mirror of
https://github.com/libguestfs/libguestfs.git
synced 2026-03-21 22:53:37 +00:00
Currently the guestfs_readdir() API can not list long directories, due to it sending back the whole directory listing in a single guestfs protocol response, which is limited to GUESTFS_MESSAGE_MAX (approx. 4MB) in size. Introduce the "internal_readdir" action, for transferring the directory listing from the daemon to the library through a FileOut parameter. Rewrite guestfs_readdir() on top of this new internal function: - The new "internal_readdir" action is a daemon action. Do not repurpose the "readdir" proc_nr (138) for "internal_readdir", as some distros ship the binary appliance to their users, and reusing the proc_nr could create a mismatch between library & appliance with obscure symptoms. Replace the old proc_nr (138) with a new proc_nr (511) instead; a mismatch would then produce a clear error message. Assume the new action will first be released in libguestfs-1.48.2. - Turn "readdir" from a daemon action into a non-daemon one. Call the daemon action guestfs_internal_readdir() manually, receive the FileOut parameter into a temp file, then deserialize the dirents array from the temp file. This patch sneakily fixes an independent bug, too. In the pre-patch do_readdir() function [daemon/readdir.c], when readdir() returns NULL, we don't distinguish "end of directory stream" from "readdir() failed". This rewrite fixes this problem -- I didn't see much value separating out the fix for the original do_readdir(). Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1674392 Signed-off-by: Laszlo Ersek <lersek@redhat.com> Message-Id: <20220502085601.15012-2-lersek@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
549 lines
17 KiB
Plaintext
549 lines
17 KiB
Plaintext
TODO list for libguestfs
|
|
======================================================================
|
|
|
|
This list contains random ideas and musings on features we could add
|
|
to libguestfs in future.
|
|
|
|
- RWMJ
|
|
|
|
FUSE API
|
|
--------
|
|
|
|
The API needs more test coverage, particularly lesser-used system
|
|
calls.
|
|
|
|
The big unresolved issue is UID/GID mapping between guest filesystem
|
|
IDs and the host. It's not easy to automate this because you need
|
|
extra details about the guest itself in order to get to its
|
|
UID->username map (eg. /etc/passwd from the guest).
|
|
|
|
Haskell bindings
|
|
----------------
|
|
|
|
Complete the Haskell bindings (see discussion on haskell-cafe).
|
|
|
|
PHP bindings
|
|
------------
|
|
|
|
Add bindtests to PHP bindings.
|
|
|
|
Complete bind tests
|
|
-------------------
|
|
|
|
Complete the bind tests - must test the return values and error cases.
|
|
|
|
virt-inspector - make libvirt XML
|
|
---------------------------------
|
|
|
|
It should be possible to generate libvirt XML from virt-inspector
|
|
data, at least partially. This would be just another output type so:
|
|
|
|
virt-inspector --libvirt guest.img
|
|
|
|
Note that recent versions of libvirt/virt-install allow guests to be
|
|
imported, so this is not so useful any more.
|
|
|
|
Ideas for extra commands
|
|
------------------------
|
|
|
|
General glibc / core programs:
|
|
chgrp
|
|
|
|
ext2 properties:
|
|
badblocks
|
|
debugfs
|
|
dumpe2fs
|
|
e2image
|
|
e2undo
|
|
filefrag
|
|
|
|
SELinux:
|
|
chcat
|
|
|
|
Oddball:
|
|
pivot_root
|
|
fts(3) / ftw(3)
|
|
|
|
sh-in, sh-out: run shell command with large input/output
|
|
debug sh-in, debug sh-out: debug versions of the above
|
|
|
|
Other initrd-* commands
|
|
-----------------------
|
|
|
|
Such as:
|
|
|
|
initrd-extract
|
|
initrd-replace
|
|
|
|
Port to Windows
|
|
---------------
|
|
|
|
"Port to Windows" means different things to different people.
|
|
|
|
The easiest is to port the library to Windows, but reuse the Linux
|
|
appliance/daemon. This would allow people to use libguestfs on
|
|
Windows hosts and link libguestfs to Windows programs. Doing this is
|
|
just a matter of chugging through the code fixing portability issues,
|
|
using gnulib as far as possible as the portability layer.
|
|
|
|
The hardest is probably a port of the daemon to Windows. The reason
|
|
to do this is so that you can use the native NTFS drivers (in Windows)
|
|
in order to edit Windows guests. Although back in 2009 we did some
|
|
work on this, I am now dubious about the utility of this, since
|
|
ntfs-3g works very well. See also discussion on list:
|
|
https://www.redhat.com/archives/libguestfs/2009-November/msg00165.html
|
|
|
|
virt-disk-explore
|
|
-----------------
|
|
|
|
For multi-level disk images such as live CDs:
|
|
http://rwmj.wordpress.com/2009/07/15/unpack-the-russian-doll-of-a-f11-live-cd/
|
|
|
|
It's possible with libguestfs to recursively look for anything that
|
|
might be a filesystem, mount-{,loop} it and look in those, revealing
|
|
anything in a disk image.
|
|
|
|
However this won't work easily for VM disk images in the disk image.
|
|
One would have to download those to the host and launch another
|
|
libguestfs instance.
|
|
|
|
[Not sure this is such a good idea. See also live CD inspection idea below.]
|
|
|
|
Map filesystems to disk blocks
|
|
------------------------------
|
|
|
|
Map files/filesystems/(any other object) to the actual disk
|
|
blocks they occupy.
|
|
|
|
And vice versa.
|
|
|
|
Is it even possible?
|
|
|
|
See also contribs/visualize-alignment/
|
|
|
|
Integration with host intrusion systems
|
|
---------------------------------------
|
|
|
|
Perfect way to monitor VMs from outside the VM. Look for file
|
|
hashes, log events, login/logout etc.
|
|
|
|
http://www.ossec.net/
|
|
http://la-samhna.de/samhain/
|
|
http://sourceforge.net/projects/aide/
|
|
http://osiris.shmoo.com/
|
|
http://sourceforge.net/projects/tripwire/
|
|
|
|
See also: virt-aide;
|
|
https://rwmj.wordpress.com/2013/05/16/scanning-offline-guests-using-openscap-and-guestmount/#content
|
|
|
|
Tips for new users in guestfish
|
|
-------------------------------
|
|
|
|
$ guestfish
|
|
Tip: You need to 'add disk.img' or 'alloc disk.img nn' to make a new image.
|
|
Type 'notips' to disable tips permanently.
|
|
><fs> add mydisk
|
|
Tip: You need to type 'run' before you can see into the disk image.
|
|
><fs> run
|
|
Tip: Use 'list-filesystems' to see what filesystems are available.
|
|
><fs> list-filesystems
|
|
/dev/vda1
|
|
Tip: Use 'mount fs /' to mount a filesystem.
|
|
><fs> mount /dev/vda1 /
|
|
Tip: Use 'll /' to view the filesystem or ...
|
|
><fs> ll /
|
|
|
|
Could we make guestfish interactive if commands are used without params?
|
|
------------------------------------------------------------------------
|
|
|
|
><fs> sparse
|
|
[[Prints man page]]
|
|
Image name? disk.img
|
|
Size of image? 10M
|
|
|
|
Display image as PS
|
|
-------------------
|
|
|
|
Display the structure of an image file as a PS.
|
|
|
|
Greater use of blkid / libblkid
|
|
-------------------------------
|
|
|
|
There are various useful functions in libblkid for listing partitions,
|
|
devices etc which we are essentially duplicating in the daemon. It
|
|
would make more sense to just use libblkid for this.
|
|
|
|
There are some places where we call out to the 'blkid' program. This
|
|
might be replaced by direct use of the library (if this is easier).
|
|
|
|
But it is very hard to be compatible between RHEL6 and RHEL5 when
|
|
using the library directly.
|
|
|
|
Properly fix device name translation
|
|
------------------------------------
|
|
|
|
Currently for Device parameters we pass a string name like "/dev/sda"
|
|
or "/dev/sdb2" or "/dev/VG/LV" to the daemon. Since Linux broke
|
|
device enumeration (RHBZ#1804207) this works less well, requiring
|
|
non-trivial translation within the daemon.
|
|
|
|
It would be far better if in the generator we mapped "/dev/XXX" to a
|
|
proper structure. Device index + partition | LV | MD | ...
|
|
This would be then used everywhere inside the daemon and mean we would
|
|
no longer need device name translation at all.
|
|
|
|
Visualization
|
|
-------------
|
|
|
|
Eric Sandeen pointed out the blktrace tool which is a better way of
|
|
capturing traces than using patched qemu (see
|
|
contrib/visualize-alignment). We would still use the same
|
|
visualization tools in conjunction with blktrace traces.
|
|
|
|
guestfish parsing
|
|
-----------------
|
|
|
|
At the moment guestfish uses an ad hoc parser which has many
|
|
shortcomings. We should change to using a lex/yacc-based scanner and
|
|
parser (there are better parsers out there, but yacc is sufficient and
|
|
very widely available).
|
|
|
|
The scanner must deal with the case of parsing a whole command string,
|
|
eg. for a command that the user types in:
|
|
|
|
><fs> add-drive-opts "/tmp/foo" readonly:true
|
|
|
|
and also with parsing single words from the command line:
|
|
|
|
guestfish add-drive-opts /tmp/foo readonly:true
|
|
|
|
Note the quotes are for scanning and don't indicate types.
|
|
|
|
We should also allow variables and expressions as part of this new
|
|
parsing code, eg:
|
|
|
|
set roots inspect-os
|
|
set product inspect-get-product-name %{roots[0]}
|
|
|
|
% is better than $ because of shell escaping and confusion with shell
|
|
variables.
|
|
|
|
Can we combine this with ability to set and read environment
|
|
variables? Currently guestfish uses many environment variables like
|
|
$EDITOR without any corresponding ability to set them.
|
|
|
|
set EDITOR /usr/bin/emacs
|
|
echo $EDITOR # or %{EDITOR}
|
|
edit /etc/resolv.conf
|
|
|
|
More ntfs tools
|
|
---------------
|
|
|
|
ntfsprogs actually has a lot more useful tools than we currently
|
|
use. Interesting ones are:
|
|
|
|
ntfscluster: display file(s) that occupy a cluster or sector
|
|
|
|
ntfsinfo: print various information about NTFS volume and files
|
|
|
|
ntfs streams: extract alternate streams from NTFS files
|
|
|
|
ntfsck: checker for NTFS filesystems
|
|
|
|
Undelete files
|
|
--------------
|
|
|
|
Two useful tools:
|
|
|
|
- ext2undelete
|
|
- ntfsundelete
|
|
|
|
More mkfs_opts options
|
|
----------------------
|
|
|
|
Useful options to offer:
|
|
- Set label.
|
|
- Set UUID.
|
|
|
|
Use /proc/self/mountinfo
|
|
------------------------
|
|
|
|
This file contains lots of interesting information about
|
|
what is mounted and where. eg:
|
|
|
|
16 21 0:3 / /proc rw,relatime - proc /proc rw
|
|
17 21 0:16 / /sys rw,relatime - sysfs /sys rw,seclabel
|
|
18 23 0:5 / /dev rw,relatime - devtmpfs udev rw,seclabel,size=1906740k,nr_inodes=476685,mode=755
|
|
26 21 253:3 / /home rw,relatime - ext4 /dev/mapper/vg-lv_home rw,seclabel,barrier=1,data=ordered
|
|
|
|
This could be used instead of current hairy code to parse the output
|
|
of the 'mount' command. We could add new APIs to return kernel mount
|
|
options, type of filesystem at a mountpoint etc.
|
|
|
|
guestfish drive letters
|
|
-----------------------
|
|
|
|
There should be an option to mount all Windows drives as separate
|
|
paths, like C: => /c/, D: => /d/ etc.
|
|
|
|
More inspection features
|
|
------------------------
|
|
|
|
- last shutdown time
|
|
- DHCP address
|
|
- last time the software was updated
|
|
- last user who logged in
|
|
- lastlog, last, who
|
|
|
|
Integrate virt-inspector with CMDBs
|
|
-----------------------------------
|
|
|
|
Either integrate virt-inspector with Configuration Management
|
|
Databases (CMDBs) or at least check that virt-inspector produces the
|
|
right range of data so that integration would be possible. The
|
|
standards for CMDBs come from the DMTF, see eg:
|
|
|
|
http://dmtf.org/news/pr/2009/7/dmtf-releases-cmdbf-standard-federating-configuration-management-data
|
|
|
|
Efficient way to visit all files
|
|
--------------------------------
|
|
|
|
https://rwmj.wordpress.com/2010/12/15/tip-audit-virtual-machine-for-setuid-files/#content
|
|
|
|
A naive method would look like:
|
|
|
|
g#visit ~return_stats:true "/" (
|
|
fun pathname stat ->
|
|
...
|
|
)
|
|
|
|
However this has two disadvantages:
|
|
|
|
- requires hand-written custom bindings in each language
|
|
- unclear about locking, thread-safety and re-entrancy of handle g
|
|
|
|
A better way would be to have some sort of explicit "download all
|
|
filenames and stat structures", which could then be iterated over:
|
|
|
|
let files = g#find_opts ~return_stats:true "/" in
|
|
List.iter (
|
|
fun pathname stat ->
|
|
...
|
|
)
|
|
|
|
The problem with this is that 'files' is going to be larger than a
|
|
protocol buffer.
|
|
|
|
This leads to thinking about changes to the protocol / generator to
|
|
make this simpler. The proposal would be to add RBigStringList,
|
|
RBigStructList [or RBig (Ranytype ...)]. These would work like
|
|
FileOut, in that they would use file streaming to stream XDR
|
|
structures (probably written to a file on the library side).
|
|
Generated code would hide most of the implementation.
|
|
|
|
We also need to think about security issues: is it possible for the
|
|
daemon to keep sending back data forever, and if so what happens on
|
|
the library side.
|
|
|
|
[Users can now use virt-ls to solve some of these problems, but it is
|
|
not a general solution at the API level]
|
|
|
|
Interactive disk creator
|
|
------------------------
|
|
|
|
An interactive disk creator program.
|
|
|
|
Backend for disconnected operation
|
|
----------------------------------
|
|
|
|
http://libguestfs.org/guestfs.3.html#guestfs_set_backend
|
|
|
|
"Librarian" has an idea that he should be able to attach to a regular
|
|
appliance, but disconnect from it and reconnect to it later. This
|
|
would be some sort of modified backend (see link above).
|
|
|
|
The complexity here is that we would no longer have access to
|
|
stdin/stdout (or we'd have to direct that somewhere else).
|
|
|
|
libosinfo mappings for virt-inspector
|
|
-------------------------------------
|
|
|
|
Return libosinfo mappings from inspection API.
|
|
|
|
virt-sysprep ideas
|
|
------------------
|
|
|
|
- other Spacewalk / RHN IDs (?)
|
|
- Windows sysprep
|
|
(see: https://github.com/clalancette/oz/blob/e74ce83283d468fd987583d6837b441608e5f8f0/oz/Windows.py )
|
|
- (librarian suggests ...)
|
|
. run external guestfish script virt-sysprep --fish=/tmp/foo.fish
|
|
- if drives are encrypted, then dm-crypt key should be changed
|
|
and drives all re-encrypted
|
|
- secure erase of inodes etc using scrub (Steve Grubb)
|
|
- fix the virt-sysprep fs-uuids plugin
|
|
|
|
- virt-sysprep should be able to zero-free space on the disks (a bit
|
|
like virt-sparsify). This is a security measure to stop people
|
|
trying to read the deleted files.
|
|
|
|
Kazuo Moriwaka adds:
|
|
|
|
- swap devices (both of block device and file) should be wiped. This may
|
|
good for security purpose, and size. I found virt-sparsify can clear
|
|
swap partition.
|
|
- --script is nice. Defining default sysprep script directory
|
|
like /usr/lib/guestfs/sysprep-scripts.d/ may be useful to integrate
|
|
other package maintainer(or ISV)'s effort.
|
|
- To achieve better (or crazy) coverage, a simple examination will be
|
|
help:
|
|
Install the same kickstart into VM twice, and diff the trees.
|
|
Many autogenerated IDs and configs can be found :)
|
|
|
|
As well as 'virt-sysprep' there is a case for a 'virt-customize' tool
|
|
which can customize templated guests. This would be useful within
|
|
companies/organizations that want to offer multiple guests, but all
|
|
customized with the organization logo etc. Some ideas:
|
|
|
|
- change the background image to some custom desktop
|
|
- change the sign-on messages (/etc/issue.net etc)
|
|
- Windows login script/service
|
|
|
|
Note that virt-sysprep has gradually gained some of these features,
|
|
eg. setting hostname, changing passwords. Since this precedent has
|
|
now been set, it could do more of the same.
|
|
|
|
virt-make-fs and virt-win-reg need to not be in Perl
|
|
----------------------------------------------------
|
|
|
|
Probably they should be in C or OCaml.
|
|
|
|
Integrate snap-type functionality in inspection tools
|
|
-----------------------------------------------------
|
|
|
|
Mo Morsi's "snap" program lets you describe a guest as the list of
|
|
packages (eg. RPMs) installed + changes made to those RPMs + files
|
|
added.
|
|
|
|
http://projects.morsi.org/wiki/Snap
|
|
|
|
This results in a compact description of the guest. He even managed
|
|
to do a kind of migration of guests by simply recreating the guest
|
|
from the description on the target machine.
|
|
|
|
It would be ideal to integrate this and/or use inspection to do this.
|
|
|
|
Ongoing code cleanups
|
|
---------------------
|
|
|
|
Examine every use of 'int' in C code for signed overflow problems.
|
|
|
|
All file descriptors in the library and daemon should normally be
|
|
opened with O_CLOEXEC. Therefore we need to examine every call to:
|
|
|
|
- open, openat
|
|
- creat
|
|
- pipe (see also: pipe2)
|
|
- dup, dup2 (see also: dup3)
|
|
- socket, socketpair
|
|
- accept (see also: accept4)
|
|
- signalfd, timerfd, epoll_create
|
|
|
|
virt-sparsify enhancements
|
|
--------------------------
|
|
|
|
'virt-sparsify --whitelist' option to generate skeletons (for
|
|
debugging, bug forensics, diagnosis). The whilelist option would
|
|
specify a list of files to be *preserved*. All other files in the
|
|
image would be replaced by equivalent files of zeroes, thus minimizing
|
|
the size of the debug image that needs to be shipped to us by the
|
|
customer.
|
|
|
|
Sort out partitioning
|
|
---------------------
|
|
|
|
Ignoring some legacy APIs, we currently have a mixed selection of
|
|
'part-*' APIs, implemented using parted. We don't like parted or
|
|
libparted very much, and would love to replace it with something else.
|
|
The part-* APIs are quirky, but not too bad and we should maintain and
|
|
extend them instead of making another set of APIs.
|
|
|
|
One option is to write "libmbr" and "libgpt" libraries that would just
|
|
do MBR and GPT respectively, and do it directly and do it well. They
|
|
wouldn't try to abstract anything (so, unlike libparted). We could
|
|
then reimplement the part-* APIs on top of these hopefully sensible
|
|
libraries. This is a lot of work.
|
|
|
|
Another option is to look for tools or libraries to replace parted.
|
|
For GPT there is a fairly obvious candidate: Rod Smith's GPT fdisk
|
|
(http://www.rodsbooks.com/gdisk/). Rod has spent a lot of time
|
|
studying GPT, and seems to know more about it than any sane man
|
|
should. There is a command line tool designed for scripts called
|
|
'sgdisk'. The tools are packaged for many Linux distros. Even if
|
|
this approach works, it doesn't solve the MBR problem, so likely we'd
|
|
have to write a library for that (or perhaps go back to sfdisk but
|
|
using a very abstracted interface over sfdisk).
|
|
|
|
hivex
|
|
-----
|
|
|
|
Reimplement virt-win-reg to use this API. (This is difficult because
|
|
the Perl libraries underneath access the hivex API directly).
|
|
|
|
ruby
|
|
----
|
|
|
|
Implement blocking calls. The API for this:
|
|
|
|
http://www.spacevatican.org/2012/7/5/whos-afraid-of-the-big-bad-lock/
|
|
|
|
is very poorly designed and essentially impossible for us to use:
|
|
|
|
https://bugs.ruby-lang.org/issues/5543
|
|
|
|
particularly if we also want to maintain backwards compatibility with
|
|
Ruby 1.8, and/or maintain volatile VALUEs on the stack.
|
|
|
|
virt-builder
|
|
------------
|
|
|
|
- how can we give users a shell for debugging purposes?
|
|
|
|
- let notes etc be localized, ie. notes[en]=...
|
|
|
|
- add a CLI option to print the in-built path/fingerprint(s)
|
|
|
|
- doing virt-builder then running (eg. via qemu, libvirt?) is common; is
|
|
it possible to make this more automatic?
|
|
|
|
- document:
|
|
|
|
* how to integrate with ansible, chef [puppet documented already]
|
|
* how to import to EC2
|
|
|
|
- /etc/resolv.conf handling works but is best described as a hack:
|
|
https://github.com/libguestfs/libguestfs/commit/9521422ce60578f7196cc8b7977d998159238c19
|
|
|
|
- sometimes (not always) aug_init takes ages, why?
|
|
|
|
Midnight Commander (mc) extension
|
|
---------------------------------
|
|
|
|
Write an extension for mc that would let people browse into
|
|
filesystems. See
|
|
http://repo.or.cz/w/midnight-commander.git/tree/HEAD:/misc/ext.d
|
|
|
|
Improvements in virt-log
|
|
------------------------
|
|
|
|
- Make it faster, especially if the user wants to grep the output.
|
|
|
|
- Support Windows guests, see
|
|
http://rwmj.wordpress.com/2011/04/17/decoding-the-windows-event-log-using-guestfish/
|
|
|
|
Subsecond handling in virt-diff, virt-ls
|
|
----------------------------------------
|
|
|
|
Handle nanoseconds properly. You should be able to specify them on
|
|
the command line and display them.
|