mirror of
https://github.com/libguestfs/libguestfs.git
synced 2026-03-22 07:03:38 +00:00
Febootstrap has been renamed upstream to 'supermin': https://www.redhat.com/archives/libguestfs/2013-February/msg00004.html This commit changes libguestfs so it can use either program to build the supermin appliance.
590 lines
18 KiB
Plaintext
590 lines
18 KiB
Plaintext
TODO list for libguestfs
|
|
======================================================================
|
|
|
|
This list contains random ideas and musings on features we could add
|
|
to libguestfs in future.
|
|
|
|
- RWMJ
|
|
|
|
FUSE API
|
|
--------
|
|
|
|
The API needs more test coverage, particularly lesser-used system
|
|
calls.
|
|
|
|
The big unresolved issue is UID/GID mapping between guest filesystem
|
|
IDs and the host. It's not easy to automate this because you need
|
|
extra details about the guest itself in order to get to its
|
|
UID->username map (eg. /etc/passwd from the guest).
|
|
|
|
Haskell bindings
|
|
----------------
|
|
|
|
Complete the Haskell bindings (see discussion on haskell-cafe).
|
|
|
|
PHP bindings
|
|
------------
|
|
|
|
Add bindtests to PHP bindings.
|
|
|
|
Complete bind tests
|
|
-------------------
|
|
|
|
Complete the bind tests - must test the return values and error cases.
|
|
|
|
virt-inspector - make libvirt XML
|
|
---------------------------------
|
|
|
|
It should be possible to generate libvirt XML from virt-inspector
|
|
data, at least partially. This would be just another output type so:
|
|
|
|
virt-inspector --libvirt guest.img
|
|
|
|
Note that recent versions of libvirt/virt-install allow guests to be
|
|
imported, so this is not so useful any more.
|
|
|
|
Ideas for extra commands
|
|
------------------------
|
|
|
|
General glibc / core programs:
|
|
chgrp
|
|
|
|
ext2 properties:
|
|
badblocks
|
|
debugfs
|
|
dumpe2fs
|
|
e2image
|
|
e2undo
|
|
filefrag
|
|
|
|
SELinux:
|
|
chcat
|
|
restorecon
|
|
[Wanlong Gao submitted patches for restorecon, but
|
|
there are problems with using the restorecon binary
|
|
from the host on the guest. Most of the time it
|
|
would do more harm than good.]
|
|
setfiles
|
|
|
|
Oddball:
|
|
pivot_root
|
|
fts(3) / ftw(3)
|
|
|
|
sh-in, sh-out: run shell command with large input/output
|
|
debug sh-in, debug sh-out: debug versions of the above
|
|
|
|
Other initrd-* commands
|
|
-----------------------
|
|
|
|
Such as:
|
|
|
|
initrd-extract
|
|
initrd-replace
|
|
|
|
virt-rescue pty
|
|
---------------
|
|
|
|
See:
|
|
http://search.cpan.org/~rgiersig/IO-Tty-1.08/Pty.pm
|
|
http://www.perlmonks.org/index.pl?node_id=582185
|
|
|
|
Note that pty requires cooperation inside the C code too (there are
|
|
two sides to a pty, and one has to be handled after the fork).
|
|
|
|
[I tried to implement this in the new C virt-rescue, but it doesn't
|
|
work. qemu is implementing its own ptys, and they are broken. Need
|
|
to fix qemu.]
|
|
|
|
Windows-based daemon/appliance
|
|
------------------------------
|
|
|
|
See discussion on list:
|
|
https://www.redhat.com/archives/libguestfs/2009-November/msg00165.html
|
|
|
|
virt-disk-explore
|
|
-----------------
|
|
|
|
For multi-level disk images such as live CDs:
|
|
http://rwmj.wordpress.com/2009/07/15/unpack-the-russian-doll-of-a-f11-live-cd/
|
|
|
|
It's possible with libguestfs to recursively look for anything that
|
|
might be a filesystem, mount-{,loop} it and look in those, revealing
|
|
anything in a disk image.
|
|
|
|
However this won't work easily for VM disk images in the disk image.
|
|
One would have to download those to the host and launch another
|
|
libguestfs instance.
|
|
|
|
[Not sure this is such a good idea. See also live CD inspection idea below.]
|
|
|
|
Map filesystems to disk blocks
|
|
------------------------------
|
|
|
|
Map files/filesystems/(any other object) to the actual disk
|
|
blocks they occupy.
|
|
|
|
And vice versa.
|
|
|
|
Is it even possible?
|
|
|
|
See also contribs/visualize-alignment/
|
|
|
|
Integration with host intrusion systems
|
|
---------------------------------------
|
|
|
|
Perfect way to monitor VMs from outside the VM. Look for file
|
|
hashes, log events, login/logout etc.
|
|
|
|
http://www.ossec.net/
|
|
http://la-samhna.de/samhain/
|
|
http://sourceforge.net/projects/aide/
|
|
http://osiris.shmoo.com/
|
|
http://sourceforge.net/projects/tripwire/
|
|
|
|
Freeze/thaw filesystems
|
|
-----------------------
|
|
|
|
Access to these ioctls:
|
|
http://git.kernel.org/linus/fcccf502540e3d7
|
|
|
|
Tips for new users in guestfish
|
|
-------------------------------
|
|
|
|
$ guestfish
|
|
Tip: You need to 'add disk.img' or 'alloc disk.img nn' to make a new image.
|
|
Type 'notips' to disable tips permanently.
|
|
><fs> add mydisk
|
|
Tip: You need to type 'run' before you can see into the disk image.
|
|
><fs> run
|
|
Tip: Use 'list-filesystems' to see what filesystems are available.
|
|
><fs> list-filesystems
|
|
/dev/vda1
|
|
Tip: Use 'mount fs /' to mount a filesystem.
|
|
><fs> mount /dev/vda1 /
|
|
Tip: Use 'll /' to view the filesystem or ...
|
|
><fs> ll /
|
|
|
|
Could we make guestfish interactive if commands are used without params?
|
|
------------------------------------------------------------------------
|
|
|
|
><fs> sparse
|
|
[[Prints man page]]
|
|
Image name? disk.img
|
|
Size of image? 10M
|
|
|
|
Better support for encrypted devices
|
|
------------------------------------
|
|
|
|
Currently LUKS support only works if the device contains volume
|
|
groups. If it contains, eg., partitions, you cannot access them.
|
|
We would like to add:
|
|
|
|
- Direct access to the /dev/mapper device (eg. if it contains
|
|
anything apart from VGs).
|
|
|
|
Display image as PS
|
|
-------------------
|
|
|
|
Display the structure of an image file as a PS.
|
|
|
|
Greater use of blkid / libblkid
|
|
-------------------------------
|
|
|
|
There are various useful functions in libblkid for listing partitions,
|
|
devices etc which we are essentially duplicating in the daemon. It
|
|
would make more sense to just use libblkid for this.
|
|
|
|
There are some places where we call out to the 'blkid' program. This
|
|
might be replaced by direct use of the library (if this is easier).
|
|
|
|
But it is very hard to be compatible between RHEL6 and RHEL5 when
|
|
using direct library.
|
|
|
|
Visualization
|
|
-------------
|
|
|
|
Eric Sandeen pointed out the blktrace tool which is a better way of
|
|
capturing traces than using patched qemu (see
|
|
contrib/visualize-alignment). We would still use the same
|
|
visualization tools in conjunction with blktrace traces.
|
|
|
|
guestfish parsing
|
|
-----------------
|
|
|
|
At the moment guestfish uses an ad hoc parser which has many
|
|
shortcomings. We should change to using a lex/yacc-based scanner and
|
|
parser (there are better parsers out there, but yacc is sufficient and
|
|
very widely available).
|
|
|
|
The scanner must deal with the case of parsing a whole command string,
|
|
eg. for a command that the user types in:
|
|
|
|
><fs> add-drive-opts "/tmp/foo" readonly:true
|
|
|
|
and also with parsing single words from the command line:
|
|
|
|
guestfish add-drive-opts /tmp/foo readonly:true
|
|
|
|
Note the quotes are for scanning and don't indicate types.
|
|
|
|
We should also allow variables and expressions as part of this new
|
|
parsing code, eg:
|
|
|
|
set roots inspect-os
|
|
set product inspect-get-product-name %{roots[0]}
|
|
|
|
% is better than $ because of shell escaping and confusion with shell
|
|
variables.
|
|
|
|
Can we combine this with ability to set and read environment
|
|
variables? Currently guestfish uses many environment variables like
|
|
$EDITOR without any corresponding ability to set them.
|
|
|
|
set EDITOR /usr/bin/emacs
|
|
echo $EDITOR # or %{EDITOR}
|
|
edit /etc/resolv.conf
|
|
|
|
More ntfs tools
|
|
---------------
|
|
|
|
ntfsprogs actually has a lot more useful tools than we currently
|
|
use. Interesting ones are:
|
|
|
|
ntfscluster: display file(s) that occupy a cluster or sector
|
|
|
|
ntfsinfo: print various information about NTFS volume and files
|
|
|
|
ntfs streams: extract alternate streams from NTFS files
|
|
|
|
ntfsck: checker for NTFS filesystems
|
|
|
|
Undelete files
|
|
--------------
|
|
|
|
Two useful tools:
|
|
|
|
- ext2undelete
|
|
- ntfsundelete
|
|
|
|
More mkfs_opts options
|
|
----------------------
|
|
|
|
Useful options to offer:
|
|
- Set label.
|
|
- Set UUID.
|
|
|
|
Use /proc/self/mountinfo
|
|
------------------------
|
|
|
|
This file contains lots of interesting information about
|
|
what is mounted and where. eg:
|
|
|
|
16 21 0:3 / /proc rw,relatime - proc /proc rw
|
|
17 21 0:16 / /sys rw,relatime - sysfs /sys rw,seclabel
|
|
18 23 0:5 / /dev rw,relatime - devtmpfs udev rw,seclabel,size=1906740k,nr_inodes=476685,mode=755
|
|
26 21 253:3 / /home rw,relatime - ext4 /dev/mapper/vg-lv_home rw,seclabel,barrier=1,data=ordered
|
|
|
|
This could be used instead of current hairy code to parse the output
|
|
of the 'mount' command. We could add new APIs to return kernel mount
|
|
options, type of filesystem at a mountpoint etc.
|
|
|
|
guestfish drive letters
|
|
-----------------------
|
|
|
|
There should be an option to mount all Windows drives as separate
|
|
paths, like C: => /c/, D: => /d/ etc.
|
|
|
|
More inspection features
|
|
------------------------
|
|
|
|
- last shutdown time
|
|
- DHCP address
|
|
- last time the software was updated
|
|
- last user who logged in
|
|
- lastlog, last, who
|
|
|
|
Integrate virt-inspector with CMDBs
|
|
-----------------------------------
|
|
|
|
Either integrate virt-inspector with Configuration Management
|
|
Databases (CMDBs) or at least check that virt-inspector produces the
|
|
right range of data so that integration would be possible. The
|
|
standards for CMDBs come from the DMTF, see eg:
|
|
|
|
http://dmtf.org/news/pr/2009/7/dmtf-releases-cmdbf-standard-federating-configuration-management-data
|
|
|
|
Efficient way to visit all files
|
|
--------------------------------
|
|
|
|
https://rwmj.wordpress.com/2010/12/15/tip-audit-virtual-machine-for-setuid-files/#content
|
|
|
|
A naive method would look like:
|
|
|
|
g#visit ~return_stats:true "/" (
|
|
fun pathname stat ->
|
|
...
|
|
)
|
|
|
|
However this has two disadvantages:
|
|
|
|
- requires hand-written custom bindings in each language
|
|
- unclear about locking, thread-safety and re-entrancy of handle g
|
|
|
|
A better way would be to have some sort of explicit "download all
|
|
filenames and stat structures", which could then be iterated over:
|
|
|
|
let files = g#find_opts ~return_stats:true "/" in
|
|
List.iter (
|
|
fun pathname stat ->
|
|
...
|
|
)
|
|
|
|
The problem with this is that 'files' is going to be larger than a
|
|
protocol buffer.
|
|
|
|
This leads to thinking about changes to the protocol / generator to
|
|
make this simpler. The proposal would be to add RBigStringList,
|
|
RBigStructList [or RBig (Ranytype ...)]. These would work like
|
|
FileOut, in that they would use file streaming to stream XDR
|
|
structures (probably written to a file on the library side).
|
|
Generated code would hide most of the implementation.
|
|
|
|
We also need to think about security issues: is it possible for the
|
|
daemon to keep sending back data forever, and if so what happens on
|
|
the library side.
|
|
|
|
[Users can now use virt-ls to solve some of these problems, but it is
|
|
not a general solution at the API level]
|
|
|
|
Interactive disk creator
|
|
------------------------
|
|
|
|
An interactive disk creator program.
|
|
|
|
Attach method for disconnected operation
|
|
----------------------------------------
|
|
|
|
http://libguestfs.org/guestfs.3.html#guestfs_set_attach_method
|
|
|
|
"Librarian" has an idea that he should be able to attach to a regular
|
|
appliance, but disconnect from it and reconnect to it later. This
|
|
would be some sort of modified attach method (see link above).
|
|
|
|
The complexity here is that we would no longer have access to
|
|
stdin/stdout (or we'd have to direct that somewhere else).
|
|
|
|
libosinfo mappings for virt-inspector
|
|
-------------------------------------
|
|
|
|
Return libosinfo mappings from inspection API.
|
|
|
|
virt-sysprep ideas
|
|
------------------
|
|
|
|
- other Spacewalk / RHN IDs (?)
|
|
- Windows sysprep
|
|
(see: https://github.com/clalancette/oz/blob/e74ce83283d468fd987583d6837b441608e5f8f0/oz/Windows.py )
|
|
- (librarian suggests ...)
|
|
. run external guestfish script virt-sysprep --fish=/tmp/foo.fish
|
|
- if drives are encrypted, then dm-crypt key should be changed
|
|
and drives all re-encrypted
|
|
- /etc/pki
|
|
(Steve says ...)
|
|
Rpm uses nss. Nss sets up its crypto database in
|
|
/etc/pki. Depending on how long the machine ran before cloning, you
|
|
may have picked up some certificates or things. This is an area
|
|
that you would want to look into.
|
|
- secure erase of inodes etc using scrub (Steve Grubb)
|
|
- other directories that could require cleaning include:
|
|
/var/run/*
|
|
(thanks Marko Myllynen, James Antill)
|
|
- remove or modify UUIDs in /etc/fstab (eg. on Ubuntu)
|
|
(thanks Joshua Daniel Franklin)
|
|
|
|
Kazuo Moriwaka adds:
|
|
|
|
- swap devices (both of block device and file) should be wiped. This may
|
|
good for security purpose, and size. I found virt-sparsify can clear
|
|
swap partition.
|
|
- --script is nice. Defining default sysprep script directory
|
|
like /usr/lib/guestfs/sysprep-scripts.d/ may be useful to integrate
|
|
other package maintainer(or ISV)'s effort.
|
|
- To achieve better (or crazy) coverage, a simple examination will be
|
|
help:
|
|
Install the same kickstart into VM twice, and diff the trees.
|
|
Many autogenerated IDs and configs can be found :)
|
|
|
|
As well as 'virt-sysprep' there is a case for a 'virt-customize' tool
|
|
which can customize templated guests. This would be useful within
|
|
companies/organizations that want to offer multiple guests, but all
|
|
customized with the organization logo etc. Some ideas:
|
|
|
|
- change the background image to some custom desktop
|
|
- change the sign-on messages (/etc/issue.net etc)
|
|
- Windows login script/service
|
|
|
|
Launch remote sessions over ssh
|
|
-------------------------------
|
|
|
|
We had an idea you could add a launch method that uses ssh, ie. all
|
|
supermin and qemu commands happen the same as now, but prefixed by
|
|
ssh so it happens on a remote machine.
|
|
|
|
Note that proper remote support and integration with libvirt is
|
|
different from this, and people are working on that. ssh would just
|
|
be "remote-lite".
|
|
|
|
virt-make-fs and virt-win-reg need to not be in Perl
|
|
----------------------------------------------------
|
|
|
|
Probably they should be in C or OCaml.
|
|
|
|
Integrate snap-type functionality in inspection tools
|
|
-----------------------------------------------------
|
|
|
|
Mo Morsi's "snap" program lets you describe a guest as the list of
|
|
packages (eg. RPMs) installed + changes made to those RPMs + files
|
|
added.
|
|
|
|
http://projects.morsi.org/wiki/Snap
|
|
|
|
This results in a compact description of the guest. He even managed
|
|
to do a kind of migration of guests by simply recreating the guest
|
|
from the description on the target machine.
|
|
|
|
It would be ideal to integrate this and/or use inspection to do this.
|
|
|
|
Ongoing code cleanups
|
|
---------------------
|
|
|
|
Examine every use of 'int' in C code for signed overflow problems.
|
|
|
|
All file descriptors in the library and daemon should normally be
|
|
opened with O_CLOEXEC. Therefore we need to examine every call to:
|
|
|
|
- open, openat
|
|
- creat
|
|
- pipe (see also: pipe2)
|
|
- dup, dup2 (see also: dup3)
|
|
- socket, socketpair
|
|
- accept (see also: accept4)
|
|
- signalfd, timerfd, epoll_create
|
|
|
|
virt-sparsify enhancements
|
|
--------------------------
|
|
|
|
TMPDIR should be checked to ensure that we won't run out of space
|
|
during the conversion, since current behaviour is very bad when this
|
|
happens (it usually causes virt-sparsify to hang). This requires
|
|
writing a small C binding to statvfs for OCaml.
|
|
|
|
'virt-sparsify --whitelist' option to generate skeletons (for
|
|
debugging, bug forensics, diagnosis). The whilelist option would
|
|
specify a list of files to be *preserved*. All other files in the
|
|
image would be replaced by equivalent files of zeroes, thus minimizing
|
|
the size of the debug image that needs to be shipped to us by the
|
|
customer.
|
|
|
|
Optimize the appliance
|
|
----------------------
|
|
|
|
Pass -cpu host. Anything else?
|
|
|
|
[The libvirt attach-method uses 'host-model' which is basically
|
|
the same as this]
|
|
|
|
Sort out partitioning
|
|
---------------------
|
|
|
|
Ignoring some legacy APIs, we currently have a mixed selection of
|
|
'part-*' APIs, implemented using parted. We don't like parted or
|
|
libparted very much, and would love to replace it with something else.
|
|
The part-* APIs are quirky, but not too bad and we should maintain and
|
|
extend them instead of making another set of APIs.
|
|
|
|
One option is to write "libmbr" and "libgpt" libraries that would just
|
|
do MBR and GPT respectively, and do it directly and do it well. They
|
|
wouldn't try to abstract anything (so, unlike libparted). We could
|
|
then reimplement the part-* APIs on top of these hopefully sensible
|
|
libraries. This is a lot of work.
|
|
|
|
Another option is to look for tools or libraries to replace parted.
|
|
For GPT there is a fairly obvious candidate: Rod Smith's GPT fdisk
|
|
(http://www.rodsbooks.com/gdisk/). Rod has spent a lot of time
|
|
studying GPT, and seems to know more about it than any sane man
|
|
should. There is a command line tool designed for scripts called
|
|
'sgdisk'. The tools are packaged for many Linux distros. Even if
|
|
this approach works, it doesn't solve the MBR problem, so likely we'd
|
|
have to write a library for that (or perhaps go back to sfdisk but
|
|
using a very abstracted interface over sfdisk).
|
|
|
|
qemu caching
|
|
------------
|
|
|
|
(Suggested by Paolo Bonzini and Kevin Wolf)
|
|
|
|
Measure the effect of cache=none, cache=directsync,
|
|
cache=writethrough, cache=writeback.
|
|
|
|
It's doubtful that using cache=none is useful, since it disables the
|
|
host cache making read-heavy workloads slower (they rely entirely on
|
|
the smaller appliance kernel's cache). And in libguestfs we don't
|
|
necessarily care about ongoing data integrity while writing, as long
|
|
as data is reliably written out when g.sync, g.shutdown or g.close
|
|
return. Also in libguestfs we effectively control the whole stack, so
|
|
we can ensure write barriers happen when we want.
|
|
|
|
libvirt attach-method
|
|
---------------------
|
|
|
|
Since libguestfs 1.19.24 this mostly works. Here are some suggested
|
|
items to work on:
|
|
|
|
- SELinux labelling of guestfsd.sock, console.sock
|
|
https://bugzilla.redhat.com/show_bug.cgi?id=842307
|
|
Once this is fixed, remove <seclabel type=none> from libvirt XML
|
|
|
|
- Check feature parity between src/launch-appliance.c and
|
|
src/launch-libvirt.c.
|
|
|
|
- Remote support. (This requires work on libvirt)
|
|
|
|
virt-sparsify should use discard
|
|
--------------------------------
|
|
|
|
This requires some changes to qemu to make discard work properly
|
|
throughout the entire stack.
|
|
|
|
Reimplement some APIs to avoid protocol limits
|
|
----------------------------------------------
|
|
|
|
We should reimplement the following APIs to avoid protocol limits.
|
|
These would be changed from daemon_functions to non_daemon_functions,
|
|
with the non-daemon versions implemented using guestfs_upload and
|
|
guestfs_download (and others). This change should be transparent from
|
|
the p.o.v of the API and ABI.
|
|
|
|
- guestfs_readdir
|
|
|
|
hivex
|
|
-----
|
|
|
|
Add more of hivex to the API, especially for writing.
|
|
|
|
Reimplement virt-win-reg to use this API. (This is difficult because
|
|
the Perl libraries underneath access the hivex API directly).
|
|
|
|
ruby
|
|
----
|
|
|
|
Implement blocking calls. The API for this:
|
|
|
|
http://www.spacevatican.org/2012/7/5/whos-afraid-of-the-big-bad-lock/
|
|
|
|
is very poorly designed and essentially impossible for us to use:
|
|
|
|
https://bugs.ruby-lang.org/issues/5543
|
|
|
|
particularly if we also want to maintain backwards compatibility with
|
|
Ruby 1.8, and/or maintain volatile VALUEs on the stack.
|