Earlier versions of qemu contained a bug in the qcow2 code which
causes qemu to segfault when shutting down and flushing its internal
cache, and this can result in data loss.
The new API splits orderly close into a two-step process:
if (guestfs_shutdown (g) == -1) {
/* handle the error, eg. qemu error */
}
guestfs_close (g);
Note that the explicit shutdown step is only necessary in the case
where you have made changes to the disk image and want to handle write
errors. Read the documentation for further information.
This change also:
- deprecates guestfs_kill_subprocess
- turns guestfs_kill_subprocess into the same as guestfs_shutdown
- changes guestfish and other tools to call shutdown + close
where necessary (not for read-only tools)
- updates documentation
- updates examples
The order is now:
- remove the handle from the list of handles
- send close trace message
- sync and shutdown qemu
- run user close callback
- free temporary directory
- free memory
This commit ought to be no functional change.
On Linux, sync(2) does not actually issue a write barrier, thus it
doesn't force a flush of the underlying hardware write cache (or
qemu's disk cache in the virtual case).
This can be a problem, because libguestfs relies on running sync in
the appliance, followed by killing qemu (using SIGTERM).
In most cases, this is fine, because killing qemu with SIGTERM should
cause it to flush out the disk cache before it exits. However we have
found various bugs in qemu which cause qemu to crash while doing the
flush, leaving the data unwritten (see RHBZ#836913).
The solution is to issue fsync(2) to the block devices. This has a
write barrier, so it ensures that qemu writes out its cache long
before we get around to killing qemu.
Replace:
cp tests/guests/fedora.img test.img
with the longer but possibly more space-efficient equivalent:
qemu-img create -F raw -b tests/guests/fedora.img -f qcow2 test.qcow2
This returns the number of whole block devices added. It is usually
simpler to call this than to list the devices and count them, which
is what we do in some places in the current codebase.
On Debian, the Ruby C extensions library isn't '-lruby', it's
something like '-lruby1.8' or '-lruby-1.9.1' and these can even be
parallel-installed.
Fix detection so we use Ruby's own rbconfig.rb file to find the right
library to use.