mirror of
https://github.com/libguestfs/libguestfs.git
synced 2026-03-22 07:03:38 +00:00
578 lines
19 KiB
Plaintext
578 lines
19 KiB
Plaintext
=encoding utf8
|
|
|
|
=head1 NAME
|
|
|
|
guestfs-performance - engineering libguestfs for greatest performance
|
|
|
|
=head1 DESCRIPTION
|
|
|
|
This page documents how to get the greatest performance out of
|
|
libguestfs, especially when you expect to use libguestfs to manipulate
|
|
thousands of virtual machines or disk images.
|
|
|
|
Three main areas are covered. Libguestfs runs an appliance (a small
|
|
Linux distribution) inside qemu/KVM. The first two areas are:
|
|
minimizing the time taken to start this appliance, and the number of
|
|
times the appliance has to be started. The third area is shortening
|
|
the time taken for inspection of VMs.
|
|
|
|
=head1 BASELINE MEASUREMENTS
|
|
|
|
Before making changes to how you use libguestfs, take baseline
|
|
measurements.
|
|
|
|
=head2 BASELINE: STARTING THE APPLIANCE
|
|
|
|
On an unloaded machine, time how long it takes to start up the
|
|
appliance:
|
|
|
|
time guestfish -a /dev/null run
|
|
|
|
Run this command several times in a row and discard the first few
|
|
runs, so that you are measuring a typical "hot cache" case.
|
|
|
|
=head3 Explanation
|
|
|
|
This command starts up the libguestfs appliance on a null disk, and
|
|
then immediately shuts it down. The first time you run the command,
|
|
it will create an appliance and cache it (usually under
|
|
C</var/tmp/.guestfs-*>). Subsequent runs should reuse the cached
|
|
appliance.
|
|
|
|
=head3 Expected results
|
|
|
|
You should expect to be getting times under 6 seconds. If the times
|
|
you see on an unloaded machine are above this, then see the section
|
|
L</TROUBLESHOOTING POOR PERFORMANCE> below.
|
|
|
|
=head2 BASELINE: PERFORMING INSPECTION OF A GUEST
|
|
|
|
For this test you will need an unloaded machine and at least one real
|
|
guest or disk image. If you are planning to use libguestfs against
|
|
only X guests (eg. X = Windows), then using an X guest here would be
|
|
most appropriate. If you are planning to run libguestfs against a mix
|
|
of guests, then use a mix of guests for testing here.
|
|
|
|
Time how long it takes to perform inspection and mount the disks of
|
|
the guest. Use the first command if you will be using disk images,
|
|
and the second command if you will be using libvirt.
|
|
|
|
time guestfish --ro -a disk.img -i exit
|
|
|
|
time guestfish --ro -d GuestName -i exit
|
|
|
|
Run the command several times in a row and discard the first few runs,
|
|
so that you are measuring a typical "hot cache" case.
|
|
|
|
=head3 Explanation
|
|
|
|
This command starts up the libguestfs appliance on the named disk
|
|
image or libvirt guest, performs libguestfs inspection on it (see
|
|
L<guestfs(3)/INSPECTION>), mounts the guest's disks, then discards all
|
|
these results and shuts down.
|
|
|
|
The first time you run the command, it will create an appliance and
|
|
cache it (usually under C</var/tmp/.guestfs-*>). Subsequent runs
|
|
should reuse the cached appliance.
|
|
|
|
=head3 Expected results
|
|
|
|
You should expect times which are E<le> 5 seconds greater than
|
|
measured in the first baseline test above. (For example, if the first
|
|
baseline test ran in 5 seconds, then this test should run in E<le> 10
|
|
seconds).
|
|
|
|
=head1 UNDERSTANDING THE APPLIANCE AND WHEN IT IS BUILT/CACHED
|
|
|
|
The first time you use libguestfs, it will build and cache an
|
|
appliance. This is usually in C</var/tmp/.guestfs-*>, unless you have
|
|
set C<$TMPDIR> or C<$LIBGUESTFS_CACHEDIR> in which case it will be
|
|
under that temporary directory.
|
|
|
|
For more information about how the appliance is constructed, see
|
|
L<supermin(1)/SUPERMIN APPLIANCES>.
|
|
|
|
Every time libguestfs runs it will check that no host files used by
|
|
the appliance have changed. If any have, then the appliance is
|
|
rebuilt. This usually happens when a package is installed or updated
|
|
on the host (eg. using programs like C<yum> or C<apt-get>). The
|
|
reason for reconstructing the appliance is security: the new program
|
|
that has been installed might contain a security fix, and so we want
|
|
to include the fixed program in the appliance automatically.
|
|
|
|
These are the performance implications:
|
|
|
|
=over 4
|
|
|
|
=item *
|
|
|
|
The process of building (or rebuilding) the cached appliance is slow,
|
|
and you can avoid this happening by using a fixed appliance (see
|
|
below).
|
|
|
|
=item *
|
|
|
|
If not using a fixed appliance, be aware that updating software on the
|
|
host will cause a one time rebuild of the appliance.
|
|
|
|
=item *
|
|
|
|
C</var/tmp> (or C<$TMPDIR>, C<$LIBGUESTFS_CACHEDIR>) should be on a
|
|
fast disk, and have plenty of space for the appliance.
|
|
|
|
=back
|
|
|
|
=head1 USING A FIXED APPLIANCE
|
|
|
|
To fully control when the appliance is built, you can build a fixed
|
|
appliance. This appliance should be stored on a fast local disk.
|
|
|
|
To build the appliance, run the command:
|
|
|
|
libguestfs-make-fixed-appliance <directory>
|
|
|
|
replacing C<E<lt>directoryE<gt>> with the name of a directory where
|
|
the appliance will be stored (normally you would name a subdirectory,
|
|
for example: C</usr/local/lib/guestfs/appliance> or
|
|
C</dev/shm/appliance>).
|
|
|
|
Then set C<$LIBGUESTFS_PATH> (and ensure this environment variable is
|
|
set in your libguestfs program), or modify your program so it calls
|
|
C<guestfs_set_path>. For example:
|
|
|
|
export LIBGUESTFS_PATH=/usr/local/lib/guestfs/appliance
|
|
|
|
Now you can run libguestfs programs, virt tools, guestfish etc. as
|
|
normal. The programs will use your fixed appliance, and will not ever
|
|
build, rebuild, or cache their own appliance.
|
|
|
|
(For detailed information on this subject, see:
|
|
L<libguestfs-make-fixed-appliance(1)>).
|
|
|
|
=head2 PERFORMANCE OF THE FIXED APPLIANCE
|
|
|
|
In our testing we did not find that using a fixed appliance gave any
|
|
measurable performance benefit, even when the appliance was located in
|
|
memory (ie. on C</dev/shm>). However there are two points to
|
|
consider:
|
|
|
|
=over 4
|
|
|
|
=item 1.
|
|
|
|
Using a fixed appliance stops libguestfs from ever rebuilding the
|
|
appliance, meaning that libguestfs will have more predictable start-up
|
|
times.
|
|
|
|
=item 2.
|
|
|
|
The appliance is loaded on demand. A simple test such as:
|
|
|
|
time guestfish -a /dev/null run
|
|
|
|
does not load very much of the appliance. A real libguestfs program
|
|
using complicated API calls would demand-load a lot more of the
|
|
appliance. Being able to store the appliance in a specified location
|
|
makes the performance more predictable.
|
|
|
|
=back
|
|
|
|
=head1 REDUCING THE NUMBER OF TIMES THE APPLIANCE IS LAUNCHED
|
|
|
|
By far the most effective, though not always the simplest way to get
|
|
good performance is to ensure that the appliance is launched the
|
|
minimum number of times. This will probably involve changing your
|
|
libguestfs application.
|
|
|
|
Try to call C<guestfs_launch> at most once per target virtual machine
|
|
or disk image.
|
|
|
|
Instead of using a separate instance of L<guestfish(1)> to make a
|
|
series of changes to the same guest, use a single instance of
|
|
guestfish and/or use the guestfish I<--listen> option.
|
|
|
|
Consider writing your program as a daemon which holds a guest open
|
|
while making a series of changes. Or marshal all the operations you
|
|
want to perform before opening the guest.
|
|
|
|
You can also try adding disks from multiple guests to a single
|
|
appliance. Before trying this, note the following points:
|
|
|
|
=over 4
|
|
|
|
=item 1.
|
|
|
|
Adding multiple guests to one appliance is a security problem because
|
|
it may allow one guest to interfere with the disks of another guest.
|
|
Only do it if you trust all the guests, or if you can group guests by
|
|
trust.
|
|
|
|
=item 2.
|
|
|
|
There is a hard limit to the number of disks you can add to a single
|
|
appliance. Call L<guestfs(3)/guestfs_max_disks> to get this limit.
|
|
For further information see L<guestfs(3)/LIMITS>.
|
|
|
|
=item 3.
|
|
|
|
Using libguestfs this way is complicated. Disks can have unexpected
|
|
interactions: for example, if two guests use the same UUID for a
|
|
filesystem (because they were cloned), or have volume groups with the
|
|
same name (but see C<guestfs_lvm_set_filter>).
|
|
|
|
=back
|
|
|
|
L<virt-df(1)> adds multiple disks by default, so the source code for
|
|
this program would be a good place to start.
|
|
|
|
=head1 SHORTENING THE TIME TAKEN FOR INSPECTION OF VMs
|
|
|
|
The main advice is obvious: Do not perform inspection (which is
|
|
expensive) unless you need the results.
|
|
|
|
If you previously performed inspection on the guest, then it may be
|
|
safe to cache and reuse the results from last time.
|
|
|
|
Some disks don't need to be inspected at all: for example, if you are
|
|
creating a disk image, or if the disk image is not a VM, or if the
|
|
disk image has a known layout.
|
|
|
|
Even when basic inspection (C<guestfs_inspect_os>) is required,
|
|
auxiliary inspection operations may be avoided:
|
|
|
|
=over 4
|
|
|
|
=item *
|
|
|
|
Mounting disks is only necessary to get further filesystem
|
|
information.
|
|
|
|
=item *
|
|
|
|
Listing applications (C<guestfs_inspect_list_applications>) is an
|
|
expensive operation on Linux, but almost free on Windows.
|
|
|
|
=item *
|
|
|
|
Generating a guest icon (C<guestfs_inspect_get_icon>) is cheap on
|
|
Linux but expensive on Windows.
|
|
|
|
=back
|
|
|
|
=head1 PARALLEL APPLIANCES
|
|
|
|
Libguestfs appliances are mostly I/O bound and you can launch multiple
|
|
appliances in parallel. Provided there is enough free memory, there
|
|
should be little difference in launching 1 appliance vs N appliances
|
|
in parallel.
|
|
|
|
On a 2-core (4-thread) laptop with 16 GB of RAM, using the (not
|
|
especially realistic) test Perl script below, the following plot shows
|
|
excellent scalability when running between 1 and 20 appliances in
|
|
parallel:
|
|
|
|
12 ++---+----+----+----+-----+----+----+----+----+---++
|
|
+ + + + + + + + + + *
|
|
| |
|
|
| * |
|
|
11 ++ ++
|
|
| |
|
|
| |
|
|
| * * |
|
|
10 ++ ++
|
|
| * |
|
|
| |
|
|
s | |
|
|
9 ++ ++
|
|
e | |
|
|
| * |
|
|
c | |
|
|
8 ++ * ++
|
|
o | * |
|
|
| |
|
|
n 7 ++ ++
|
|
| * |
|
|
d | * |
|
|
| |
|
|
s 6 ++ ++
|
|
| * * |
|
|
| * |
|
|
| |
|
|
5 ++ ++
|
|
| |
|
|
| * |
|
|
| * * |
|
|
4 ++ ++
|
|
| |
|
|
| |
|
|
+ * * * + + + + + + + +
|
|
3 ++-*-+----+----+----+-----+----+----+----+----+---++
|
|
0 2 4 6 8 10 12 14 16 18 20
|
|
number of parallel appliances
|
|
|
|
It is possible to run many more than 20 appliances in parallel, but if
|
|
you are using the libvirt backend then you should be aware that out of
|
|
the box libvirt limits the number of client connections to 20.
|
|
|
|
The simple Perl script below was used to collect the data for the plot
|
|
above, but there is much more information on this subject, including
|
|
more advanced test scripts and graphs, available in the following blog
|
|
postings:
|
|
|
|
L<http://rwmj.wordpress.com/2013/02/25/multiple-libguestfs-appliances-in-parallel-part-1/>
|
|
L<http://rwmj.wordpress.com/2013/02/25/multiple-libguestfs-appliances-in-parallel-part-2/>
|
|
L<http://rwmj.wordpress.com/2013/02/25/multiple-libguestfs-appliances-in-parallel-part-3/>
|
|
L<http://rwmj.wordpress.com/2013/02/25/multiple-libguestfs-appliances-in-parallel-part-4/>
|
|
|
|
#!/usr/bin/perl -w
|
|
|
|
use strict;
|
|
use threads;
|
|
use Sys::Guestfs;
|
|
use Time::HiRes qw(time);
|
|
|
|
sub test {
|
|
my $g = Sys::Guestfs->new;
|
|
$g->add_drive_ro ("/dev/null");
|
|
$g->launch ();
|
|
|
|
# You could add some work for libguestfs to do here.
|
|
|
|
$g->close ();
|
|
}
|
|
|
|
# Get everything into cache.
|
|
test (); test (); test ();
|
|
|
|
for my $nr_threads (1..20) {
|
|
my $start_t = time ();
|
|
my @threads;
|
|
foreach (1..$nr_threads) {
|
|
push @threads, threads->create (\&test)
|
|
}
|
|
foreach (@threads) {
|
|
$_->join ();
|
|
if (my $err = $_->error ()) {
|
|
die "launch failed with $nr_threads threads: $err"
|
|
}
|
|
}
|
|
my $end_t = time ();
|
|
printf ("%d %.2f\n", $nr_threads, $end_t - $start_t);
|
|
}
|
|
|
|
=head1 USING USER-MODE LINUX
|
|
|
|
Since libguestfs 1.24, it has been possible to use the User-Mode Linux
|
|
(uml) backend instead of KVM
|
|
(see L<guestfs(3)/USER-MODE LINUX BACKEND>). This section makes some
|
|
general remarks about this backend, but it is B<highly advisable> to
|
|
measure your own workload under UML rather than trusting comments or
|
|
intuition.
|
|
|
|
=over 4
|
|
|
|
=item *
|
|
|
|
UML usually performs the same or slightly slower than KVM, on baremetal.
|
|
|
|
=item *
|
|
|
|
However UML often performs the same under virtualization as it does on
|
|
baremetal, whereas KVM can run much slower under virtualization (since
|
|
hardware virt acceleration is not available).
|
|
|
|
=item *
|
|
|
|
Upload and download is as much as 10 times slower on UML than KVM.
|
|
Libguestfs sends this data over the UML emulated serial port, which is
|
|
far less efficient than KVM's virtio-serial.
|
|
|
|
=item *
|
|
|
|
UML lacks some features (eg. qcow2 support), so it may not be
|
|
applicable at all.
|
|
|
|
=back
|
|
|
|
For some actual figures, see:
|
|
L<http://rwmj.wordpress.com/2013/08/14/performance-of-user-mode-linux-as-a-libguestfs-backend/#content>
|
|
|
|
=head1 TROUBLESHOOTING POOR PERFORMANCE
|
|
|
|
=head2 ENSURE HARDWARE VIRTUALIZATION IS AVAILABLE
|
|
|
|
Use C</proc/cpuinfo> and this page:
|
|
|
|
http://virt-tools.org/learning/check-hardware-virt/
|
|
|
|
to ensure that hardware virtualization is available. Note that you
|
|
may need to enable it in your BIOS.
|
|
|
|
Hardware virt is not usually available inside VMs, and libguestfs will
|
|
run slowly inside another virtual machine whatever you do. Nested
|
|
virtualization does not work well in our experience, and is certainly
|
|
no substitute for running libguestfs on baremetal.
|
|
|
|
=head2 ENSURE KVM IS AVAILABLE
|
|
|
|
Ensure that KVM is enabled and available to the user that will run
|
|
libguestfs. It should be safe to set 0666 permissions on C</dev/kvm>
|
|
and most distributions now do this.
|
|
|
|
=head2 PROCESSORS TO AVOID
|
|
|
|
Avoid processors that don't have hardware virtualization, and some
|
|
processors which are simply very slow (AMD Geode being a great
|
|
example).
|
|
|
|
=head1 DETAILED TIMINGS USING ANNOTATE
|
|
|
|
Use the L<annotate(1)>/L<annotate-output(1)> command to show detailed
|
|
timings:
|
|
|
|
$ annotate-output +'%T.%N' guestfish -a /dev/null run -v
|
|
22:17:53.215784625 I: Started guestfish -a /dev/null run -v
|
|
22:17:53.240335409 E: libguestfs: [00000ms] supermin-helper --verbose -f checksum '/usr/lib64/guestfs/supermin.d' x86_64
|
|
22:17:53.266857866 E: supermin helper [00000ms] whitelist = (not specified), host_cpu = x86_64, kernel = (null), initrd = (null), appliance = (null)
|
|
22:17:53.272704072 E: supermin helper [00000ms] inputs[0] = /usr/lib64/guestfs/supermin.d
|
|
22:17:53.276528651 E: checking modpath /lib/modules/3.4.0-1.fc17.x86_64.debug is a directory
|
|
[etc]
|
|
|
|
The timestamps are C<hours:minutes:seconds.nanoseconds>. By comparing
|
|
the timestamps you can see exactly how long each operation in the boot
|
|
sequence takes.
|
|
|
|
=head1 DETAILED TIMINGS USING SYSTEMTAP
|
|
|
|
You can use SystemTap (L<stap(1)>) to get detailed timings from
|
|
libguestfs programs.
|
|
|
|
Save the following script as C<time.stap>:
|
|
|
|
global last;
|
|
|
|
function display_time () {
|
|
now = gettimeofday_us ();
|
|
delta = 0;
|
|
if (last > 0)
|
|
delta = now - last;
|
|
last = now;
|
|
|
|
printf ("%d (+%d):", now, delta);
|
|
}
|
|
|
|
probe begin {
|
|
last = 0;
|
|
printf ("ready\n");
|
|
}
|
|
|
|
/* Display all calls to static markers. */
|
|
probe process("/usr/lib*/libguestfs.so.0")
|
|
.provider("guestfs").mark("*") ? {
|
|
display_time();
|
|
printf ("\t%s %s\n", $$name, $$parms);
|
|
}
|
|
|
|
/* Display all calls to guestfs_* functions. */
|
|
probe process("/usr/lib*/libguestfs.so.0")
|
|
.function("guestfs_[a-z]*") ? {
|
|
display_time();
|
|
printf ("\t%s %s\n", probefunc(), $$parms);
|
|
}
|
|
|
|
Run it as root in one window:
|
|
|
|
# stap time.stap
|
|
ready
|
|
|
|
It prints "ready" when SystemTap has loaded the program. Run your
|
|
libguestfs program, guestfish or a virt tool in another window. For
|
|
example:
|
|
|
|
$ guestfish -a /dev/null run
|
|
|
|
In the stap window you will see a large amount of output, with the
|
|
time taken for each step shown (microseconds in parenthesis). For
|
|
example:
|
|
|
|
xxxx (+0): guestfs_create
|
|
xxxx (+29): guestfs_set_pgroup g=0x17a9de0 pgroup=0x1
|
|
xxxx (+9): guestfs_add_drive_opts_argv g=0x17a9de0 [...]
|
|
xxxx (+8): guestfs___safe_strdup g=0x17a9de0 str=0x7f8a153bed5d
|
|
xxxx (+19): guestfs___safe_malloc g=0x17a9de0 nbytes=0x38
|
|
xxxx (+5): guestfs___safe_strdup g=0x17a9de0 str=0x17a9f60
|
|
xxxx (+10): guestfs_launch g=0x17a9de0
|
|
xxxx (+4): launch_start
|
|
[etc]
|
|
|
|
You will need to consult, and even modify, the source to libguestfs to
|
|
fully understand the output.
|
|
|
|
=head1 DETAILED DEBUGGING USING GDB
|
|
|
|
You can attach to the appliance BIOS/kernel using gdb. If you know
|
|
what you're doing, this can be a useful way to diagnose boot
|
|
regressions.
|
|
|
|
Firstly, you have to change qemu so it runs with the C<-S> and C<-s>
|
|
options. These options cause qemu to pause at boot and allow you to
|
|
attach a debugger. Read L<qemu(1)> for further information.
|
|
Libguestfs invokes qemu several times (to scan the help output and so
|
|
on) and you only want the final invocation of qemu to use these
|
|
options, so use a qemu wrapper script like this:
|
|
|
|
#!/bin/bash -
|
|
|
|
# Set this to point to the real qemu binary.
|
|
qemu=/usr/bin/qemu-kvm
|
|
|
|
if [ "$1" != "-global" ]; then
|
|
# Scanning help output etc.
|
|
exec $qemu "$@"
|
|
else
|
|
# Really running qemu.
|
|
exec $qemu -S -s "$@"
|
|
fi
|
|
|
|
Now run guestfish or another libguestfs tool with the qemu wrapper
|
|
(see L<guestfs(3)/QEMU WRAPPERS> to understand what this is doing):
|
|
|
|
LIBGUESTFS_HV=/path/to/qemu-wrapper guestfish -a /dev/null -v run
|
|
|
|
This should pause just after qemu launches. In another window, attach
|
|
to qemu using gdb:
|
|
|
|
$ gdb
|
|
(gdb) set architecture i8086
|
|
The target architecture is assumed to be i8086
|
|
(gdb) target remote :1234
|
|
Remote debugging using :1234
|
|
0x0000fff0 in ?? ()
|
|
(gdb) cont
|
|
|
|
At this point you can use standard gdb techniques, eg. hitting C<^C>
|
|
to interrupt the boot and C<bt> get a stack trace, setting
|
|
breakpoints, etc. Note that when you are past the BIOS and into the
|
|
Linux kernel, you'll want to change the architecture back to 32 or 64
|
|
bit.
|
|
|
|
=head1 SEE ALSO
|
|
|
|
L<supermin(1)>,
|
|
L<guestfish(1)>,
|
|
L<guestfs(3)>,
|
|
L<guestfs-examples(3)>,
|
|
L<libguestfs-make-fixed-appliance(1)>,
|
|
L<stap(1)>,
|
|
L<qemu(1)>,
|
|
L<gdb(1)>,
|
|
L<http://libguestfs.org/>.
|
|
|
|
=head1 AUTHORS
|
|
|
|
Richard W.M. Jones (C<rjones at redhat dot com>)
|
|
|
|
=head1 COPYRIGHT
|
|
|
|
Copyright (C) 2012 Red Hat Inc.
|