For the libvirt backend we no longer need to determine if we're using
a custom hypervisor using black magic. Instead we can simply check if
g->hv is non-NULL.
Add a generic way for backends to report the default hypervisor
(ie. QEMU) back to the main code.
The direct implemention reflects the current way that the hypervisor
is chosen at configure time (see m4/guestfs-qemu.m4).
For the libvirt backend, we are already getting this from libvirt
domcapabilities, so we can just return that field.
Note this may return NULL (roughly "no data"), particularly in the
libvirt case because that requires us to have launched the appliance
already.
If a user passes a custom hv, and uses add_libvirt_dom (like
in test820RHBZ912499.py) , this hits a bogus libvirt validation
check which rejects disabling svirt at both domain and disk level.
Work around it by skipping the redundant disk override when
svirt is disabled at the domain level.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
On fedora, ./configure will set QEMU=qemu-kvm, but libvirt
domcaps will default to qemu-system-x86_64. The former is a symlink
to the latter, but libguestfs doesn't know that, so determines
the user requested a non-default hv override and changes some
settings as a result (like disabling svirt).
Check for the symlink case and consider it a non-custom config
if realpath matches libvirt's default
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Add debugging about is_custom_hv result.
cache the initial call so we are only hitting the debug() once.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
libvirt has DAC relabeled sockets for us for a decade, so we don't
need to do chowning here anymore
The chmod calls are still required, otherwise our created sockets
may be too permissive.
Update the comment to try and preserve the still relevant info,
though now it's a bit awkwardly placed.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
We no longer use the libvirt version anywhere, except when reporting
the version. Remove this from the handle.
Simplify the remaining code. In particular:
* don't bother parsing the libvirt version, just print what
virGetVersion gives us
* guestfs_int_version_from_libvirt is dead code, so it can be removed
Libvirt 9.0.0 was released in January 2023, and it seems safe to
assume that if you're enabling the non-default backend, you can at
least use a new version of libvirt.
If you're using new libvirt, might as well also assume passt is
available.
This saves us going into a loop if virDomainDestroyFlags keeps
returning -EBUSY quickly, which apparenrly can happen in containers.
The equivalent 'direct' backend code sleeps for 2 seconds in this case.
Many years ago we used to pass acpi=off on the Linux kernel command
line. In commit db1f811b2 we stopped doing that (around 2016).
However unless you also use:
<features>
<acpi/>
</features>
then it turns out that libvirt disables ACPI generation at the qemu
level. None of this mattered until SeaBIOS 1.17 changed its
behaviour, causing ACPI to be required for virtio devices to work.
Updates: commit db1f811b29
Related: https://bugzilla.redhat.com/show_bug.cgi?id=2372329
Thanks: Gerd Hoffmann
All recent compilers support this (except MS compilers which we don't
care about). Assume it is supported. We test it in ./configure and
hard fail if it doesn't work.
We still define HAVE_ATTRIBUTE_CLEANUP but you can now assume it is
always defined and don't have to check it.
This was only theoretically supported, via curl. It's unlikely that
it really worked as it was never tested.
If needed it's better to use nbdkit-curl-plugin instead (this applies
to http and ftp as well).
After many, many years, although libvirt does still often fail to
work, it's generally more secure to stick with libvirt than to try
running qemu directly. The main issue here is that people have
cargo-culted LIBGUESTFS_BACKEND=direct everywhere (even when it's not
necessary).
We generate the <interface type="user"> element on libvirt 3.8.0+ already.
For selecting passt rather than SLIRP, we only need to insert the child
element <backend type='passt'>. Make that child element conditional on
libvirt 9.0.0+, plus "passt --help" being executable.
For the latter, place the new helper function guestfs_int_passt_runnable()
in "lib/launch.c" -- we're going to use the same function for the direct
backend as well.
This change exposes a number of (perceived) shortcomings in libvirt; I've
filed <https://bugzilla.redhat.com/show_bug.cgi?id=2222766> about those.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2184967
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Message-Id: <20230714132213.96616-3-lersek@redhat.com>
We require libvirt >= 0.10.2, and we included code to check this at
configure-, compile- and run-time. Remove the checks at compile and
run time (keep the ./configure check). Libvirt 0.10.2 was released
over 10 years ago so it's safe to assume that everyone has it by now.
Run this command across the source:
perl -pi.bak -e 's/(20[012][0-9])-20[12][012]/$1-2023/g' `git ls-files`
and remove changes to po{,-docs}/*.po{,t} (these will be regenerated
later when we run 'make dist').
These were added in libguestfs 1.14, but never really used. Only a
handful of probes were available. When I was benchmarking libguestfs
in 2016 I didn't even use these probes because better/simpler
techniques were available.
In https://bugzilla.redhat.com/show_bug.cgi?id=2082806 we've been
tracking an insidious qemu bug which intermittently prevents the
libguestfs appliance from starting. The symptoms are that SeaBIOS
starts and displays its messages, but the kernel isn't reached. We
found that the kernel does in fact start, but when it tries to set up
page tables and jump to protected mode it gets a triple fault which
causes the emulated CPU in qemu to reset (qemu exits).
This seems to only affect TCG (not KVM).
Yesterday I found that this is caused by using -cpu max which enables
the "la57" feature (5-level page tables[0]), and that we can make the
problem go away using -cpu max,la57=off. Note that I still don't
fully understand the qemu bug, so this is only a workaround.
I chose to disable 5-level page tables for both TCG and KVM, partly to
make the patch simpler, and partly because I guess it's not a feature
(ie. 57 bit linear addresses) that is useful for the libguestfs
appliance case, where we have limited physical memory and no need to
run any programs with huge address spaces.
I tested this by running both the direct & libvirt paths overnight. I
expect that this patch will fail with old qemu/libvirt which doesn't
understand the "la57" feature, but this is only intended as a
temporary workaround.
[0] Article about 5-level page tables as background:
https://lwn.net/Articles/717293/
Thanks: Laszlo Ersek
Fixes: https://answers.launchpad.net/ubuntu/+source/libguestfs/+question/701625
Acked-by: Laszlo Ersek <lersek@redhat.com>
Representing "iface" in the "drive_create_data" and "drive" structures is
now useless; the direct backend ignores "iface", while the libvirt one
rejects it unless it is empty. Unify both backends -- make them both
ignore "iface". (Which only relaxes the libvirt backend, so it cannot
cause compatibility problems.) This lets us remove the fields. Update the
documentation as well.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1844341
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <20220504134155.11832-3-lersek@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
This was a feature that allowed you to add drives to the appliance
after launching it. It was complicated to implement, and only worked
for the libvirt backend (not "direct", which is the default backend).
It also turned out to be a bad idea. The original concept was that
appliance creation was slow, so to examine multiple guests you should
launch the handle once then hot-add the disks from each guest in turn
to manipulate them. However this is terrible from a security point of
view, especially for multi-tenant, because the drives from one guest
might compromise the appliance and thus the filesystems/drives from
subsequent guests.
It also turns out that hotplugging is very slow. Nowadays appliance
creation should be faster than hotplugging.
The main use case for this was virt-df, but virt-df no longer uses it
after we discovered the problems outlined above.
The 169.254.0.0/16 network specification (for the appliance) is currently
duplicated between the direct backend and the libvirt backend. In a
subsequent patch, we're going to need the network specification in yet
another spot; extract it now to the NETWORK_ADDRESS and NETWORK_PREFIX
macros (simply as strings).
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2034160
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <20211223103701.12702-3-lersek@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
The <qemu:commandline> trick we use for adding our virtio-net-pci device
in the libvirt backend can conflict with libvirtd's and QEMU's PCI address
assignment. Try to mitigate that by placing our device in slot 0x1e on the
root bus. In practice this could only conflict with a "dmi-to-pci-bridge"
device model, which libvirtd itself places in slot 0x1e. However, given
the XMLs we generate, and modern QEMU versions, libvirtd has no reason to
auto-add "dmi-to-pci-bridge". Refer to
<https://libvirt.org/formatdomain.html#controllers>.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2034160
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <20211223103701.12702-2-lersek@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Note this requires libvirt >= 7.1.0 which was only released in March 2021.
With an older libvirt you will see this error:
Original error from libvirt: unsupported configuration: Invalid mode attribute 'maximum' [code=67 int1=-1]
In theory we could check if this is supported by looking at the
libvirt capabilities and fall back, but this commit does not do that,
in the expectation that most people will be using the default backend
(direct) and on Fedora/RHEL we will add an explicit minimum version
dependency to the package.
qemu support has been around quite a bit longer (at least since 2017).
Fixes: commit 30f74f38bd
QEMU has a newish feature (from about 2017 / qemu 2.9) called -cpu max
which is supposed to select the best CPU, ideal for libguestfs.
After this change, on x86-64:
KVM TCG
Direct -cpu max -cpu max
(non-libvirt)
Libvirt <cpu mode="host-passthrough"> <cpu mode="host-model">
<model fallback="allow"/> <model fallback="allow"/>
</cpu> </cpu>
Thanks: Daniel Berrangé
Appliance device names are not reliable since the kernel no longer
enumerates virtio-scsi devices serially. Instead get the UUID of the
appliance and pass this as the parameter.
Note this requires supermin >= 5.1.18 (from around July 2017).
Nowadays there are hard drives and operating systems which support
"4K native" sector size. In this mode physical and logical block size
exposed to the operating system is equal to 4096 bytes.
GPT partition table (as a known example) being created in this mode will
place GPT header at LBA1 which is 4096 bytes. libguetfs is unable to
recognize partition table on such physical block devices or disk images.
The reason is that libguestfs appliance will look for a GPT header at
LBA1 which is seen at 512 byte offset.
In order to fix the issue we need a way to provide correct logical block
size for attached disks. Fortunately QEMU and libvirt already provides
a way to specify physical/logical block size per disk basis.
After discussion in a mailing list we agreed that physical block size is
rarely used and is not so important. Thus both physical and logical
block size will be set to the same value.
In this patch one more optional parameter 'blocksize' is added
to add_drive_opts API method. Valid values are 512 and 4096.
add_drive_scratch has the same optional parameter for a consistency and
testing purpose.
add-domain and add_libvirt_dom will pass logical_block_size value from
libvirt XML to add_drive_opts method.
Enhance the UEFI firmware lookup function with the information on the
libvirt firmware autoselection, allowing it to return a value to use for
the appliance.
At the moment no firmware is selected this way, so there is no behaviour
change.
Previously, is_custom_hv() used to compare the QEMU executable found
during configure to the hypervisor set to check whether it is a custom
one; however, the QEMU found at configure time can be different than
what libvirt was configured with.
This fixes the libvirt backend when libguestfs is configured with a
different QEMU, that now will be specified as emulator overriding the
libvirt one.