This change was done almost entirely automatically using the script
below. This uses the OCaml lexer to read the source files and extract
the strings and locations. Strings which are "candidates" (in this
case, longer than 3 lines) are replaced in the output with quoted
string literals.
Since the OCaml lexer is used, it already substitutes all escape
sequences correctly. I diffed the output of the generator and it is
identical after this change, except for UUIDs, which change because of
how Utils.stable_uuid is implemented.
Thanks: Nicolas Ojeda Bar
$ ocamlfind opt -package unix,compiler-libs.common find_strings.ml \
-o find_strings.opt -linkpkg
$ for f in $( git ls-files -- \*.ml ) ; do ./find_strings.opt $f ; done
open Printf
let read_whole_file path =
let buf = Buffer.create 16384 in
let chan = open_in path in
let maxlen = 16384 in
let b = Bytes.create maxlen in
let rec loop () =
let r = input chan b 0 maxlen in
if r > 0 then (
Buffer.add_substring buf (Bytes.to_string b) 0 r;
loop ()
)
in
loop ();
close_in chan;
Buffer.contents buf
let count_chars c str =
let count = ref 0 in
for i = 0 to String.length str - 1 do
if c = String.unsafe_get str i then incr count
done;
!count
let subs = ref []
let consider_string str loc =
let nr_lines = count_chars '\n' str in
if nr_lines > 3 then
subs := (str, loc) :: !subs
let () =
Lexer.init ();
let filename = Sys.argv.(1) in
let content = read_whole_file filename in
let lexbuf = Lexing.from_string content in
let rec loop () =
let token = Lexer.token lexbuf in
(match token with
| Parser.EOF -> ();
| STRING (s, loc, sopt) ->
consider_string s loc; (* sopt? *)
loop ();
| token ->
loop ();
)
in
loop ();
(* The list of subs is already reversed, which is convenient
* because we must the file substitutions in reverse order.
*)
let subs = !subs in
let new_content = ref content in
List.iter (
fun (str, loc) ->
let { Location.loc_start = { pos_cnum = p1 };
loc_end = { pos_cnum = p2 } } = loc in
let len = String.length !new_content in
let before = String.sub !new_content 0 (p1-1) in
let after = String.sub !new_content (p2+1) (len - p2 - 1) in
new_content := before ^ "{|" ^ str ^ "|}" ^ after
) subs;
let new_content = !new_content in
if content <> new_content then (
(* Update the file in place. *)
let new_filename = filename ^ ".new"
and backup_filename = filename ^ ".bak" in
let chan = open_out new_filename in
fprintf chan "%s" new_content;
close_out chan;
Unix.rename filename backup_filename;
Unix.rename new_filename filename
)
This acts just like FString except that we do reverse device name
translation on it. The only use is in the 'pvs-full' API where we
will use it (in a subsequent commit) to reverse translate the pv_name
field (a device name) before returning it from the daemon.
Compare this to the 'pvs' API which also returns a list of device
names, but using the generator's 'RStructList (RDevice,...)' return
type, where RDevice is similarly reverse translated.
Note in the library-side bindings, because the name has already been
translated in the daemon, we just treat it exactly the same as
FString. The vast majority of this patch is this mechanical change.
All recent compilers support this (except MS compilers which we don't
care about). Assume it is supported. We test it in ./configure and
hard fail if it doesn't work.
We still define HAVE_ATTRIBUTE_CLEANUP but you can now assume it is
always defined and don't have to check it.
Run this command across the source:
perl -pi.bak -e 's/(20[012][0-9])-20[12][012]/$1-2023/g' `git ls-files`
and remove changes to po{,-docs}/*.po{,t} (these will be regenerated
later when we run 'make dist').
This gnulib feature abstracts away threads, locks and TLS, and also
allowed libguestfs to be linked with or without pthread. However
since pthread these days is part of glibc and so every program is
using pthread, and we want to get rid of gnulib as a dependency, just
use pthread directly.
Add a simple way to do not even provide prototypes of deprecated
functions in the C library: this way, users (like our tools) can build
against the library making sure to not use any deprecated function, not
even when compiler deprecation warnings are disabled.
Add it to the majority of our tools/internal libraries, and make sure
that it is not defined when building the API bridges of our bindings.
Right now, deprecated functions of the library do not trigger any
compiler deprecation warning by default; they do that only if
GUESTFS_WARN_DEPRECATED=1 is defined. However, this is not something
that seems to be done often -- at least none of the projects using the
libguestfs C API does that.
Hence, do a small behaviour change to change this on the other way
round: now deprecated functions trigger compiler deprecation warnings by
default, using GUESTFS_NO_WARN_DEPRECATED to disable this (and revert
to the previous behaviour). Even though deprecated functions will not
be removed, we really want users to migrate away from them, as they were
deprecated for good reasons.
Define GUESTFS_NO_WARN_DEPRECATED where needed:
- in all the bindings, as they bind all the functions including the
deprecated ones
- in the guestfish actions, as it exposes almost all the APIs
- in the C API test, as it runs the automated tests of all the APIs that
have them
- for two tests that explicitly test deprecated functions
We reimplemented some functions which can now be found in the OCaml
stdlib since 4.01 (or earlier). The functions I have dropped are:
- String.map
- |>
- iteri (replaced by List.iteri)
- mapi (replaced by List.mapi)
Note that our definition of iteri was slightly wrong: the type of the
function parameter was too wide, allowing (int -> 'a -> 'b) instead of
(int -> 'a -> unit).
I also added this new function to the Std_utils.String module as an
export from stdlib String:
- String.iteri
Thanks: Pino Toscano
If you have a struct containing ‘field’, eg:
type t = { field : int }
then previously to pattern-match on this type, eg. in function
parameters, you had to write:
let f { field = field } =
(* ... use field ... *)
In OCaml >= 3.12 it is possible to abbreviate cases where the field
being matched and the variable being bound have the same name, so now
you can just write:
let f { field } =
(* ... use field ... *)
(Similarly for a field prefixed by a Module name you can use
‘{ Module.field }’ instead of ‘{ Module.field = field }’).
This style is widely used inside the OCaml compiler sources, and is
briefer than the long form, so it makes sense to use it. Furthermore
there was one place in virt-dib where we are already using this new
style, so the old code did not compile on OCaml < 3.12.
See also:
https://forge.ocamlcore.org/docman/view.php/77/112/leroy-cug2010.pdf
Acquire the per-handle lock on entering each public API function.
The lock is released by a cleanup handler, so we only need to use the
ACQUIRE_LOCK_FOR_CURRENT_SCOPE macro at the top of each function.
Note this means we require __attribute__((cleanup)). On platforms
where this is not supported, the code will probably hang whenever a
libguestfs function is called.
The only definitive list of public APIs is found indirectly in the
generator (in generator/c.ml : globals).
The new module ‘Std_utils’ contains only functions which are pure
OCaml and depend only on the OCaml stdlib. Therefore these functions
may be used by the generator.
The new module is moved to ‘common/mlstdutils’.
This also removes the "<stdlib>" hack, and the code which copied the
library around.
Also ‘Guestfs_config’, ‘Libdir’ and ‘StringMap’ modules are moved
since these are essentially the same.
The bulk of this change is just updating files which use
‘open Common_utils’ to add ‘open Std_utils’ where necessary.
Previously the generator did not change any string returned from the
daemon. Thus guestfs_list_devices (for example) might return internal
device names like /dev/vda (if virtio-blk was in use).
This changes calls to the daemon so that returned strings are
annotated as plain strings, devices or mountables:
old ---> new
RString "uuid" RString (RPlainString "uuid")
RString "device" RString (RDevice "device")
RString "fs" RString (RMountable "fs")
For hash tables, keys and values must be annotated separately. For
example a hash table of mountables (keys) -> plain strings (values)
would be annotated like this:
old ---> new
RHashtable "fses" RHashtable (RMountable, RPlainString, "fses")
The daemon calls reverse_device_name_translation (currently a no-op)
for devices and mountables.
Note that this has no effect for calls which are handled on the
library side.
(cherry picked from commit 6b77cc196ecb8d7e1d73592ef65a189a7412c97c)
Previously we had lots of types like String, Device, StringList,
DeviceList, etc. where Device was just a String with magical
properties (but only inside the daemon), and DeviceList was just a
list of Device strings.
Replace these with some simple top-level types:
String
StringList
and move the magic into a subtype.
The change is mechanical, for example:
old ---> new
FileIn "filename" String (FileIn, "filename")
DeviceList "devices" StringList (Device, "devices")
Handling BufferIn is sufficiently different from a plain String
throughout all the bindings that it still uses a top-level type.
(Compare with FileIn/FileOut where the only difference is in the
protocol, but the bindings can uniformly treat it as a plain String.)
There is no semantic change, and the generated files are identical
except for a minor change in the (deprecated) Perl
%guestfs_introspection table.
Daemon 'proc_nr's have to be assigned monotonically and uniquely to
each daemon function. However in practice it can be difficult to work
out which is the next free proc_nr. Placing all of them into a single
table in a new file (proc_nr.ml) should make this easier.
Sort the functions so the output is stable.
This changes the order in which the C API tests run. Previously we
ran the newest tests first, which was useful when we were frequently
adding new APIs. Now we run them in sorted order.
Run the following command over the source:
perl -pi.bak -e 's/(20[01][0-9])-2016/$1-2017/g' `git ls-files`
(Thanks Rich for the perl snippet, as used in past years.)
For a very long time we have maintained two sets of utility functions,
in mllib/common_utils.ml and generator/utils.ml. This changes things
so that the same set of utility functions can be shared with both
directories.
It's not possible to use common_utils.ml directly in the generator
because it provides several functions that use modules outside the
OCaml stdlib. Therefore we add some lightweight post-processing which
extracts the functions using only the stdlib:
(*<stdlib>*)
...
(*</stdlib>*)
and creates generator/common_utils.ml and generator/common_utils.mli
from that. The effect is we only need to write utility functions
once.
As with other tools, we still have generator-specific utility
functions in generator/utils.ml.
Also in this change:
- Use String.uppercase_ascii and String.lowercase_ascii in place
of deprecated String.uppercase/String.lowercase.
- Implement String.capitalize_ascii to replace deprecated
String.capitalize.
- Move isspace, isdigit, isxdigit functions to Char module.
Previously we used an awkward hack to split up the large src/actions.c
into smaller files (which can benefit from being compiled in
parallel). This commit generalizes that code, so that we pass a
subsetted actions list to certain generator functions.
The output of the generator is identical after this commit and the
previous commit, except for the UUID encoded into tests/c-api/tests.c
since that is derived from the MD5 hash of generator/actions.ml.
This mostly mechanical change moves all of the libguestfs API
lists of functions into a struct in the Actions module.
It also adds filter functions to be used elsewhere to get
subsets of these functions.
Original code Replacement
all_functions actions
daemon_functions actions |> daemon_functions
non_daemon_functions actions |> non_daemon_functions
external_functions actions |> external_functions
internal_functions actions |> internal_functions
documented_functions actions |> documented_functions
fish_functions actions |> fish_functions
*_functions_sorted ... replacement as above ... |> sort
Document which feature, if any, is needed for a function; this should
help users in properly checking feature availability when using certain
functions.
Extend the generator to generate a source (and the header for it) with
functions that print the content of a guestfs struct. The code is
modelled after the code for it in fish.ml, although made a bit more
generic (destination FILE*, line separator) so it can be used also in
the library, when tracing.
This just introduces the new code and builds it as part of the helper
libutils.la.
Previously we only defined GUESTFS_HAVE_* macros for functions that
were not deprecated, test or debug functions. The reasoning for that
is given in this commit message [note that 'LIBGUESTFS_HAVE_' is the
old/deprecated macro]:
commit 2d8fd7dacd
Author: Richard Jones <rjones@redhat.com>
Date: Thu Sep 2 22:45:54 2010 +0100
Define LIBGUESTFS_HAVE_<shortname> for C API functions.
The actions each have a corresponding define, eg:
#define LIBGUESTFS_HAVE_VGUUID 1
extern char *guestfs_vguuid (guestfs_h *g, const char *vgname);
However functions which are for testing, debugging or deprecated do
not have the corresponding define. Also a few functions are so
basic (eg. guestfs_create) that there is no point defining a symbol
for them.
This wasn't in fact carried through very consistently, since when we
replaced s/LIBGUESTFS_HAVE_/GUESTFS_HAVE_/, we kept the old
LIBGUESTFS_HAVE_* macros, but defined them for every function. Oops.
This commit defines GUESTFS_HAVE_* for every function in <guestfs.h>,
making it easier to write the Python language bindings (see following
commits).
libguestfs has used double and triple underscores in identifiers.
These aren't valid for global names in C++.
The second step is to replace all guestfs__* (2 underscores) with
guestfs_impl_*.
These functions are used where a libguestfs API call is implemented on
the library side. The generator creates a wrapper function which
calls guestfs_impl_* to do the work. (Libguestfs APIs which are
passed directly by the daemon work differently and don't require a
guestfs_impl_* function).
This is an entirely mechanical change done using:
git ls-files | xargs perl -pi.bak -e 's/guestfs___/guestfs_impl_/g'
Reference: http://stackoverflow.com/a/228797
libguestfs has used double and triple underscores in identifiers.
These aren't valid for global names in C++.
The first step is to replace all guestfs___* (3 underscores) with
guestfs_int_*. We've used guestfs_int_* elsewhere already as a prefix
for internal identifiers.
This is an entirely mechanical change done using:
git ls-files | xargs perl -pi.bak -e 's/guestfs___/guestfs_int_/g'
Reference: http://stackoverflow.com/a/228797
This API already existed (as guestfs___add_libvirt_dom), and was used
by a few tools.
This commit changes it to a public API.
Note that for reasons outlined in the previous commit message, it is
impossible to call this from guestfish or from non-C language
bindings.
This implements Pointer arguments properly, at least for certain
limited definitions of "implements" and "properly".
'Pointer' as an argument type is meant to indicate a pointer passed to
an API. The canonical example is the following proposed API:
int guestfs_add_libvirt_dom (guestfs_h *g, virDomainPtr dom, ...);
where 'dom' is described in the generator as:
Pointer ("virDomainPtr", "dom")
Pointer existed already in the generator, but the implementation was
broken. It is not used by any existing API.
There are two basic difficulties of implementing Pointer:
(1) In language bindings there is no portable way to turn (eg.) a Perl
Sys::Virt 'dom' object into a C virDomainPtr.
(2) We can't rely on <libvirt/libvirt.h> being included (since it's an
optional dependency).
In this commit, we solve (2) by using a 'void *'.
We don't solve (1), really. Instead we have a macro
POINTER_NOT_IMPLEMENTED which is used by currently all the non-C
language bindings. It complains loudly and passes a NULL to the
underlying function. The underlying function detects the NULL and
safely returns an error. It is to be hoped that people will
contribute patches to make each language binding work, although in
some bindings it will always remain impossible to implement.