This change was done almost entirely automatically using the script
below. This uses the OCaml lexer to read the source files and extract
the strings and locations. Strings which are "candidates" (in this
case, longer than 3 lines) are replaced in the output with quoted
string literals.
Since the OCaml lexer is used, it already substitutes all escape
sequences correctly. I diffed the output of the generator and it is
identical after this change, except for UUIDs, which change because of
how Utils.stable_uuid is implemented.
Thanks: Nicolas Ojeda Bar
$ ocamlfind opt -package unix,compiler-libs.common find_strings.ml \
-o find_strings.opt -linkpkg
$ for f in $( git ls-files -- \*.ml ) ; do ./find_strings.opt $f ; done
open Printf
let read_whole_file path =
let buf = Buffer.create 16384 in
let chan = open_in path in
let maxlen = 16384 in
let b = Bytes.create maxlen in
let rec loop () =
let r = input chan b 0 maxlen in
if r > 0 then (
Buffer.add_substring buf (Bytes.to_string b) 0 r;
loop ()
)
in
loop ();
close_in chan;
Buffer.contents buf
let count_chars c str =
let count = ref 0 in
for i = 0 to String.length str - 1 do
if c = String.unsafe_get str i then incr count
done;
!count
let subs = ref []
let consider_string str loc =
let nr_lines = count_chars '\n' str in
if nr_lines > 3 then
subs := (str, loc) :: !subs
let () =
Lexer.init ();
let filename = Sys.argv.(1) in
let content = read_whole_file filename in
let lexbuf = Lexing.from_string content in
let rec loop () =
let token = Lexer.token lexbuf in
(match token with
| Parser.EOF -> ();
| STRING (s, loc, sopt) ->
consider_string s loc; (* sopt? *)
loop ();
| token ->
loop ();
)
in
loop ();
(* The list of subs is already reversed, which is convenient
* because we must the file substitutions in reverse order.
*)
let subs = !subs in
let new_content = ref content in
List.iter (
fun (str, loc) ->
let { Location.loc_start = { pos_cnum = p1 };
loc_end = { pos_cnum = p2 } } = loc in
let len = String.length !new_content in
let before = String.sub !new_content 0 (p1-1) in
let after = String.sub !new_content (p2+1) (len - p2 - 1) in
new_content := before ^ "{|" ^ str ^ "|}" ^ after
) subs;
let new_content = !new_content in
if content <> new_content then (
(* Update the file in place. *)
let new_filename = filename ^ ".new"
and backup_filename = filename ^ ".bak" in
let chan = open_out new_filename in
fprintf chan "%s" new_content;
close_out chan;
Unix.rename filename backup_filename;
Unix.rename new_filename filename
)
No existing OCaml functions have a StringList parameter, but we would
like to add one.
The original plan seems to have been to map these to 'string array'
types, but 'string list' is more natural, albeit marginally less
efficient. The implementation here just has to convert the 'char **'
into the OCaml linked list of values.
This pulls in the commits below, requiring us to replace all uses of
String.is_prefix and String.is_suffix.
Mostly done with Perl like this, and carefully checked by hand
afterwards since this doesn't get everything right:
$ perl -pi.bak -e 's/String.is_prefix ([^[:space:]\)]+) ([^[:space:]\)]+)/String.starts_with \2 \1/g' -- `git ls-files`
Richard W.M. Jones (3):
mlstdutils: Fix comment that still referred to the old function names
mldrivers: Link to gettext-stub if ocaml-gettext is enabled
mlstdutils: Rename String.is_prefix -> starts_with, is_suffix -> ends_with
These were previously written in very convoluted C which had to deal
with parsing the crazy output of the "lvm" command. In fact the
parsing was so complex that it was generated by the generator. It's
easier to do this in OCaml.
These are basically legacy APIs. They cannot be expanded and LVM
already supports many more fields. We should replace these with APIs
for getting single named fields from LVM.
This was implemented wrongly. In the XDR protocol, UUIDs are fixed
buffers of length 32. We can just use memcpy to copy from the OCaml
string to the UUID, but we have to ensure the string length returned
by OCaml is correct (if not we just assert, it's an internal error).
(It didn't even compile before, so we know it was never used).
This acts just like FString except that we do reverse device name
translation on it. The only use is in the 'pvs-full' API where we
will use it (in a subsequent commit) to reverse translate the pv_name
field (a device name) before returning it from the daemon.
Compare this to the 'pvs' API which also returns a list of device
names, but using the generator's 'RStructList (RDevice,...)' return
type, where RDevice is similarly reverse translated.
Note in the library-side bindings, because the name has already been
translated in the daemon, we just treat it exactly the same as
FString. The vast majority of this patch is this mechanical change.
All recent compilers support this (except MS compilers which we don't
care about). Assume it is supported. We test it in ./configure and
hard fail if it doesn't work.
We still define HAVE_ATTRIBUTE_CLEANUP but you can now assume it is
always defined and don't have to check it.
Commit d5b6f1df5f ("daemon: Allow parts of the daemon and APIs to be
written in OCaml.", 2017) contained a bug where in any OCaml function
that returns int64_t, the result was truncated to an int. This
particularly affected part_get_gpt_attributes as that returns large 64
bit numbers, but probably affects other functions too, undetected.
Fixes: commit d5b6f1df5f
Run this command across the source:
perl -pi.bak -e 's/(20[012][0-9])-20[12][012]/$1-2023/g' `git ls-files`
and remove changes to po{,-docs}/*.po{,t} (these will be regenerated
later when we run 'make dist').
In some places in the generator we were still generating "noalloc".
It was hidden from the previous regexp I used to replace these because
of string escaping.
Updates: commit a69cde79ca
This commit deprecates luks-open/luks-open-ro/luks-close for the more
generic sounding names cryptsetup-open/cryptsetup-close, which also
correspond directly to the cryptsetup commands.
The optional cryptsetup-open readonly flag is used to replace the
functionality of luks-open-ro.
The optional cryptsetup-open crypttype parameter can be used to select
the type (corresponding to cryptsetup open --type), which allows us to
open BitLocker-encrypted disks with no extra effort. As a convenience
the crypttype parameter may be omitted, and libguestfs will use a
heuristic (based on vfs-type output) to try to determine the correct
type to use.
The deprecated functions and the new functions are all (re-)written in
OCaml.
There is no new test here, unfortunately. It would be nice to test
Windows BitLocker support in this new API, however the Linux tools do
not support creating BitLocker disks, and while it is possible to
create one under Windows, the smallest compressed disk I could create
is 37M because of a mixture of the minimum support size for BitLocker
disks and the fact that encrypted parts of NTFS cannot be compressed.
Also synchronise with common module.
Use the cleanup handlers to free the structs (and list of structs) in
case their OCaml->C transformation fails for any reason; use calloc()
to not try to use uninitialized memory.
In case of lists, avoid allocating the memory for the array if there
are no elements, since the returned pointer in that case is either NULL,
or a free()-only pointer; also, set the list size only after the array
is allocated, to not confuse the XDR routines.
Add a way to generate OCaml interfaces for all the modules in the
daemon that implement APIs: this makes sure that for them the
interface of each function matches the actual API specified in the
generator.
We reimplemented some functions which can now be found in the OCaml
stdlib since 4.01 (or earlier). The functions I have dropped are:
- String.map
- |>
- iteri (replaced by List.iteri)
- mapi (replaced by List.mapi)
Note that our definition of iteri was slightly wrong: the type of the
function parameter was too wide, allowing (int -> 'a -> 'b) instead of
(int -> 'a -> unit).
I also added this new function to the Std_utils.String module as an
export from stdlib String:
- String.iteri
Thanks: Pino Toscano
If you have a struct containing ‘field’, eg:
type t = { field : int }
then previously to pattern-match on this type, eg. in function
parameters, you had to write:
let f { field = field } =
(* ... use field ... *)
In OCaml >= 3.12 it is possible to abbreviate cases where the field
being matched and the variable being bound have the same name, so now
you can just write:
let f { field } =
(* ... use field ... *)
(Similarly for a field prefixed by a Module name you can use
‘{ Module.field }’ instead of ‘{ Module.field = field }’).
This style is widely used inside the OCaml compiler sources, and is
briefer than the long form, so it makes sense to use it. Furthermore
there was one place in virt-dib where we are already using this new
style, so the old code did not compile on OCaml < 3.12.
See also:
https://forge.ocamlcore.org/docman/view.php/77/112/leroy-cug2010.pdf
This change allows parts of the daemon to be written in the OCaml
programming language. I am using the ‘Main Program in C’ method along
with ‘-output-obj’ to create an object file from the OCaml code /
runtime, as described here:
https://caml.inria.fr/pub/docs/manual-ocaml/intfc.html
Furthermore, change the generator to allow individual APIs to be
implemented in OCaml. This is picked by setting:
impl = OCaml <ocaml_function>;
The generator creates ‘do_function’ (the same one you would have to
write by hand in C), with the function calling the named
‘ocaml_function’ and dealing with marshalling/unmarshalling the OCaml
parameters.
After the previous refactoring, we are able to link the daemon to
common/utils, and also remove some of the "duplicate" functions that
the daemon carried ("duplicate" in quotes because they were often not
exact duplicates).
Also this removes the duplicate reimplementation of (most) cleanup
functions in the daemon, since those are provided by libutils now.
It also allows us in future (but not in this commit) to move utility
functions from the daemon into libutils.
The new module ‘Std_utils’ contains only functions which are pure
OCaml and depend only on the OCaml stdlib. Therefore these functions
may be used by the generator.
The new module is moved to ‘common/mlstdutils’.
This also removes the "<stdlib>" hack, and the code which copied the
library around.
Also ‘Guestfs_config’, ‘Libdir’ and ‘StringMap’ modules are moved
since these are essentially the same.
The bulk of this change is just updating files which use
‘open Common_utils’ to add ‘open Std_utils’ where necessary.
Previously the generator did not change any string returned from the
daemon. Thus guestfs_list_devices (for example) might return internal
device names like /dev/vda (if virtio-blk was in use).
This changes calls to the daemon so that returned strings are
annotated as plain strings, devices or mountables:
old ---> new
RString "uuid" RString (RPlainString "uuid")
RString "device" RString (RDevice "device")
RString "fs" RString (RMountable "fs")
For hash tables, keys and values must be annotated separately. For
example a hash table of mountables (keys) -> plain strings (values)
would be annotated like this:
old ---> new
RHashtable "fses" RHashtable (RMountable, RPlainString, "fses")
The daemon calls reverse_device_name_translation (currently a no-op)
for devices and mountables.
Note that this has no effect for calls which are handled on the
library side.
(cherry picked from commit 6b77cc196ecb8d7e1d73592ef65a189a7412c97c)
Previously we had lots of types like String, Device, StringList,
DeviceList, etc. where Device was just a String with magical
properties (but only inside the daemon), and DeviceList was just a
list of Device strings.
Replace these with some simple top-level types:
String
StringList
and move the magic into a subtype.
The change is mechanical, for example:
old ---> new
FileIn "filename" String (FileIn, "filename")
DeviceList "devices" StringList (Device, "devices")
Handling BufferIn is sufficiently different from a plain String
throughout all the bindings that it still uses a top-level type.
(Compare with FileIn/FileOut where the only difference is in the
protocol, but the bindings can uniformly treat it as a plain String.)
There is no semantic change, and the generated files are identical
except for a minor change in the (deprecated) Perl
%guestfs_introspection table.
In every instance where we used the ‘cancel_stmt’ parameter of
these macros:
ABS_PATH
NEED_ROOT
the value was only ever ‘cancel_receive ()’ or empty. We only use
‘cancel_receive’ for FileIn functions, so replace it with a simple
flag for whether the current function is a FileIn function.
This also removes some incorrect comments about macros that cannot be
used with FileIn functions when in fact they can.
As a simple consequence of the previous commit, every instance of the
‘fail_stmt’ parameter to one of the following macros:
RESOLVE_DEVICE
RESOLVE_MOUNTABLE
REQUIRE_ROOT_OR_RESOLVE_DEVICE
REQUIRE_ROOT_OR_RESOLVE_MOUNTABLE
is now ‘return’ and therefore we can replace it in the macro and drop
the parameter completely.
The generated code had:
...
done_no_free:
return;
}
There was no possible code between the label and the return statement,
so both the label and the return statement are redundant. The
instances of ‘goto done_no_free’ can simply be replaced by ‘return’.
The only small complication is that there is a label just before this
which contains some optional code. So we might have ended up with
generated code looking like:
done:
// if there was no optional code, this would be empty
}
which provokes a parse error in C. Therefore I had to add a semicolon
after the ‘done:’ label. This will be removed in a subsequent commit,
so it's just to make the current commit bisectable.
Sort the functions so the output is stable.
This changes the order in which the C API tests run. Previously we
ran the newest tests first, which was useful when we were frequently
adding new APIs. Now we run them in sorted order.
Run the following command over the source:
perl -pi.bak -e 's/(20[01][0-9])-2016/$1-2017/g' `git ls-files`
(Thanks Rich for the perl snippet, as used in past years.)
For a very long time we have maintained two sets of utility functions,
in mllib/common_utils.ml and generator/utils.ml. This changes things
so that the same set of utility functions can be shared with both
directories.
It's not possible to use common_utils.ml directly in the generator
because it provides several functions that use modules outside the
OCaml stdlib. Therefore we add some lightweight post-processing which
extracts the functions using only the stdlib:
(*<stdlib>*)
...
(*</stdlib>*)
and creates generator/common_utils.ml and generator/common_utils.mli
from that. The effect is we only need to write utility functions
once.
As with other tools, we still have generator-specific utility
functions in generator/utils.ml.
Also in this change:
- Use String.uppercase_ascii and String.lowercase_ascii in place
of deprecated String.uppercase/String.lowercase.
- Implement String.capitalize_ascii to replace deprecated
String.capitalize.
- Move isspace, isdigit, isxdigit functions to Char module.
This mostly mechanical change moves all of the libguestfs API
lists of functions into a struct in the Actions module.
It also adds filter functions to be used elsewhere to get
subsets of these functions.
Original code Replacement
all_functions actions
daemon_functions actions |> daemon_functions
non_daemon_functions actions |> non_daemon_functions
external_functions actions |> external_functions
internal_functions actions |> internal_functions
documented_functions actions |> documented_functions
fish_functions actions |> fish_functions
*_functions_sorted ... replacement as above ... |> sort
Commit b91b39e06a changed the separator to
':', although this creates parsing issues when there are fields with
colons (for example lv_tags=imgbased:pool).
Change once more the separator, but this time using a non-printable
character such as '\r': while it will produce uglier debug logs, this
should greatly reduce the possibilities of conflicts with texts of
metadata.