From 79d7fc867458ca290d446cdd6b478bb5de74875f Mon Sep 17 00:00:00 2001 From: Laszlo Ersek Date: Thu, 2 Sep 2021 15:51:23 +0200 Subject: [PATCH] tests/mount-local: exit child immediately when exec fails Each worker thread of "test-parallel-mount-local" performs the following steps (among others): (1) it starts an appliance dedicated to that thread, using a private scratch disk image, (2) exports a dedicated FUSE mount point on the host, exposing the file system on the appliance's disk, (3) launches a child process for manipulating the particular FUSE mount point on the host, (4) enters a FUSE request processing loop, translating requests between the host kernel (coming in via the FUSE mount point) and the appliance. Items to note: - The child process from step (3) consists of a single thread of execution (see fork() in POSIX): a duplicate of the parent process's respective worker thread. - The child process from step (3) blocks on any FUSE mount point access on the host until the worker thread in the parent process starts processing FUSE requests, in step (4). - The FUSE request processing in step (4), in the worker thread living in the parent process, terminates if and only if the child process unmounts the FUSE mount point originating from (2). Should the exec call in step (3) fail for any reason, the child currently jumps to the "error" label. This is wrong: under the error label, we call guestfs_close() on the appliance -- but the appliance is owned by the parent process's worker thread, not the child. What happens is that the child kills off the appliance while the parent's worker thread is in the FUSE request processing loop (4). The "error" label was never meant to be reached by the child process -- if exec fails for any reason, exit the child immediately. The parent will remain in the FUSE request processing loop (4) forever, but no state will be corrupted. For example, using another (interactive) session on the host, the FUSE mount points can be interacted with, and if all of them are manually unmounted, the FUSE request processing (4) completes in every worker thread. This patch does not fix the primary issue with "test-parallel-mount-local", but removes "chaos" from the symptoms. The next patch will fix the actual regression in this test case. Signed-off-by: Laszlo Ersek Message-Id: <20210902135124.15191-2-lersek@redhat.com> Acked-by: Richard W.M. Jones --- tests/mount-local/test-parallel-mount-local.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/mount-local/test-parallel-mount-local.c b/tests/mount-local/test-parallel-mount-local.c index d3db6914f..5f00e328a 100644 --- a/tests/mount-local/test-parallel-mount-local.c +++ b/tests/mount-local/test-parallel-mount-local.c @@ -223,7 +223,7 @@ start_thread (void *statevp) execlp ("./test-parallel-mount-local", "test-parallel-mount-local", "--test", state->mp, NULL); perror ("execlp"); - goto error; + _exit (EXIT_FAILURE); } /* Run the FUSE main loop. We don't really want to see libguestfs