Message ID | 1465932382-28645-3-git-send-email-joel.holdsworth@vcatechnology.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Le 14/06/2016 à 21:26, Joel Holdsworth a écrit : > Previously, when emulating execve(2), qemu would execute a child > instance of the emulator with the environment variables provided by > the parent process. This caused problems with qemu if any of the > variables affected the child emulator's behaviour e.g. > LD_LIBRARY_PATH. The best way to avoid that is to use a statically linked qemu. > This patch solves this issue by passing the environment variables > with '-E' arguments to the child qemu instance. The call to > execve(2) is replaced by a call to execv(2) so that the parent > emulator's environment variable state is propagated into the child. > > Any variables from the host environment that are not in the in the > execve() call are removed with a '-U' argument. Run ./scripts/checkpatch.pl on your patch... and add your Signed-off-by here. The environment is already managed in linux-user/main.c:main(), I don't understand why the qemu_execve() special case should differ from the general case. Thanks, Laurent
On 15/06/16 20:59, Laurent Vivier wrote: > > Le 14/06/2016 à 21:26, Joel Holdsworth a écrit : >> Previously, when emulating execve(2), qemu would execute a child >> instance of the emulator with the environment variables provided by >> the parent process. This caused problems with qemu if any of the >> variables affected the child emulator's behaviour e.g. >> LD_LIBRARY_PATH. > The best way to avoid that is to use a statically linked qemu. Stepping back a bit; the problem I'm trying to solve is this... There are some processes that invoke a helper process to do some work for them e.g. gstreamer's gst-plugin-scanner. Previously qemu would attempt to execute the helper executable as if it were machine-native, which won't work. These patches modify qemu so that it will (optionally) run the child process inside a child instance of qemu. My experience as a user was that it took a couple of hours of searching through strace logs to figure out what the issue was. gstreamer would just fail with a generic error about the helper. These patches are meant to make qemu do the right thing. Saying to the user that they should make a static linked build of qemu isn't very practical. Having a command line argument is a much easier solution for the user, that doesn't force them not to used shared-library builds. The distros aren't going to go for that. Moreover, LD_LIBRARY_PATH is just one example. LD_PRELOAD is another. Timezone and locale environment variables are also an issue. In an ideal world, it wouldn't even be necessary to add an argument - qemu would just do the right thing automatically. Though I'm not sure how that could be done correctly, so a command line option is a good compromise for a starting point. > >> This patch solves this issue by passing the environment variables >> with '-E' arguments to the child qemu instance. The call to >> execve(2) is replaced by a call to execv(2) so that the parent >> emulator's environment variable state is propagated into the child. >> >> Any variables from the host environment that are not in the in the >> execve() call are removed with a '-U' argument. > Run ./scripts/checkpatch.pl on your patch... > > and add your Signed-off-by here. Sure ok. > The environment is already managed in linux-user/main.c:main(), I don't > understand why the qemu_execve() special case should differ from the > general case. Maybe I've missed something, but the problem I'm trying to solve here is the issue of correctly propagating the guest environment variables into the child process. The parent guest process (running inside qemu-user) constructs a set of environment variables and passes them to execve. However, if the parent qemu-user decides to run a child instance of qemu-user it should run with the same environment variables as the parent, but with environment variables the parent-guest passed through the arguments for the child-guest. If gstreamer decides to run gst-plugin-scanner with a certain environ, that is for the child qemu guest, not the child qemu instance itself.
Le 20/06/2016 à 21:51, Joel Holdsworth a écrit : > On 15/06/16 20:59, Laurent Vivier wrote: >> >> Le 14/06/2016 à 21:26, Joel Holdsworth a écrit : >>> Previously, when emulating execve(2), qemu would execute a child >>> instance of the emulator with the environment variables provided by >>> the parent process. This caused problems with qemu if any of the >>> variables affected the child emulator's behaviour e.g. >>> LD_LIBRARY_PATH. >> The best way to avoid that is to use a statically linked qemu. > > Stepping back a bit; the problem I'm trying to solve is this... > > There are some processes that invoke a helper process to do some work > for them e.g. gstreamer's gst-plugin-scanner. Previously qemu would > attempt to execute the helper executable as if it were machine-native, > which won't work. These patches modify qemu so that it will (optionally) > run the child process inside a child instance of qemu. If the context is to use qemu to have a cross build/test environment, I like the idea, but you should use chroot/binfmt to do that. Even without the architecture change, the build/test environment must be isolated (chroot) from the host environment to know exactly what you build/test. > > My experience as a user was that it took a couple of hours of searching > through strace logs to figure out what the issue was. gstreamer would > just fail with a generic error about the helper. These patches are meant > to make qemu do the right thing. > > Saying to the user that they should make a static linked build of qemu > isn't very practical. Having a command line argument is a much easier > solution for the user, that doesn't force them not to used > shared-library builds. The distros aren't going to go for that. You can provide the RPM/DEB with the statically linked qemu. (it will have no dependencies) > Moreover, LD_LIBRARY_PATH is just one example. LD_PRELOAD is another. > Timezone and locale environment variables are also an issue. all LD_ are for the ld.so, the dynamic loader, and with a statically linked qemu, you don't use the host ld.so (see ld.so(8)). Why timezone and local environment variables are also an issue? > In an ideal world, it wouldn't even be necessary to add an argument - > qemu would just do the right thing automatically. Though I'm not sure > how that could be done correctly, so a command line option is a good > compromise for a starting point. > > >> >>> This patch solves this issue by passing the environment variables >>> with '-E' arguments to the child qemu instance. The call to >>> execve(2) is replaced by a call to execv(2) so that the parent >>> emulator's environment variable state is propagated into the child. >>> >>> Any variables from the host environment that are not in the in the >>> execve() call are removed with a '-U' argument. >> Run ./scripts/checkpatch.pl on your patch... >> >> and add your Signed-off-by here. > Sure ok. > > >> The environment is already managed in linux-user/main.c:main(), I don't >> understand why the qemu_execve() special case should differ from the >> general case. > Maybe I've missed something, but the problem I'm trying to solve here is > the issue of correctly propagating the guest environment variables into > the child process. > > The parent guest process (running inside qemu-user) constructs a set of > environment variables and passes them to execve. However, if the parent > qemu-user decides to run a child instance of qemu-user it should run > with the same environment variables as the parent, but with environment > variables the parent-guest passed through the arguments for the > child-guest. > > If gstreamer decides to run gst-plugin-scanner with a certain environ, > that is for the child qemu guest, not the child qemu instance itself. Child qemu instance should just ignore it. Thanks, Laurent
On 20/06/16 21:29, Laurent Vivier wrote: > > Le 20/06/2016 à 21:51, Joel Holdsworth a écrit : >> On 15/06/16 20:59, Laurent Vivier wrote: >>> Le 14/06/2016 à 21:26, Joel Holdsworth a écrit : >>>> Previously, when emulating execve(2), qemu would execute a child >>>> instance of the emulator with the environment variables provided by >>>> the parent process. This caused problems with qemu if any of the >>>> variables affected the child emulator's behaviour e.g. >>>> LD_LIBRARY_PATH. >>> The best way to avoid that is to use a statically linked qemu. >> Stepping back a bit; the problem I'm trying to solve is this... >> >> There are some processes that invoke a helper process to do some work >> for them e.g. gstreamer's gst-plugin-scanner. Previously qemu would >> attempt to execute the helper executable as if it were machine-native, >> which won't work. These patches modify qemu so that it will (optionally) >> run the child process inside a child instance of qemu. > If the context is to use qemu to have a cross build/test environment, I > like the idea, but you should use chroot/binfmt to do that. > > Even without the architecture change, the build/test environment must be > isolated (chroot) from the host environment to know exactly what you > build/test. I do know what we test though: (a gstreamer unit test) -> (gst-plugin-scanner). chroot+binfmt is a fine solution for testing a whole user-space, but rather overkill for just a single program. Also, chroot and binfmt require root permissions. Also the libraries have to be installed in a rootfs tree - which isn't how my use case works. >> My experience as a user was that it took a couple of hours of searching >> through strace logs to figure out what the issue was. gstreamer would >> just fail with a generic error about the helper. These patches are meant >> to make qemu do the right thing. >> >> Saying to the user that they should make a static linked build of qemu >> isn't very practical. Having a command line argument is a much easier >> solution for the user, that doesn't force them not to used >> shared-library builds. The distros aren't going to go for that. > You can provide the RPM/DEB with the statically linked qemu. > (it will have no dependencies) Users could do that, but as far as I'm concerned that isn't really satisfactory. The current behaviour was quite unexpected to me - there were no warnings, and the need to link qemu statically isn't documented anywhere. If you really believe that static linking is the best answer here, then shouldn't the shared library option be removed? Because with the shared-library build, qemu-user is somewhat "broken". But the distros won't like that because of the induced bloat. >> Moreover, LD_LIBRARY_PATH is just one example. LD_PRELOAD is another. >> Timezone and locale environment variables are also an issue. > all LD_ are for the ld.so, the dynamic loader, and with a statically > linked qemu, you don't use the host ld.so (see ld.so(8)). > > Why timezone and local environment variables are also an issue? All these environment variables affect the behaviour of qemu's glibc - these are examples of the the parent-guest being able to modify the behaviour of the child-host if the execve qemu wrapper patch is integrated. The correct way to pass the execve environ to the child qemu wrapper is through -E and -U arguments. > Child qemu instance should just ignore it. Thanks, Laurent The child qemu can't control what glibc will respond to. There are a lot of environment variables that can affect it: http://www.scratchbox.org/documentation/general/tutorials/glibcenv.html For example with LANG=, the parent-guest process might want to run the child-guest in Japanese, but the child-qemu should still run in English.
On 20 June 2016 at 22:27, Joel Holdsworth <joel.holdsworth@vcatechnology.com> wrote: > The current behaviour was quite unexpected to me - there were no warnings, > and the need to link qemu statically isn't documented anywhere. If you > really believe that static linking is the best answer here, then shouldn't > the shared library option be removed? Because with the shared-library build, > qemu-user is somewhat "broken". Shared library QEMU works fine for simple use cases ("run a gcc test case", for instance) or where the guest binary is linked statically and doesn't much care where it runs. > But the distros won't like that because of the induced bloat. I don't know of a distro which doesn't ship a statically linked QEMU offhand. Debian, Ubuntu and SUSE certainly all do. You basically need it for the chroot case, which is a really common one. [These days if your distro-in-the-chroot is multiarch you could put all the dynamic libraries for the QEMU binary in it too, but in practice being able to just copy a single binary in is much easier.] thanks -- PMM
On 20/06/16 22:40, Peter Maydell wrote: > On 20 June 2016 at 22:27, Joel Holdsworth > <joel.holdsworth@vcatechnology.com> wrote: >> The current behaviour was quite unexpected to me - there were no warnings, >> and the need to link qemu statically isn't documented anywhere. If you >> really believe that static linking is the best answer here, then shouldn't >> the shared library option be removed? Because with the shared-library build, >> qemu-user is somewhat "broken". > Shared library QEMU works fine for simple use cases ("run a gcc > test case", for instance) or where the guest binary is linked > statically and doesn't much care where it runs. > >> But the distros won't like that because of the induced bloat. > I don't know of a distro which doesn't ship a statically linked > QEMU offhand. Debian, Ubuntu and SUSE certainly all do. You > basically need it for the chroot case, which is a really common one. > [These days if your distro-in-the-chroot is multiarch you could > put all the dynamic libraries for the QEMU binary in it too, > but in practice being able to just copy a single binary in is much > easier.] > > thanks > -- PMM Fair enough. It sounds like this is the way needs to be! Even so, there is still the issue of the other glibc environment variables - see my LANG= example of the parent-guest wanting to run a child-guest Japanese, but the child-qemu should still run in English.
On 20 June 2016 at 23:15, Joel Holdsworth <joel.holdsworth@vcatechnology.com> wrote: > Even so, there is still the issue of the other glibc environment variables - > see my LANG= example of the parent-guest wanting to run a child-guest > Japanese, but the child-qemu should still run in English. That particular example is easy in that I don't think qemu user-mode cares about LANG at all :-) Needing to be root to set up a chroot is genuinely awkward, though. proot (https://github.com/proot-me/PRoot) is one approach to that which uses and wraps around a stock QEMU to do many of the things you'd need a chroot for without requiring root to set it up. (it uses the ptrace API to intercept various syscalls and fix up paths, etc.) Mostly I think the set of possible solutions QEMU has to "run a guest binary" are the way they are for historical reasons: * we have "-L sysroot" which works for simple cases where you have the dynamic libraries to hand, but which has some bad failure modes if the directory tree you point it at is large or has symlinks (including QEMU just going into an infinite loop at startup) * the chroot approach is the most flexible, and handily doesn't need much support in QEMU at all, but needing root support to set it up is a pain and this hasn't much changed in many years. I think my main concern is that if we add a third model of how usermode works (or extend one of the existing ones) that we do it with a clear idea of where we're going with that, rather than adding bits and pieces ad-hoc. [I haven't read the rest of this patchset, so this isn't a specific comment on it, just a general observation.] We could certainly stand to document things better too. thanks -- PMM
diff --git a/linux-user/syscall.c b/linux-user/syscall.c index a478f56..440986e 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -6665,9 +6665,13 @@ static abi_long qemu_execve(char *filename, char *argv[], char *envp[]) { char *i_arg = NULL, *i_name = NULL; - char **new_argp; - int argc, fd, ret, i, offset = 3; + char **qemu_argp, **argp; + int i, j; + size_t qemu_argc = 3, argc, host_envc, envpc; + int fd, ret; char *cp; + size_t def_envc = 0, undef_envc = 0; + char **def_env, **undef_env; char buf[BINPRM_BUF_SIZE]; /* normal execve case */ @@ -6675,10 +6679,12 @@ static abi_long qemu_execve(char *filename, char *argv[], return get_errno(execve(filename, argv, envp)); } - for (argc = 0; argv[argc] != NULL; argc++) { - /* nothing */ ; - } + /* count the number of arguments and environment variables */ + for (argc = 0; argv[argc]; argc++); + for (host_envc = 0; environ[host_envc]; host_envc++); + for (envpc = 0; envp[envpc]; envpc++); + /* read the file header so we can check the shebang */ fd = open(filename, O_RDONLY); if (fd == -1) { return get_errno(fd); @@ -6739,36 +6745,74 @@ static abi_long qemu_execve(char *filename, char *argv[], i_arg = cp; } - if (i_arg) { - offset = 5; - } else { - offset = 4; - } + if (i_arg) + qemu_argc += 2; + else + qemu_argc += 1; + } + + /* list environment variables to define */ + def_env = alloca((envpc + 1) * sizeof(envp[0])); + for (i = 0; i != envpc; i++) { + for (j = 0; j != host_envc; j++) + if (!strcmp(envp[i], environ[j])) + break; + if (j == host_envc) + def_env[def_envc++] = envp[i]; } - new_argp = alloca((argc + offset + 1) * sizeof(void *)); + qemu_argc += def_envc * 2; - /* Copy the original arguments with offset */ - for (i = 0; i < argc; i++) { - new_argp[i + offset] = argv[i]; + /* list environment variables to undefine */ + undef_env = alloca((host_envc + 1) * sizeof(undef_env[0])); + for (i = 0; i != host_envc; i++) { + const char *const host_env = environ[i]; + const size_t key_len = strchr(host_env, '=') - host_env; + for (j = 0; j != envpc; j++) + if (!strncmp(host_env, envp[j], key_len)) + break; + if (j == envpc) + undef_env[undef_envc++] = strndup(environ[i], key_len); } - new_argp[0] = strdup(qemu_execve_path); - new_argp[1] = strdup("-0"); - new_argp[offset] = filename; - new_argp[argc + offset] = NULL; + qemu_argc += undef_envc * 2; - if (i_name) { - new_argp[2] = i_name; - new_argp[3] = i_name; + /* allocate the argument list */ + argp = qemu_argp = alloca((qemu_argc + 1) * sizeof(void *)); - if (i_arg) { - new_argp[4] = i_arg; - } + /* set up the qemu arguments */ + *argp++ = strdup(qemu_execve_path); + + /* add arguments for the enironment variables */ + for (i = 0; i < def_envc; i++) { + *argp++ = strdup("-E"); + *argp++ = def_env[i]; + } + + for (i = 0; i < undef_envc; i++) { + *argp++ = strdup("-U"); + *argp++ = undef_env[i]; + } + + /* add the path to the executable */ + *argp++ = strdup("-0"); + if (i_name) { + *argp++ = i_name; + *argp++ = i_name; + if (i_arg) + *argp++ = i_arg; } else { - new_argp[2] = argv[0]; + *argp++ = argv[0]; } + *argp++ = filename; + + /* copy the original arguments with offset */ + for (i = 1; i < argc; i++) + *argp++ = argv[i]; + + *argp++ = NULL; + /* Although execve() is not an interruptible syscall it is * a special case where we must use the safe_syscall wrapper: * if we allow a signal to happen before we make the host @@ -6779,7 +6823,7 @@ static abi_long qemu_execve(char *filename, char *argv[], * before the execve completes and makes it the other * program's problem. */ - return get_errno(safe_execve(qemu_execve_path, new_argp, envp)); + return get_errno(safe_execve(qemu_execve_path, qemu_argp, environ)); } /* do_syscall() should always have a single exit point at the end so