Message ID | 20240726075502.4054284-1-cleger@rivosinc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v5] osdep: add a qemu_close_all_open_fd() helper | expand |
Hi Clément, On 26/7/24 09:54, Clément Léger wrote: > Since commit 03e471c41d8b ("qemu_init: increase NOFILE soft limit on > POSIX"), the maximum number of file descriptors that can be opened are > raised to nofile.rlim_max. On recent debian distro, this yield a maximum > of 1073741816 file descriptors. Now, when forking to start > qemu-bridge-helper, this actually calls close() on the full possible file > descriptor range (more precisely [3 - sysconf(_SC_OPEN_MAX)]) which > takes a considerable amount of time. In order to reduce that time, > factorize existing code to close all open files descriptors in a new > qemu_close_all_open_fd() function. This function uses various methods > to close all the open file descriptors ranging from the most efficient > one to the least one. It also accepts an ordered array of file > descriptors that should not be closed since this is required by the > callers that calls it after forking. Since this function is not used > for Win32, do not implement it to force an error at link time if used. > > Signed-off-by: Clément Léger <cleger@rivosinc.com> > Reviewed-by: Richard Henderson <richard.henderson@linaro.org> > --- > v5: > - Move qemu_close_all_open_fd() to oslib-posix.c since it does not > compile on windows and is not even used on it. > - v4: https://lore.kernel.org/qemu-devel/20240717124534.1200735-1-cleger@rivosinc.com/ > > v4: > - Add a comment saying that qemu_close_all_open_fd() can take a NULL skip > array and nskip == 0 > - Added an assert in qemu_close_all_open_fd() to check for skip/nskip > parameters > - Fix spurious tabs instead of spaces > - Applied checkpatch > - v3: https://lore.kernel.org/qemu-devel/20240716144006.6571-1-cleger@rivosinc.com/ > > v3: > - Use STD*_FILENO defines instead of raw values > - Fix indentation of close_all_fds_after_fork() > - Check for nksip in fallback code > - Check for path starting with a '.' in qemu_close_all_open_fd_proc() > - Use unsigned for cur_skip > - Move ifdefs inside close_fds functions rather than redefining them > - Remove uneeded 'if(nskip)' test > - Add comments to close_range version > - Reduce range of skip fd as we find them in > - v2: https://lore.kernel.org/qemu-devel/20240618111704.63092-1-cleger@rivosinc.com/ > > v2: > - Factorize async_teardown.c close_fds implementation as well as tap.c ones > - Apply checkpatch > - v1: https://lore.kernel.org/qemu-devel/20240617162520.4045016-1-cleger@rivosinc.com/ > > --- > include/qemu/osdep.h | 9 +++ > net/tap.c | 33 +++++----- > system/async-teardown.c | 37 +----------- > util/oslib-posix.c | 131 ++++++++++++++++++++++++++++++++++++++++ > 4 files changed, 160 insertions(+), 50 deletions(-) I'm getting this error on msys2, not sure if related: WARNING: Failed to terminate process: 1 error occurred: * failed to attach the runner process to the console of its parent process: The handle is invalid. I find your patch hard to review. Do you mind splitting as trivial changes? Something like: - Expose close_all_open_fd() renamed as qemu_close_all_open_fd() - Rework qemu_close_all_open_fd() - Factor close_all_fds_after_fork() in net/tap.c - Use qemu_close_all_open_fd() in net/tap.c Thanks, Phil.
On 29/07/2024 16:00, Philippe Mathieu-Daudé wrote: > Hi Clément, > > On 26/7/24 09:54, Clément Léger wrote: >> Since commit 03e471c41d8b ("qemu_init: increase NOFILE soft limit on >> POSIX"), the maximum number of file descriptors that can be opened are >> raised to nofile.rlim_max. On recent debian distro, this yield a maximum >> of 1073741816 file descriptors. Now, when forking to start >> qemu-bridge-helper, this actually calls close() on the full possible file >> descriptor range (more precisely [3 - sysconf(_SC_OPEN_MAX)]) which >> takes a considerable amount of time. In order to reduce that time, >> factorize existing code to close all open files descriptors in a new >> qemu_close_all_open_fd() function. This function uses various methods >> to close all the open file descriptors ranging from the most efficient >> one to the least one. It also accepts an ordered array of file >> descriptors that should not be closed since this is required by the >> callers that calls it after forking. Since this function is not used >> for Win32, do not implement it to force an error at link time if used. >> >> Signed-off-by: Clément Léger <cleger@rivosinc.com> >> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> >> --- >> v5: >> - Move qemu_close_all_open_fd() to oslib-posix.c since it does not >> compile on windows and is not even used on it. >> - v4: >> https://lore.kernel.org/qemu-devel/20240717124534.1200735-1-cleger@rivosinc.com/ >> >> v4: >> - Add a comment saying that qemu_close_all_open_fd() can take a NULL >> skip >> array and nskip == 0 >> - Added an assert in qemu_close_all_open_fd() to check for skip/nskip >> parameters >> - Fix spurious tabs instead of spaces >> - Applied checkpatch >> - v3: >> https://lore.kernel.org/qemu-devel/20240716144006.6571-1-cleger@rivosinc.com/ >> >> v3: >> - Use STD*_FILENO defines instead of raw values >> - Fix indentation of close_all_fds_after_fork() >> - Check for nksip in fallback code >> - Check for path starting with a '.' in qemu_close_all_open_fd_proc() >> - Use unsigned for cur_skip >> - Move ifdefs inside close_fds functions rather than redefining them >> - Remove uneeded 'if(nskip)' test >> - Add comments to close_range version >> - Reduce range of skip fd as we find them in >> - v2: >> https://lore.kernel.org/qemu-devel/20240618111704.63092-1-cleger@rivosinc.com/ >> >> v2: >> - Factorize async_teardown.c close_fds implementation as well as >> tap.c ones >> - Apply checkpatch >> - v1: >> https://lore.kernel.org/qemu-devel/20240617162520.4045016-1-cleger@rivosinc.com/ >> >> --- >> include/qemu/osdep.h | 9 +++ >> net/tap.c | 33 +++++----- >> system/async-teardown.c | 37 +----------- >> util/oslib-posix.c | 131 ++++++++++++++++++++++++++++++++++++++++ >> 4 files changed, 160 insertions(+), 50 deletions(-) > > I'm getting this error on msys2, not sure if related: > > WARNING: Failed to terminate process: 1 error occurred: > * failed to attach the runner process to the console of its parent > process: The handle is invalid. > > I find your patch hard to review. Do you mind splitting as trivial > changes? Something like: > > - Expose close_all_open_fd() renamed as qemu_close_all_open_fd() > - Rework qemu_close_all_open_fd() > - Factor close_all_fds_after_fork() in net/tap.c > - Use qemu_close_all_open_fd() in net/tap.c Yes sure, I'll do that. Clément > > Thanks, > > Phil.
On 7/30/24 00:20, Clément Léger wrote: > > > On 29/07/2024 16:00, Philippe Mathieu-Daudé wrote: >> Hi Clément, >> >> On 26/7/24 09:54, Clément Léger wrote: >>> Since commit 03e471c41d8b ("qemu_init: increase NOFILE soft limit on >>> POSIX"), the maximum number of file descriptors that can be opened are >>> raised to nofile.rlim_max. On recent debian distro, this yield a maximum >>> of 1073741816 file descriptors. Now, when forking to start >>> qemu-bridge-helper, this actually calls close() on the full possible file >>> descriptor range (more precisely [3 - sysconf(_SC_OPEN_MAX)]) which >>> takes a considerable amount of time. In order to reduce that time, >>> factorize existing code to close all open files descriptors in a new >>> qemu_close_all_open_fd() function. This function uses various methods >>> to close all the open file descriptors ranging from the most efficient >>> one to the least one. It also accepts an ordered array of file >>> descriptors that should not be closed since this is required by the >>> callers that calls it after forking. Since this function is not used >>> for Win32, do not implement it to force an error at link time if used. >>> >>> Signed-off-by: Clément Léger <cleger@rivosinc.com> >>> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> >>> --- >>> v5: >>> - Move qemu_close_all_open_fd() to oslib-posix.c since it does not >>> compile on windows and is not even used on it. >>> - v4: >>> https://lore.kernel.org/qemu-devel/20240717124534.1200735-1-cleger@rivosinc.com/ >>> >>> v4: >>> - Add a comment saying that qemu_close_all_open_fd() can take a NULL >>> skip >>> array and nskip == 0 >>> - Added an assert in qemu_close_all_open_fd() to check for skip/nskip >>> parameters >>> - Fix spurious tabs instead of spaces >>> - Applied checkpatch >>> - v3: >>> https://lore.kernel.org/qemu-devel/20240716144006.6571-1-cleger@rivosinc.com/ >>> >>> v3: >>> - Use STD*_FILENO defines instead of raw values >>> - Fix indentation of close_all_fds_after_fork() >>> - Check for nksip in fallback code >>> - Check for path starting with a '.' in qemu_close_all_open_fd_proc() >>> - Use unsigned for cur_skip >>> - Move ifdefs inside close_fds functions rather than redefining them >>> - Remove uneeded 'if(nskip)' test >>> - Add comments to close_range version >>> - Reduce range of skip fd as we find them in >>> - v2: >>> https://lore.kernel.org/qemu-devel/20240618111704.63092-1-cleger@rivosinc.com/ >>> >>> v2: >>> - Factorize async_teardown.c close_fds implementation as well as >>> tap.c ones >>> - Apply checkpatch >>> - v1: >>> https://lore.kernel.org/qemu-devel/20240617162520.4045016-1-cleger@rivosinc.com/ >>> >>> --- >>> include/qemu/osdep.h | 9 +++ >>> net/tap.c | 33 +++++----- >>> system/async-teardown.c | 37 +----------- >>> util/oslib-posix.c | 131 ++++++++++++++++++++++++++++++++++++++++ >>> 4 files changed, 160 insertions(+), 50 deletions(-) >> >> I'm getting this error on msys2, not sure if related: >> >> WARNING: Failed to terminate process: 1 error occurred: >> * failed to attach the runner process to the console of its parent >> process: The handle is invalid. >> >> I find your patch hard to review. Do you mind splitting as trivial >> changes? Something like: >> >> - Expose close_all_open_fd() renamed as qemu_close_all_open_fd() >> - Rework qemu_close_all_open_fd() >> - Factor close_all_fds_after_fork() in net/tap.c >> - Use qemu_close_all_open_fd() in net/tap.c > > Yes sure, I'll do that. If you're making updates, I think you should drop the linux ifdef for /proc/self/fd. This is also present on Solaris. Importantly, it'll compile on all POSIX systems, whether or not /proc is available. r~
On 29/07/2024 23:32, Richard Henderson wrote: > On 7/30/24 00:20, Clément Léger wrote: >> >> >> On 29/07/2024 16:00, Philippe Mathieu-Daudé wrote: >>> Hi Clément, >>> >>> On 26/7/24 09:54, Clément Léger wrote: >>>> Since commit 03e471c41d8b ("qemu_init: increase NOFILE soft limit on >>>> POSIX"), the maximum number of file descriptors that can be opened are >>>> raised to nofile.rlim_max. On recent debian distro, this yield a >>>> maximum >>>> of 1073741816 file descriptors. Now, when forking to start >>>> qemu-bridge-helper, this actually calls close() on the full possible >>>> file >>>> descriptor range (more precisely [3 - sysconf(_SC_OPEN_MAX)]) which >>>> takes a considerable amount of time. In order to reduce that time, >>>> factorize existing code to close all open files descriptors in a new >>>> qemu_close_all_open_fd() function. This function uses various methods >>>> to close all the open file descriptors ranging from the most efficient >>>> one to the least one. It also accepts an ordered array of file >>>> descriptors that should not be closed since this is required by the >>>> callers that calls it after forking. Since this function is not used >>>> for Win32, do not implement it to force an error at link time if used. >>>> >>>> Signed-off-by: Clément Léger <cleger@rivosinc.com> >>>> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> >>>> --- >>>> v5: >>>> - Move qemu_close_all_open_fd() to oslib-posix.c since it does not >>>> compile on windows and is not even used on it. >>>> - v4: >>>> https://lore.kernel.org/qemu-devel/20240717124534.1200735-1-cleger@rivosinc.com/ >>>> >>>> v4: >>>> - Add a comment saying that qemu_close_all_open_fd() can take a NULL >>>> skip >>>> array and nskip == 0 >>>> - Added an assert in qemu_close_all_open_fd() to check for >>>> skip/nskip >>>> parameters >>>> - Fix spurious tabs instead of spaces >>>> - Applied checkpatch >>>> - v3: >>>> https://lore.kernel.org/qemu-devel/20240716144006.6571-1-cleger@rivosinc.com/ >>>> >>>> v3: >>>> - Use STD*_FILENO defines instead of raw values >>>> - Fix indentation of close_all_fds_after_fork() >>>> - Check for nksip in fallback code >>>> - Check for path starting with a '.' in >>>> qemu_close_all_open_fd_proc() >>>> - Use unsigned for cur_skip >>>> - Move ifdefs inside close_fds functions rather than redefining them >>>> - Remove uneeded 'if(nskip)' test >>>> - Add comments to close_range version >>>> - Reduce range of skip fd as we find them in >>>> - v2: >>>> https://lore.kernel.org/qemu-devel/20240618111704.63092-1-cleger@rivosinc.com/ >>>> >>>> v2: >>>> - Factorize async_teardown.c close_fds implementation as well as >>>> tap.c ones >>>> - Apply checkpatch >>>> - v1: >>>> https://lore.kernel.org/qemu-devel/20240617162520.4045016-1-cleger@rivosinc.com/ >>>> >>>> --- >>>> include/qemu/osdep.h | 9 +++ >>>> net/tap.c | 33 +++++----- >>>> system/async-teardown.c | 37 +----------- >>>> util/oslib-posix.c | 131 >>>> ++++++++++++++++++++++++++++++++++++++++ >>>> 4 files changed, 160 insertions(+), 50 deletions(-) >>> >>> I'm getting this error on msys2, not sure if related: >>> >>> WARNING: Failed to terminate process: 1 error occurred: >>> * failed to attach the runner process to the console of its parent >>> process: The handle is invalid. >>> >>> I find your patch hard to review. Do you mind splitting as trivial >>> changes? Something like: >>> >>> - Expose close_all_open_fd() renamed as qemu_close_all_open_fd() >>> - Rework qemu_close_all_open_fd() >>> - Factor close_all_fds_after_fork() in net/tap.c >>> - Use qemu_close_all_open_fd() in net/tap.c >> >> Yes sure, I'll do that. > > If you're making updates, I think you should drop the linux ifdef for > /proc/self/fd. This is also present on Solaris. Importantly, it'll > compile on all POSIX systems, whether or not /proc is available. Acked Thanks, Clément > > > r~
diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h index 191916f38e..ae98c12810 100644 --- a/include/qemu/osdep.h +++ b/include/qemu/osdep.h @@ -757,6 +757,15 @@ static inline void qemu_reset_optind(void) int qemu_fdatasync(int fd); +/** + * Close all open file descriptors except the ones supplied in the @skip array + * + * @skip: ordered array of distinct file descriptors that should not be closed + * if any, or NULL. + * @nskip: number of entries in the @skip array or 0 if @skip is NULL. + */ +void qemu_close_all_open_fd(const int *skip, unsigned int nskip); + /** * Sync changes made to the memory mapped file back to the backing * storage. For POSIX compliant systems this will fallback diff --git a/net/tap.c b/net/tap.c index 51f7aec39d..45cb963142 100644 --- a/net/tap.c +++ b/net/tap.c @@ -385,6 +385,23 @@ static TAPState *net_tap_fd_init(NetClientState *peer, return s; } +static void close_all_fds_after_fork(int excluded_fd) +{ + const int skip_fd[] = {STDIN_FILENO, STDOUT_FILENO, STDERR_FILENO, + excluded_fd}; + unsigned int nskip = ARRAY_SIZE(skip_fd); + + /* + * skip_fd must be an ordered array of distinct fds, exclude + * excluded_fd if already included in the [STDIN_FILENO - STDERR_FILENO] + * range + */ + if (excluded_fd <= STDERR_FILENO) { + nskip--; + } + qemu_close_all_open_fd(skip_fd, nskip); +} + static void launch_script(const char *setup_script, const char *ifname, int fd, Error **errp) { @@ -400,13 +417,7 @@ static void launch_script(const char *setup_script, const char *ifname, return; } if (pid == 0) { - int open_max = sysconf(_SC_OPEN_MAX), i; - - for (i = 3; i < open_max; i++) { - if (i != fd) { - close(i); - } - } + close_all_fds_after_fork(fd); parg = args; *parg++ = (char *)setup_script; *parg++ = (char *)ifname; @@ -490,17 +501,11 @@ static int net_bridge_run_helper(const char *helper, const char *bridge, return -1; } if (pid == 0) { - int open_max = sysconf(_SC_OPEN_MAX), i; char *fd_buf = NULL; char *br_buf = NULL; char *helper_cmd = NULL; - for (i = 3; i < open_max; i++) { - if (i != sv[1]) { - close(i); - } - } - + close_all_fds_after_fork(sv[1]); fd_buf = g_strdup_printf("%s%d", "--fd=", sv[1]); if (strrchr(helper, ' ') || strrchr(helper, '\t')) { diff --git a/system/async-teardown.c b/system/async-teardown.c index 396963c091..9148ee8d04 100644 --- a/system/async-teardown.c +++ b/system/async-teardown.c @@ -26,40 +26,6 @@ static pid_t the_ppid; -/* - * Close all open file descriptors. - */ -static void close_all_open_fd(void) -{ - struct dirent *de; - int fd, dfd; - DIR *dir; - -#ifdef CONFIG_CLOSE_RANGE - int r = close_range(0, ~0U, 0); - if (!r) { - /* Success, no need to try other ways. */ - return; - } -#endif - - dir = opendir("/proc/self/fd"); - if (!dir) { - /* If /proc is not mounted, there is nothing that can be done. */ - return; - } - /* Avoid closing the directory. */ - dfd = dirfd(dir); - - for (de = readdir(dir); de; de = readdir(dir)) { - fd = atoi(de->d_name); - if (fd != dfd) { - close(fd); - } - } - closedir(dir); -} - static void hup_handler(int signal) { /* Check every second if this process has been reparented. */ @@ -85,9 +51,8 @@ static int async_teardown_fn(void *arg) /* * Close all file descriptors that might have been inherited from the * main qemu process when doing clone, needed to make libvirt happy. - * Not using close_range for increased compatibility with older kernels. */ - close_all_open_fd(); + qemu_close_all_open_fd(NULL, 0); /* Set up a handler for SIGHUP and unblock SIGHUP. */ sigaction(SIGHUP, &sa, NULL); diff --git a/util/oslib-posix.c b/util/oslib-posix.c index e76441695b..c632cb90a2 100644 --- a/util/oslib-posix.c +++ b/util/oslib-posix.c @@ -807,3 +807,134 @@ int qemu_msync(void *addr, size_t length, int fd) return msync(addr, length, MS_SYNC); } + +static bool qemu_close_all_open_fd_proc(const int *skip, unsigned int nskip) +{ +#ifdef CONFIG_LINUX + struct dirent *de; + int fd, dfd; + bool close_fd; + DIR *dir; + unsigned int i, skip_start = 0, skip_end = nskip; + + dir = opendir("/proc/self/fd"); + if (!dir) { + /* If /proc is not mounted, there is nothing that can be done. */ + return false; + } + /* Avoid closing the directory. */ + dfd = dirfd(dir); + + for (de = readdir(dir); de; de = readdir(dir)) { + if (de->d_name[0] == '.') { + continue; + } + fd = atoi(de->d_name); + close_fd = true; + if (fd == dfd) { + close_fd = false; + } else { + for (i = skip_start; i < skip_end; i++) { + if (fd < skip[i]) { + /* We are below the next skipped fd, break */ + break; + } else if (fd == skip[i]) { + close_fd = false; + /* Restrict the range as we found fds matching start/end */ + if (i == skip_start) { + skip_start++; + } else if (i == skip_end) { + skip_end--; + } + break; + } + } + } + if (close_fd) { + close(fd); + } + } + closedir(dir); + + return true; +#else + return false; +#endif +} + +static bool qemu_close_all_open_fd_close_range(const int *skip, + unsigned int nskip) +{ +#ifdef CONFIG_CLOSE_RANGE + int max_fd = sysconf(_SC_OPEN_MAX) - 1; + int first = 0, last = max_fd; + unsigned int cur_skip = 0; + int ret; + + do { + /* Find the start boundary of the range to close */ + while (cur_skip < nskip && first == skip[cur_skip]) { + cur_skip++; + first++; + } + + /* Find the upper boundary of the range to close */ + if (cur_skip < nskip) { + last = skip[cur_skip] - 1; + } + /* + * Adjust the maximum fd to close if it's above what the system + * supports + */ + if (last > max_fd) { + last = max_fd; + /* + * We can directly skip the remaining skip fds since the current + * one is already above the maximum supported one. + */ + cur_skip = nskip; + } + /* If last was adjusted, we might be > first, bail out */ + if (first > last) { + break; + } + + ret = close_range(first, last, 0); + if (ret < 0) { + return false; + } + first = last + 1; + last = max_fd; + } while (cur_skip < nskip); + + return true; +#else + return false; +#endif +} + +void qemu_close_all_open_fd(const int *skip, unsigned int nskip) +{ + int open_max = sysconf(_SC_OPEN_MAX); + unsigned int cur_skip = 0; + int i; + + assert(skip != NULL || nskip == 0); + + if (qemu_close_all_open_fd_close_range(skip, nskip)) { + return; + } + + if (qemu_close_all_open_fd_proc(skip, nskip)) { + return; + } + + /* Fallback */ + for (i = 0; i < open_max; i++) { + if (cur_skip < nskip && i == skip[cur_skip]) { + cur_skip++; + continue; + } + close(i); + } +}