diff mbox series

tap: use close_range() when forking scripts and helper

Message ID 20240617162520.4045016-1-cleger@rivosinc.com (mailing list archive)
State New, archived
Headers show
Series tap: use close_range() when forking scripts and helper | expand

Commit Message

Clément Léger June 17, 2024, 4:25 p.m. UTC
Since commit 03e471c41d8b ("qemu_init: increase NOFILE soft limit on
POSIX"), the maximum number of file descriptors that can be opened are
raised to nofile.rlim_max. On recent debian distro, this yield a maximum
of 1073741816 file descriptors. Now, when forking to start
qemu-bridge-helper, this actually calls close() on the full possible file
descriptor range (more precisely [3 - sysconf(_SC_OPEN_MAX)]) which
takes a considerable amount of time. Use close_range() which only
requires to be called twice and factorize it in a separate function for
both call sites.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
---
 net/tap.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

Comments

Peter Maydell June 17, 2024, 4:36 p.m. UTC | #1
On Mon, 17 Jun 2024 at 17:25, Clément Léger <cleger@rivosinc.com> wrote:
>
> Since commit 03e471c41d8b ("qemu_init: increase NOFILE soft limit on
> POSIX"), the maximum number of file descriptors that can be opened are
> raised to nofile.rlim_max. On recent debian distro, this yield a maximum
> of 1073741816 file descriptors. Now, when forking to start
> qemu-bridge-helper, this actually calls close() on the full possible file
> descriptor range (more precisely [3 - sysconf(_SC_OPEN_MAX)]) which
> takes a considerable amount of time. Use close_range() which only
> requires to be called twice and factorize it in a separate function for
> both call sites.
>
> Signed-off-by: Clément Léger <cleger@rivosinc.com>
> ---
>  net/tap.c | 25 +++++++++++++------------
>  1 file changed, 13 insertions(+), 12 deletions(-)
>
> diff --git a/net/tap.c b/net/tap.c
> index 51f7aec39d..6f5bf06bb5 100644
> --- a/net/tap.c
> +++ b/net/tap.c
> @@ -385,6 +385,17 @@ static TAPState *net_tap_fd_init(NetClientState *peer,
>      return s;
>  }
>
> +static void fork_close_all_fds_except(int fd)
> +{
> +    int open_max = sysconf(_SC_OPEN_MAX);
> +
> +    if (fd > 3)
> +        close_range(3, fd - 1, 0);
> +
> +    if (fd < open_max)
> +        close_range(fd + 1, open_max, 0);
> +}

We can't rely on close_range() being available, I think --
this should be ifdeffed with CONFIG_CLOSE_RANGE, with a
fallback if it's not there.

Also, QEMU coding style requires braces on all if() statements,
even single-line ones.

thanks
-- PMM
Clément Léger June 17, 2024, 4:38 p.m. UTC | #2
On 17/06/2024 18:36, Peter Maydell wrote:
> On Mon, 17 Jun 2024 at 17:25, Clément Léger <cleger@rivosinc.com> wrote:
>>
>> Since commit 03e471c41d8b ("qemu_init: increase NOFILE soft limit on
>> POSIX"), the maximum number of file descriptors that can be opened are
>> raised to nofile.rlim_max. On recent debian distro, this yield a maximum
>> of 1073741816 file descriptors. Now, when forking to start
>> qemu-bridge-helper, this actually calls close() on the full possible file
>> descriptor range (more precisely [3 - sysconf(_SC_OPEN_MAX)]) which
>> takes a considerable amount of time. Use close_range() which only
>> requires to be called twice and factorize it in a separate function for
>> both call sites.
>>
>> Signed-off-by: Clément Léger <cleger@rivosinc.com>
>> ---
>>  net/tap.c | 25 +++++++++++++------------
>>  1 file changed, 13 insertions(+), 12 deletions(-)
>>
>> diff --git a/net/tap.c b/net/tap.c
>> index 51f7aec39d..6f5bf06bb5 100644
>> --- a/net/tap.c
>> +++ b/net/tap.c
>> @@ -385,6 +385,17 @@ static TAPState *net_tap_fd_init(NetClientState *peer,
>>      return s;
>>  }
>>
>> +static void fork_close_all_fds_except(int fd)
>> +{
>> +    int open_max = sysconf(_SC_OPEN_MAX);
>> +
>> +    if (fd > 3)
>> +        close_range(3, fd - 1, 0);
>> +
>> +    if (fd < open_max)
>> +        close_range(fd + 1, open_max, 0);
>> +}
> 
> We can't rely on close_range() being available, I think --
> this should be ifdeffed with CONFIG_CLOSE_RANGE, with a
> fallback if it's not there.

Ok, makes sense.

> 
> Also, QEMU coding style requires braces on all if() statements,
> even single-line ones.

Sorry for that, noticed it after running checkpatch.

I'll send a V2 with these fixes.

Thanks,

Clément

> 
> thanks
> -- PMM
Daniel P. Berrangé June 17, 2024, 4:39 p.m. UTC | #3
On Mon, Jun 17, 2024 at 06:25:18PM +0200, Clément Léger wrote:
> Since commit 03e471c41d8b ("qemu_init: increase NOFILE soft limit on
> POSIX"), the maximum number of file descriptors that can be opened are
> raised to nofile.rlim_max. On recent debian distro, this yield a maximum
> of 1073741816 file descriptors. Now, when forking to start
> qemu-bridge-helper, this actually calls close() on the full possible file
> descriptor range (more precisely [3 - sysconf(_SC_OPEN_MAX)]) which
> takes a considerable amount of time. Use close_range() which only
> requires to be called twice and factorize it in a separate function for
> both call sites.
> 
> Signed-off-by: Clément Léger <cleger@rivosinc.com>
> ---
>  net/tap.c | 25 +++++++++++++------------
>  1 file changed, 13 insertions(+), 12 deletions(-)
> 
> diff --git a/net/tap.c b/net/tap.c
> index 51f7aec39d..6f5bf06bb5 100644
> --- a/net/tap.c
> +++ b/net/tap.c
> @@ -385,6 +385,17 @@ static TAPState *net_tap_fd_init(NetClientState *peer,
>      return s;
>  }
>  
> +static void fork_close_all_fds_except(int fd)
> +{
> +    int open_max = sysconf(_SC_OPEN_MAX);
> +
> +    if (fd > 3)
> +        close_range(3, fd - 1, 0);
> +
> +    if (fd < open_max)
> +        close_range(fd + 1, open_max, 0);
> +}

We can't assume that 'close_range' exists on all platforms/versions that
QEMU targets.

In system/async-teardown.c there is close_all_open_fd() that has a fallback
path to iterating over /proc, which gives good fallback for Linux. That
code doesn't have to deal with non-Linux though.

I'd suggest that util/osdep.c needs to have a 'close_all_open_fd()' method
that accepts an array of FDs to skip closing of, rather than assuming we
always skip STDIO + 1 extra FD. eg

  int close_all_open_fd(int *skip, int nskip);

Could either declare that 'skip' must be sorted, or we can explicitly
run qsort() on it.

Try native close_range first. If unavailable, then on Linux try /proc,
otherwise the simple for() loop.

Then use this common helper from both tap.c and asynct-teardown.c

> +
>  static void launch_script(const char *setup_script, const char *ifname,
>                            int fd, Error **errp)
>  {
> @@ -400,13 +411,8 @@ static void launch_script(const char *setup_script, const char *ifname,
>          return;
>      }
>      if (pid == 0) {
> -        int open_max = sysconf(_SC_OPEN_MAX), i;
> +        fork_close_all_fds_except(fd);
>  
> -        for (i = 3; i < open_max; i++) {
> -            if (i != fd) {
> -                close(i);
> -            }
> -        }
>          parg = args;
>          *parg++ = (char *)setup_script;
>          *parg++ = (char *)ifname;
> @@ -490,16 +496,11 @@ static int net_bridge_run_helper(const char *helper, const char *bridge,
>          return -1;
>      }
>      if (pid == 0) {
> -        int open_max = sysconf(_SC_OPEN_MAX), i;
>          char *fd_buf = NULL;
>          char *br_buf = NULL;
>          char *helper_cmd = NULL;
>  
> -        for (i = 3; i < open_max; i++) {
> -            if (i != sv[1]) {
> -                close(i);
> -            }
> -        }
> +        fork_close_all_fds_except(sv[1]);
>  
>          fd_buf = g_strdup_printf("%s%d", "--fd=", sv[1]);
>  
> -- 
> 2.45.2
> 
> 

With regards,
Daniel
diff mbox series

Patch

diff --git a/net/tap.c b/net/tap.c
index 51f7aec39d..6f5bf06bb5 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -385,6 +385,17 @@  static TAPState *net_tap_fd_init(NetClientState *peer,
     return s;
 }
 
+static void fork_close_all_fds_except(int fd)
+{
+    int open_max = sysconf(_SC_OPEN_MAX);
+
+    if (fd > 3)
+        close_range(3, fd - 1, 0);
+
+    if (fd < open_max)
+        close_range(fd + 1, open_max, 0);
+}
+
 static void launch_script(const char *setup_script, const char *ifname,
                           int fd, Error **errp)
 {
@@ -400,13 +411,8 @@  static void launch_script(const char *setup_script, const char *ifname,
         return;
     }
     if (pid == 0) {
-        int open_max = sysconf(_SC_OPEN_MAX), i;
+        fork_close_all_fds_except(fd);
 
-        for (i = 3; i < open_max; i++) {
-            if (i != fd) {
-                close(i);
-            }
-        }
         parg = args;
         *parg++ = (char *)setup_script;
         *parg++ = (char *)ifname;
@@ -490,16 +496,11 @@  static int net_bridge_run_helper(const char *helper, const char *bridge,
         return -1;
     }
     if (pid == 0) {
-        int open_max = sysconf(_SC_OPEN_MAX), i;
         char *fd_buf = NULL;
         char *br_buf = NULL;
         char *helper_cmd = NULL;
 
-        for (i = 3; i < open_max; i++) {
-            if (i != sv[1]) {
-                close(i);
-            }
-        }
+        fork_close_all_fds_except(sv[1]);
 
         fd_buf = g_strdup_printf("%s%d", "--fd=", sv[1]);