mbox series

[v2,net-next,00/11] net: speedup netns dismantles

Message ID 20220208045038.2635826-1-eric.dumazet@gmail.com (mailing list archive)
Headers show
Series net: speedup netns dismantles | expand

Message

Eric Dumazet Feb. 8, 2022, 4:50 a.m. UTC
From: Eric Dumazet <edumazet@google.com>

In this series, I made network namespace deletions more scalable,
by 4x on the little benchmark described in this cover letter.

- Remove bottleneck on ipv6 addrconf, by replacing a global
  hash table to a per netns one.

- Rework many (struct pernet_operations)->exit() handlers to
  exit_batch() ones. This removes many rtnl acquisitions,
  and gives to cleanup_net() kind of a priority over rtnl
  ownership.

Tested on a host with 24 cpus (48 HT)

Test script:

for nr in {1..10}
do
  (for i in {1..10000}; do unshare -n /bin/bash -c "ifconfig lo up"; done) &
done
wait

for i in {1..10}
do
  sleep 1 
  echo 3 >/proc/sys/vm/drop_caches
  grep net_namespace /proc/slabinfo
done

Before: We can see host struggles to clean the netns, even after there are no new creations.
Memory cost is high, because each netns consumes a good amount of memory.

time ./unshare10.sh
net_namespace      82634  82634   3968    1    1 : tunables   24   12    8 : slabdata  82634  82634      0
net_namespace      82634  82634   3968    1    1 : tunables   24   12    8 : slabdata  82634  82634      0
net_namespace      82634  82634   3968    1    1 : tunables   24   12    8 : slabdata  82634  82634      0
net_namespace      82634  82634   3968    1    1 : tunables   24   12    8 : slabdata  82634  82634      0
net_namespace      82634  82634   3968    1    1 : tunables   24   12    8 : slabdata  82634  82634      0
net_namespace      82634  82634   3968    1    1 : tunables   24   12    8 : slabdata  82634  82634      0
net_namespace      82634  82634   3968    1    1 : tunables   24   12    8 : slabdata  82634  82634      0
net_namespace      82634  82634   3968    1    1 : tunables   24   12    8 : slabdata  82634  82634      0
net_namespace      82634  82634   3968    1    1 : tunables   24   12    8 : slabdata  82634  82634      0
net_namespace      37214  37792   3968    1    1 : tunables   24   12    8 : slabdata  37214  37792    192

real	6m57.766s
user	3m37.277s
sys	40m4.826s

After: We can see the script completes much faster,
the kernel thread doing the cleanup_net() keeps up just fine.
Memory cost is not too big.

time ./unshare10.sh
net_namespace       9945   9945   4096    1    1 : tunables   24   12    8 : slabdata   9945   9945      0
net_namespace       4087   4665   4096    1    1 : tunables   24   12    8 : slabdata   4087   4665    192
net_namespace       4082   4607   4096    1    1 : tunables   24   12    8 : slabdata   4082   4607    192
net_namespace        234    761   4096    1    1 : tunables   24   12    8 : slabdata    234    761    192
net_namespace        224    751   4096    1    1 : tunables   24   12    8 : slabdata    224    751    192
net_namespace        218    745   4096    1    1 : tunables   24   12    8 : slabdata    218    745    192
net_namespace        193    667   4096    1    1 : tunables   24   12    8 : slabdata    193    667    172
net_namespace        167    609   4096    1    1 : tunables   24   12    8 : slabdata    167    609    152
net_namespace        167    609   4096    1    1 : tunables   24   12    8 : slabdata    167    609    152
net_namespace        157    609   4096    1    1 : tunables   24   12    8 : slabdata    157    609    152

real    1m43.876s
user    3m39.728s
sys 7m36.342s


v2: - fix a typo on ASSER_RTNL() (kernel build bots)
    - add reviewers approvals.

Eric Dumazet (11):
  ipv6/addrconf: allocate a per netns hash table
  ipv6/addrconf: use one delayed work per netns
  ipv6/addrconf: switch to per netns inet6_addr_lst hash table
  nexthop: change nexthop_net_exit() to nexthop_net_exit_batch()
  ipv4: add fib_net_exit_batch()
  ipv6: change fib6_rules_net_exit() to batch mode
  ip6mr: introduce ip6mr_net_exit_batch()
  ipmr: introduce ipmr_net_exit_batch()
  can: gw: switch cangw_pernet_exit() to batch mode
  bonding: switch bond_net_exit() to batch mode
  net: remove default_device_exit()

 drivers/net/bonding/bond_main.c   |  27 ++++--
 drivers/net/bonding/bond_procfs.c |   1 -
 include/net/netns/ipv6.h          |   5 ++
 net/can/gw.c                      |   9 +-
 net/core/dev.c                    |  22 +++--
 net/ipv4/fib_frontend.c           |  19 +++-
 net/ipv4/ipmr.c                   |  20 +++--
 net/ipv4/nexthop.c                |  12 ++-
 net/ipv6/addrconf.c               | 139 ++++++++++++++----------------
 net/ipv6/fib6_rules.c             |  11 ++-
 net/ipv6/ip6mr.c                  |  20 +++--
 11 files changed, 172 insertions(+), 113 deletions(-)

Comments

patchwork-bot+netdevbpf@kernel.org Feb. 9, 2022, 5:20 a.m. UTC | #1
Hello:

This series was applied to netdev/net-next.git (master)
by Jakub Kicinski <kuba@kernel.org>:

On Mon,  7 Feb 2022 20:50:27 -0800 you wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> In this series, I made network namespace deletions more scalable,
> by 4x on the little benchmark described in this cover letter.
> 
> - Remove bottleneck on ipv6 addrconf, by replacing a global
>   hash table to a per netns one.
> 
> [...]

Here is the summary with links:
  - [v2,net-next,01/11] ipv6/addrconf: allocate a per netns hash table
    https://git.kernel.org/netdev/net-next/c/21a216a8fc63
  - [v2,net-next,02/11] ipv6/addrconf: use one delayed work per netns
    https://git.kernel.org/netdev/net-next/c/8805d13ff1b2
  - [v2,net-next,03/11] ipv6/addrconf: switch to per netns inet6_addr_lst hash table
    https://git.kernel.org/netdev/net-next/c/e66d11722204
  - [v2,net-next,04/11] nexthop: change nexthop_net_exit() to nexthop_net_exit_batch()
    https://git.kernel.org/netdev/net-next/c/fea7b201320c
  - [v2,net-next,05/11] ipv4: add fib_net_exit_batch()
    https://git.kernel.org/netdev/net-next/c/1c6957646143
  - [v2,net-next,06/11] ipv6: change fib6_rules_net_exit() to batch mode
    https://git.kernel.org/netdev/net-next/c/ea3e91666ddd
  - [v2,net-next,07/11] ip6mr: introduce ip6mr_net_exit_batch()
    https://git.kernel.org/netdev/net-next/c/e2f736b753ec
  - [v2,net-next,08/11] ipmr: introduce ipmr_net_exit_batch()
    https://git.kernel.org/netdev/net-next/c/696e595f7075
  - [v2,net-next,09/11] can: gw: switch cangw_pernet_exit() to batch mode
    https://git.kernel.org/netdev/net-next/c/ef0de6696c38
  - [v2,net-next,10/11] bonding: switch bond_net_exit() to batch mode
    https://git.kernel.org/netdev/net-next/c/16a41634acca
  - [v2,net-next,11/11] net: remove default_device_exit()
    https://git.kernel.org/netdev/net-next/c/ee403248fa6d

You are awesome, thank you!