Message ID | 20220208045038.2635826-1-eric.dumazet@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | net: speedup netns dismantles | expand |
Hello: This series was applied to netdev/net-next.git (master) by Jakub Kicinski <kuba@kernel.org>: On Mon, 7 Feb 2022 20:50:27 -0800 you wrote: > From: Eric Dumazet <edumazet@google.com> > > In this series, I made network namespace deletions more scalable, > by 4x on the little benchmark described in this cover letter. > > - Remove bottleneck on ipv6 addrconf, by replacing a global > hash table to a per netns one. > > [...] Here is the summary with links: - [v2,net-next,01/11] ipv6/addrconf: allocate a per netns hash table https://git.kernel.org/netdev/net-next/c/21a216a8fc63 - [v2,net-next,02/11] ipv6/addrconf: use one delayed work per netns https://git.kernel.org/netdev/net-next/c/8805d13ff1b2 - [v2,net-next,03/11] ipv6/addrconf: switch to per netns inet6_addr_lst hash table https://git.kernel.org/netdev/net-next/c/e66d11722204 - [v2,net-next,04/11] nexthop: change nexthop_net_exit() to nexthop_net_exit_batch() https://git.kernel.org/netdev/net-next/c/fea7b201320c - [v2,net-next,05/11] ipv4: add fib_net_exit_batch() https://git.kernel.org/netdev/net-next/c/1c6957646143 - [v2,net-next,06/11] ipv6: change fib6_rules_net_exit() to batch mode https://git.kernel.org/netdev/net-next/c/ea3e91666ddd - [v2,net-next,07/11] ip6mr: introduce ip6mr_net_exit_batch() https://git.kernel.org/netdev/net-next/c/e2f736b753ec - [v2,net-next,08/11] ipmr: introduce ipmr_net_exit_batch() https://git.kernel.org/netdev/net-next/c/696e595f7075 - [v2,net-next,09/11] can: gw: switch cangw_pernet_exit() to batch mode https://git.kernel.org/netdev/net-next/c/ef0de6696c38 - [v2,net-next,10/11] bonding: switch bond_net_exit() to batch mode https://git.kernel.org/netdev/net-next/c/16a41634acca - [v2,net-next,11/11] net: remove default_device_exit() https://git.kernel.org/netdev/net-next/c/ee403248fa6d You are awesome, thank you!
From: Eric Dumazet <edumazet@google.com> In this series, I made network namespace deletions more scalable, by 4x on the little benchmark described in this cover letter. - Remove bottleneck on ipv6 addrconf, by replacing a global hash table to a per netns one. - Rework many (struct pernet_operations)->exit() handlers to exit_batch() ones. This removes many rtnl acquisitions, and gives to cleanup_net() kind of a priority over rtnl ownership. Tested on a host with 24 cpus (48 HT) Test script: for nr in {1..10} do (for i in {1..10000}; do unshare -n /bin/bash -c "ifconfig lo up"; done) & done wait for i in {1..10} do sleep 1 echo 3 >/proc/sys/vm/drop_caches grep net_namespace /proc/slabinfo done Before: We can see host struggles to clean the netns, even after there are no new creations. Memory cost is high, because each netns consumes a good amount of memory. time ./unshare10.sh net_namespace 82634 82634 3968 1 1 : tunables 24 12 8 : slabdata 82634 82634 0 net_namespace 82634 82634 3968 1 1 : tunables 24 12 8 : slabdata 82634 82634 0 net_namespace 82634 82634 3968 1 1 : tunables 24 12 8 : slabdata 82634 82634 0 net_namespace 82634 82634 3968 1 1 : tunables 24 12 8 : slabdata 82634 82634 0 net_namespace 82634 82634 3968 1 1 : tunables 24 12 8 : slabdata 82634 82634 0 net_namespace 82634 82634 3968 1 1 : tunables 24 12 8 : slabdata 82634 82634 0 net_namespace 82634 82634 3968 1 1 : tunables 24 12 8 : slabdata 82634 82634 0 net_namespace 82634 82634 3968 1 1 : tunables 24 12 8 : slabdata 82634 82634 0 net_namespace 82634 82634 3968 1 1 : tunables 24 12 8 : slabdata 82634 82634 0 net_namespace 37214 37792 3968 1 1 : tunables 24 12 8 : slabdata 37214 37792 192 real 6m57.766s user 3m37.277s sys 40m4.826s After: We can see the script completes much faster, the kernel thread doing the cleanup_net() keeps up just fine. Memory cost is not too big. time ./unshare10.sh net_namespace 9945 9945 4096 1 1 : tunables 24 12 8 : slabdata 9945 9945 0 net_namespace 4087 4665 4096 1 1 : tunables 24 12 8 : slabdata 4087 4665 192 net_namespace 4082 4607 4096 1 1 : tunables 24 12 8 : slabdata 4082 4607 192 net_namespace 234 761 4096 1 1 : tunables 24 12 8 : slabdata 234 761 192 net_namespace 224 751 4096 1 1 : tunables 24 12 8 : slabdata 224 751 192 net_namespace 218 745 4096 1 1 : tunables 24 12 8 : slabdata 218 745 192 net_namespace 193 667 4096 1 1 : tunables 24 12 8 : slabdata 193 667 172 net_namespace 167 609 4096 1 1 : tunables 24 12 8 : slabdata 167 609 152 net_namespace 167 609 4096 1 1 : tunables 24 12 8 : slabdata 167 609 152 net_namespace 157 609 4096 1 1 : tunables 24 12 8 : slabdata 157 609 152 real 1m43.876s user 3m39.728s sys 7m36.342s v2: - fix a typo on ASSER_RTNL() (kernel build bots) - add reviewers approvals. Eric Dumazet (11): ipv6/addrconf: allocate a per netns hash table ipv6/addrconf: use one delayed work per netns ipv6/addrconf: switch to per netns inet6_addr_lst hash table nexthop: change nexthop_net_exit() to nexthop_net_exit_batch() ipv4: add fib_net_exit_batch() ipv6: change fib6_rules_net_exit() to batch mode ip6mr: introduce ip6mr_net_exit_batch() ipmr: introduce ipmr_net_exit_batch() can: gw: switch cangw_pernet_exit() to batch mode bonding: switch bond_net_exit() to batch mode net: remove default_device_exit() drivers/net/bonding/bond_main.c | 27 ++++-- drivers/net/bonding/bond_procfs.c | 1 - include/net/netns/ipv6.h | 5 ++ net/can/gw.c | 9 +- net/core/dev.c | 22 +++-- net/ipv4/fib_frontend.c | 19 +++- net/ipv4/ipmr.c | 20 +++-- net/ipv4/nexthop.c | 12 ++- net/ipv6/addrconf.c | 139 ++++++++++++++---------------- net/ipv6/fib6_rules.c | 11 ++- net/ipv6/ip6mr.c | 20 +++-- 11 files changed, 172 insertions(+), 113 deletions(-)