Message ID | 20240607151357.421181-1-petrm@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | Allow configuration of multipath hash seed | expand |
On 6/7/24 18:13, Petr Machata wrote: > Let me just quote the commit message of patch #2 here to inform the > motivation and some of the implementation: > > When calculating hashes for the purpose of multipath forwarding, > both IPv4 and IPv6 code currently fall back on > flow_hash_from_keys(). That uses a randomly-generated seed. That's a > fine choice by default, but unfortunately some deployments may need > a tighter control over the seed used. > > In this patchset, make the seed configurable by adding a new sysctl > key, net.ipv4.fib_multipath_hash_seed to control the seed. This seed > is used specifically for multipath forwarding and not for the other > concerns that flow_hash_from_keys() is used for, such as queue > selection. Expose the knob as sysctl because other such settings, > such as headers to hash, are also handled that way. > > Despite being placed in the net.ipv4 namespace, the multipath seed > sysctl is used for both IPv4 and IPv6, similarly to e.g. a number of > TCP variables. Like those, the multipath hash seed is a per-netns > variable. > > The seed used by flow_hash_from_keys() is a 128-bit quantity. > However it seems that usually the seed is a much more modest value. > 32 bits seem typical (Cisco, Cumulus), some systems go even lower. > For that reason, and to decouple the user interface from > implementation details, go with a 32-bit quantity, which is then > quadruplicated to form the siphash key. > > One example of use of this interface is avoiding hash polarization, > where two ECMP routers, one behind the other, happen to make consistent > hashing decisions, and as a result, part of the ECMP space of the latter > router is never used. Another is a load balancer where several machines > forward traffic to one of a number of leaves, and the forwarding > decisions need to be made consistently. (This is a case of a desired > hash polarization, mentioned e.g. in chapter 6.3 of [0].) > > There has already been a proposal to include a hash seed control > interface in the past[1]. > > - Patches #1-#2 contain the substance of the work > - Patch #3 is an mlxsw offload > - Patches #4 and #5 are a selftest > > [0] https://www.usenix.org/system/files/conference/nsdi18/nsdi18-araujo.pdf > [1] https://lore.kernel.org/netdev/YIlVpYMCn%2F8WfE1P@rnd/ > > v2: > - Patch #2: > - Instead of precomputing the siphash key, construct it in place > of use thus obviating the need to use RCU. > - Instead of dispatching to the flow dissector for cases where > user seed is 0, maintain a separate random seed. Initialize it > early so that we can avoid a branch at the seed reader. > - In documentation, s/only valid/only present/ (when > CONFIG_IP_ROUTE_MULTIPATH). Also mention the algorithm is > unspecified and unstable in principle. > - Patch #3: > - Update to match changes in patch #2. > - Patch #4: > - New patch. > - Patch #5: > - Do not set seed on test init and run the stability tests first to catch > the cases of a missed pernet seed init. > > Petr Machata (5): > net: ipv4,ipv6: Pass multipath hash computation through a helper > net: ipv4: Add a sysctl to set multipath hash seed > mlxsw: spectrum_router: Apply user-defined multipath hash seed > selftests: forwarding: lib: Split sysctl_save() out of sysctl_set() > selftests: forwarding: router_mpath_hash: Add a new selftest > > Documentation/networking/ip-sysctl.rst | 14 + > .../ethernet/mellanox/mlxsw/spectrum_router.c | 6 +- > include/net/flow_dissector.h | 2 + > include/net/ip_fib.h | 28 ++ > include/net/netns/ipv4.h | 8 + > net/core/flow_dissector.c | 7 + > net/ipv4/route.c | 12 +- > net/ipv4/sysctl_net_ipv4.c | 66 ++++ > net/ipv6/route.c | 12 +- > .../testing/selftests/net/forwarding/Makefile | 1 + > tools/testing/selftests/net/forwarding/lib.sh | 9 +- > .../net/forwarding/router_mpath_seed.sh | 333 ++++++++++++++++++ > 12 files changed, 484 insertions(+), 14 deletions(-) > create mode 100755 tools/testing/selftests/net/forwarding/router_mpath_seed.sh > Looks good to me, thank you! For the set, Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
On 6/7/24 9:13 AM, Petr Machata wrote: > Let me just quote the commit message of patch #2 here to inform the > motivation and some of the implementation: > > When calculating hashes for the purpose of multipath forwarding, > both IPv4 and IPv6 code currently fall back on > flow_hash_from_keys(). That uses a randomly-generated seed. That's a > fine choice by default, but unfortunately some deployments may need > a tighter control over the seed used. > > In this patchset, make the seed configurable by adding a new sysctl > key, net.ipv4.fib_multipath_hash_seed to control the seed. This seed > is used specifically for multipath forwarding and not for the other > concerns that flow_hash_from_keys() is used for, such as queue > selection. Expose the knob as sysctl because other such settings, > such as headers to hash, are also handled that way. > > Despite being placed in the net.ipv4 namespace, the multipath seed > sysctl is used for both IPv4 and IPv6, similarly to e.g. a number of > TCP variables. Like those, the multipath hash seed is a per-netns > variable. > > The seed used by flow_hash_from_keys() is a 128-bit quantity. > However it seems that usually the seed is a much more modest value. > 32 bits seem typical (Cisco, Cumulus), some systems go even lower. > For that reason, and to decouple the user interface from > implementation details, go with a 32-bit quantity, which is then > quadruplicated to form the siphash key. > > One example of use of this interface is avoiding hash polarization, > where two ECMP routers, one behind the other, happen to make consistent > hashing decisions, and as a result, part of the ECMP space of the latter > router is never used. Another is a load balancer where several machines > forward traffic to one of a number of leaves, and the forwarding > decisions need to be made consistently. (This is a case of a desired > hash polarization, mentioned e.g. in chapter 6.3 of [0].) > > There has already been a proposal to include a hash seed control > interface in the past[1]. > > - Patches #1-#2 contain the substance of the work > - Patch #3 is an mlxsw offload > - Patches #4 and #5 are a selftest > > [0] https://www.usenix.org/system/files/conference/nsdi18/nsdi18-araujo.pdf > [1] https://lore.kernel.org/netdev/YIlVpYMCn%2F8WfE1P@rnd/ > > v2: > - Patch #2: > - Instead of precomputing the siphash key, construct it in place > of use thus obviating the need to use RCU. > - Instead of dispatching to the flow dissector for cases where > user seed is 0, maintain a separate random seed. Initialize it > early so that we can avoid a branch at the seed reader. > - In documentation, s/only valid/only present/ (when > CONFIG_IP_ROUTE_MULTIPATH). Also mention the algorithm is > unspecified and unstable in principle. > - Patch #3: > - Update to match changes in patch #2. > - Patch #4: > - New patch. > - Patch #5: > - Do not set seed on test init and run the stability tests first to catch > the cases of a missed pernet seed init. > > Petr Machata (5): > net: ipv4,ipv6: Pass multipath hash computation through a helper > net: ipv4: Add a sysctl to set multipath hash seed > mlxsw: spectrum_router: Apply user-defined multipath hash seed > selftests: forwarding: lib: Split sysctl_save() out of sysctl_set() > selftests: forwarding: router_mpath_hash: Add a new selftest > > Documentation/networking/ip-sysctl.rst | 14 + > .../ethernet/mellanox/mlxsw/spectrum_router.c | 6 +- > include/net/flow_dissector.h | 2 + > include/net/ip_fib.h | 28 ++ > include/net/netns/ipv4.h | 8 + > net/core/flow_dissector.c | 7 + > net/ipv4/route.c | 12 +- > net/ipv4/sysctl_net_ipv4.c | 66 ++++ > net/ipv6/route.c | 12 +- > .../testing/selftests/net/forwarding/Makefile | 1 + > tools/testing/selftests/net/forwarding/lib.sh | 9 +- > .../net/forwarding/router_mpath_seed.sh | 333 ++++++++++++++++++ > 12 files changed, 484 insertions(+), 14 deletions(-) > create mode 100755 tools/testing/selftests/net/forwarding/router_mpath_seed.sh > For the set: Reviewed-by: David Ahern <dsahern@kernel.org>
Hello: This series was applied to netdev/net-next.git (main) by Jakub Kicinski <kuba@kernel.org>: On Fri, 7 Jun 2024 17:13:52 +0200 you wrote: > Let me just quote the commit message of patch #2 here to inform the > motivation and some of the implementation: > > When calculating hashes for the purpose of multipath forwarding, > both IPv4 and IPv6 code currently fall back on > flow_hash_from_keys(). That uses a randomly-generated seed. That's a > fine choice by default, but unfortunately some deployments may need > a tighter control over the seed used. > > [...] Here is the summary with links: - [net-next,v2,1/5] net: ipv4,ipv6: Pass multipath hash computation through a helper https://git.kernel.org/netdev/net-next/c/3e453ca122d4 - [net-next,v2,2/5] net: ipv4: Add a sysctl to set multipath hash seed https://git.kernel.org/netdev/net-next/c/4ee2a8cace3f - [net-next,v2,3/5] mlxsw: spectrum_router: Apply user-defined multipath hash seed https://git.kernel.org/netdev/net-next/c/60bcfede3f9f - [net-next,v2,4/5] selftests: forwarding: lib: Split sysctl_save() out of sysctl_set() https://git.kernel.org/netdev/net-next/c/6f51aed38a4f - [net-next,v2,5/5] selftests: forwarding: router_mpath_hash: Add a new selftest https://git.kernel.org/netdev/net-next/c/5f90d93b6108 You are awesome, thank you!