diff mbox series

[net-next,2/4] net: ipv4: Add a sysctl to set multipath hash seed

Message ID 20240529111844.13330-3-petrm@nvidia.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series Allow configuration of multipath hash seed | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 5718 this patch: 5718
netdev/build_tools success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers warning 3 maintainers not CCed: aleksander.lobakin@intel.com rkannoth@marvell.com lixiaoyan@google.com
netdev/build_clang success Errors and warnings before: 1018 this patch: 1018
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 5993 this patch: 5993
netdev/checkpatch warning WARNING: line length of 85 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 47 this patch: 47
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-05-30--06-00 (tests: 1042)

Commit Message

Petr Machata May 29, 2024, 11:18 a.m. UTC
When calculating hashes for the purpose of multipath forwarding, both IPv4
and IPv6 code currently fall back on flow_hash_from_keys(). That uses a
randomly-generated seed. That's a fine choice by default, but unfortunately
some deployments may need a tighter control over the seed used.

In this patch, make the seed configurable by adding a new sysctl key,
net.ipv4.fib_multipath_hash_seed to control the seed. This seed is used
specifically for multipath forwarding and not for the other concerns that
flow_hash_from_keys() is used for, such as queue selection. Expose the knob
as sysctl because other such settings, such as headers to hash, are also
handled that way. Like those, the multipath hash seed is a per-netns
variable.

Despite being placed in the net.ipv4 namespace, the multipath seed sysctl
is used for both IPv4 and IPv6, similarly to e.g. a number of TCP
variables.

The seed used by flow_hash_from_keys() is a 128-bit quantity. However it
seems that usually the seed is a much more modest value. 32 bits seem
typical (Cisco, Cumulus), some systems go even lower. For that reason, and
to decouple the user interface from implementation details, go with a
32-bit quantity, which is then quadruplicated to form the siphash key.

For locking, use RTNL instead of a custom lock. This based on feedback
given to a patch from Pavel Balaev, which also aimed to introduce multipath
hash seed control [0].

[0] https://lore.kernel.org/netdev/20210413.161521.2301224176572441397.davem@davemloft.net/

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
---

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Cc: Simon Horman <horms@kernel.org>

 Documentation/networking/ip-sysctl.rst | 10 ++++
 include/net/flow_dissector.h           |  2 +
 include/net/ip_fib.h                   | 19 +++++-
 include/net/netns/ipv4.h               | 10 ++++
 net/core/flow_dissector.c              |  7 +++
 net/ipv4/sysctl_net_ipv4.c             | 82 ++++++++++++++++++++++++++
 6 files changed, 129 insertions(+), 1 deletion(-)

Comments

Jakub Kicinski May 31, 2024, 1 a.m. UTC | #1
On Wed, 29 May 2024 13:18:42 +0200 Petr Machata wrote:
> +fib_multipath_hash_seed - UNSIGNED INTEGER
> +	The seed value used when calculating hash for multipath routes. Applies

nits..

For RSS we call it key rather than seed, is calling it seed well
established for ECMP?

Can we also call out that hashing implementation is not well defined?

> +	to both IPv4 and IPv6 datapath. Only valid for kernels built with

s/valid/present/ ?

> +	CONFIG_IP_ROUTE_MULTIPATH enabled.
> +
> +	When set to 0, the seed value used for multipath routing defaults to an
> +	internal random-generated one.
Eric Dumazet June 1, 2024, 8:46 a.m. UTC | #2
On Wed, May 29, 2024 at 1:21 PM Petr Machata <petrm@nvidia.com> wrote:
>
> When calculating hashes for the purpose of multipath forwarding, both IPv4
> and IPv6 code currently fall back on flow_hash_from_keys(). That uses a
> randomly-generated seed. That's a fine choice by default, but unfortunately
> some deployments may need a tighter control over the seed used.
>
> In this patch, make the seed configurable by adding a new sysctl key,
> net.ipv4.fib_multipath_hash_seed to control the seed. This seed is used
> specifically for multipath forwarding and not for the other concerns that
> flow_hash_from_keys() is used for, such as queue selection. Expose the knob
> as sysctl because other such settings, such as headers to hash, are also
> handled that way. Like those, the multipath hash seed is a per-netns
> variable.
>
> Despite being placed in the net.ipv4 namespace, the multipath seed sysctl
> is used for both IPv4 and IPv6, similarly to e.g. a number of TCP
> variables.
>
...

> +       rtnl_lock();
> +       old = rcu_replace_pointer_rtnl(net->ipv4.sysctl_fib_multipath_hash_seed,
> +                                      mphs);
> +       rtnl_unlock();
> +

In case you keep RCU for the next version, please do not use rtnl_lock() here.

A simple xchg() will work just fine.

old = xchg((__force struct struct sysctl_fib_multipath_hash_seed
**)&net->ipv4.sysctl_fib_multipath_hash_seed,
                 mphs);
Ido Schimmel June 2, 2024, 11:15 a.m. UTC | #3
On Thu, May 30, 2024 at 06:00:34PM -0700, Jakub Kicinski wrote:
> On Wed, 29 May 2024 13:18:42 +0200 Petr Machata wrote:
> > +fib_multipath_hash_seed - UNSIGNED INTEGER
> > +	The seed value used when calculating hash for multipath routes. Applies
> 
> nits..
> 
> For RSS we call it key rather than seed, is calling it seed well
> established for ECMP?

I have only seen documentation where it is called "seed". Examples:

Cumulus:
https://docs.nvidia.com/networking-ethernet-software/cumulus-linux-59/Layer-3/Routing/Equal-Cost-Multipath-Load-Sharing/#unique-hash-seed

Arista:
https://arista.my.site.com/AristaCommunity/s/article/hashing-for-l2-port-channels-and-l3-ecmp

Research from Fastly around load balancing (Section 6.3):
https://www.usenix.org/system/files/conference/nsdi18/nsdi18-araujo.pdf
Nicolas Dichtel June 3, 2024, 6:51 a.m. UTC | #4
Le 02/06/2024 à 13:15, Ido Schimmel a écrit :
> On Thu, May 30, 2024 at 06:00:34PM -0700, Jakub Kicinski wrote:
>> On Wed, 29 May 2024 13:18:42 +0200 Petr Machata wrote:
>>> +fib_multipath_hash_seed - UNSIGNED INTEGER
>>> +	The seed value used when calculating hash for multipath routes. Applies
>>
>> nits..
>>
>> For RSS we call it key rather than seed, is calling it seed well
>> established for ECMP?
It seems standard for me (we call it like this in our products).

> 
> I have only seen documentation where it is called "seed". Examples:
> 
> Cumulus:
> https://docs.nvidia.com/networking-ethernet-software/cumulus-linux-59/Layer-3/Routing/Equal-Cost-Multipath-Load-Sharing/#unique-hash-seed
> 
> Arista:
> https://arista.my.site.com/AristaCommunity/s/article/hashing-for-l2-port-channels-and-l3-ecmp
> 
> Research from Fastly around load balancing (Section 6.3):
> https://www.usenix.org/system/files/conference/nsdi18/nsdi18-araujo.pdf
> 
You can add some others:

https://www.juniper.net/documentation/us/en/software/junos/interfaces-ethernet-switches/topics/topic-map/switches-interface-resilient-hashing.html

https://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus9000/sw/6-x/unicast/configuration/guide/l3_cli_nxos/l3_manage-routes.html
Toke Høiland-Jørgensen June 3, 2024, 7:29 a.m. UTC | #5
Eric Dumazet <edumazet@google.com> writes:

> On Wed, May 29, 2024 at 1:21 PM Petr Machata <petrm@nvidia.com> wrote:
>>
>> When calculating hashes for the purpose of multipath forwarding, both IPv4
>> and IPv6 code currently fall back on flow_hash_from_keys(). That uses a
>> randomly-generated seed. That's a fine choice by default, but unfortunately
>> some deployments may need a tighter control over the seed used.
>>
>> In this patch, make the seed configurable by adding a new sysctl key,
>> net.ipv4.fib_multipath_hash_seed to control the seed. This seed is used
>> specifically for multipath forwarding and not for the other concerns that
>> flow_hash_from_keys() is used for, such as queue selection. Expose the knob
>> as sysctl because other such settings, such as headers to hash, are also
>> handled that way. Like those, the multipath hash seed is a per-netns
>> variable.
>>
>> Despite being placed in the net.ipv4 namespace, the multipath seed sysctl
>> is used for both IPv4 and IPv6, similarly to e.g. a number of TCP
>> variables.
>>
> ...
>
>> +       rtnl_lock();
>> +       old = rcu_replace_pointer_rtnl(net->ipv4.sysctl_fib_multipath_hash_seed,
>> +                                      mphs);
>> +       rtnl_unlock();
>> +
>
> In case you keep RCU for the next version, please do not use rtnl_lock() here.
>
> A simple xchg() will work just fine.
>
> old = xchg((__force struct struct sysctl_fib_multipath_hash_seed
> **)&net->ipv4.sysctl_fib_multipath_hash_seed,
>                  mphs);

We added a macro to do this kind of thing without triggering any of the
RCU type linter warnings, in:

76c8eaafe4f0 ("rcu: Create an unrcu_pointer() to remove __rcu from a pointer")

So as an alternative to open-coding the cast, something like this could
work - I guess it's mostly a matter of taste:

old = unrcu_pointer(xchg(&net->ipv4.sysctl_fib_multipath_hash_seed, RCU_INITIALIZER(mphs)));

-Toke
Eric Dumazet June 3, 2024, 8:25 a.m. UTC | #6
On Mon, Jun 3, 2024 at 9:30 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Eric Dumazet <edumazet@google.com> writes:
>
> > On Wed, May 29, 2024 at 1:21 PM Petr Machata <petrm@nvidia.com> wrote:
> >>
> >> When calculating hashes for the purpose of multipath forwarding, both IPv4
> >> and IPv6 code currently fall back on flow_hash_from_keys(). That uses a
> >> randomly-generated seed. That's a fine choice by default, but unfortunately
> >> some deployments may need a tighter control over the seed used.
> >>
> >> In this patch, make the seed configurable by adding a new sysctl key,
> >> net.ipv4.fib_multipath_hash_seed to control the seed. This seed is used
> >> specifically for multipath forwarding and not for the other concerns that
> >> flow_hash_from_keys() is used for, such as queue selection. Expose the knob
> >> as sysctl because other such settings, such as headers to hash, are also
> >> handled that way. Like those, the multipath hash seed is a per-netns
> >> variable.
> >>
> >> Despite being placed in the net.ipv4 namespace, the multipath seed sysctl
> >> is used for both IPv4 and IPv6, similarly to e.g. a number of TCP
> >> variables.
> >>
> > ...
> >
> >> +       rtnl_lock();
> >> +       old = rcu_replace_pointer_rtnl(net->ipv4.sysctl_fib_multipath_hash_seed,
> >> +                                      mphs);
> >> +       rtnl_unlock();
> >> +
> >
> > In case you keep RCU for the next version, please do not use rtnl_lock() here.
> >
> > A simple xchg() will work just fine.
> >
> > old = xchg((__force struct struct sysctl_fib_multipath_hash_seed
> > **)&net->ipv4.sysctl_fib_multipath_hash_seed,
> >                  mphs);
>
> We added a macro to do this kind of thing without triggering any of the
> RCU type linter warnings, in:
>
> 76c8eaafe4f0 ("rcu: Create an unrcu_pointer() to remove __rcu from a pointer")
>
> So as an alternative to open-coding the cast, something like this could
> work - I guess it's mostly a matter of taste:
>
> old = unrcu_pointer(xchg(&net->ipv4.sysctl_fib_multipath_hash_seed, RCU_INITIALIZER(mphs)));

Good to know, thanks.

Not sure why __kernel qualifier has been put there.
Toke Høiland-Jørgensen June 3, 2024, 8:58 a.m. UTC | #7
Eric Dumazet <edumazet@google.com> writes:

> On Mon, Jun 3, 2024 at 9:30 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Eric Dumazet <edumazet@google.com> writes:
>>
>> > On Wed, May 29, 2024 at 1:21 PM Petr Machata <petrm@nvidia.com> wrote:
>> >>
>> >> When calculating hashes for the purpose of multipath forwarding, both IPv4
>> >> and IPv6 code currently fall back on flow_hash_from_keys(). That uses a
>> >> randomly-generated seed. That's a fine choice by default, but unfortunately
>> >> some deployments may need a tighter control over the seed used.
>> >>
>> >> In this patch, make the seed configurable by adding a new sysctl key,
>> >> net.ipv4.fib_multipath_hash_seed to control the seed. This seed is used
>> >> specifically for multipath forwarding and not for the other concerns that
>> >> flow_hash_from_keys() is used for, such as queue selection. Expose the knob
>> >> as sysctl because other such settings, such as headers to hash, are also
>> >> handled that way. Like those, the multipath hash seed is a per-netns
>> >> variable.
>> >>
>> >> Despite being placed in the net.ipv4 namespace, the multipath seed sysctl
>> >> is used for both IPv4 and IPv6, similarly to e.g. a number of TCP
>> >> variables.
>> >>
>> > ...
>> >
>> >> +       rtnl_lock();
>> >> +       old = rcu_replace_pointer_rtnl(net->ipv4.sysctl_fib_multipath_hash_seed,
>> >> +                                      mphs);
>> >> +       rtnl_unlock();
>> >> +
>> >
>> > In case you keep RCU for the next version, please do not use rtnl_lock() here.
>> >
>> > A simple xchg() will work just fine.
>> >
>> > old = xchg((__force struct struct sysctl_fib_multipath_hash_seed
>> > **)&net->ipv4.sysctl_fib_multipath_hash_seed,
>> >                  mphs);
>>
>> We added a macro to do this kind of thing without triggering any of the
>> RCU type linter warnings, in:
>>
>> 76c8eaafe4f0 ("rcu: Create an unrcu_pointer() to remove __rcu from a pointer")
>>
>> So as an alternative to open-coding the cast, something like this could
>> work - I guess it's mostly a matter of taste:
>>
>> old = unrcu_pointer(xchg(&net->ipv4.sysctl_fib_multipath_hash_seed, RCU_INITIALIZER(mphs)));
>
> Good to know, thanks.
>
> Not sure why __kernel qualifier has been put there.

Not sure either. Paul, care to enlighten us? :)

-Toke
Petr Machata June 3, 2024, 9:50 a.m. UTC | #8
Eric Dumazet <edumazet@google.com> writes:

> On Wed, May 29, 2024 at 1:21 PM Petr Machata <petrm@nvidia.com> wrote:
>>
>> When calculating hashes for the purpose of multipath forwarding, both IPv4
>> and IPv6 code currently fall back on flow_hash_from_keys(). That uses a
>> randomly-generated seed. That's a fine choice by default, but unfortunately
>> some deployments may need a tighter control over the seed used.
>>
>> In this patch, make the seed configurable by adding a new sysctl key,
>> net.ipv4.fib_multipath_hash_seed to control the seed. This seed is used
>> specifically for multipath forwarding and not for the other concerns that
>> flow_hash_from_keys() is used for, such as queue selection. Expose the knob
>> as sysctl because other such settings, such as headers to hash, are also
>> handled that way. Like those, the multipath hash seed is a per-netns
>> variable.
>>
>> Despite being placed in the net.ipv4 namespace, the multipath seed sysctl
>> is used for both IPv4 and IPv6, similarly to e.g. a number of TCP
>> variables.
>>
> ...
>
>> +       rtnl_lock();
>> +       old = rcu_replace_pointer_rtnl(net->ipv4.sysctl_fib_multipath_hash_seed,
>> +                                      mphs);
>> +       rtnl_unlock();
>> +
>
> In case you keep RCU for the next version, please do not use rtnl_lock() here.

Thanks. It looks like it's going to be inline and key constructed at the
point of use, so no RCU.
Petr Machata June 3, 2024, 9:51 a.m. UTC | #9
Jakub Kicinski <kuba@kernel.org> writes:

> On Wed, 29 May 2024 13:18:42 +0200 Petr Machata wrote:
>> +fib_multipath_hash_seed - UNSIGNED INTEGER
>> +	The seed value used when calculating hash for multipath routes. Applies
>
> nits..
>
> For RSS we call it key rather than seed, is calling it seed well
> established for ECMP?
>
> Can we also call out that hashing implementation is not well defined?

As others note, this seems to be industry nomenclature, so I'll keep it.

>> +	to both IPv4 and IPv6 datapath. Only valid for kernels built with
>
> s/valid/present/ ?

Ack.

>> +	CONFIG_IP_ROUTE_MULTIPATH enabled.
>> +
>> +	When set to 0, the seed value used for multipath routing defaults to an
>> +	internal random-generated one.
Petr Machata June 3, 2024, 11:37 a.m. UTC | #10
Petr Machata <petrm@nvidia.com> writes:

> Jakub Kicinski <kuba@kernel.org> writes:
>
>> On Wed, 29 May 2024 13:18:42 +0200 Petr Machata wrote:
>>> +fib_multipath_hash_seed - UNSIGNED INTEGER
>>> +	The seed value used when calculating hash for multipath routes. Applies
>>
>> nits..
>>
>> For RSS we call it key rather than seed, is calling it seed well
>> established for ECMP?
>>
>> Can we also call out that hashing implementation is not well defined?
>
> As others note, this seems to be industry nomenclature, so I'll keep it.

I meant the "seed" name, I'll mention the algorithm is undefined and
doesn't constitute an ABI.
Paul E. McKenney June 3, 2024, 1:53 p.m. UTC | #11
On Mon, Jun 03, 2024 at 10:58:18AM +0200, Toke Høiland-Jørgensen wrote:
> Eric Dumazet <edumazet@google.com> writes:
> 
> > On Mon, Jun 3, 2024 at 9:30 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> Eric Dumazet <edumazet@google.com> writes:
> >>
> >> > On Wed, May 29, 2024 at 1:21 PM Petr Machata <petrm@nvidia.com> wrote:
> >> >>
> >> >> When calculating hashes for the purpose of multipath forwarding, both IPv4
> >> >> and IPv6 code currently fall back on flow_hash_from_keys(). That uses a
> >> >> randomly-generated seed. That's a fine choice by default, but unfortunately
> >> >> some deployments may need a tighter control over the seed used.
> >> >>
> >> >> In this patch, make the seed configurable by adding a new sysctl key,
> >> >> net.ipv4.fib_multipath_hash_seed to control the seed. This seed is used
> >> >> specifically for multipath forwarding and not for the other concerns that
> >> >> flow_hash_from_keys() is used for, such as queue selection. Expose the knob
> >> >> as sysctl because other such settings, such as headers to hash, are also
> >> >> handled that way. Like those, the multipath hash seed is a per-netns
> >> >> variable.
> >> >>
> >> >> Despite being placed in the net.ipv4 namespace, the multipath seed sysctl
> >> >> is used for both IPv4 and IPv6, similarly to e.g. a number of TCP
> >> >> variables.
> >> >>
> >> > ...
> >> >
> >> >> +       rtnl_lock();
> >> >> +       old = rcu_replace_pointer_rtnl(net->ipv4.sysctl_fib_multipath_hash_seed,
> >> >> +                                      mphs);
> >> >> +       rtnl_unlock();
> >> >> +
> >> >
> >> > In case you keep RCU for the next version, please do not use rtnl_lock() here.
> >> >
> >> > A simple xchg() will work just fine.
> >> >
> >> > old = xchg((__force struct struct sysctl_fib_multipath_hash_seed
> >> > **)&net->ipv4.sysctl_fib_multipath_hash_seed,
> >> >                  mphs);
> >>
> >> We added a macro to do this kind of thing without triggering any of the
> >> RCU type linter warnings, in:
> >>
> >> 76c8eaafe4f0 ("rcu: Create an unrcu_pointer() to remove __rcu from a pointer")
> >>
> >> So as an alternative to open-coding the cast, something like this could
> >> work - I guess it's mostly a matter of taste:
> >>
> >> old = unrcu_pointer(xchg(&net->ipv4.sysctl_fib_multipath_hash_seed, RCU_INITIALIZER(mphs)));
> >
> > Good to know, thanks.
> >
> > Not sure why __kernel qualifier has been put there.
> 
> Not sure either. Paul, care to enlighten us? :)

Because __kernel says "just plain kernel access".  Here are the options:

# define __kernel       __attribute__((address_space(0)))
# define __user         __attribute__((noderef, address_space(__user)))
# define __iomem        __attribute__((noderef, address_space(__iomem)))
# define __percpu       __attribute__((noderef, address_space(__percpu)))
# define __rcu          __attribute__((noderef, address_space(__rcu)))

So casting to __kernel removes the __rcu, thus avoiding the sparse
complaint.

							Thanx, Paul
diff mbox series

Patch

diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index bd50df6a5a42..afcf3f323965 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -131,6 +131,16 @@  fib_multipath_hash_fields - UNSIGNED INTEGER
 
 	Default: 0x0007 (source IP, destination IP and IP protocol)
 
+fib_multipath_hash_seed - UNSIGNED INTEGER
+	The seed value used when calculating hash for multipath routes. Applies
+	to both IPv4 and IPv6 datapath. Only valid for kernels built with
+	CONFIG_IP_ROUTE_MULTIPATH enabled.
+
+	When set to 0, the seed value used for multipath routing defaults to an
+	internal random-generated one.
+
+	Default: 0 (random)
+
 fib_sync_mem - UNSIGNED INTEGER
 	Amount of dirty memory from fib entries that can be backlogged before
 	synchronize_rcu is forced.
diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index 9ab376d1a677..a5423219dee1 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -433,6 +433,8 @@  static inline bool flow_keys_have_l4(const struct flow_keys *keys)
 }
 
 u32 flow_hash_from_keys(struct flow_keys *keys);
+u32 flow_hash_from_keys_seed(struct flow_keys *keys,
+			     const siphash_key_t *keyval);
 void skb_flow_get_icmp_tci(const struct sk_buff *skb,
 			   struct flow_dissector_key_icmp *key_icmp,
 			   const void *data, int thoff, int hlen);
diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index b8b3c07e8f7b..785c571e2cef 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -520,13 +520,30 @@  void fib_nhc_update_mtu(struct fib_nh_common *nhc, u32 new, u32 orig);
 #ifdef CONFIG_IP_ROUTE_MULTIPATH
 int fib_multipath_hash(const struct net *net, const struct flowi4 *fl4,
 		       const struct sk_buff *skb, struct flow_keys *flkeys);
-#endif
 
+static inline u32 fib_multipath_hash_from_keys(const struct net *net,
+					       struct flow_keys *keys)
+{
+	struct sysctl_fib_multipath_hash_seed *mphs;
+	u32 ret;
+
+	rcu_read_lock();
+	mphs = rcu_dereference(net->ipv4.sysctl_fib_multipath_hash_seed);
+	if (likely(!mphs))
+		ret = flow_hash_from_keys(keys);
+	else
+		ret = flow_hash_from_keys_seed(keys, &mphs->seed);
+	rcu_read_unlock();
+
+	return ret;
+}
+#else
 static inline u32 fib_multipath_hash_from_keys(const struct net *net,
 					       struct flow_keys *keys)
 {
 	return flow_hash_from_keys(keys);
 }
+#endif
 
 int fib_check_nh(struct net *net, struct fib_nh *nh, u32 table, u8 scope,
 		 struct netlink_ext_ack *extack);
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index c356c458b340..1f5043d32cb0 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -40,6 +40,14 @@  struct inet_timewait_death_row {
 
 struct tcp_fastopen_context;
 
+#ifdef CONFIG_IP_ROUTE_MULTIPATH
+struct sysctl_fib_multipath_hash_seed {
+	siphash_aligned_key_t	seed;
+	u32			user_seed;
+	struct rcu_head		rcu;
+};
+#endif
+
 struct netns_ipv4 {
 	/* Cacheline organization can be found documented in
 	 * Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst.
@@ -245,6 +253,8 @@  struct netns_ipv4 {
 #endif
 #endif
 #ifdef CONFIG_IP_ROUTE_MULTIPATH
+	struct sysctl_fib_multipath_hash_seed
+					__rcu *sysctl_fib_multipath_hash_seed;
 	u32 sysctl_fib_multipath_hash_fields;
 	u8 sysctl_fib_multipath_use_neigh;
 	u8 sysctl_fib_multipath_hash_policy;
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index f82e9a7d3b37..7b3283ad5b39 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -1792,6 +1792,13 @@  u32 flow_hash_from_keys(struct flow_keys *keys)
 }
 EXPORT_SYMBOL(flow_hash_from_keys);
 
+u32 flow_hash_from_keys_seed(struct flow_keys *keys,
+			     const siphash_key_t *keyval)
+{
+	return __flow_hash_from_keys(keys, keyval);
+}
+EXPORT_SYMBOL(flow_hash_from_keys_seed);
+
 static inline u32 ___skb_get_hash(const struct sk_buff *skb,
 				  struct flow_keys *keys,
 				  const siphash_key_t *keyval)
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index d7892f34a15b..18fae2bf881c 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -464,6 +464,72 @@  static int proc_fib_multipath_hash_fields(struct ctl_table *table, int write,
 
 	return ret;
 }
+
+static void
+proc_fib_multipath_hash_construct_seed(u32 user_seed, siphash_key_t *key)
+{
+	u64 user_seed_64 = user_seed;
+
+	key->key[0] = (user_seed_64 << 32) | user_seed_64;
+	key->key[1] = key->key[0];
+}
+
+static int proc_fib_multipath_hash_set_seed(struct net *net, u32 user_seed)
+{
+	struct sysctl_fib_multipath_hash_seed *mphs = NULL;
+	struct sysctl_fib_multipath_hash_seed *old;
+
+	if (user_seed) {
+		mphs = kzalloc(sizeof(*mphs), GFP_KERNEL);
+		if (!mphs)
+			return -ENOMEM;
+
+		mphs->user_seed = user_seed;
+		proc_fib_multipath_hash_construct_seed(user_seed, &mphs->seed);
+	}
+
+	rtnl_lock();
+	old = rcu_replace_pointer_rtnl(net->ipv4.sysctl_fib_multipath_hash_seed,
+				       mphs);
+	rtnl_unlock();
+
+	if (old)
+		kfree_rcu(old, rcu);
+
+	return 0;
+}
+
+static int proc_fib_multipath_hash_seed(struct ctl_table *table, int write,
+					void *buffer, size_t *lenp,
+					loff_t *ppos)
+{
+	struct sysctl_fib_multipath_hash_seed *mphs;
+	struct net *net = table->data;
+	struct ctl_table tmp;
+	u32 user_seed = 0;
+	int ret;
+
+	rcu_read_lock();
+	mphs = rcu_dereference(net->ipv4.sysctl_fib_multipath_hash_seed);
+	if (mphs)
+		user_seed = mphs->user_seed;
+	rcu_read_unlock();
+
+	tmp = *table;
+	tmp.data = &user_seed;
+
+	ret = proc_douintvec_minmax(&tmp, write, buffer, lenp, ppos);
+
+	if (write && ret == 0) {
+		ret = proc_fib_multipath_hash_set_seed(net, user_seed);
+		if (ret)
+			return ret;
+
+		call_netevent_notifiers(NETEVENT_IPV4_MPATH_HASH_UPDATE, net);
+	}
+
+	return ret;
+}
 #endif
 
 static struct ctl_table ipv4_table[] = {
@@ -1072,6 +1138,13 @@  static struct ctl_table ipv4_net_table[] = {
 		.extra1		= SYSCTL_ONE,
 		.extra2		= &fib_multipath_hash_fields_all_mask,
 	},
+	{
+		.procname	= "fib_multipath_hash_seed",
+		.data		= &init_net,
+		.maxlen		= sizeof(u32),
+		.mode		= 0644,
+		.proc_handler	= proc_fib_multipath_hash_seed,
+	},
 #endif
 	{
 		.procname	= "ip_unprivileged_port_start",
@@ -1557,6 +1630,15 @@  static __net_exit void ipv4_sysctl_exit_net(struct net *net)
 {
 	const struct ctl_table *table;
 
+#ifdef CONFIG_IP_ROUTE_MULTIPATH
+	{
+		struct sysctl_fib_multipath_hash_seed *mphs;
+
+		mphs = rcu_dereference_raw(net->ipv4.sysctl_fib_multipath_hash_seed);
+		kfree(mphs);
+	}
+#endif
+
 	kfree(net->ipv4.sysctl_local_reserved_ports);
 	table = net->ipv4.ipv4_hdr->ctl_table_arg;
 	unregister_net_sysctl_table(net->ipv4.ipv4_hdr);