Message ID | 20250127194024.3647-2-david.laight.linux@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | udplookup: Rescan udp hash chains if cross-linked | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Clearly marked for net |
netdev/apply | fail | Patch does not apply to net-0 |
From: David Laight <david.laight.linux@gmail.com> Date: Mon, 27 Jan 2025 19:40:23 +0000 > udp_lib_rehash() can get called at any time and will move a > socket to a different hash2 chain. > This can cause udp4_lib_lookup2() (processing incoming UDP) to > fail to find a socket and an ICMP port unreachable be sent. > > Prior to ca065d0cf80fa the lookup used 'hlist_nulls' and checked > that the 'end if list' marker was on the correct list. I think we should use hlist_nulls for hash2 as hash4. ---8<--- commit dab78a1745ab3c6001e1e4d50a9d09efef8e260d Author: Philo Lu <lulie@linux.alibaba.com> Date: Thu Nov 14 18:52:05 2024 +0800 net/udp: Add 4-tuple hash list basis ... hash4 uses hlist_nulls to avoid moving wrongly onto another hlist due to concurrent rehash, because rehash() can happen with lookup(). ---8<--- Also, Fixes: tag is missing in both patches. > > Signed-off-by: David Laight <david.laight.linux@gmail.com> > --- > net/ipv4/udp.c | 19 +++++++++++++++++-- > 1 file changed, 17 insertions(+), 2 deletions(-) > > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c > index 86d282618515..a8e2b431d348 100644 > --- a/net/ipv4/udp.c > +++ b/net/ipv4/udp.c > @@ -425,16 +425,21 @@ static struct sock *udp4_lib_lookup2(const struct net *net, > __be32 saddr, __be16 sport, > __be32 daddr, unsigned int hnum, > int dif, int sdif, > + unsigned int hash2, unsigned int mask, > struct udp_hslot *hslot2, > struct sk_buff *skb) > { > + unsigned int hash2_rescan; > struct sock *sk, *result; > int score, badness; > bool need_rescore; > > +rescan: > + hash2_rescan = hash2; > result = NULL; > badness = 0; > udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) { > + hash2_rescan = udp_sk(sk)->udp_portaddr_hash; > need_rescore = false; > rescore: > score = compute_score(need_rescore ? result : sk, net, saddr, > @@ -475,6 +480,16 @@ static struct sock *udp4_lib_lookup2(const struct net *net, > goto rescore; > } > } > + > + /* udp sockets can get moved to a different hash chain. > + * If the chains have got crossed then rescan. > + */ nit: trailing spaces here ^^^^^^^^ > + if ((hash2_rescan ^ hash2) & mask) { > + /* Ensure hslot2->head is reread */ > + barrier(); > + goto rescan; > + } > + > return result; > } > > @@ -654,7 +669,7 @@ struct sock *__udp4_lib_lookup(const struct net *net, __be32 saddr, > /* Lookup connected or non-wildcard socket */ > result = udp4_lib_lookup2(net, saddr, sport, > daddr, hnum, dif, sdif, > - hslot2, skb); > + hash2, udptable->mask, hslot2, skb); > if (!IS_ERR_OR_NULL(result) && result->sk_state == TCP_ESTABLISHED) > goto done; > > @@ -680,7 +695,7 @@ struct sock *__udp4_lib_lookup(const struct net *net, __be32 saddr, > > result = udp4_lib_lookup2(net, saddr, sport, > htonl(INADDR_ANY), hnum, dif, sdif, > - hslot2, skb); > + hash2, udptable->mask, hslot2, skb); > done: > if (IS_ERR(result)) > return NULL; > -- > 2.39.5 >
On Mon, 27 Jan 2025 12:33:04 -0800 Kuniyuki Iwashima <kuniyu@amazon.com> wrote: > From: David Laight <david.laight.linux@gmail.com> > Date: Mon, 27 Jan 2025 19:40:23 +0000 > > udp_lib_rehash() can get called at any time and will move a > > socket to a different hash2 chain. > > This can cause udp4_lib_lookup2() (processing incoming UDP) to > > fail to find a socket and an ICMP port unreachable be sent. > > > > Prior to ca065d0cf80fa the lookup used 'hlist_nulls' and checked > > that the 'end if list' marker was on the correct list. > > I think we should use hlist_nulls for hash2 as hash4. From what I remember when I first wrote this patch (mid 2023) using hlist_nulls doesn't make much difference. The code just did a rescan when the 'wrong NULL' was found rather than when the last item wasn't on the starting hash chain. ISTR it was removed to simplify other code paths. > > ---8<--- > commit dab78a1745ab3c6001e1e4d50a9d09efef8e260d > Author: Philo Lu <lulie@linux.alibaba.com> > Date: Thu Nov 14 18:52:05 2024 +0800 > > net/udp: Add 4-tuple hash list basis > ... > hash4 uses hlist_nulls to avoid moving wrongly onto another hlist due to > concurrent rehash, because rehash() can happen with lookup(). > ---8<--- > > > Also, Fixes: tag is missing in both patches. Semi-deliberate to stop it being immediately backported. While I think we have a system/test that fails it is running Ubuntu on a Dell server and I don't think the raid controller driver is in the main kernel tree. (We're definitely seeing unexpected ICMP on localhost - hard to get otherwise.) David > > > > > Signed-off-by: David Laight <david.laight.linux@gmail.com> > > --- > > net/ipv4/udp.c | 19 +++++++++++++++++-- > > 1 file changed, 17 insertions(+), 2 deletions(-) > > > > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c > > index 86d282618515..a8e2b431d348 100644 > > --- a/net/ipv4/udp.c > > +++ b/net/ipv4/udp.c > > @@ -425,16 +425,21 @@ static struct sock *udp4_lib_lookup2(const struct net *net, > > __be32 saddr, __be16 sport, > > __be32 daddr, unsigned int hnum, > > int dif, int sdif, > > + unsigned int hash2, unsigned int mask, > > struct udp_hslot *hslot2, > > struct sk_buff *skb) > > { > > + unsigned int hash2_rescan; > > struct sock *sk, *result; > > int score, badness; > > bool need_rescore; > > > > +rescan: > > + hash2_rescan = hash2; > > result = NULL; > > badness = 0; > > udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) { > > + hash2_rescan = udp_sk(sk)->udp_portaddr_hash; > > need_rescore = false; > > rescore: > > score = compute_score(need_rescore ? result : sk, net, saddr, > > @@ -475,6 +480,16 @@ static struct sock *udp4_lib_lookup2(const struct net *net, > > goto rescore; > > } > > } > > + > > + /* udp sockets can get moved to a different hash chain. > > + * If the chains have got crossed then rescan. > > + */ > > nit: trailing spaces here ^^^^^^^^ > > > > + if ((hash2_rescan ^ hash2) & mask) { > > + /* Ensure hslot2->head is reread */ > > + barrier(); > > + goto rescan; > > + } > > + > > return result; > > } > > > > @@ -654,7 +669,7 @@ struct sock *__udp4_lib_lookup(const struct net *net, __be32 saddr, > > /* Lookup connected or non-wildcard socket */ > > result = udp4_lib_lookup2(net, saddr, sport, > > daddr, hnum, dif, sdif, > > - hslot2, skb); > > + hash2, udptable->mask, hslot2, skb); > > if (!IS_ERR_OR_NULL(result) && result->sk_state == TCP_ESTABLISHED) > > goto done; > > > > @@ -680,7 +695,7 @@ struct sock *__udp4_lib_lookup(const struct net *net, __be32 saddr, > > > > result = udp4_lib_lookup2(net, saddr, sport, > > htonl(INADDR_ANY), hnum, dif, sdif, > > - hslot2, skb); > > + hash2, udptable->mask, hslot2, skb); > > done: > > if (IS_ERR(result)) > > return NULL; > > -- > > 2.39.5 > >
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 86d282618515..a8e2b431d348 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -425,16 +425,21 @@ static struct sock *udp4_lib_lookup2(const struct net *net, __be32 saddr, __be16 sport, __be32 daddr, unsigned int hnum, int dif, int sdif, + unsigned int hash2, unsigned int mask, struct udp_hslot *hslot2, struct sk_buff *skb) { + unsigned int hash2_rescan; struct sock *sk, *result; int score, badness; bool need_rescore; +rescan: + hash2_rescan = hash2; result = NULL; badness = 0; udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) { + hash2_rescan = udp_sk(sk)->udp_portaddr_hash; need_rescore = false; rescore: score = compute_score(need_rescore ? result : sk, net, saddr, @@ -475,6 +480,16 @@ static struct sock *udp4_lib_lookup2(const struct net *net, goto rescore; } } + + /* udp sockets can get moved to a different hash chain. + * If the chains have got crossed then rescan. + */ + if ((hash2_rescan ^ hash2) & mask) { + /* Ensure hslot2->head is reread */ + barrier(); + goto rescan; + } + return result; } @@ -654,7 +669,7 @@ struct sock *__udp4_lib_lookup(const struct net *net, __be32 saddr, /* Lookup connected or non-wildcard socket */ result = udp4_lib_lookup2(net, saddr, sport, daddr, hnum, dif, sdif, - hslot2, skb); + hash2, udptable->mask, hslot2, skb); if (!IS_ERR_OR_NULL(result) && result->sk_state == TCP_ESTABLISHED) goto done; @@ -680,7 +695,7 @@ struct sock *__udp4_lib_lookup(const struct net *net, __be32 saddr, result = udp4_lib_lookup2(net, saddr, sport, htonl(INADDR_ANY), hnum, dif, sdif, - hslot2, skb); + hash2, udptable->mask, hslot2, skb); done: if (IS_ERR(result)) return NULL;
udp_lib_rehash() can get called at any time and will move a socket to a different hash2 chain. This can cause udp4_lib_lookup2() (processing incoming UDP) to fail to find a socket and an ICMP port unreachable be sent. Prior to ca065d0cf80fa the lookup used 'hlist_nulls' and checked that the 'end if list' marker was on the correct list. Signed-off-by: David Laight <david.laight.linux@gmail.com> --- net/ipv4/udp.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-)