diff mbox series

[net,2/2] net: ipv6: fix dst ref loops in rpl, seg6 and ioam6 lwtunnels

Message ID 20250129021346.2333089-2-kuba@kernel.org (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net,1/2] net: ipv6: fix dst refleaks in rpl, seg6 and ioam6 lwtunnels | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 7 of 7 maintainers
netdev/build_clang success Errors and warnings before: 4 this patch: 4
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 2 this patch: 2
netdev/checkpatch warning WARNING: 'refrence' may be misspelled - perhaps 'reference'?
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2025-01-29--12-00 (tests: 885)

Commit Message

Jakub Kicinski Jan. 29, 2025, 2:13 a.m. UTC
Some lwtunnels have a dst cache for post-transformation dst.
If the packet destination did not change we may end up recording
a reference to the lwtunnel in its own cache, and the lwtunnel
state will never be freed.

Discovered by the ioam6.sh test, kmemleak was recently fixed
to catch per-cpu memory leaks. I'm not sure if rpl and seg6
can actually hit this, but in principle I don't see why not.

Fixes: 985ec6f5e623 ("net: ipv6: rpl_iptunnel: mitigate 2-realloc issue")
Fixes: 40475b63761a ("net: ipv6: seg6_iptunnel: mitigate 2-realloc issue")
Fixes: dce525185bc9 ("net: ipv6: ioam6_iptunnel: mitigate 2-realloc issue")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: dsahern@kernel.org
CC: justin.iurman@uliege.be
---
 net/ipv6/ioam6_iptunnel.c | 9 ++++++---
 net/ipv6/rpl_iptunnel.c   | 9 ++++++---
 net/ipv6/seg6_iptunnel.c  | 9 ++++++---
 3 files changed, 18 insertions(+), 9 deletions(-)

Comments

Justin Iurman Jan. 29, 2025, 4:50 p.m. UTC | #1
On 1/29/25 03:13, Jakub Kicinski wrote:
> Some lwtunnels have a dst cache for post-transformation dst.
> If the packet destination did not change we may end up recording
> a reference to the lwtunnel in its own cache, and the lwtunnel
> state will never be freed.
> 
> Discovered by the ioam6.sh test, kmemleak was recently fixed
> to catch per-cpu memory leaks. I'm not sure if rpl and seg6
> can actually hit this, but in principle I don't see why not.

Agree, both can theoretically hit this too.

> Fixes: 985ec6f5e623 ("net: ipv6: rpl_iptunnel: mitigate 2-realloc issue")
> Fixes: 40475b63761a ("net: ipv6: seg6_iptunnel: mitigate 2-realloc issue")
> Fixes: dce525185bc9 ("net: ipv6: ioam6_iptunnel: mitigate 2-realloc issue")
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: dsahern@kernel.org
> CC: justin.iurman@uliege.be
> ---
>   net/ipv6/ioam6_iptunnel.c | 9 ++++++---
>   net/ipv6/rpl_iptunnel.c   | 9 ++++++---
>   net/ipv6/seg6_iptunnel.c  | 9 ++++++---
>   3 files changed, 18 insertions(+), 9 deletions(-)
> 
> diff --git a/net/ipv6/ioam6_iptunnel.c b/net/ipv6/ioam6_iptunnel.c
> index 3936c137a572..0279f1327ad5 100644
> --- a/net/ipv6/ioam6_iptunnel.c
> +++ b/net/ipv6/ioam6_iptunnel.c
> @@ -410,9 +410,12 @@ static int ioam6_output(struct net *net, struct sock *sk, struct sk_buff *skb)
>   			goto drop;
>   		}
>   
> -		local_bh_disable();
> -		dst_cache_set_ip6(&ilwt->cache, cache_dst, &fl6.saddr);
> -		local_bh_enable();
> +		/* cache only if we don't create a dst refrence loop */

s/refrence/reference

> +		if (dst->lwtstate != cache_dst->lwtstate) {
> +			local_bh_disable();
> +			dst_cache_set_ip6(&ilwt->cache, cache_dst, &fl6.saddr);
> +			local_bh_enable();
> +		}

I agree the above patch fixes what kmemleak reported. However, I think 
it'd bring the double-reallocation issue back when the packet 
destination did not change (i.e., cache will always be empty). I'll try 
to come up with a solution...

>   
>   		err = skb_cow_head(skb, LL_RESERVED_SPACE(cache_dst->dev));
>   		if (unlikely(err))
> diff --git a/net/ipv6/rpl_iptunnel.c b/net/ipv6/rpl_iptunnel.c
> index 9b7d03563115..f3febe4881a5 100644
> --- a/net/ipv6/rpl_iptunnel.c
> +++ b/net/ipv6/rpl_iptunnel.c
> @@ -235,9 +235,12 @@ static int rpl_output(struct net *net, struct sock *sk, struct sk_buff *skb)
>   			goto drop;
>   		}
>   
> -		local_bh_disable();
> -		dst_cache_set_ip6(&rlwt->cache, dst, &fl6.saddr);
> -		local_bh_enable();
> +		/* cache only if we don't create a dst refrence loop */

s/refrence/reference

Same comment as above.

> +		if (orig_dst->lwtstate != dst->lwtstate) {
> +			local_bh_disable();
> +			dst_cache_set_ip6(&rlwt->cache, dst, &fl6.saddr);
> +			local_bh_enable();
> +		}
>   
>   		err = skb_cow_head(skb, LL_RESERVED_SPACE(dst->dev));
>   		if (unlikely(err))
> diff --git a/net/ipv6/seg6_iptunnel.c b/net/ipv6/seg6_iptunnel.c
> index eacc4e91b48e..0da989f07376 100644
> --- a/net/ipv6/seg6_iptunnel.c
> +++ b/net/ipv6/seg6_iptunnel.c
> @@ -576,9 +576,12 @@ static int seg6_output_core(struct net *net, struct sock *sk,
>   			goto drop;
>   		}
>   
> -		local_bh_disable();
> -		dst_cache_set_ip6(&slwt->cache, dst, &fl6.saddr);
> -		local_bh_enable();
> +		/* cache only if we don't create a dst refrence loop */

s/refrence/reference

Same comment as above.

> +		if (orig_dst->lwtstate != dst->lwtstate) {
> +			local_bh_disable();
> +			dst_cache_set_ip6(&slwt->cache, dst, &fl6.saddr);
> +			local_bh_enable();
> +		}
>   
>   		err = skb_cow_head(skb, LL_RESERVED_SPACE(dst->dev));
>   		if (unlikely(err))
Jakub Kicinski Jan. 29, 2025, 8:14 p.m. UTC | #2
On Wed, 29 Jan 2025 17:50:14 +0100 Justin Iurman wrote:
> > +		if (dst->lwtstate != cache_dst->lwtstate) {
> > +			local_bh_disable();
> > +			dst_cache_set_ip6(&ilwt->cache, cache_dst, &fl6.saddr);
> > +			local_bh_enable();
> > +		}  
> 
> I agree the above patch fixes what kmemleak reported. However, I think 
> it'd bring the double-reallocation issue back when the packet 
> destination did not change (i.e., cache will always be empty). I'll try 
> to come up with a solution...

True, dunno enough about use cases so I may be missing the point.
But the naive solution would be to remember that the tunnel "doesn't
re-route" and use dst directly, instead of cache_dst?
Justin Iurman Jan. 30, 2025, 12:24 a.m. UTC | #3
On 1/29/25 21:14, Jakub Kicinski wrote:
> On Wed, 29 Jan 2025 17:50:14 +0100 Justin Iurman wrote:
>>> +		if (dst->lwtstate != cache_dst->lwtstate) {
>>> +			local_bh_disable();
>>> +			dst_cache_set_ip6(&ilwt->cache, cache_dst, &fl6.saddr);
>>> +			local_bh_enable();
>>> +		}
>>
>> I agree the above patch fixes what kmemleak reported. However, I think
>> it'd bring the double-reallocation issue back when the packet
>> destination did not change (i.e., cache will always be empty). I'll try
>> to come up with a solution...
> 
> True, dunno enough about use cases so I may be missing the point.

Possible use cases: (i) with inline mode, or (ii) with encap mode using 
the same destination address. For (ii), the egress node of the IOAM 
domain also happens to be the actual destination of the packet, but it's 
not "your" packet... so you use a tunnel to stay compliant with RFC8200.

> But the naive solution would be to remember that the tunnel "doesn't
> re-route" and use dst directly, instead of cache_dst?

Correct, that'd work.
diff mbox series

Patch

diff --git a/net/ipv6/ioam6_iptunnel.c b/net/ipv6/ioam6_iptunnel.c
index 3936c137a572..0279f1327ad5 100644
--- a/net/ipv6/ioam6_iptunnel.c
+++ b/net/ipv6/ioam6_iptunnel.c
@@ -410,9 +410,12 @@  static int ioam6_output(struct net *net, struct sock *sk, struct sk_buff *skb)
 			goto drop;
 		}
 
-		local_bh_disable();
-		dst_cache_set_ip6(&ilwt->cache, cache_dst, &fl6.saddr);
-		local_bh_enable();
+		/* cache only if we don't create a dst refrence loop */
+		if (dst->lwtstate != cache_dst->lwtstate) {
+			local_bh_disable();
+			dst_cache_set_ip6(&ilwt->cache, cache_dst, &fl6.saddr);
+			local_bh_enable();
+		}
 
 		err = skb_cow_head(skb, LL_RESERVED_SPACE(cache_dst->dev));
 		if (unlikely(err))
diff --git a/net/ipv6/rpl_iptunnel.c b/net/ipv6/rpl_iptunnel.c
index 9b7d03563115..f3febe4881a5 100644
--- a/net/ipv6/rpl_iptunnel.c
+++ b/net/ipv6/rpl_iptunnel.c
@@ -235,9 +235,12 @@  static int rpl_output(struct net *net, struct sock *sk, struct sk_buff *skb)
 			goto drop;
 		}
 
-		local_bh_disable();
-		dst_cache_set_ip6(&rlwt->cache, dst, &fl6.saddr);
-		local_bh_enable();
+		/* cache only if we don't create a dst refrence loop */
+		if (orig_dst->lwtstate != dst->lwtstate) {
+			local_bh_disable();
+			dst_cache_set_ip6(&rlwt->cache, dst, &fl6.saddr);
+			local_bh_enable();
+		}
 
 		err = skb_cow_head(skb, LL_RESERVED_SPACE(dst->dev));
 		if (unlikely(err))
diff --git a/net/ipv6/seg6_iptunnel.c b/net/ipv6/seg6_iptunnel.c
index eacc4e91b48e..0da989f07376 100644
--- a/net/ipv6/seg6_iptunnel.c
+++ b/net/ipv6/seg6_iptunnel.c
@@ -576,9 +576,12 @@  static int seg6_output_core(struct net *net, struct sock *sk,
 			goto drop;
 		}
 
-		local_bh_disable();
-		dst_cache_set_ip6(&slwt->cache, dst, &fl6.saddr);
-		local_bh_enable();
+		/* cache only if we don't create a dst refrence loop */
+		if (orig_dst->lwtstate != dst->lwtstate) {
+			local_bh_disable();
+			dst_cache_set_ip6(&slwt->cache, dst, &fl6.saddr);
+			local_bh_enable();
+		}
 
 		err = skb_cow_head(skb, LL_RESERVED_SPACE(dst->dev));
 		if (unlikely(err))