Message ID | 20250129021346.2333089-2-kuba@kernel.org (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,1/2] net: ipv6: fix dst refleaks in rpl, seg6 and ioam6 lwtunnels | expand |
On 1/29/25 03:13, Jakub Kicinski wrote: > Some lwtunnels have a dst cache for post-transformation dst. > If the packet destination did not change we may end up recording > a reference to the lwtunnel in its own cache, and the lwtunnel > state will never be freed. > > Discovered by the ioam6.sh test, kmemleak was recently fixed > to catch per-cpu memory leaks. I'm not sure if rpl and seg6 > can actually hit this, but in principle I don't see why not. Agree, both can theoretically hit this too. > Fixes: 985ec6f5e623 ("net: ipv6: rpl_iptunnel: mitigate 2-realloc issue") > Fixes: 40475b63761a ("net: ipv6: seg6_iptunnel: mitigate 2-realloc issue") > Fixes: dce525185bc9 ("net: ipv6: ioam6_iptunnel: mitigate 2-realloc issue") > Signed-off-by: Jakub Kicinski <kuba@kernel.org> > --- > CC: dsahern@kernel.org > CC: justin.iurman@uliege.be > --- > net/ipv6/ioam6_iptunnel.c | 9 ++++++--- > net/ipv6/rpl_iptunnel.c | 9 ++++++--- > net/ipv6/seg6_iptunnel.c | 9 ++++++--- > 3 files changed, 18 insertions(+), 9 deletions(-) > > diff --git a/net/ipv6/ioam6_iptunnel.c b/net/ipv6/ioam6_iptunnel.c > index 3936c137a572..0279f1327ad5 100644 > --- a/net/ipv6/ioam6_iptunnel.c > +++ b/net/ipv6/ioam6_iptunnel.c > @@ -410,9 +410,12 @@ static int ioam6_output(struct net *net, struct sock *sk, struct sk_buff *skb) > goto drop; > } > > - local_bh_disable(); > - dst_cache_set_ip6(&ilwt->cache, cache_dst, &fl6.saddr); > - local_bh_enable(); > + /* cache only if we don't create a dst refrence loop */ s/refrence/reference > + if (dst->lwtstate != cache_dst->lwtstate) { > + local_bh_disable(); > + dst_cache_set_ip6(&ilwt->cache, cache_dst, &fl6.saddr); > + local_bh_enable(); > + } I agree the above patch fixes what kmemleak reported. However, I think it'd bring the double-reallocation issue back when the packet destination did not change (i.e., cache will always be empty). I'll try to come up with a solution... > > err = skb_cow_head(skb, LL_RESERVED_SPACE(cache_dst->dev)); > if (unlikely(err)) > diff --git a/net/ipv6/rpl_iptunnel.c b/net/ipv6/rpl_iptunnel.c > index 9b7d03563115..f3febe4881a5 100644 > --- a/net/ipv6/rpl_iptunnel.c > +++ b/net/ipv6/rpl_iptunnel.c > @@ -235,9 +235,12 @@ static int rpl_output(struct net *net, struct sock *sk, struct sk_buff *skb) > goto drop; > } > > - local_bh_disable(); > - dst_cache_set_ip6(&rlwt->cache, dst, &fl6.saddr); > - local_bh_enable(); > + /* cache only if we don't create a dst refrence loop */ s/refrence/reference Same comment as above. > + if (orig_dst->lwtstate != dst->lwtstate) { > + local_bh_disable(); > + dst_cache_set_ip6(&rlwt->cache, dst, &fl6.saddr); > + local_bh_enable(); > + } > > err = skb_cow_head(skb, LL_RESERVED_SPACE(dst->dev)); > if (unlikely(err)) > diff --git a/net/ipv6/seg6_iptunnel.c b/net/ipv6/seg6_iptunnel.c > index eacc4e91b48e..0da989f07376 100644 > --- a/net/ipv6/seg6_iptunnel.c > +++ b/net/ipv6/seg6_iptunnel.c > @@ -576,9 +576,12 @@ static int seg6_output_core(struct net *net, struct sock *sk, > goto drop; > } > > - local_bh_disable(); > - dst_cache_set_ip6(&slwt->cache, dst, &fl6.saddr); > - local_bh_enable(); > + /* cache only if we don't create a dst refrence loop */ s/refrence/reference Same comment as above. > + if (orig_dst->lwtstate != dst->lwtstate) { > + local_bh_disable(); > + dst_cache_set_ip6(&slwt->cache, dst, &fl6.saddr); > + local_bh_enable(); > + } > > err = skb_cow_head(skb, LL_RESERVED_SPACE(dst->dev)); > if (unlikely(err))
On Wed, 29 Jan 2025 17:50:14 +0100 Justin Iurman wrote: > > + if (dst->lwtstate != cache_dst->lwtstate) { > > + local_bh_disable(); > > + dst_cache_set_ip6(&ilwt->cache, cache_dst, &fl6.saddr); > > + local_bh_enable(); > > + } > > I agree the above patch fixes what kmemleak reported. However, I think > it'd bring the double-reallocation issue back when the packet > destination did not change (i.e., cache will always be empty). I'll try > to come up with a solution... True, dunno enough about use cases so I may be missing the point. But the naive solution would be to remember that the tunnel "doesn't re-route" and use dst directly, instead of cache_dst?
On 1/29/25 21:14, Jakub Kicinski wrote: > On Wed, 29 Jan 2025 17:50:14 +0100 Justin Iurman wrote: >>> + if (dst->lwtstate != cache_dst->lwtstate) { >>> + local_bh_disable(); >>> + dst_cache_set_ip6(&ilwt->cache, cache_dst, &fl6.saddr); >>> + local_bh_enable(); >>> + } >> >> I agree the above patch fixes what kmemleak reported. However, I think >> it'd bring the double-reallocation issue back when the packet >> destination did not change (i.e., cache will always be empty). I'll try >> to come up with a solution... > > True, dunno enough about use cases so I may be missing the point. Possible use cases: (i) with inline mode, or (ii) with encap mode using the same destination address. For (ii), the egress node of the IOAM domain also happens to be the actual destination of the packet, but it's not "your" packet... so you use a tunnel to stay compliant with RFC8200. > But the naive solution would be to remember that the tunnel "doesn't > re-route" and use dst directly, instead of cache_dst? Correct, that'd work.
diff --git a/net/ipv6/ioam6_iptunnel.c b/net/ipv6/ioam6_iptunnel.c index 3936c137a572..0279f1327ad5 100644 --- a/net/ipv6/ioam6_iptunnel.c +++ b/net/ipv6/ioam6_iptunnel.c @@ -410,9 +410,12 @@ static int ioam6_output(struct net *net, struct sock *sk, struct sk_buff *skb) goto drop; } - local_bh_disable(); - dst_cache_set_ip6(&ilwt->cache, cache_dst, &fl6.saddr); - local_bh_enable(); + /* cache only if we don't create a dst refrence loop */ + if (dst->lwtstate != cache_dst->lwtstate) { + local_bh_disable(); + dst_cache_set_ip6(&ilwt->cache, cache_dst, &fl6.saddr); + local_bh_enable(); + } err = skb_cow_head(skb, LL_RESERVED_SPACE(cache_dst->dev)); if (unlikely(err)) diff --git a/net/ipv6/rpl_iptunnel.c b/net/ipv6/rpl_iptunnel.c index 9b7d03563115..f3febe4881a5 100644 --- a/net/ipv6/rpl_iptunnel.c +++ b/net/ipv6/rpl_iptunnel.c @@ -235,9 +235,12 @@ static int rpl_output(struct net *net, struct sock *sk, struct sk_buff *skb) goto drop; } - local_bh_disable(); - dst_cache_set_ip6(&rlwt->cache, dst, &fl6.saddr); - local_bh_enable(); + /* cache only if we don't create a dst refrence loop */ + if (orig_dst->lwtstate != dst->lwtstate) { + local_bh_disable(); + dst_cache_set_ip6(&rlwt->cache, dst, &fl6.saddr); + local_bh_enable(); + } err = skb_cow_head(skb, LL_RESERVED_SPACE(dst->dev)); if (unlikely(err)) diff --git a/net/ipv6/seg6_iptunnel.c b/net/ipv6/seg6_iptunnel.c index eacc4e91b48e..0da989f07376 100644 --- a/net/ipv6/seg6_iptunnel.c +++ b/net/ipv6/seg6_iptunnel.c @@ -576,9 +576,12 @@ static int seg6_output_core(struct net *net, struct sock *sk, goto drop; } - local_bh_disable(); - dst_cache_set_ip6(&slwt->cache, dst, &fl6.saddr); - local_bh_enable(); + /* cache only if we don't create a dst refrence loop */ + if (orig_dst->lwtstate != dst->lwtstate) { + local_bh_disable(); + dst_cache_set_ip6(&slwt->cache, dst, &fl6.saddr); + local_bh_enable(); + } err = skb_cow_head(skb, LL_RESERVED_SPACE(dst->dev)); if (unlikely(err))
Some lwtunnels have a dst cache for post-transformation dst. If the packet destination did not change we may end up recording a reference to the lwtunnel in its own cache, and the lwtunnel state will never be freed. Discovered by the ioam6.sh test, kmemleak was recently fixed to catch per-cpu memory leaks. I'm not sure if rpl and seg6 can actually hit this, but in principle I don't see why not. Fixes: 985ec6f5e623 ("net: ipv6: rpl_iptunnel: mitigate 2-realloc issue") Fixes: 40475b63761a ("net: ipv6: seg6_iptunnel: mitigate 2-realloc issue") Fixes: dce525185bc9 ("net: ipv6: ioam6_iptunnel: mitigate 2-realloc issue") Signed-off-by: Jakub Kicinski <kuba@kernel.org> --- CC: dsahern@kernel.org CC: justin.iurman@uliege.be --- net/ipv6/ioam6_iptunnel.c | 9 ++++++--- net/ipv6/rpl_iptunnel.c | 9 ++++++--- net/ipv6/seg6_iptunnel.c | 9 ++++++--- 3 files changed, 18 insertions(+), 9 deletions(-)