diff mbox series

[net] net: do not delay dst_entries_add() in dst_release()

Message ID 20241008143110.1064899-1-edumazet@google.com (mailing list archive)
State Accepted
Commit ac888d58869bb99753e7652be19a151df9ecb35d
Delegated to: Netdev Maintainers
Headers show
Series [net] net: do not delay dst_entries_add() in dst_release() | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 6 this patch: 6
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 1 maintainers not CCed: bigeasy@linutronix.de
netdev/build_clang success Errors and warnings before: 6 this patch: 6
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 5 this patch: 5
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 38 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-10-10--09-00 (tests: 775)

Commit Message

Eric Dumazet Oct. 8, 2024, 2:31 p.m. UTC
dst_entries_add() uses per-cpu data that might be freed at netns
dismantle from ip6_route_net_exit() calling dst_entries_destroy()

Before ip6_route_net_exit() can be called, we release all
the dsts associated with this netns, via calls to dst_release(),
which waits an rcu grace period before calling dst_destroy()

dst_entries_add() use in dst_destroy() is racy, because
dst_entries_destroy() could have been called already.

Decrementing the number of dsts must happen sooner.

Notes:

1) in CONFIG_XFRM case, dst_destroy() can call
   dst_release_immediate(child), this might also cause UAF
   if the child does not have DST_NOCOUNT set.
   IPSEC maintainers might take a look and see how to address this.

2) There is also discussion about removing this count of dst,
   which might happen in future kernels.

Fixes: f88649721268 ("ipv4: fix dst race in sk_dst_get()")
Closes: https://lore.kernel.org/lkml/CANn89iLCCGsP7SFn9HKpvnKu96Td4KD08xf7aGtiYgZnkjaL=w@mail.gmail.com/T/
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/core/dst.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

Comments

Xin Long Oct. 8, 2024, 9:59 p.m. UTC | #1
On Tue, Oct 8, 2024 at 10:31 AM Eric Dumazet <edumazet@google.com> wrote:
>
> dst_entries_add() uses per-cpu data that might be freed at netns
> dismantle from ip6_route_net_exit() calling dst_entries_destroy()
>
> Before ip6_route_net_exit() can be called, we release all
> the dsts associated with this netns, via calls to dst_release(),
> which waits an rcu grace period before calling dst_destroy()
>
> dst_entries_add() use in dst_destroy() is racy, because
> dst_entries_destroy() could have been called already.
>
> Decrementing the number of dsts must happen sooner.
>
> Notes:
>
> 1) in CONFIG_XFRM case, dst_destroy() can call
>    dst_release_immediate(child), this might also cause UAF
>    if the child does not have DST_NOCOUNT set.
>    IPSEC maintainers might take a look and see how to address this.
>
> 2) There is also discussion about removing this count of dst,
>    which might happen in future kernels.
>
> Fixes: f88649721268 ("ipv4: fix dst race in sk_dst_get()")
> Closes: https://lore.kernel.org/lkml/CANn89iLCCGsP7SFn9HKpvnKu96Td4KD08xf7aGtiYgZnkjaL=w@mail.gmail.com/T/
> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
> Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Xin Long <lucien.xin@gmail.com>
> Cc: Steffen Klassert <steffen.klassert@secunet.com>

Reviewed-by: Xin Long <lucien.xin@gmail.com>
patchwork-bot+netdevbpf@kernel.org Oct. 10, 2024, 9:40 a.m. UTC | #2
Hello:

This patch was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Tue,  8 Oct 2024 14:31:10 +0000 you wrote:
> dst_entries_add() uses per-cpu data that might be freed at netns
> dismantle from ip6_route_net_exit() calling dst_entries_destroy()
> 
> Before ip6_route_net_exit() can be called, we release all
> the dsts associated with this netns, via calls to dst_release(),
> which waits an rcu grace period before calling dst_destroy()
> 
> [...]

Here is the summary with links:
  - [net] net: do not delay dst_entries_add() in dst_release()
    https://git.kernel.org/netdev/net/c/ac888d58869b

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/core/dst.c b/net/core/dst.c
index 95f533844f17f119c09f335ccf9bf09515dd3606..9552a90d4772dce49b5fe94d2f1d8da6979d9908 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -109,9 +109,6 @@  static void dst_destroy(struct dst_entry *dst)
 		child = xdst->child;
 	}
 #endif
-	if (!(dst->flags & DST_NOCOUNT))
-		dst_entries_add(dst->ops, -1);
-
 	if (dst->ops->destroy)
 		dst->ops->destroy(dst);
 	netdev_put(dst->dev, &dst->dev_tracker);
@@ -159,17 +156,27 @@  void dst_dev_put(struct dst_entry *dst)
 }
 EXPORT_SYMBOL(dst_dev_put);
 
+static void dst_count_dec(struct dst_entry *dst)
+{
+	if (!(dst->flags & DST_NOCOUNT))
+		dst_entries_add(dst->ops, -1);
+}
+
 void dst_release(struct dst_entry *dst)
 {
-	if (dst && rcuref_put(&dst->__rcuref))
+	if (dst && rcuref_put(&dst->__rcuref)) {
+		dst_count_dec(dst);
 		call_rcu_hurry(&dst->rcu_head, dst_destroy_rcu);
+	}
 }
 EXPORT_SYMBOL(dst_release);
 
 void dst_release_immediate(struct dst_entry *dst)
 {
-	if (dst && rcuref_put(&dst->__rcuref))
+	if (dst && rcuref_put(&dst->__rcuref)) {
+		dst_count_dec(dst);
 		dst_destroy(dst);
+	}
 }
 EXPORT_SYMBOL(dst_release_immediate);