diff mbox series

[v2,net] ipv6: Fix infinite recursion in fib6_dump_done().

Message ID 20240401211003.25274-1-kuniyu@amazon.com (mailing list archive)
State Accepted
Commit d21d40605bca7bd5fc23ef03d4c1ca1f48bc2cae
Delegated to: Netdev Maintainers
Headers show
Series [v2,net] ipv6: Fix infinite recursion in fib6_dump_done(). | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 944 this patch: 944
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/build_clang success Errors and warnings before: 954 this patch: 954
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 955 this patch: 955
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 26 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Kuniyuki Iwashima April 1, 2024, 9:10 p.m. UTC
syzkaller reported infinite recursive calls of fib6_dump_done() during
netlink socket destruction.  [1]

From the log, syzkaller sent an AF_UNSPEC RTM_GETROUTE message, and then
the response was generated.  The following recvmmsg() resumed the dump
for IPv6, but the first call of inet6_dump_fib() failed at kzalloc() due
to the fault injection.  [0]

  12:01:34 executing program 3:
  r0 = socket$nl_route(0x10, 0x3, 0x0)
  sendmsg$nl_route(r0, ... snip ...)
  recvmmsg(r0, ... snip ...) (fail_nth: 8)

Here, fib6_dump_done() was set to nlk_sk(sk)->cb.done, and the next call
of inet6_dump_fib() set it to nlk_sk(sk)->cb.args[3].  syzkaller stopped
receiving the response halfway through, and finally netlink_sock_destruct()
called nlk_sk(sk)->cb.done().

fib6_dump_done() calls fib6_dump_end() and nlk_sk(sk)->cb.done() if it
is still not NULL.  fib6_dump_end() rewrites nlk_sk(sk)->cb.done() by
nlk_sk(sk)->cb.args[3], but it has the same function, not NULL, calling
itself recursively and hitting the stack guard page.

To avoid the issue, let's set the destructor after kzalloc().

[0]:
FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 0
CPU: 1 PID: 432110 Comm: syz-executor.3 Not tainted 6.8.0-12821-g537c2e91d354-dirty #11
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl (lib/dump_stack.c:117)
 should_fail_ex (lib/fault-inject.c:52 lib/fault-inject.c:153)
 should_failslab (mm/slub.c:3733)
 kmalloc_trace (mm/slub.c:3748 mm/slub.c:3827 mm/slub.c:3992)
 inet6_dump_fib (./include/linux/slab.h:628 ./include/linux/slab.h:749 net/ipv6/ip6_fib.c:662)
 rtnl_dump_all (net/core/rtnetlink.c:4029)
 netlink_dump (net/netlink/af_netlink.c:2269)
 netlink_recvmsg (net/netlink/af_netlink.c:1988)
 ____sys_recvmsg (net/socket.c:1046 net/socket.c:2801)
 ___sys_recvmsg (net/socket.c:2846)
 do_recvmmsg (net/socket.c:2943)
 __x64_sys_recvmmsg (net/socket.c:3041 net/socket.c:3034 net/socket.c:3034)

[1]:
BUG: TASK stack guard page was hit at 00000000f2fa9af1 (stack is 00000000b7912430..000000009a436beb)
stack guard page: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 223719 Comm: kworker/1:3 Not tainted 6.8.0-12821-g537c2e91d354-dirty #11
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Workqueue: events netlink_sock_destruct_work
RIP: 0010:fib6_dump_done (net/ipv6/ip6_fib.c:570)
Code: 3c 24 e8 f3 e9 51 fd e9 28 fd ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 41 57 41 56 41 55 41 54 55 48 89 fd <53> 48 8d 5d 60 e8 b6 4d 07 fd 48 89 da 48 b8 00 00 00 00 00 fc ff
RSP: 0018:ffffc9000d980000 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffffffff84405990 RCX: ffffffff844059d3
RDX: ffff8881028e0000 RSI: ffffffff84405ac2 RDI: ffff88810c02f358
RBP: ffff88810c02f358 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000224 R12: 0000000000000000
R13: ffff888007c82c78 R14: ffff888007c82c68 R15: ffff888007c82c68
FS:  0000000000000000(0000) GS:ffff88811b100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffc9000d97fff8 CR3: 0000000102309002 CR4: 0000000000770ef0
PKRU: 55555554
Call Trace:
 <#DF>
 </#DF>
 <TASK>
 fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
 fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
 ...
 fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
 fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
 netlink_sock_destruct (net/netlink/af_netlink.c:401)
 __sk_destruct (net/core/sock.c:2177 (discriminator 2))
 sk_destruct (net/core/sock.c:2224)
 __sk_free (net/core/sock.c:2235)
 sk_free (net/core/sock.c:2246)
 process_one_work (kernel/workqueue.c:3259)
 worker_thread (kernel/workqueue.c:3329 kernel/workqueue.c:3416)
 kthread (kernel/kthread.c:388)
 ret_from_fork (arch/x86/kernel/process.c:153)
 ret_from_fork_asm (arch/x86/entry/entry_64.S:256)
Modules linked in:

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
Conflict with:
https://lore.kernel.org/netdev/20240329183053.644630-1-edumazet@google.com/

Changes:
  v2: Removed the garbage in the head of description
  v1: https://lore.kernel.org/netdev/20240401205020.22723-1-kuniyu@amazon.com/
---
 net/ipv6/ip6_fib.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

Comments

Eric Dumazet April 2, 2024, 9:43 a.m. UTC | #1
On Mon, Apr 1, 2024 at 11:10 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> syzkaller reported infinite recursive calls of fib6_dump_done() during
> netlink socket destruction.  [1]
>
> From the log, syzkaller sent an AF_UNSPEC RTM_GETROUTE message, and then
> the response was generated.  The following recvmmsg() resumed the dump
> for IPv6, but the first call of inet6_dump_fib() failed at kzalloc() due
> to the fault injection.  [0]
>
>   12:01:34 executing program 3:
>   r0 = socket$nl_route(0x10, 0x3, 0x0)
>   sendmsg$nl_route(r0, ... snip ...)
>   recvmmsg(r0, ... snip ...) (fail_nth: 8)
>
> Here, fib6_dump_done() was set to nlk_sk(sk)->cb.done, and the next call
> of inet6_dump_fib() set it to nlk_sk(sk)->cb.args[3].  syzkaller stopped
> receiving the response halfway through, and finally netlink_sock_destruct()
> called nlk_sk(sk)->cb.done().

It was not clear to me why we call inet6_dump_fib() a second time
after the first call returned -ENOMEM

It seems to be caused by rtnl_dump_all(), if the skb had info from
IPv4 (skb->len != 0)

"ip -6 ro" alone is not triggering the bug.

>
> fib6_dump_done() calls fib6_dump_end() and nlk_sk(sk)->cb.done() if it
> is still not NULL.  fib6_dump_end() rewrites nlk_sk(sk)->cb.done() by
> nlk_sk(sk)->cb.args[3], but it has the same function, not NULL, calling
> itself recursively and hitting the stack guard page.
>
> To avoid the issue, let's set the destructor after kzalloc().
>
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Reported-by: syzkaller <syzkaller@googlegroups.com>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>

Reviewed-by: Eric Dumazet <edumazet@google.com>
David Ahern April 2, 2024, 2:46 p.m. UTC | #2
On 4/1/24 3:10 PM, Kuniyuki Iwashima wrote:
> syzkaller reported infinite recursive calls of fib6_dump_done() during
> netlink socket destruction.  [1]
> 
> From the log, syzkaller sent an AF_UNSPEC RTM_GETROUTE message, and then
> the response was generated.  The following recvmmsg() resumed the dump
> for IPv6, but the first call of inet6_dump_fib() failed at kzalloc() due
> to the fault injection.  [0]
> 
>   12:01:34 executing program 3:
>   r0 = socket$nl_route(0x10, 0x3, 0x0)
>   sendmsg$nl_route(r0, ... snip ...)
>   recvmmsg(r0, ... snip ...) (fail_nth: 8)
> 
> Here, fib6_dump_done() was set to nlk_sk(sk)->cb.done, and the next call
> of inet6_dump_fib() set it to nlk_sk(sk)->cb.args[3].  syzkaller stopped
> receiving the response halfway through, and finally netlink_sock_destruct()
> called nlk_sk(sk)->cb.done().
> 
> fib6_dump_done() calls fib6_dump_end() and nlk_sk(sk)->cb.done() if it
> is still not NULL.  fib6_dump_end() rewrites nlk_sk(sk)->cb.done() by
> nlk_sk(sk)->cb.args[3], but it has the same function, not NULL, calling
> itself recursively and hitting the stack guard page.
> 
> To avoid the issue, let's set the destructor after kzalloc().
> 

...

> Modules linked in:
> 
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Reported-by: syzkaller <syzkaller@googlegroups.com>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> ---
> Conflict with:
> https://lore.kernel.org/netdev/20240329183053.644630-1-edumazet@google.com/
> 
> Changes:
>   v2: Removed the garbage in the head of description
>   v1: https://lore.kernel.org/netdev/20240401205020.22723-1-kuniyu@amazon.com/
> ---
>  net/ipv6/ip6_fib.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>
patchwork-bot+netdevbpf@kernel.org April 3, 2024, 2:20 a.m. UTC | #3
Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Mon, 1 Apr 2024 14:10:04 -0700 you wrote:
> syzkaller reported infinite recursive calls of fib6_dump_done() during
> netlink socket destruction.  [1]
> 
> From the log, syzkaller sent an AF_UNSPEC RTM_GETROUTE message, and then
> the response was generated.  The following recvmmsg() resumed the dump
> for IPv6, but the first call of inet6_dump_fib() failed at kzalloc() due
> to the fault injection.  [0]
> 
> [...]

Here is the summary with links:
  - [v2,net] ipv6: Fix infinite recursion in fib6_dump_done().
    https://git.kernel.org/netdev/net/c/d21d40605bca

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 5c558dc1c683..7209419cfb0e 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -651,19 +651,19 @@  static int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
 	if (!w) {
 		/* New dump:
 		 *
-		 * 1. hook callback destructor.
-		 */
-		cb->args[3] = (long)cb->done;
-		cb->done = fib6_dump_done;
-
-		/*
-		 * 2. allocate and initialize walker.
+		 * 1. allocate and initialize walker.
 		 */
 		w = kzalloc(sizeof(*w), GFP_ATOMIC);
 		if (!w)
 			return -ENOMEM;
 		w->func = fib6_dump_node;
 		cb->args[2] = (long)w;
+
+		/* 2. hook callback destructor.
+		 */
+		cb->args[3] = (long)cb->done;
+		cb->done = fib6_dump_done;
+
 	}
 
 	arg.skb = skb;