diff mbox series

[net] mptcp: fix TCP options overflow.

Message ID 025d9df8cde3c9a557befc47e9bc08fbbe3476e5.1734771049.git.pabeni@redhat.com (mailing list archive)
State Accepted
Commit f7c91576f26a62e4f2dc84cd5e8df9c66083a40d
Delegated to: Matthieu Baerts
Headers show
Series [net] mptcp: fix TCP options overflow. | expand

Checks

Context Check Description
matttbe/build success Build and static analysis OK
matttbe/checkpatch warning total: 0 errors, 3 warnings, 0 checks, 15 lines checked
matttbe/shellcheck success MPTCP selftests files have not been modified
matttbe/KVM_Validation__normal fail Critical: Global Timeout ❌
matttbe/KVM_Validation__debug success Success! ✅
matttbe/KVM_Validation__btf-normal__only_bpftest_all_ success Success! ✅
matttbe/KVM_Validation__btf-debug__only_bpftest_all_ success Success! ✅

Commit Message

Paolo Abeni Dec. 21, 2024, 8:51 a.m. UTC
Syzbot reported the following splat:

Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
CPU: 1 UID: 0 PID: 5836 Comm: sshd Not tainted 6.13.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/25/2024
RIP: 0010:_compound_head include/linux/page-flags.h:242 [inline]
RIP: 0010:put_page+0x23/0x260 include/linux/mm.h:1552
Code: 90 90 90 90 90 90 90 55 41 57 41 56 53 49 89 fe 48 bd 00 00 00 00 00 fc ff df e8 f8 5e 12 f8 49 8d 5e 08 48 89 d8 48 c1 e8 03 <80> 3c 28 00 74 08 48 89 df e8 8f c7 78 f8 48 8b 1b 48 89 de 48 83
RSP: 0000:ffffc90003916c90 EFLAGS: 00010202
RAX: 0000000000000001 RBX: 0000000000000008 RCX: ffff888030458000
RDX: 0000000000000100 RSI: 0000000000000000 RDI: 0000000000000000
RBP: dffffc0000000000 R08: ffffffff898ca81d R09: 1ffff110054414ac
R10: dffffc0000000000 R11: ffffed10054414ad R12: 0000000000000007
R13: ffff88802a20a542 R14: 0000000000000000 R15: 0000000000000000
FS:  00007f34f496e800(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f9d6ec9ec28 CR3: 000000004d260000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 skb_page_unref include/linux/skbuff_ref.h:43 [inline]
 __skb_frag_unref include/linux/skbuff_ref.h:56 [inline]
 skb_release_data+0x483/0x8a0 net/core/skbuff.c:1119
 skb_release_all net/core/skbuff.c:1190 [inline]
 __kfree_skb+0x55/0x70 net/core/skbuff.c:1204
 tcp_clean_rtx_queue net/ipv4/tcp_input.c:3436 [inline]
 tcp_ack+0x2442/0x6bc0 net/ipv4/tcp_input.c:4032
 tcp_rcv_state_process+0x8eb/0x44e0 net/ipv4/tcp_input.c:6805
 tcp_v4_do_rcv+0x77d/0xc70 net/ipv4/tcp_ipv4.c:1939
 tcp_v4_rcv+0x2dc0/0x37f0 net/ipv4/tcp_ipv4.c:2351
 ip_protocol_deliver_rcu+0x22e/0x440 net/ipv4/ip_input.c:205
 ip_local_deliver_finish+0x341/0x5f0 net/ipv4/ip_input.c:233
 NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
 NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
 __netif_receive_skb_one_core net/core/dev.c:5672 [inline]
 __netif_receive_skb+0x2bf/0x650 net/core/dev.c:5785
 process_backlog+0x662/0x15b0 net/core/dev.c:6117
 __napi_poll+0xcb/0x490 net/core/dev.c:6883
 napi_poll net/core/dev.c:6952 [inline]
 net_rx_action+0x89b/0x1240 net/core/dev.c:7074
 handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:561
 __do_softirq kernel/softirq.c:595 [inline]
 invoke_softirq kernel/softirq.c:435 [inline]
 __irq_exit_rcu+0xf7/0x220 kernel/softirq.c:662
 irq_exit_rcu+0x9/0x30 kernel/softirq.c:678
 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline]
 sysvec_apic_timer_interrupt+0x57/0xc0 arch/x86/kernel/apic/apic.c:1049
 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
RIP: 0033:0x7f34f4519ad5
Code: 85 d2 74 0d 0f 10 02 48 8d 54 24 20 0f 11 44 24 20 64 8b 04 25 18 00 00 00 85 c0 75 27 41 b8 08 00 00 00 b8 0f 01 00 00 0f 05 <48> 3d 00 f0 ff ff 76 75 48 8b 15 24 73 0d 00 f7 d8 64 89 02 48 83
RSP: 002b:00007ffec5b32ce0 EFLAGS: 00000246
RAX: 0000000000000001 RBX: 00000000000668a0 RCX: 00007f34f4519ad5
RDX: 00007ffec5b32d00 RSI: 0000000000000004 RDI: 0000564f4bc6cae0
RBP: 0000564f4bc6b5a0 R08: 0000000000000008 R09: 0000000000000000
R10: 00007ffec5b32de8 R11: 0000000000000246 R12: 0000564f48ea8aa4
R13: 0000000000000001 R14: 0000564f48ea93e8 R15: 00007ffec5b32d68
 </TASK>

Eric noted a probable shinfo->nr_frags corruption, which indeed
occurs.

The root cause is a buggy MPTCP option len computation in some
circumstances: the ADD_ADDR option should be mutually exclusive
with DSS since the blamed commit.

Still, mptcp_established_options_add_addr() tries to set the
relevant info in mptcp_out_options, if the remaining space is
large enough even when DSS is present.

Since the ADD_ADDR infos and the DSS share the same union
fields, adding first corrupts the latter. In the worst-case
scenario, such corruption increases the DSS binary layout,
exceeding the computed length and possibly overwriting the
skb shared info.

Address the issue by enforcing mutual exclusion in
mptcp_established_options_add_addr(), too.

Reported-by: syzbot+38a095a81f30d82884c1@syzkaller.appspotmail.com
Fixes: 1bff1e43a30e ("mptcp: optimize out option generation")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/options.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Matthieu Baerts Dec. 21, 2024, 10:28 a.m. UTC | #1
Hi Paolo,

On 21/12/2024 09:51, Paolo Abeni wrote:
> Syzbot reported the following splat:

(...)

> Eric noted a probable shinfo->nr_frags corruption, which indeed
> occurs.
> 
> The root cause is a buggy MPTCP option len computation in some
> circumstances: the ADD_ADDR option should be mutually exclusive
> with DSS since the blamed commit.
> 
> Still, mptcp_established_options_add_addr() tries to set the
> relevant info in mptcp_out_options, if the remaining space is
> large enough even when DSS is present.
> 
> Since the ADD_ADDR infos and the DSS share the same union
> fields, adding first corrupts the latter. In the worst-case
> scenario, such corruption increases the DSS binary layout,
> exceeding the computed length and possibly overwriting the
> skb shared info.
> 
> Address the issue by enforcing mutual exclusion in
> mptcp_established_options_add_addr(), too.

Thank you for the investigation and the fix, it looks good to me:

Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>

> Reported-by: syzbot+38a095a81f30d82884c1@syzkaller.appspotmail.com

If you don't mind, can you please add these two tags when applying the
patches to help to track the backports?

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/538
Cc: stable@vger.kernel.org

> Fixes: 1bff1e43a30e ("mptcp: optimize out option generation")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Cheers,
Matt
MPTCP CI Dec. 21, 2024, 10:33 a.m. UTC | #2
Hi Paolo,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal: Critical: Global Timeout ❌
- KVM Validation: debug: Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/12443930793

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/54b09e873b02
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=920093


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
Eric Dumazet Dec. 23, 2024, 9:46 a.m. UTC | #3
On Sat, Dec 21, 2024 at 11:28 AM Matthieu Baerts <matttbe@kernel.org> wrote:
>
> Hi Paolo,
>
> On 21/12/2024 09:51, Paolo Abeni wrote:
> > Syzbot reported the following splat:
>
> (...)
>
> > Eric noted a probable shinfo->nr_frags corruption, which indeed
> > occurs.
> >
> > The root cause is a buggy MPTCP option len computation in some
> > circumstances: the ADD_ADDR option should be mutually exclusive
> > with DSS since the blamed commit.
> >
> > Still, mptcp_established_options_add_addr() tries to set the
> > relevant info in mptcp_out_options, if the remaining space is
> > large enough even when DSS is present.
> >
> > Since the ADD_ADDR infos and the DSS share the same union
> > fields, adding first corrupts the latter. In the worst-case
> > scenario, such corruption increases the DSS binary layout,
> > exceeding the computed length and possibly overwriting the
> > skb shared info.
> >
> > Address the issue by enforcing mutual exclusion in
> > mptcp_established_options_add_addr(), too.
>
> Thank you for the investigation and the fix, it looks good to me:
>
> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
>
> > Reported-by: syzbot+38a095a81f30d82884c1@syzkaller.appspotmail.com
>
> If you don't mind, can you please add these two tags when applying the
> patches to help to track the backports?
>
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/538
> Cc: stable@vger.kernel.org
>

Thanks for the fix !

Reviewed-by: Eric Dumazet <edumazet@google.com>
patchwork-bot+netdevbpf@kernel.org Dec. 31, 2024, 2 a.m. UTC | #4
Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Sat, 21 Dec 2024 09:51:46 +0100 you wrote:
> Syzbot reported the following splat:
> 
> Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN PTI
> KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
> CPU: 1 UID: 0 PID: 5836 Comm: sshd Not tainted 6.13.0-rc3-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/25/2024
> RIP: 0010:_compound_head include/linux/page-flags.h:242 [inline]
> RIP: 0010:put_page+0x23/0x260 include/linux/mm.h:1552
> Code: 90 90 90 90 90 90 90 55 41 57 41 56 53 49 89 fe 48 bd 00 00 00 00 00 fc ff df e8 f8 5e 12 f8 49 8d 5e 08 48 89 d8 48 c1 e8 03 <80> 3c 28 00 74 08 48 89 df e8 8f c7 78 f8 48 8b 1b 48 89 de 48 83
> RSP: 0000:ffffc90003916c90 EFLAGS: 00010202
> RAX: 0000000000000001 RBX: 0000000000000008 RCX: ffff888030458000
> RDX: 0000000000000100 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: dffffc0000000000 R08: ffffffff898ca81d R09: 1ffff110054414ac
> R10: dffffc0000000000 R11: ffffed10054414ad R12: 0000000000000007
> R13: ffff88802a20a542 R14: 0000000000000000 R15: 0000000000000000
> FS:  00007f34f496e800(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f9d6ec9ec28 CR3: 000000004d260000 CR4: 00000000003526f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  skb_page_unref include/linux/skbuff_ref.h:43 [inline]
>  __skb_frag_unref include/linux/skbuff_ref.h:56 [inline]
>  skb_release_data+0x483/0x8a0 net/core/skbuff.c:1119
>  skb_release_all net/core/skbuff.c:1190 [inline]
>  __kfree_skb+0x55/0x70 net/core/skbuff.c:1204
>  tcp_clean_rtx_queue net/ipv4/tcp_input.c:3436 [inline]
>  tcp_ack+0x2442/0x6bc0 net/ipv4/tcp_input.c:4032
>  tcp_rcv_state_process+0x8eb/0x44e0 net/ipv4/tcp_input.c:6805
>  tcp_v4_do_rcv+0x77d/0xc70 net/ipv4/tcp_ipv4.c:1939
>  tcp_v4_rcv+0x2dc0/0x37f0 net/ipv4/tcp_ipv4.c:2351
>  ip_protocol_deliver_rcu+0x22e/0x440 net/ipv4/ip_input.c:205
>  ip_local_deliver_finish+0x341/0x5f0 net/ipv4/ip_input.c:233
>  NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
>  NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
>  __netif_receive_skb_one_core net/core/dev.c:5672 [inline]
>  __netif_receive_skb+0x2bf/0x650 net/core/dev.c:5785
>  process_backlog+0x662/0x15b0 net/core/dev.c:6117
>  __napi_poll+0xcb/0x490 net/core/dev.c:6883
>  napi_poll net/core/dev.c:6952 [inline]
>  net_rx_action+0x89b/0x1240 net/core/dev.c:7074
>  handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:561
>  __do_softirq kernel/softirq.c:595 [inline]
>  invoke_softirq kernel/softirq.c:435 [inline]
>  __irq_exit_rcu+0xf7/0x220 kernel/softirq.c:662
>  irq_exit_rcu+0x9/0x30 kernel/softirq.c:678
>  instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline]
>  sysvec_apic_timer_interrupt+0x57/0xc0 arch/x86/kernel/apic/apic.c:1049
>  asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
> RIP: 0033:0x7f34f4519ad5
> Code: 85 d2 74 0d 0f 10 02 48 8d 54 24 20 0f 11 44 24 20 64 8b 04 25 18 00 00 00 85 c0 75 27 41 b8 08 00 00 00 b8 0f 01 00 00 0f 05 <48> 3d 00 f0 ff ff 76 75 48 8b 15 24 73 0d 00 f7 d8 64 89 02 48 83
> RSP: 002b:00007ffec5b32ce0 EFLAGS: 00000246
> RAX: 0000000000000001 RBX: 00000000000668a0 RCX: 00007f34f4519ad5
> RDX: 00007ffec5b32d00 RSI: 0000000000000004 RDI: 0000564f4bc6cae0
> RBP: 0000564f4bc6b5a0 R08: 0000000000000008 R09: 0000000000000000
> R10: 00007ffec5b32de8 R11: 0000000000000246 R12: 0000564f48ea8aa4
> R13: 0000000000000001 R14: 0000564f48ea93e8 R15: 00007ffec5b32d68
>  </TASK>
> 
> [...]

Here is the summary with links:
  - [net] mptcp: fix TCP options overflow.
    https://git.kernel.org/netdev/net/c/cbb26f7d8451

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index ad22622843a2..9ea7356d9fab 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -667,8 +667,15 @@  static bool mptcp_established_options_add_addr(struct sock *sk, struct sk_buff *
 		    &echo, &drop_other_suboptions))
 		return false;
 
+	/*
+	 * Later on, mptcp_write_options() will enforce mutually exclusion with
+	 * DSS, bail out if such option is set and we can't drop it.
+	 */
 	if (drop_other_suboptions)
 		remaining += opt_size;
+	else if (opts->suboptions & OPTION_MPTCP_DSS)
+		return false;
+
 	len = mptcp_add_addr_len(opts->addr.family, echo, !!opts->addr.port);
 	if (remaining < len)
 		return false;