Message ID | 3b0ca23eaa8906f967a841f55bafed48d0f895cb.1713367699.git.pabeni@redhat.com (mailing list archive) |
---|---|
State | Accepted, archived |
Commit | b3584792f65e88e4a86318096aafbbe3900f4292 |
Delegated to: | Matthieu Baerts |
Headers | show |
Series | [mptcp-net,v3] mptcp: ensure snd_nxt is properly initialized on connect | expand |
Context | Check | Description |
---|---|---|
matttbe/build | success | Build and static analysis OK |
matttbe/checkpatch | fail | total: 1 errors, 3 warnings, 0 checks, 23 lines checked |
matttbe/shellcheck | success | MPTCP selftests files have not been modified |
matttbe/KVM_Validation__normal | success | Success! ✅ |
matttbe/KVM_Validation__debug | warning | Unstable: 1 failed test(s): packetdrill_sockopts |
matttbe/KVM_Validation__btf__only_bpftest_all_ | success | Success! ✅ |
Hi Paolo, Thank you for your modifications, that's great! Our CI did some validations and here is its report: - KVM Validation: normal: Success! ✅ - KVM Validation: debug: Unstable: 1 failed test(s): packetdrill_sockopts
On Thu, 18 Apr 2024, Paolo Abeni wrote: > Christoph reported a splat hinting at a corrupted snd_una: > > WARNING: CPU: 1 PID: 38 at net/mptcp/protocol.c:1005 __mptcp_clean_una+0x4b3/0x620 net/mptcp/protocol.c:1005 > Modules linked in: > CPU: 1 PID: 38 Comm: kworker/1:1 Not tainted 6.9.0-rc1-gbbeac67456c9 #59 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 > Workqueue: events mptcp_worker > RIP: 0010:__mptcp_clean_una+0x4b3/0x620 net/mptcp/protocol.c:1005 > Code: be 06 01 00 00 bf 06 01 00 00 e8 a8 12 e7 fe e9 00 fe ff ff e8 > 8e 1a e7 fe 0f b7 ab 3e 02 00 00 e9 d3 fd ff ff e8 7d 1a e7 fe > <0f> 0b 4c 8b bb e0 05 00 00 e9 74 fc ff ff e8 6a 1a e7 fe 0f 0b e9 > RSP: 0018:ffffc9000013fd48 EFLAGS: 00010293 > RAX: 0000000000000000 RBX: ffff8881029bd280 RCX: ffffffff82382fe4 > RDX: ffff8881003cbd00 RSI: ffffffff823833c3 RDI: 0000000000000001 > RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 > R10: 0000000000000000 R11: fefefefefefefeff R12: ffff888138ba8000 > R13: 0000000000000106 R14: ffff8881029bd908 R15: ffff888126560000 > FS: 0000000000000000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007f604a5dae38 CR3: 0000000101dac002 CR4: 0000000000170ef0 > Call Trace: > <TASK> > __mptcp_clean_una_wakeup net/mptcp/protocol.c:1055 [inline] > mptcp_clean_una_wakeup net/mptcp/protocol.c:1062 [inline] > __mptcp_retrans+0x7f/0x7e0 net/mptcp/protocol.c:2615 > mptcp_worker+0x434/0x740 net/mptcp/protocol.c:2767 > process_one_work+0x1e0/0x560 kernel/workqueue.c:3254 > process_scheduled_works kernel/workqueue.c:3335 [inline] > worker_thread+0x3c7/0x640 kernel/workqueue.c:3416 > kthread+0x121/0x170 kernel/kthread.c:388 > ret_from_fork+0x44/0x50 arch/x86/kernel/process.c:147 > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243 > </TASK> > > When fallback to TCP happens early on a client socket, snd_nxt > is not yet initialized and any incoming ack will copy such value > into snd_una. If the mptcp worker (dumbly) tries mptcp-level > re-injection after such ack, that would unconditionally trigger a send > buffer cleanup using 'bad' snd_una values. > > We could easily disable re-injection for fallback sockets, but such > dumb behavior already helped catching a few subtle issues and a very > low to zero impact in practice. > > Instead address the issue always initializing snd_nxt (and write_seq, > for consistency) at connect time. > > Reported-by: Christoph Paasch <cpaasch@apple.com> > Tested-by: Christoph Paasch <cpaasch@apple.com> > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/485 > Fixes: 8fd738049ac3 ("mptcp: fallback in case of simultaneous connect") > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > --- > v2 -> v3: > - init snd_nxt/write_seq instead > > v1 -> v2: > - moved the initialization at established time (Mat) > - dropped debug code wrongly landed in v1 > - added fixes tag > --- > net/mptcp/protocol.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c > index 3e1b15d76442..4205cef2d45f 100644 > --- a/net/mptcp/protocol.c > +++ b/net/mptcp/protocol.c > @@ -1002,8 +1002,13 @@ static void __mptcp_clean_una(struct sock *sk) > > if (unlikely(dfrag == msk->first_pending)) { > /* in recovery mode can see ack after the current snd head */ > - if (WARN_ON_ONCE(!msk->recovery)) > + if (WARN_ON_ONCE(!msk->recovery)) { > + pr_err("sk state %d:%d snd_una %llx write_seq %llx dfrag end seq %llx frag len %d bytes sent %lld acked %lld", > + sk->sk_state, msk->first ? msk->first->sk_state: -1, msk->write_seq, > + msk->snd_una, dfrag->data_seq + dfrag->data_len, dfrag->data_len, > + msk->bytes_sent, msk->bytes_acked); > break; > + } Hi Paolo - You dropped this debug code in v2, do you want it included? Otherwise, LGTM: Reviewed-by: Mat Martineau <martineau@kernel.org> > > WRITE_ONCE(msk->first_pending, mptcp_send_next(sk)); > } > @@ -3730,6 +3735,9 @@ static int mptcp_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) > MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_TOKENFALLBACKINIT); > mptcp_subflow_early_fallback(msk, subflow); > } > + > + WRITE_ONCE(msk->write_seq, subflow->idsn); > + WRITE_ONCE(msk->snd_nxt, subflow->idsn); > if (likely(!__mptcp_check_fallback(msk))) > MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_MPCAPABLEACTIVE); > > -- > 2.43.2 > > >
On Thu, 2024-04-18 at 09:59 -0700, Mat Martineau wrote: > On Thu, 18 Apr 2024, Paolo Abeni wrote: > > > Christoph reported a splat hinting at a corrupted snd_una: > > > > WARNING: CPU: 1 PID: 38 at net/mptcp/protocol.c:1005 __mptcp_clean_una+0x4b3/0x620 net/mptcp/protocol.c:1005 > > Modules linked in: > > CPU: 1 PID: 38 Comm: kworker/1:1 Not tainted 6.9.0-rc1-gbbeac67456c9 #59 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 > > Workqueue: events mptcp_worker > > RIP: 0010:__mptcp_clean_una+0x4b3/0x620 net/mptcp/protocol.c:1005 > > Code: be 06 01 00 00 bf 06 01 00 00 e8 a8 12 e7 fe e9 00 fe ff ff e8 > > 8e 1a e7 fe 0f b7 ab 3e 02 00 00 e9 d3 fd ff ff e8 7d 1a e7 fe > > <0f> 0b 4c 8b bb e0 05 00 00 e9 74 fc ff ff e8 6a 1a e7 fe 0f 0b e9 > > RSP: 0018:ffffc9000013fd48 EFLAGS: 00010293 > > RAX: 0000000000000000 RBX: ffff8881029bd280 RCX: ffffffff82382fe4 > > RDX: ffff8881003cbd00 RSI: ffffffff823833c3 RDI: 0000000000000001 > > RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 > > R10: 0000000000000000 R11: fefefefefefefeff R12: ffff888138ba8000 > > R13: 0000000000000106 R14: ffff8881029bd908 R15: ffff888126560000 > > FS: 0000000000000000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00007f604a5dae38 CR3: 0000000101dac002 CR4: 0000000000170ef0 > > Call Trace: > > <TASK> > > __mptcp_clean_una_wakeup net/mptcp/protocol.c:1055 [inline] > > mptcp_clean_una_wakeup net/mptcp/protocol.c:1062 [inline] > > __mptcp_retrans+0x7f/0x7e0 net/mptcp/protocol.c:2615 > > mptcp_worker+0x434/0x740 net/mptcp/protocol.c:2767 > > process_one_work+0x1e0/0x560 kernel/workqueue.c:3254 > > process_scheduled_works kernel/workqueue.c:3335 [inline] > > worker_thread+0x3c7/0x640 kernel/workqueue.c:3416 > > kthread+0x121/0x170 kernel/kthread.c:388 > > ret_from_fork+0x44/0x50 arch/x86/kernel/process.c:147 > > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243 > > </TASK> > > > > When fallback to TCP happens early on a client socket, snd_nxt > > is not yet initialized and any incoming ack will copy such value > > into snd_una. If the mptcp worker (dumbly) tries mptcp-level > > re-injection after such ack, that would unconditionally trigger a send > > buffer cleanup using 'bad' snd_una values. > > > > We could easily disable re-injection for fallback sockets, but such > > dumb behavior already helped catching a few subtle issues and a very > > low to zero impact in practice. > > > > Instead address the issue always initializing snd_nxt (and write_seq, > > for consistency) at connect time. > > > > Reported-by: Christoph Paasch <cpaasch@apple.com> > > Tested-by: Christoph Paasch <cpaasch@apple.com> > > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/485 > > Fixes: 8fd738049ac3 ("mptcp: fallback in case of simultaneous connect") > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > --- > > v2 -> v3: > > - init snd_nxt/write_seq instead > > > > v1 -> v2: > > - moved the initialization at established time (Mat) > > - dropped debug code wrongly landed in v1 > > - added fixes tag > > --- > > net/mptcp/protocol.c | 10 +++++++++- > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c > > index 3e1b15d76442..4205cef2d45f 100644 > > --- a/net/mptcp/protocol.c > > +++ b/net/mptcp/protocol.c > > @@ -1002,8 +1002,13 @@ static void __mptcp_clean_una(struct sock *sk) > > > > if (unlikely(dfrag == msk->first_pending)) { > > /* in recovery mode can see ack after the current snd head */ > > - if (WARN_ON_ONCE(!msk->recovery)) > > + if (WARN_ON_ONCE(!msk->recovery)) { > > + pr_err("sk state %d:%d snd_una %llx write_seq %llx dfrag end seq %llx frag len %d bytes sent %lld acked %lld", > > + sk->sk_state, msk->first ? msk->first->sk_state: -1, msk->write_seq, > > + msk->snd_una, dfrag->data_seq + dfrag->data_len, dfrag->data_len, > > + msk->bytes_sent, msk->bytes_acked); > > break; > > + } > > Hi Paolo - > > You dropped this debug code in v2, do you want it included? Nope, it's there by mistake. @Matttbe: could you please drop this chunk while applying the patch or do you prefer me to resend the patch? Thanks! Paolo
Hi Paolo, On 19/04/2024 12:33, Paolo Abeni wrote: > On Thu, 2024-04-18 at 09:59 -0700, Mat Martineau wrote: >> On Thu, 18 Apr 2024, Paolo Abeni wrote: >> >>> Christoph reported a splat hinting at a corrupted snd_una: >>> >>> WARNING: CPU: 1 PID: 38 at net/mptcp/protocol.c:1005 __mptcp_clean_una+0x4b3/0x620 net/mptcp/protocol.c:1005 >>> Modules linked in: >>> CPU: 1 PID: 38 Comm: kworker/1:1 Not tainted 6.9.0-rc1-gbbeac67456c9 #59 >>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 >>> Workqueue: events mptcp_worker >>> RIP: 0010:__mptcp_clean_una+0x4b3/0x620 net/mptcp/protocol.c:1005 >>> Code: be 06 01 00 00 bf 06 01 00 00 e8 a8 12 e7 fe e9 00 fe ff ff e8 >>> 8e 1a e7 fe 0f b7 ab 3e 02 00 00 e9 d3 fd ff ff e8 7d 1a e7 fe >>> <0f> 0b 4c 8b bb e0 05 00 00 e9 74 fc ff ff e8 6a 1a e7 fe 0f 0b e9 >>> RSP: 0018:ffffc9000013fd48 EFLAGS: 00010293 >>> RAX: 0000000000000000 RBX: ffff8881029bd280 RCX: ffffffff82382fe4 >>> RDX: ffff8881003cbd00 RSI: ffffffff823833c3 RDI: 0000000000000001 >>> RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 >>> R10: 0000000000000000 R11: fefefefefefefeff R12: ffff888138ba8000 >>> R13: 0000000000000106 R14: ffff8881029bd908 R15: ffff888126560000 >>> FS: 0000000000000000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000 >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> CR2: 00007f604a5dae38 CR3: 0000000101dac002 CR4: 0000000000170ef0 >>> Call Trace: >>> <TASK> >>> __mptcp_clean_una_wakeup net/mptcp/protocol.c:1055 [inline] >>> mptcp_clean_una_wakeup net/mptcp/protocol.c:1062 [inline] >>> __mptcp_retrans+0x7f/0x7e0 net/mptcp/protocol.c:2615 >>> mptcp_worker+0x434/0x740 net/mptcp/protocol.c:2767 >>> process_one_work+0x1e0/0x560 kernel/workqueue.c:3254 >>> process_scheduled_works kernel/workqueue.c:3335 [inline] >>> worker_thread+0x3c7/0x640 kernel/workqueue.c:3416 >>> kthread+0x121/0x170 kernel/kthread.c:388 >>> ret_from_fork+0x44/0x50 arch/x86/kernel/process.c:147 >>> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243 >>> </TASK> >>> >>> When fallback to TCP happens early on a client socket, snd_nxt >>> is not yet initialized and any incoming ack will copy such value >>> into snd_una. If the mptcp worker (dumbly) tries mptcp-level >>> re-injection after such ack, that would unconditionally trigger a send >>> buffer cleanup using 'bad' snd_una values. >>> >>> We could easily disable re-injection for fallback sockets, but such >>> dumb behavior already helped catching a few subtle issues and a very >>> low to zero impact in practice. >>> >>> Instead address the issue always initializing snd_nxt (and write_seq, >>> for consistency) at connect time. >>> >>> Reported-by: Christoph Paasch <cpaasch@apple.com> >>> Tested-by: Christoph Paasch <cpaasch@apple.com> >>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/485 >>> Fixes: 8fd738049ac3 ("mptcp: fallback in case of simultaneous connect") >>> Signed-off-by: Paolo Abeni <pabeni@redhat.com> >>> --- >>> v2 -> v3: >>> - init snd_nxt/write_seq instead >>> >>> v1 -> v2: >>> - moved the initialization at established time (Mat) >>> - dropped debug code wrongly landed in v1 >>> - added fixes tag >>> --- >>> net/mptcp/protocol.c | 10 +++++++++- >>> 1 file changed, 9 insertions(+), 1 deletion(-) >>> >>> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c >>> index 3e1b15d76442..4205cef2d45f 100644 >>> --- a/net/mptcp/protocol.c >>> +++ b/net/mptcp/protocol.c >>> @@ -1002,8 +1002,13 @@ static void __mptcp_clean_una(struct sock *sk) >>> >>> if (unlikely(dfrag == msk->first_pending)) { >>> /* in recovery mode can see ack after the current snd head */ >>> - if (WARN_ON_ONCE(!msk->recovery)) >>> + if (WARN_ON_ONCE(!msk->recovery)) { >>> + pr_err("sk state %d:%d snd_una %llx write_seq %llx dfrag end seq %llx frag len %d bytes sent %lld acked %lld", >>> + sk->sk_state, msk->first ? msk->first->sk_state: -1, msk->write_seq, >>> + msk->snd_una, dfrag->data_seq + dfrag->data_len, dfrag->data_len, >>> + msk->bytes_sent, msk->bytes_acked); >>> break; >>> + } >> >> Hi Paolo - >> >> You dropped this debug code in v2, do you want it included? > > Nope, it's there by mistake. @Matttbe: could you please drop this chunk > while applying the patch or do you prefer me to resend the patch? No problem, I can drop it when applying it. Cheers, Matt
Hi Paolo, Mat, Christoph, On 18/04/2024 09:44, Paolo Abeni wrote: > Christoph reported a splat hinting at a corrupted snd_una: Thank you for the patch, the review, and the bug report + test. I forgot to send a notification after I applied this patch. In between, the CI already validated this, and both the build for export and export-net have the same issue with Packetdrill (fastopen): - export: > not ok 32 packetdrill: mptcp/fastopen/server-TCP_FASTOPEN-cookie-req.pkt (ipv4-mapped-v6) > # server-TCP_FASTOPEN-cookie-req.pkt:23: error handling packet: live packet field ipv4_total_length: expected: 48 (0x30) vs actual: 64 (0x40) > # script packet: 0.660855 . 1:1(0) ack 1 <dss dack4 3007449510 flags: A> > # actual packet: 0.651162 . 1:1(0) ack 1 win 256 <dss dack4 3007449509 dsn8 4250179620018965011 ssn 0 dll 1 no_checksum flags: MmAF,nop,nop> https://github.com/multipath-tcp/mptcp_net-next/actions/runs/8752422517 - export-net: > not ok 32 packetdrill: mptcp/fastopen/server-TCP_FASTOPEN-cookie-req.pkt (ipv4-mapped-v6) > # server-TCP_FASTOPEN-cookie-req.pkt:23: error handling packet: live packet field ipv4_total_length: expected: 48 (0x30) vs actual: 64 (0x40) > # script packet: 0.667029 . 1:1(0) ack 1 <dss dack4 3007449510 flags: A> > # actual packet: 0.665988 . 1:1(0) ack 1 win 256 <dss dack4 3007449509 dsn8 4812888791396531368 ssn 0 dll 1 no_checksum flags: MmAF,nop,nop> https://github.com/multipath-tcp/mptcp_net-next/actions/runs/8752426856 Do you think this is linked? New patches for t/upstream-net and t/upstream: - b3584792f65e: mptcp: ensure snd_nxt is properly initialized on connect - Results: 428f275bfbb0..e065a4657da5 (export-net) - Results: b88f530f98b9..47bb272bba61 (export) Cheers, Matt
On Fri, 2024-04-19 at 17:15 +0200, Matthieu Baerts wrote: > Hi Paolo, Mat, Christoph, > > On 18/04/2024 09:44, Paolo Abeni wrote: > > Christoph reported a splat hinting at a corrupted snd_una: > Thank you for the patch, the review, and the bug report + test. > > I forgot to send a notification after I applied this patch. In between, > the CI already validated this, and both the build for export and > export-net have the same issue with Packetdrill (fastopen): > > - export: > > > not ok 32 packetdrill: mptcp/fastopen/server-TCP_FASTOPEN-cookie-req.pkt (ipv4-mapped-v6) > > # server-TCP_FASTOPEN-cookie-req.pkt:23: error handling packet: live packet field ipv4_total_length: expected: 48 (0x30) vs actual: 64 (0x40) > > # script packet: 0.660855 . 1:1(0) ack 1 <dss dack4 3007449510 flags: A> > > # actual packet: 0.651162 . 1:1(0) ack 1 win 256 <dss dack4 3007449509 dsn8 4250179620018965011 ssn 0 dll 1 no_checksum flags: MmAF,nop,nop> > > https://github.com/multipath-tcp/mptcp_net-next/actions/runs/8752422517 > > - export-net: > > > not ok 32 packetdrill: mptcp/fastopen/server-TCP_FASTOPEN-cookie-req.pkt (ipv4-mapped-v6) > > # server-TCP_FASTOPEN-cookie-req.pkt:23: error handling packet: live packet field ipv4_total_length: expected: 48 (0x30) vs actual: 64 (0x40) > > # script packet: 0.667029 . 1:1(0) ack 1 <dss dack4 3007449510 flags: A> > > # actual packet: 0.665988 . 1:1(0) ack 1 win 256 <dss dack4 3007449509 dsn8 4812888791396531368 ssn 0 dll 1 no_checksum flags: MmAF,nop,nop> > > https://github.com/multipath-tcp/mptcp_net-next/actions/runs/8752426856 > > Do you think this is linked? I hope it's not, but the timing hints otherwise :( I'll try to have a better look. Thanks for the head-up! Paolo
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 3e1b15d76442..4205cef2d45f 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1002,8 +1002,13 @@ static void __mptcp_clean_una(struct sock *sk) if (unlikely(dfrag == msk->first_pending)) { /* in recovery mode can see ack after the current snd head */ - if (WARN_ON_ONCE(!msk->recovery)) + if (WARN_ON_ONCE(!msk->recovery)) { + pr_err("sk state %d:%d snd_una %llx write_seq %llx dfrag end seq %llx frag len %d bytes sent %lld acked %lld", + sk->sk_state, msk->first ? msk->first->sk_state: -1, msk->write_seq, + msk->snd_una, dfrag->data_seq + dfrag->data_len, dfrag->data_len, + msk->bytes_sent, msk->bytes_acked); break; + } WRITE_ONCE(msk->first_pending, mptcp_send_next(sk)); } @@ -3730,6 +3735,9 @@ static int mptcp_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_TOKENFALLBACKINIT); mptcp_subflow_early_fallback(msk, subflow); } + + WRITE_ONCE(msk->write_seq, subflow->idsn); + WRITE_ONCE(msk->snd_nxt, subflow->idsn); if (likely(!__mptcp_check_fallback(msk))) MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_MPCAPABLEACTIVE);