diff mbox series

[mptcp-net,2/2] mptcp: stricter state check in mptcp_worker

Message ID 10b158069455d831d9efc547bcdb3a6379234f1c.1679654040.git.pabeni@redhat.com (mailing list archive)
State Accepted, archived
Commit e8fc17094d8543f75eb1e26c93468779118f32bf
Delegated to: Matthieu Baerts
Headers show
Series [mptcp-net,1/2] mptcp: use mptcp_schedule_work instead of open-codying it | expand

Checks

Context Check Description
matttbe/build success Build and static analysis OK
matttbe/checkpatch warning total: 0 errors, 3 warnings, 0 checks, 8 lines checked
matttbe/KVM_Validation__normal__except_selftest_mptcp_join_ success Success! ✅
matttbe/KVM_Validation__debug__only_selftest_mptcp_join_ warning Unstable: 1 failed test(s): selftest_mptcp_join
matttbe/KVM_Validation__normal__only_selftest_mptcp_join_ warning Unstable: 1 failed test(s): selftest_mptcp_join
matttbe/KVM_Validation__debug__except_selftest_mptcp_join_ success Success! ✅

Commit Message

Paolo Abeni March 24, 2023, 10:34 a.m. UTC
As reported by Christpher, the mptcp protocol can run the
worker when the relevant msk socket is in an unexpected state:

connect()
// incoming reset + fastclose
// the mptcp worker is scheduled
mptcp_disconnect()
// msk is now CLOSED
listen()
mptcp_worker()

Leading to the following splat:

divide error: 0000 [#1] PREEMPT SMP
CPU: 1 PID: 21 Comm: kworker/1:0 Not tainted 6.3.0-rc1-gde5e8fd0123c #11
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
Workqueue: events mptcp_worker
RIP: 0010:__tcp_select_window+0x22c/0x4b0 net/ipv4/tcp_output.c:3018
RSP: 0018:ffffc900000b3c98 EFLAGS: 00010293
RAX: 000000000000ffd7 RBX: 000000000000ffd7 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff8214ce97 RDI: 0000000000000004
RBP: 000000000000ffd7 R08: 0000000000000004 R09: 0000000000010000
R10: 000000000000ffd7 R11: ffff888005afa148 R12: 000000000000ffd7
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88803ed00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000405270 CR3: 000000003011e006 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 tcp_select_window net/ipv4/tcp_output.c:262 [inline]
 __tcp_transmit_skb+0x356/0x1280 net/ipv4/tcp_output.c:1345
 tcp_transmit_skb net/ipv4/tcp_output.c:1417 [inline]
 tcp_send_active_reset+0x13e/0x320 net/ipv4/tcp_output.c:3459
 mptcp_check_fastclose net/mptcp/protocol.c:2530 [inline]
 mptcp_worker+0x6c7/0x800 net/mptcp/protocol.c:2705
 process_one_work+0x3bd/0x950 kernel/workqueue.c:2390
 worker_thread+0x5b/0x610 kernel/workqueue.c:2537
 kthread+0x138/0x170 kernel/kthread.c:376
 ret_from_fork+0x2c/0x50 arch/x86/entry/entry_64.S:308
 </TASK>

This change address the issue explicitly checking for bad states
before running the mptcp worker.

Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/374
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/protocol.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

MPTCP CI March 24, 2023, 12:34 p.m. UTC | #1
Hi Paolo,

Thank you for your modifications, that's great!

But sadly, our CI spotted some issues with it when trying to build it.

You can find more details there:

  https://patchwork.kernel.org/project/mptcp/patch/10b158069455d831d9efc547bcdb3a6379234f1c.1679654040.git.pabeni@redhat.com/
  https://github.com/multipath-tcp/mptcp_net-next/actions/runs/4511121675

Status: failure
Initiator: matttbe
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/0a8809a9f3ab

Feel free to reply to this email if you cannot access logs, if you need
some support to fix the error, if this doesn't seem to be caused by your
modifications or if the error is a false positive one.

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (Tessares)
MPTCP CI March 24, 2023, 12:36 p.m. UTC | #2
Hi Paolo,

Thank you for your modifications, that's great!

But sadly, our CI spotted some issues with it when trying to build it.

You can find more details there:

  https://patchwork.kernel.org/project/mptcp/patch/10b158069455d831d9efc547bcdb3a6379234f1c.1679654040.git.pabeni@redhat.com/
  https://github.com/multipath-tcp/mptcp_net-next/actions/runs/4511130601

Status: failure
Initiator: matttbe
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/d79b0ae2a966

Feel free to reply to this email if you cannot access logs, if you need
some support to fix the error, if this doesn't seem to be caused by your
modifications or if the error is a false positive one.

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (Tessares)
MPTCP CI March 24, 2023, 2:25 p.m. UTC | #3
Hi Paolo,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join):
  - Unstable: 1 failed test(s): selftest_simult_flows 
MPTCP CI March 24, 2023, 3:12 p.m. UTC | #4
Hi Paolo,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/5434954880909312
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/5434954880909312/summary/summary.txt

- KVM Validation: debug (only selftest_mptcp_join):
  - Unstable: 1 failed test(s): selftest_mptcp_join 
Matthieu Baerts March 27, 2023, 3:33 p.m. UTC | #5
Hi Paolo,

On 24/03/2023 11:34, Paolo Abeni wrote:
> As reported by Christpher, the mptcp protocol can run the
> worker when the relevant msk socket is in an unexpected state:
> 
> connect()
> // incoming reset + fastclose
> // the mptcp worker is scheduled
> mptcp_disconnect()
> // msk is now CLOSED
> listen()
> mptcp_worker()
> 
> Leading to the following splat:

(...)

> This change address the issue explicitly checking for bad states
> before running the mptcp worker.
> 
> Reported-by: Christoph Paasch <cpaasch@apple.com>
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/374
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Thank you for the two fixes! They look good to me:

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>

As I said on IRC, it looks like your patch 1/2 could also include the
commit with the same title we have queued for net-next, in our export
branch[1]. Please tell me if you are OK with that.

[1] https://github.com/multipath-tcp/mptcp_net-next/commit/58e381ebe218

I just applied the two patches in our tree (fix for -net):

New patches for t/upstream-net and t/upstream:
- f5fbf4c258e7: mptcp: use mptcp_schedule_work instead of open-coding it
(net)
- e8fc17094d85: mptcp: stricter state check in mptcp_worker
- Results: ad9a6d9ee2fb..c50cadb14dbb (export-net)
- Results: 3ab8c05f4350..3dca512f38c6 (export)

Tests are now in progress:

https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export-net/20230327T152231
https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20230327T152231

Cheers,
Matt
Christoph Paasch March 27, 2023, 5:17 p.m. UTC | #6
> On Mar 24, 2023, at 3:34 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> 
> As reported by Christpher, the mptcp protocol can run the
> worker when the relevant msk socket is in an unexpected state:
> 
> connect()
> // incoming reset + fastclose
> // the mptcp worker is scheduled
> mptcp_disconnect()
> // msk is now CLOSED
> listen()
> mptcp_worker()
> 
> Leading to the following splat:
> 
> divide error: 0000 [#1] PREEMPT SMP
> CPU: 1 PID: 21 Comm: kworker/1:0 Not tainted 6.3.0-rc1-gde5e8fd0123c #11
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
> Workqueue: events mptcp_worker
> RIP: 0010:__tcp_select_window+0x22c/0x4b0 net/ipv4/tcp_output.c:3018
> RSP: 0018:ffffc900000b3c98 EFLAGS: 00010293
> RAX: 000000000000ffd7 RBX: 000000000000ffd7 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffffffff8214ce97 RDI: 0000000000000004
> RBP: 000000000000ffd7 R08: 0000000000000004 R09: 0000000000010000
> R10: 000000000000ffd7 R11: ffff888005afa148 R12: 000000000000ffd7
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffff88803ed00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000405270 CR3: 000000003011e006 CR4: 0000000000370ee0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> tcp_select_window net/ipv4/tcp_output.c:262 [inline]
> __tcp_transmit_skb+0x356/0x1280 net/ipv4/tcp_output.c:1345
> tcp_transmit_skb net/ipv4/tcp_output.c:1417 [inline]
> tcp_send_active_reset+0x13e/0x320 net/ipv4/tcp_output.c:3459
> mptcp_check_fastclose net/mptcp/protocol.c:2530 [inline]
> mptcp_worker+0x6c7/0x800 net/mptcp/protocol.c:2705
> process_one_work+0x3bd/0x950 kernel/workqueue.c:2390
> worker_thread+0x5b/0x610 kernel/workqueue.c:2537
> kthread+0x138/0x170 kernel/kthread.c:376
> ret_from_fork+0x2c/0x50 arch/x86/entry/entry_64.S:308
> </TASK>
> 
> This change address the issue explicitly checking for bad states
> before running the mptcp worker.
> 
> Reported-by: Christoph Paasch <cpaasch@apple.com>
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/374
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> net/mptcp/protocol.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)

Works!

Tested-by: Christoph Paasch <cpaasch@apple.com>
diff mbox series

Patch

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 3a8add6aa74c..1116a64d072e 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2697,7 +2697,7 @@  static void mptcp_worker(struct work_struct *work)
 
 	lock_sock(sk);
 	state = sk->sk_state;
-	if (unlikely(state == TCP_CLOSE))
+	if (unlikely((1 << state) & (TCPF_CLOSE | TCPF_LISTEN)))
 		goto unlock;
 
 	mptcp_check_data_fin_ack(sk);