diff mbox series

[v2,mptcp-net] mptcp: more conservative check for zero probes.

Message ID 0ee2a306b00ef868b993e2426d9d51e14882bd72.1697125647.git.pabeni@redhat.com (mailing list archive)
State Accepted, archived
Commit 5c315b72c90e0d0aaecbe7eb5f3c6b57c3513d64
Delegated to: Matthieu Baerts
Headers show
Series [v2,mptcp-net] mptcp: more conservative check for zero probes. | expand

Checks

Context Check Description
matttbe/build success Build and static analysis OK
matttbe/checkpatch warning total: 0 errors, 2 warnings, 0 checks, 26 lines checked
matttbe/KVM_Validation__normal__except_selftest_mptcp_join_ success Success! ✅
matttbe/KVM_Validation__normal__only_selftest_mptcp_join_ success Success! ✅
matttbe/KVM_Validation__debug__except_selftest_mptcp_join_ success Success! ✅
matttbe/KVM_Validation__debug__only_selftest_mptcp_join_ fail Script error! ❓

Commit Message

Paolo Abeni Oct. 12, 2023, 3:47 p.m. UTC
Christoph reported that the MPTCP protocol can find the subflow-level
write queue unexpectedly not empty while crafting a zero-window probe,
hitting a warning:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 188 at net/mptcp/protocol.c:1312 mptcp_sendmsg_frag+0xc06/0xe70
Modules linked in:
CPU: 0 PID: 188 Comm: kworker/0:2 Not tainted 6.6.0-rc2-g1176aa719d7a #47
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
Workqueue: events mptcp_worker
RIP: 0010:mptcp_sendmsg_frag+0xc06/0xe70 net/mptcp/protocol.c:1312
RAX: 47d0530de347ff6a RBX: 47d0530de347ff6b RCX: ffff8881015d3c00
RDX: ffff8881015d3c00 RSI: 47d0530de347ff6b RDI: 47d0530de347ff6b
RBP: 47d0530de347ff6b R08: ffffffff8243c6a8 R09: ffffffff82042d9c
R10: 0000000000000002 R11: ffffffff82056850 R12: ffff88812a13d580
R13: 0000000000000001 R14: ffff88812b375e50 R15: ffff88812bbf3200
FS:  0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000695118 CR3: 0000000115dfc001 CR4: 0000000000170ef0
Call Trace:
 <TASK>
 __subflow_push_pending+0xa4/0x420 net/mptcp/protocol.c:1545
 __mptcp_push_pending+0x128/0x3b0 net/mptcp/protocol.c:1614
 mptcp_release_cb+0x218/0x5b0 net/mptcp/protocol.c:3391
 release_sock+0xf6/0x100 net/core/sock.c:3521
 mptcp_worker+0x6e8/0x8f0 net/mptcp/protocol.c:2746
 process_scheduled_works+0x341/0x690 kernel/workqueue.c:2630
 worker_thread+0x3a7/0x610 kernel/workqueue.c:2784
 kthread+0x143/0x180 kernel/kthread.c:388
 ret_from_fork+0x4d/0x60 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1b/0x30 arch/x86/entry/entry_64.S:304
 </TASK>

The root cause of the issue is that expectations are wrong: e.g. due
to MPTCP-level re-injection we can hit the critical condition.

Explicitly avoid the zero-window probe when the subflow write queue
is not empty and drop the related warning.

Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/444
Fixes: f70cad1085d1 ("mptcp: stop relying on tcp_tx_skb_cache")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
v1 -> v2:
 - drop additional, obsoleted warn_on() (Mat)
---
 net/mptcp/protocol.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

Comments

MPTCP CI Oct. 12, 2023, 5:20 p.m. UTC | #1
Hi Paolo,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join):
  - Unstable: 1 failed test(s): packetdrill_regressions 
Mat Martineau Oct. 12, 2023, 7:50 p.m. UTC | #2
On Thu, 12 Oct 2023, Paolo Abeni wrote:

> Christoph reported that the MPTCP protocol can find the subflow-level
> write queue unexpectedly not empty while crafting a zero-window probe,
> hitting a warning:
>
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 188 at net/mptcp/protocol.c:1312 mptcp_sendmsg_frag+0xc06/0xe70
> Modules linked in:
> CPU: 0 PID: 188 Comm: kworker/0:2 Not tainted 6.6.0-rc2-g1176aa719d7a #47
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
> Workqueue: events mptcp_worker
> RIP: 0010:mptcp_sendmsg_frag+0xc06/0xe70 net/mptcp/protocol.c:1312
> RAX: 47d0530de347ff6a RBX: 47d0530de347ff6b RCX: ffff8881015d3c00
> RDX: ffff8881015d3c00 RSI: 47d0530de347ff6b RDI: 47d0530de347ff6b
> RBP: 47d0530de347ff6b R08: ffffffff8243c6a8 R09: ffffffff82042d9c
> R10: 0000000000000002 R11: ffffffff82056850 R12: ffff88812a13d580
> R13: 0000000000000001 R14: ffff88812b375e50 R15: ffff88812bbf3200
> FS:  0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000695118 CR3: 0000000115dfc001 CR4: 0000000000170ef0
> Call Trace:
> <TASK>
> __subflow_push_pending+0xa4/0x420 net/mptcp/protocol.c:1545
> __mptcp_push_pending+0x128/0x3b0 net/mptcp/protocol.c:1614
> mptcp_release_cb+0x218/0x5b0 net/mptcp/protocol.c:3391
> release_sock+0xf6/0x100 net/core/sock.c:3521
> mptcp_worker+0x6e8/0x8f0 net/mptcp/protocol.c:2746
> process_scheduled_works+0x341/0x690 kernel/workqueue.c:2630
> worker_thread+0x3a7/0x610 kernel/workqueue.c:2784
> kthread+0x143/0x180 kernel/kthread.c:388
> ret_from_fork+0x4d/0x60 arch/x86/kernel/process.c:147
> ret_from_fork_asm+0x1b/0x30 arch/x86/entry/entry_64.S:304
> </TASK>
>
> The root cause of the issue is that expectations are wrong: e.g. due
> to MPTCP-level re-injection we can hit the critical condition.
>
> Explicitly avoid the zero-window probe when the subflow write queue
> is not empty and drop the related warning.
>
> Reported-by: Christoph Paasch <cpaasch@apple.com>
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/444
> Fixes: f70cad1085d1 ("mptcp: stop relying on tcp_tx_skb_cache")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> v1 -> v2:
> - drop additional, obsoleted warn_on() (Mat)

Thanks Paolo, v2 looks good.

Reviewed-by: Mat Martineau <martineau@kernel.org>

> ---
> net/mptcp/protocol.c | 8 +-------
> 1 file changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 30e0c29ae0a4..48aa4045cb0d 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -1297,7 +1297,7 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
> 	if (copy == 0) {
> 		u64 snd_una = READ_ONCE(msk->snd_una);
>
> -		if (snd_una != msk->snd_nxt) {
> +		if (snd_una != msk->snd_nxt || tcp_write_queue_tail(ssk)) {
> 			tcp_remove_empty_skb(ssk);
> 			return 0;
> 		}
> @@ -1305,11 +1305,6 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
> 		zero_window_probe = true;
> 		data_seq = snd_una - 1;
> 		copy = 1;
> -
> -		/* all mptcp-level data is acked, no skbs should be present into the
> -		 * ssk write queue
> -		 */
> -		WARN_ON_ONCE(reuse_skb);
> 	}
>
> 	copy = min_t(size_t, copy, info->limit - info->sent);
> @@ -1338,7 +1333,6 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
> 	if (reuse_skb) {
> 		TCP_SKB_CB(skb)->tcp_flags &= ~TCPHDR_PSH;
> 		mpext->data_len += copy;
> -		WARN_ON_ONCE(zero_window_probe);
> 		goto out;
> 	}
>
> -- 
> 2.41.0
>
>
>
MPTCP CI Oct. 12, 2023, 9:15 p.m. UTC | #3
Hi Paolo,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/5346035443171328
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/5346035443171328/summary/summary.txt

- KVM Validation: normal (only selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/6471935350013952
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/6471935350013952/summary/summary.txt

- KVM Validation: debug (except selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/5636778166583296
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/5636778166583296/summary/summary.txt

- KVM Validation: debug (only selftest_mptcp_join):
  - Script error! ❓:
  - Task: https://cirrus-ci.com/task/6190460373303296
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/6190460373303296/summary/summary.txt

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/742d67fc90cb


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-debug

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (Tessares)
Matthieu Baerts (NGI0) Oct. 13, 2023, 10:19 a.m. UTC | #4
Hi Paolo, Mat,

On 12/10/2023 17:47, Paolo Abeni wrote:
> Christoph reported that the MPTCP protocol can find the subflow-level
> write queue unexpectedly not empty while crafting a zero-window probe,
> hitting a warning:
> 
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 188 at net/mptcp/protocol.c:1312 mptcp_sendmsg_frag+0xc06/0xe70
> Modules linked in:
> CPU: 0 PID: 188 Comm: kworker/0:2 Not tainted 6.6.0-rc2-g1176aa719d7a #47
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
> Workqueue: events mptcp_worker
> RIP: 0010:mptcp_sendmsg_frag+0xc06/0xe70 net/mptcp/protocol.c:1312
> RAX: 47d0530de347ff6a RBX: 47d0530de347ff6b RCX: ffff8881015d3c00
> RDX: ffff8881015d3c00 RSI: 47d0530de347ff6b RDI: 47d0530de347ff6b
> RBP: 47d0530de347ff6b R08: ffffffff8243c6a8 R09: ffffffff82042d9c
> R10: 0000000000000002 R11: ffffffff82056850 R12: ffff88812a13d580
> R13: 0000000000000001 R14: ffff88812b375e50 R15: ffff88812bbf3200
> FS:  0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000695118 CR3: 0000000115dfc001 CR4: 0000000000170ef0
> Call Trace:
>  <TASK>
>  __subflow_push_pending+0xa4/0x420 net/mptcp/protocol.c:1545
>  __mptcp_push_pending+0x128/0x3b0 net/mptcp/protocol.c:1614
>  mptcp_release_cb+0x218/0x5b0 net/mptcp/protocol.c:3391
>  release_sock+0xf6/0x100 net/core/sock.c:3521
>  mptcp_worker+0x6e8/0x8f0 net/mptcp/protocol.c:2746
>  process_scheduled_works+0x341/0x690 kernel/workqueue.c:2630
>  worker_thread+0x3a7/0x610 kernel/workqueue.c:2784
>  kthread+0x143/0x180 kernel/kthread.c:388
>  ret_from_fork+0x4d/0x60 arch/x86/kernel/process.c:147
>  ret_from_fork_asm+0x1b/0x30 arch/x86/entry/entry_64.S:304
>  </TASK>
> 
> The root cause of the issue is that expectations are wrong: e.g. due
> to MPTCP-level re-injection we can hit the critical condition.
> 
> Explicitly avoid the zero-window probe when the subflow write queue
> is not empty and drop the related warning.

(I added a 's' after 'warning' and removed the '.' in the title).

Thank you for the patch and the review!

Now in our tree (fixes for -net):

New patches for t/upstream-net and t/upstream:
- 5c315b72c90e: mptcp: more conservative check for zero probes
- Results: 8bb9623ee0ad..83a52a5f95c9 (export-net)
- Results: 3d46662d175d..c0f977cdc194 (export)

Tests are now in progress:

https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export-net/20231013T100443
https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20231013T100443

Cheers,
Matt
diff mbox series

Patch

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 30e0c29ae0a4..48aa4045cb0d 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1297,7 +1297,7 @@  static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 	if (copy == 0) {
 		u64 snd_una = READ_ONCE(msk->snd_una);
 
-		if (snd_una != msk->snd_nxt) {
+		if (snd_una != msk->snd_nxt || tcp_write_queue_tail(ssk)) {
 			tcp_remove_empty_skb(ssk);
 			return 0;
 		}
@@ -1305,11 +1305,6 @@  static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 		zero_window_probe = true;
 		data_seq = snd_una - 1;
 		copy = 1;
-
-		/* all mptcp-level data is acked, no skbs should be present into the
-		 * ssk write queue
-		 */
-		WARN_ON_ONCE(reuse_skb);
 	}
 
 	copy = min_t(size_t, copy, info->limit - info->sent);
@@ -1338,7 +1333,6 @@  static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 	if (reuse_skb) {
 		TCP_SKB_CB(skb)->tcp_flags &= ~TCPHDR_PSH;
 		mpext->data_len += copy;
-		WARN_ON_ONCE(zero_window_probe);
 		goto out;
 	}