diff mbox series

[net-next] mptcp: deal with large GSO size.

Message ID ce93989da39fd2e47baa4506fbbf4c2049c2da13.1698416467.git.pabeni@redhat.com (mailing list archive)
State Accepted, archived
Commit 6ee098165aecb3d00308d70e27095a4f976167bc
Delegated to: Matthieu Baerts
Headers show
Series [net-next] mptcp: deal with large GSO size. | expand

Checks

Context Check Description
matttbe/build success Build and static analysis OK
matttbe/checkpatch success total: 0 errors, 0 warnings, 0 checks, 16 lines checked
matttbe/KVM_Validation__normal__except_selftest_mptcp_join_ success Success! ✅
matttbe/KVM_Validation__debug__only_selftest_mptcp_join_ warning Unstable: 1 failed test(s): selftest_mptcp_join
matttbe/KVM_Validation__normal__only_selftest_mptcp_join_ success Success! ✅
matttbe/KVM_Validation__debug__except_selftest_mptcp_join_ success Success! ✅

Commit Message

Paolo Abeni Oct. 27, 2023, 2:21 p.m. UTC
After the blamed commit below, the TCP sockets (and the MPTCP subflows)
can build egress packets larger then 64K. That exceeds the maximum DSS
data size, the length being misrepresent on the wire and the stream being
corrupted, as later observed on the receiver:

WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
PKRU: 55555554
Call Trace:
 <IRQ>
 mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
 subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
 tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
 tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
 tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
 tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
 ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
 ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
 ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
 __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
 process_backlog+0x353/0x660 net/core/dev.c:5974
 __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
 net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
 __do_softirq+0x184/0x524 kernel/softirq.c:553
 do_softirq+0xdd/0x130 kernel/softirq.c:454

Address the issue explicitly bounding the maximum GSO size to what MPTCP
actually allows.

Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/450
Fixes: 7c4e983c4f3c ("net: allow gso_max_size to exceed 65536")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
this is a rough quick attempt to address the issue. A cleaner one
would be catching sk_gso_max_size in the control path and bound the
value there. Done this way to:
- allow earlier testing
- create a much smaller patch.
---
 net/mptcp/protocol.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

MPTCP CI Oct. 27, 2023, 2:42 p.m. UTC | #1
Hi Paolo,

Thank you for your modifications, that's great!

But sadly, our CI spotted some issues with it when trying to build it.

You can find more details there:

  https://patchwork.kernel.org/project/mptcp/patch/ce93989da39fd2e47baa4506fbbf4c2049c2da13.1698416467.git.pabeni@redhat.com/
  https://github.com/multipath-tcp/mptcp_net-next/actions/runs/6668654076

Status: failure
Initiator: MPTCPimporter
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/3ec0ab8000a0

Feel free to reply to this email if you cannot access logs, if you need
some support to fix the error, if this doesn't seem to be caused by your
modifications or if the error is a false positive one.

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (Tessares)
Paolo Abeni Oct. 27, 2023, 2:46 p.m. UTC | #2
On Fri, 2023-10-27 at 16:21 +0200, Paolo Abeni wrote:
> After the blamed commit below, the TCP sockets (and the MPTCP subflows)
> can build egress packets larger then 64K. That exceeds the maximum DSS
> data size, the length being misrepresent on the wire and the stream being
> corrupted, as later observed on the receiver:
> 
> WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
> CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
> netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
> RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
> RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
> RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
> netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
> RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
> RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
> R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
> R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
> FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> PKRU: 55555554
> Call Trace:
>  <IRQ>
>  mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
>  subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
>  tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
>  tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
>  tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
>  tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
>  ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
>  ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
>  ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
>  __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
>  process_backlog+0x353/0x660 net/core/dev.c:5974
>  __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
>  net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
>  __do_softirq+0x184/0x524 kernel/softirq.c:553
>  do_softirq+0xdd/0x130 kernel/softirq.c:454
> 
> Address the issue explicitly bounding the maximum GSO size to what MPTCP
> actually allows.
> 
> Reported-by: Christoph Paasch <cpaasch@apple.com>
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/450
> Fixes: 7c4e983c4f3c ("net: allow gso_max_size to exceed 65536")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> this is a rough quick attempt to address the issue. A cleaner one
> would be catching sk_gso_max_size in the control path and bound the
> value there. Done this way to:
> - allow earlier testing
> - create a much smaller patch.

To share more details, the other option will need to check and bound 
sk_gso_max_size every time/place where such field is updated:

1) after icsk_af_ops->syn_recv_sock
2) after icsk_af_ops->rebuild_header 
3) after icsk_af_ops->mtu_reduced	
4) after icsk_af_ops->queue_xmit
5) after connect()

1) would be straight forward, 2) will require separate modification for
ipv4 and ipv6, 3 & 4 the same, plus additional new subflow specific
implementation, and finally 5 is problematic in the fastopen case,
because we would need a new hook inside the tcp code.

All in all I think we are better off adding just an additional
conditional in the fastpath.

Cheers,

Paolo
Paolo Abeni Oct. 27, 2023, 2:59 p.m. UTC | #3
On Fri, 2023-10-27 at 14:42 +0000, MPTCP CI wrote:
> Hi Paolo,
> 
> Thank you for your modifications, that's great!
> 
> But sadly, our CI spotted some issues with it when trying to build it.
> 
> You can find more details there:
> 
>   https://patchwork.kernel.org/project/mptcp/patch/ce93989da39fd2e47baa4506fbbf4c2049c2da13.1698416467.git.pabeni@redhat.com/
>   https://github.com/multipath-tcp/mptcp_net-next/actions/runs/6668654076
> 
> Status: failure
> Initiator: MPTCPimporter
> Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/3ec0ab8000a0

It looks like the (upstream) tree is broken:

 kernel/bpf/task_iter.c: In function 'bpf_iter_css_task_new':
  kernel/bpf/task_iter.c:919:14: error: 'CSS_TASK_ITER_PROCS'
undeclared (first use in this function)
    919 |         case CSS_TASK_ITER_PROCS | CSS_TASK_ITER_THREADED:
        |              ^~~~~~~~~~~~~~~~~~~
  kernel/bpf/task_iter.c:919:14: note: each undeclared identifier is
reported only once for each function it appears in
  kernel/bpf/task_iter.c:919:36: error: 'CSS_TASK_ITER_THREADED'
undeclared (first use in this function)
    919 |         case CSS_TASK_ITER_PROCS | CSS_TASK_ITER_THREADED:
        |                                    ^~~~~~~~~~~~~~~~~~~~~~
  kernel/bpf/task_iter.c:927:60: error: invalid application of 'sizeof'
to incomplete type 'struct css_task_iter'
    927 |         kit->css_it = bpf_mem_alloc(&bpf_global_ma,
sizeof(struct css_task_iter));
        |                                                           
^~~~~~
  kernel/bpf/task_iter.c:930:9: error: implicit declaration of function
'css_task_iter_start'; did you mean 'task_seq_start'? [-
Werror=implicit-function-declaration]
    930 |         css_task_iter_start(css, flags, kit->css_it);
        |         ^~~~~~~~~~~~~~~~~~~
        |         task_seq_start


/P
MPTCP CI Oct. 27, 2023, 4:41 p.m. UTC | #4
Hi Paolo,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/6054814614814720
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/6054814614814720/summary/summary.txt

- KVM Validation: debug (except selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/6415959792025600
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/6415959792025600/summary/summary.txt

- KVM Validation: normal (only selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/6220357317689344
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/6220357317689344/summary/summary.txt

- KVM Validation: debug (only selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/6343048695644160
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/6343048695644160/summary/summary.txt

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/3ec0ab8000a0


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-debug

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (Tessares)
MPTCP CI Oct. 31, 2023, 5:26 p.m. UTC | #5
Hi Paolo,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/4778133740584960
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/4778133740584960/summary/summary.txt

- KVM Validation: debug (except selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/5341083694006272
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/5341083694006272/summary/summary.txt

- KVM Validation: debug (only selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/6466983600848896
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/6466983600848896/summary/summary.txt

- KVM Validation: normal (only selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/5904033647427584
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/5904033647427584/summary/summary.txt

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/1c8a63f49894


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-debug

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (Tessares)
Matthieu Baerts (NGI0) Oct. 31, 2023, 5:31 p.m. UTC | #6
Hi Paolo,

On 27/10/2023 16:59, Paolo Abeni wrote:
> On Fri, 2023-10-27 at 14:42 +0000, MPTCP CI wrote:
>> Hi Paolo,
>>
>> Thank you for your modifications, that's great!
>>
>> But sadly, our CI spotted some issues with it when trying to build it.
>>
>> You can find more details there:
>>
>>   https://patchwork.kernel.org/project/mptcp/patch/ce93989da39fd2e47baa4506fbbf4c2049c2da13.1698416467.git.pabeni@redhat.com/
>>   https://github.com/multipath-tcp/mptcp_net-next/actions/runs/6668654076
>>
>> Status: failure
>> Initiator: MPTCPimporter
>> Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/3ec0ab8000a0
> 
> It looks like the (upstream) tree is broken:
> 
>  kernel/bpf/task_iter.c: In function 'bpf_iter_css_task_new':
>   kernel/bpf/task_iter.c:919:14: error: 'CSS_TASK_ITER_PROCS'
> undeclared (first use in this function)
>     919 |         case CSS_TASK_ITER_PROCS | CSS_TASK_ITER_THREADED:
>         |              ^~~~~~~~~~~~~~~~~~~

Indeed, there is an issue on BPF side. I sent a patch a few hours ago[1]
and re-launched the CI: now it is all good!

Cheers,
Matt

[1] https://lore.kernel.org/mptcp/ZUEzzc%2FSod8OR28B@krava/T/
Mat Martineau Nov. 1, 2023, 11:55 p.m. UTC | #7
On Fri, 27 Oct 2023, Paolo Abeni wrote:

> On Fri, 2023-10-27 at 16:21 +0200, Paolo Abeni wrote:
>> After the blamed commit below, the TCP sockets (and the MPTCP subflows)
>> can build egress packets larger then 64K. That exceeds the maximum DSS
>> data size, the length being misrepresent on the wire and the stream being
>> corrupted, as later observed on the receiver:
>>
>> WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
>> CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
>> netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
>> RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
>> RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
>> RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
>> netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
>> RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
>> RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
>> R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
>> R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
>> FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
>> PKRU: 55555554
>> Call Trace:
>>  <IRQ>
>>  mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
>>  subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
>>  tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
>>  tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
>>  tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
>>  tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
>>  ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
>>  ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
>>  ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
>>  __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
>>  process_backlog+0x353/0x660 net/core/dev.c:5974
>>  __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
>>  net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
>>  __do_softirq+0x184/0x524 kernel/softirq.c:553
>>  do_softirq+0xdd/0x130 kernel/softirq.c:454
>>
>> Address the issue explicitly bounding the maximum GSO size to what MPTCP
>> actually allows.
>>
>> Reported-by: Christoph Paasch <cpaasch@apple.com>
>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/450
>> Fixes: 7c4e983c4f3c ("net: allow gso_max_size to exceed 65536")
>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>> ---
>> this is a rough quick attempt to address the issue. A cleaner one
>> would be catching sk_gso_max_size in the control path and bound the
>> value there. Done this way to:
>> - allow earlier testing
>> - create a much smaller patch.
>
> To share more details, the other option will need to check and bound
> sk_gso_max_size every time/place where such field is updated:
>
> 1) after icsk_af_ops->syn_recv_sock
> 2) after icsk_af_ops->rebuild_header
> 3) after icsk_af_ops->mtu_reduced
> 4) after icsk_af_ops->queue_xmit
> 5) after connect()
>
> 1) would be straight forward, 2) will require separate modification for
> ipv4 and ipv6, 3 & 4 the same, plus additional new subflow specific
> implementation, and finally 5 is problematic in the fastopen case,
> because we would need a new hook inside the tcp code.
>
> All in all I think we are better off adding just an additional
> conditional in the fastpath.

Hi Paolo -

I only see one place where sk_gso_max_size is updated, in sk_setup_caps():

 	sk->sk_gso_max_size = sk_dst_gso_max_size(sk, dst);

Looks like your list above is the call sites for sk_setup_caps()?

Seems like an earlier check and smaller patch to instead update the logic 
in sk_dst_gso_max_size():

 	if (max_size > GSO_LEGACY_MAX_SIZE && (!sk_is_tcp(sk) || sk_is_mptcp(sk)))
 		max_size = GSO_LEGACY_MAX_SIZE;

Fortunately by the time the sk_is_mptcp() is evaluated it has already been 
checked that sk is a struct tcp_sock *. Makes me wonder if sk_is_mptcp() 
should accept any sock pointer, I'll add a github issue for that.

- Mat
Mat Martineau Nov. 7, 2023, 5:23 p.m. UTC | #8
On Fri, 27 Oct 2023, Paolo Abeni wrote:

> After the blamed commit below, the TCP sockets (and the MPTCP subflows)
> can build egress packets larger then 64K. That exceeds the maximum DSS
> data size, the length being misrepresent on the wire and the stream being
> corrupted, as later observed on the receiver:
>
> WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
> CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
> netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
> RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
> RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
> RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
> netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
> RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
> RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
> R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
> R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
> FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> PKRU: 55555554
> Call Trace:
> <IRQ>
> mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
> subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
> tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
> tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
> tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
> tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
> ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
> ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
> ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
> __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
> process_backlog+0x353/0x660 net/core/dev.c:5974
> __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
> net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
> __do_softirq+0x184/0x524 kernel/softirq.c:553
> do_softirq+0xdd/0x130 kernel/softirq.c:454
>
> Address the issue explicitly bounding the maximum GSO size to what MPTCP
> actually allows.
>
> Reported-by: Christoph Paasch <cpaasch@apple.com>
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/450
> Fixes: 7c4e983c4f3c ("net: allow gso_max_size to exceed 65536")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> this is a rough quick attempt to address the issue. A cleaner one
> would be catching sk_gso_max_size in the control path and bound the
> value there. Done this way to:
> - allow earlier testing
> - create a much smaller patch.
> ---
> net/mptcp/protocol.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index b41f8c3e0c00..d522fabc863d 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -1234,6 +1234,8 @@ static void mptcp_update_infinite_map(struct mptcp_sock *msk,
> 	mptcp_do_fallback(ssk);
> }
>
> +#define MPTCP_MAX_GSO_SIZE (GSO_LEGACY_MAX_SIZE - (MAX_TCP_HEADER + 1))
> +
> static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
> 			      struct mptcp_data_frag *dfrag,
> 			      struct mptcp_sendmsg_info *info)
> @@ -1260,6 +1262,8 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
> 		return -EAGAIN;
>
> 	/* compute send limit */
> +	if (unlikely(ssk->sk_gso_max_size > MPTCP_MAX_GSO_SIZE))
> +		ssk->sk_gso_max_size = MPTCP_MAX_GSO_SIZE;
> 	info->mss_now = tcp_send_mss(ssk, &info->size_goal, info->flags);
> 	copy = info->size_goal;
>

As we discussed in the meeting, since the limit is related to the 
MPTCP-specific DSS limitations (which could change) rather than anything 
directly related to GSO, it's helpful to have this check in the MPTCP 
code.

Reviewed-by: Mat Martineau <martineau@kernel.org>
MPTCP CI Nov. 7, 2023, 6:58 p.m. UTC | #9
Hi Paolo,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join):
  - Success! ✅:
  - Task: https://cirrus-ci.com/task/6483594420420608
  - Summary: https://api.cirrus-ci.com/v1/artifact/task/6483594420420608/summary/summary.txt

- KVM Validation: debug (only selftest_mptcp_join):
  - Unstable: 1 failed test(s): selftest_mptcp_join 
Matthieu Baerts (NGI0) Nov. 7, 2023, 9:12 p.m. UTC | #10
Hi Paolo, Mat,

On 27/10/2023 16:21, Paolo Abeni wrote:
> After the blamed commit below, the TCP sockets (and the MPTCP subflows)
> can build egress packets larger then 64K. That exceeds the maximum DSS
> data size, the length being misrepresent on the wire and the stream being
> corrupted, as later observed on the receiver:

Thank you for the patch and the review!

Now in our tree (fix for -net) with Mat's RvB tag:

New patches for t/upstream-net and t/upstream:
- 6ee098165aec: mptcp: deal with large GSO size
- Results: 9fa919e1e582..f1fb06db6c5f (export-net)
- Results: fd17d7e00023..58f8fd527fbf (export)

Tests are now in progress:

https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export-net/20231107T210949
https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20231107T210949

Cheers,
Matt
diff mbox series

Patch

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index b41f8c3e0c00..d522fabc863d 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1234,6 +1234,8 @@  static void mptcp_update_infinite_map(struct mptcp_sock *msk,
 	mptcp_do_fallback(ssk);
 }
 
+#define MPTCP_MAX_GSO_SIZE (GSO_LEGACY_MAX_SIZE - (MAX_TCP_HEADER + 1))
+
 static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 			      struct mptcp_data_frag *dfrag,
 			      struct mptcp_sendmsg_info *info)
@@ -1260,6 +1262,8 @@  static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 		return -EAGAIN;
 
 	/* compute send limit */
+	if (unlikely(ssk->sk_gso_max_size > MPTCP_MAX_GSO_SIZE))
+		ssk->sk_gso_max_size = MPTCP_MAX_GSO_SIZE;
 	info->mss_now = tcp_send_mss(ssk, &info->size_goal, info->flags);
 	copy = info->size_goal;