Message ID | 02fe7afaec6fac65fc2da5922efe25d6ebdebc88.1706132185.git.pabeni@redhat.com (mailing list archive) |
---|---|
State | Accepted, archived |
Commit | d59403d0c91bd78663dad043b348f1f3ad2cc88d |
Delegated to: | Matthieu Baerts |
Headers | show |
Series | [mptcp-net] mptcp: really cope with fastopen race. | expand |
Context | Check | Description |
---|---|---|
matttbe/build | success | Build and static analysis OK |
matttbe/checkpatch | warning | total: 0 errors, 1 warnings, 0 checks, 9 lines checked |
matttbe/KVM_Validation__normal | warning | Unstable: 2 failed test(s): packetdrill_mp_reset packetdrill_regressions |
matttbe/KVM_Validation__debug__except_selftest_mptcp_join_ | success | Success! ✅ |
matttbe/KVM_Validation__debug__only_selftest_mptcp_join_ | warning | Unstable: 1 failed test(s): selftest_mptcp_join |
Hi Paolo, Thank you for your modifications, that's great! Our CI (GitHub Action) did some validations and here is its report: - KVM Validation: normal: - Unstable: 2 failed test(s): packetdrill_regressions selftest_mptcp_join
Hi Paolo, Thank you for your modifications, that's great! Our CI (Cirrus) did some validations with a debug kernel and here is its report: - KVM Validation: debug (except selftest_mptcp_join): - Critical: 1 Call Trace(s) ❌: - Task: https://cirrus-ci.com/task/5675809764016128 - Summary: https://api.cirrus-ci.com/v1/artifact/task/5675809764016128/summary/summary.txt - KVM Validation: debug (only selftest_mptcp_join): - Critical: 1 Call Trace(s) ❌: - Task: https://cirrus-ci.com/task/5112859810594816 - Summary: https://api.cirrus-ci.com/v1/artifact/task/5112859810594816/summary/summary.txt Initiator: Patchew Applier Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/f24f750c6f62 If there are some issues, you can reproduce them using the same environment as the one used by the CI thanks to a docker image, e.g.: $ cd [kernel source code] $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \ --pull always mptcp/mptcp-upstream-virtme-docker:latest \ auto-debug For more details: https://github.com/multipath-tcp/mptcp-upstream-virtme-docker Please note that despite all the efforts that have been already done to have a stable tests suite when executed on a public CI like here, it is possible some reported issues are not due to your modifications. Still, do not hesitate to help us improve that ;-) Cheers, MPTCP GH Action bot Bot operated by Matthieu Baerts (NGI0 Core)
On Wed, 24 Jan 2024, Paolo Abeni wrote: > Fastopen and PM-trigger subflow shutdown can race, as reported by > syzkaller. > > In my first attempt to close such race, I missed the fact that > the subflow status can change again before the subflow_state_change > callback is invoked. > > Address the issue additionally copying with all the states directly > reachable from TCP_FIN_WAIT1. > > Fixes: 1e777f39b4d7 ("mptcp: add MSG_FASTOPEN sendmsg flag support") > Fixes: 4fd19a307016 ("mptcp: fix inconsistent state on fastopen race") > Reported-by: syzbot+c53d4d3ddb327e80bc51@syzkaller.appspotmail.com > Signed-off-by: Paolo Abeni <pabeni@redhat.com> I wasn't able to reproduce the CI failure, and the change looks good to me. We'll see if CI is happy when this is applied to export-net. Reviewed-by: Mat Martineau <martineau@kernel.org> > --- > net/mptcp/protocol.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h > index 4a32d3d11fb6..de04b97e8dd1 100644 > --- a/net/mptcp/protocol.h > +++ b/net/mptcp/protocol.h > @@ -1171,7 +1171,8 @@ static inline bool subflow_simultaneous_connect(struct sock *sk) > { > struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); > > - return (1 << sk->sk_state) & (TCPF_ESTABLISHED | TCPF_FIN_WAIT1) && > + return (1 << sk->sk_state) & > + (TCPF_ESTABLISHED | TCPF_FIN_WAIT1 | TCPF_FIN_WAIT2 | TCPF_CLOSING) && > is_active_ssk(subflow) && > !subflow->conn_finished; > } > -- > 2.43.0 > > >
Hi Paolo, Thank you for your modifications, that's great! Our CI (GitHub Action) did some validations and here is its report: - KVM Validation: normal: - Unstable: 2 failed test(s): packetdrill_mp_reset packetdrill_regressions
Hi Paolo, Thank you for your modifications, that's great! Our CI (Cirrus) did some validations with a debug kernel and here is its report: - KVM Validation: debug (except selftest_mptcp_join): - Success! ✅: - Task: https://cirrus-ci.com/task/4896041841983488 - Summary: https://api.cirrus-ci.com/v1/artifact/task/4896041841983488/summary/summary.txt - KVM Validation: debug (only selftest_mptcp_join): - Unstable: 1 failed test(s): selftest_mptcp_join
On Fri, 2024-01-26 at 17:18 -0800, Mat Martineau wrote: > On Wed, 24 Jan 2024, Paolo Abeni wrote: > > > Fastopen and PM-trigger subflow shutdown can race, as reported by > > syzkaller. > > > > In my first attempt to close such race, I missed the fact that > > the subflow status can change again before the subflow_state_change > > callback is invoked. > > > > Address the issue additionally copying with all the states directly > > reachable from TCP_FIN_WAIT1. > > > > Fixes: 1e777f39b4d7 ("mptcp: add MSG_FASTOPEN sendmsg flag support") > > Fixes: 4fd19a307016 ("mptcp: fix inconsistent state on fastopen race") > > Reported-by: syzbot+c53d4d3ddb327e80bc51@syzkaller.appspotmail.com > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > I wasn't able to reproduce the CI failure, and the change looks good to > me. We'll see if CI is happy when this is applied to export-net. I used the repro provided by syzbot upstream: https://lore.kernel.org/netdev/000000000000e8ecae060fae7a47@google.com/ https://syzkaller.appspot.com/x/repro.c?x=17abc23be80000 with my default debug config, and here it reproduced the issue reliably on the unpatched kernel (and no splat with the patched one). > > Reviewed-by: Mat Martineau <martineau@kernel.org> Thanks, Paolo
On Mon, 29 Jan 2024, Paolo Abeni wrote: > On Fri, 2024-01-26 at 17:18 -0800, Mat Martineau wrote: >> On Wed, 24 Jan 2024, Paolo Abeni wrote: >> >>> Fastopen and PM-trigger subflow shutdown can race, as reported by >>> syzkaller. >>> >>> In my first attempt to close such race, I missed the fact that >>> the subflow status can change again before the subflow_state_change >>> callback is invoked. >>> >>> Address the issue additionally copying with all the states directly >>> reachable from TCP_FIN_WAIT1. >>> >>> Fixes: 1e777f39b4d7 ("mptcp: add MSG_FASTOPEN sendmsg flag support") >>> Fixes: 4fd19a307016 ("mptcp: fix inconsistent state on fastopen race") >>> Reported-by: syzbot+c53d4d3ddb327e80bc51@syzkaller.appspotmail.com >>> Signed-off-by: Paolo Abeni <pabeni@redhat.com> >> >> I wasn't able to reproduce the CI failure, and the change looks good to >> me. We'll see if CI is happy when this is applied to export-net. > > I used the repro provided by syzbot upstream: > > https://lore.kernel.org/netdev/000000000000e8ecae060fae7a47@google.com/ > https://syzkaller.appspot.com/x/repro.c?x=17abc23be80000 > > with my default debug config, and here it reproduced the issue reliably > on the unpatched kernel (and no splat with the patched one). > I was not very specific in my wording: I was referring to the CI failures when it applied this patch to export-net: https://lore.kernel.org/mptcp/482549332bcc96a8b4dd37292cb289b271b49511.camel@redhat.com/T/#mfb81b40b8a42912362eede622b5a15b17313d2a2 ...not the failure reported by syzbot that this patch is intended to fix. I did run export-net plus this patch through the mptcp-upstream-virtme-docker environment and it worked fine, so I think it's ok to apply. - Mat
On Mon, 2024-01-29 at 11:16 -0800, Mat Martineau wrote: > On Mon, 29 Jan 2024, Paolo Abeni wrote: > > > On Fri, 2024-01-26 at 17:18 -0800, Mat Martineau wrote: > > > On Wed, 24 Jan 2024, Paolo Abeni wrote: > > > > > > > Fastopen and PM-trigger subflow shutdown can race, as reported by > > > > syzkaller. > > > > > > > > In my first attempt to close such race, I missed the fact that > > > > the subflow status can change again before the subflow_state_change > > > > callback is invoked. > > > > > > > > Address the issue additionally copying with all the states directly > > > > reachable from TCP_FIN_WAIT1. > > > > > > > > Fixes: 1e777f39b4d7 ("mptcp: add MSG_FASTOPEN sendmsg flag support") > > > > Fixes: 4fd19a307016 ("mptcp: fix inconsistent state on fastopen race") > > > > Reported-by: syzbot+c53d4d3ddb327e80bc51@syzkaller.appspotmail.com > > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > > > > > I wasn't able to reproduce the CI failure, and the change looks good to > > > me. We'll see if CI is happy when this is applied to export-net. > > > > I used the repro provided by syzbot upstream: > > > > https://lore.kernel.org/netdev/000000000000e8ecae060fae7a47@google.com/ > > https://syzkaller.appspot.com/x/repro.c?x=17abc23be80000 > > > > with my default debug config, and here it reproduced the issue reliably > > on the unpatched kernel (and no splat with the patched one). > > > > I was not very specific in my wording: I was referring to the CI failures > when it applied this patch to export-net: > > https://lore.kernel.org/mptcp/482549332bcc96a8b4dd37292cb289b271b49511.camel@redhat.com/T/#mfb81b40b8a42912362eede622b5a15b17313d2a2 Ah, that one! I think it's caused by: commit 198bc90e0e734e5f98c3d2833e8390cac3df61b2 Author: Zhengchao Shao <shaozhengchao@huawei.com> Date: Thu Jan 18 09:20:19 2024 +0800 tcp: make sure init the accept_queue's spinlocks once and fixed by: commit 435e202d645c197dcfd39d7372eb2a56529b6640 (net68/tmp_png) Author: Zhengchao Shao <shaozhengchao@huawei.com> Date: Mon Jan 22 18:20:01 2024 +0800 ipv6: init the accept_queue's spinlocks in inet6_create Cheers, Paolo
Hi Paolo, Mat, On 24/01/2024 22:36, Paolo Abeni wrote: > Fastopen and PM-trigger subflow shutdown can race, as reported by > syzkaller. > > In my first attempt to close such race, I missed the fact that > the subflow status can change again before the subflow_state_change > callback is invoked. > > Address the issue additionally copying with all the states directly > reachable from TCP_FIN_WAIT1. Thank you for the patch and the review! Now in our tree (fix for -net) without the '.' at the end of the commit title and a Closes tag ;) New patches for t/upstream-net and t/upstream: - d59403d0c91b: mptcp: really cope with fastopen race - Results: ca82f8ccb3cc..6d56aa24e733 (export-net) - Results: 4b8f1ec22243..cdd4fe00f4c5 (export) Tests are now in progress: https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export-net/20240130T120521 https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20240130T120521 Cheers, Matt
On Wed, 2024-01-24 at 22:36 +0100, Paolo Abeni wrote: > Fastopen and PM-trigger subflow shutdown can race, as reported by > syzkaller. > > In my first attempt to close such race, I missed the fact that > the subflow status can change again before the subflow_state_change > callback is invoked. > > Address the issue additionally copying with all the states directly > reachable from TCP_FIN_WAIT1. > > Fixes: 1e777f39b4d7 ("mptcp: add MSG_FASTOPEN sendmsg flag support") > Fixes: 4fd19a307016 ("mptcp: fix inconsistent state on fastopen race") > Reported-by: syzbot+c53d4d3ddb327e80bc51@syzkaller.appspotmail.com Hopefully it's not too late... The above should be: Reported-by: syzbot+732ab7be796ec0d104ac@syzkaller.appspotmail.com @Mat(s), could you please update the path directly in the export branch? Thanks! Paolo
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 4a32d3d11fb6..de04b97e8dd1 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -1171,7 +1171,8 @@ static inline bool subflow_simultaneous_connect(struct sock *sk) { struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); - return (1 << sk->sk_state) & (TCPF_ESTABLISHED | TCPF_FIN_WAIT1) && + return (1 << sk->sk_state) & + (TCPF_ESTABLISHED | TCPF_FIN_WAIT1 | TCPF_FIN_WAIT2 | TCPF_CLOSING) && is_active_ssk(subflow) && !subflow->conn_finished; }
Fastopen and PM-trigger subflow shutdown can race, as reported by syzkaller. In my first attempt to close such race, I missed the fact that the subflow status can change again before the subflow_state_change callback is invoked. Address the issue additionally copying with all the states directly reachable from TCP_FIN_WAIT1. Fixes: 1e777f39b4d7 ("mptcp: add MSG_FASTOPEN sendmsg flag support") Fixes: 4fd19a307016 ("mptcp: fix inconsistent state on fastopen race") Reported-by: syzbot+c53d4d3ddb327e80bc51@syzkaller.appspotmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> --- net/mptcp/protocol.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)