Message ID | 4538e7bca21fdb7f997c026d74f886aedb547d39.1714654631.git.tanggeliang@kylinos.cn (mailing list archive) |
---|---|
State | Superseded, archived |
Delegated to: | Mat Martineau |
Headers | show |
Series | [mptcp-net] mptcp: fix EAGAIN errors in MPTCP BPF tests | expand |
Context | Check | Description |
---|---|---|
matttbe/build | success | Build and static analysis OK |
matttbe/checkpatch | warning | total: 0 errors, 1 warnings, 0 checks, 11 lines checked |
matttbe/shellcheck | success | MPTCP selftests files have not been modified |
matttbe/KVM_Validation__normal | success | Success! ✅ |
matttbe/KVM_Validation__debug | success | Success! ✅ |
matttbe/KVM_Validation__btf__only_bpftest_all_ | success | Success! ✅ |
Hi Geliang, Thank you for your modifications, that's great! Our CI did some validations and here is its report: - KVM Validation: normal: Success! ✅ - KVM Validation: debug: Success! ✅ - KVM Validation: btf (only bpftest_all): Success! ✅ - Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/8924476878 Initiator: Patchew Applier Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/66c948e86929 Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=849887 If there are some issues, you can reproduce them using the same environment as the one used by the CI thanks to a docker image, e.g.: $ cd [kernel source code] $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \ --pull always mptcp/mptcp-upstream-virtme-docker:latest \ auto-normal For more details: https://github.com/multipath-tcp/mptcp-upstream-virtme-docker Please note that despite all the efforts that have been already done to have a stable tests suite when executed on a public CI like here, it is possible some reported issues are not due to your modifications. Still, do not hesitate to help us improve that ;-) Cheers, MPTCP GH Action bot Bot operated by Matthieu Baerts (NGI0 Core)
On Thu, 2 May 2024, Geliang Tang wrote: > From: Geliang Tang <tanggeliang@kylinos.cn> > > BPF tests fail sometimes with "bytes != total_bytes" errors: > > test_default:PASS:sched_init:default 0 nsec > send_data:PASS:pthread_create 0 nsec > send_data:FAIL:recv 936000 != 10485760 nr_recv:-1 errno:11 > default: 3041 ms > server:FAIL:send 7579500 != 10485760 nr_sent:-1 errno:11 > send_data:FAIL:pthread_join thread_ret:-11 test_default:PASS: \ > has_bytes_sent addr_1 0 nsec > test_default:PASS:has_bytes_sent addr_2 0 nsec > close_netns:PASS:setns 0 nsec > > In this case mptcp_recvmsg gets EAGAIN errors. This issue introduces > by commit dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale"). > This patch fixes it by adding sk_is_mptcp() check defore update > scaling_ratio in tcp_measure_rcv_mss(). Do not update scaling_ratio if > this is a MPTCP socket. > > Fixes: dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale") > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/487 > Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> > --- > net/ipv4/tcp_input.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index ad8fa129fcfe..39237d6713ee 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -235,8 +235,10 @@ static void tcp_measure_rcv_mss(struct sock *sk, const struct sk_buff *skb) > /* Note: divides are still a bit expensive. > * For the moment, only adjust scaling_ratio > * when we update icsk_ack.rcv_mss. > + * > + * This breaks MPTCP BPF tests, skip it. > */ > - if (unlikely(len != icsk->icsk_ack.rcv_mss)) { > + if (unlikely(len != icsk->icsk_ack.rcv_mss && !sk_is_mptcp(sk))) { Hi Geliang - It's not clear to me why avoiding scaling ratio changes when len > icsk->icsk_ack.rcv_mss is the right solution to the problem. From the conditional right above this diff, it looks like scaling_ratio still could change when len == icsk->icsk_ack.rcv_mss Does the MPTCP code handle that case, or is that not exercised by the tests? Are you sure there isn't an issue with the MPTCP code where it should be able to handle the TCP layer updating scaling_ratios on different subflows? Paolo do you have any suggestions or notes? Thanks, Mat > u64 val = (u64)skb->len << TCP_RMEM_TO_WIN_SCALE; > > do_div(val, skb->truesize); > -- > 2.43.0 > > >
On Mon, 2024-05-06 at 18:02 -0700, Mat Martineau wrote: > On Thu, 2 May 2024, Geliang Tang wrote: > > > From: Geliang Tang <tanggeliang@kylinos.cn> > > > > BPF tests fail sometimes with "bytes != total_bytes" errors: > > > > test_default:PASS:sched_init:default 0 nsec > > send_data:PASS:pthread_create 0 nsec > > send_data:FAIL:recv 936000 != 10485760 nr_recv:-1 errno:11 > > default: 3041 ms > > server:FAIL:send 7579500 != 10485760 nr_sent:-1 errno:11 > > send_data:FAIL:pthread_join thread_ret:-11 test_default:PASS: \ > > has_bytes_sent addr_1 0 nsec > > test_default:PASS:has_bytes_sent addr_2 0 nsec > > close_netns:PASS:setns 0 nsec > > > > In this case mptcp_recvmsg gets EAGAIN errors. This issue > > introduces > > by commit dfa2f0483360 ("tcp: get rid of > > sysctl_tcp_adv_win_scale"). > > This patch fixes it by adding sk_is_mptcp() check defore update > > scaling_ratio in tcp_measure_rcv_mss(). Do not update scaling_ratio > > if > > this is a MPTCP socket. > > > > Fixes: dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale") > > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/487 > > Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> > > --- > > net/ipv4/tcp_input.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > > index ad8fa129fcfe..39237d6713ee 100644 > > --- a/net/ipv4/tcp_input.c > > +++ b/net/ipv4/tcp_input.c > > @@ -235,8 +235,10 @@ static void tcp_measure_rcv_mss(struct sock > > *sk, const struct sk_buff *skb) > > /* Note: divides are still a bit expensive. > > * For the moment, only adjust scaling_ratio > > * when we update icsk_ack.rcv_mss. > > + * > > + * This breaks MPTCP BPF tests, skip it. > > */ > > - if (unlikely(len != icsk->icsk_ack.rcv_mss)) { > > + if (unlikely(len != icsk->icsk_ack.rcv_mss && > > !sk_is_mptcp(sk))) { > > > Hi Geliang - > > It's not clear to me why avoiding scaling ratio changes when > > len > icsk->icsk_ack.rcv_mss > > is the right solution to the problem. From the conditional right > above > this diff, it looks like scaling_ratio still could change when > > len == icsk->icsk_ack.rcv_mss > > Does the MPTCP code handle that case, or is that not exercised by the > tests? Are you sure there isn't an issue with the MPTCP code where it > should be able to handle the TCP layer updating scaling_ratios on > different subflows? > > Paolo do you have any suggestions or notes? Thanks Mat, I finally found the root cause of this issue. It is indeed an MPTCP bug. And v2 has just sent out, renamed as "fix the default value of scaling_ratio". -Geliang > > > Thanks, > Mat > > > u64 val = (u64)skb->len << > > TCP_RMEM_TO_WIN_SCALE; > > > > do_div(val, skb->truesize); > > -- > > 2.43.0 > > > > > > >
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index ad8fa129fcfe..39237d6713ee 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -235,8 +235,10 @@ static void tcp_measure_rcv_mss(struct sock *sk, const struct sk_buff *skb) /* Note: divides are still a bit expensive. * For the moment, only adjust scaling_ratio * when we update icsk_ack.rcv_mss. + * + * This breaks MPTCP BPF tests, skip it. */ - if (unlikely(len != icsk->icsk_ack.rcv_mss)) { + if (unlikely(len != icsk->icsk_ack.rcv_mss && !sk_is_mptcp(sk))) { u64 val = (u64)skb->len << TCP_RMEM_TO_WIN_SCALE; do_div(val, skb->truesize);