diff mbox series

[bpf-next] selftests/bpf: Handle EAGAIN in bpf_tcp_ca

Message ID 78a17486bd12d7afb4cce565d58dac3f15e55c49.1711620349.git.tanggeliang@kylinos.cn (mailing list archive)
State Rejected
Delegated to: BPF
Headers show
Series [bpf-next] selftests/bpf: Handle EAGAIN in bpf_tcp_ca | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-4 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-34 success Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-33 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-9 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-41 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-10 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-11 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-35 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-26 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-30 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-42 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-20 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-37 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-19 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-25 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-40 success Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-32 success Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-38 success Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-22 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-39 success Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-36 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18 and -O2 optimization
bpf/vmtest-bpf-next-VM_Test-31 success Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-15 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17 and -O2 optimization
bpf/vmtest-bpf-next-VM_Test-14 success Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for bpf-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/build_tools success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers warning 2 maintainers not CCed: linux-kselftest@vger.kernel.org yuran.pereira@hotmail.com
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 16 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Geliang Tang March 28, 2024, 10:23 a.m. UTC
From: Geliang Tang <tanggeliang@kylinos.cn>

bpf_tcp_ca tests may emit EAGAIN sometimes. In that case, tests fail with
"bytes != total_bytes" errors. Sending should continue, not break when
errno is EAGAIN. This patch can make bpf_tcp_ca tests stable.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
 tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Martin KaFai Lau March 28, 2024, 4:55 p.m. UTC | #1
On 3/28/24 3:23 AM, Geliang Tang wrote:
> From: Geliang Tang <tanggeliang@kylinos.cn>
> 
> bpf_tcp_ca tests may emit EAGAIN sometimes. In that case, tests fail with
> "bytes != total_bytes" errors. Sending should continue, not break when
> errno is EAGAIN. This patch can make bpf_tcp_ca tests stable.
> 
> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> ---
>   tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
> index 077b107130f6..fbc219c2d53b 100644
> --- a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
> +++ b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
> @@ -56,7 +56,7 @@ static void *server(void *arg)
>   	while (bytes < total_bytes && !READ_ONCE(stop)) {
>   		nr_sent = send(fd, &batch,
>   			       MIN(total_bytes - bytes, sizeof(batch)), 0);
> -		if (nr_sent == -1 && errno == EINTR)
> +		if (nr_sent == -1 && (errno == EINTR || errno == EAGAIN))

This is a non blocking socket. EAGAIN is hitting the timeout situation?

The default timeout is 3s and it has not been changed after the recent 
connect_fd_to_fd and start_server simplifications. I don't find bpf CI failing 
in this test in the last month also.

I would prefer to fail after timeout instead of keep retrying. Do you really hit 
that in your environment for this specific bpf_tcp_ca test? There are many tests 
using this timeout value also.
Geliang Tang March 29, 2024, 5:57 a.m. UTC | #2
Hi Martin,

On Thu, 2024-03-28 at 09:55 -0700, Martin KaFai Lau wrote:
> On 3/28/24 3:23 AM, Geliang Tang wrote:
> > From: Geliang Tang <tanggeliang@kylinos.cn>
> > 
> > bpf_tcp_ca tests may emit EAGAIN sometimes. In that case, tests
> > fail with
> > "bytes != total_bytes" errors. Sending should continue, not break
> > when
> > errno is EAGAIN. This patch can make bpf_tcp_ca tests stable.
> > 
> > Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> > ---
> >   tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
> > b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
> > index 077b107130f6..fbc219c2d53b 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
> > @@ -56,7 +56,7 @@ static void *server(void *arg)
> >   	while (bytes < total_bytes && !READ_ONCE(stop)) {
> >   		nr_sent = send(fd, &batch,
> >   			       MIN(total_bytes - bytes,
> > sizeof(batch)), 0);
> > -		if (nr_sent == -1 && errno == EINTR)
> > +		if (nr_sent == -1 && (errno == EINTR || errno ==
> > EAGAIN))
> 
> This is a non blocking socket. EAGAIN is hitting the timeout
> situation?
> 
> The default timeout is 3s and it has not been changed after the
> recent 
> connect_fd_to_fd and start_server simplifications. I don't find bpf
> CI failing 
> in this test in the last month also.
> 
> I would prefer to fail after timeout instead of keep retrying. Do you
> really hit 
> that in your environment for this specific bpf_tcp_ca test? There are
> many tests 
> using this timeout value also.

This is the 2nd patch of "refactor mptcp bpf tests" series:

https://patchwork.kernel.org/project/mptcp/cover/cover.1711688054.git.tanggeliang@kylinos.cn/

I didn't get the mentioned EAGAIN errors in bpf_tcp_ca tests, but got
them in MPTCP BPF sched tests (see patch 1). MPTCP BPF sched tests (not
upstream yet) use the same sending and receiving functions as
bpf_tcp_ca tests (patch 15). So it makes sense to add this fix for
bpf_tcp_ca tests too.

And here's another reason. I want to move these functions from
bpf_tcp_ca into network_helpers as public ones (patch 4), which can be
used by both bpf_tcp_ca and MPTCP BPF sched tests. So we must add this
fix to the public ones too.

Maybe the commit log of this patch needs to be updated. Or I should
send patches 2, 3 and 4 together to bpf-next?

I'd like to hear your opinion.

Thanks,
-Geliang
Martin KaFai Lau March 29, 2024, 5:27 p.m. UTC | #3
On 3/28/24 10:57 PM, Geliang Tang wrote:
> Hi Martin,
> 
> On Thu, 2024-03-28 at 09:55 -0700, Martin KaFai Lau wrote:
>> On 3/28/24 3:23 AM, Geliang Tang wrote:
>>> From: Geliang Tang <tanggeliang@kylinos.cn>
>>>
>>> bpf_tcp_ca tests may emit EAGAIN sometimes. In that case, tests
>>> fail with
>>> "bytes != total_bytes" errors. Sending should continue, not break
>>> when
>>> errno is EAGAIN. This patch can make bpf_tcp_ca tests stable.
>>>
>>> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
>>> ---
>>>    tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c | 4 ++--
>>>    1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
>>> b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
>>> index 077b107130f6..fbc219c2d53b 100644
>>> --- a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
>>> +++ b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
>>> @@ -56,7 +56,7 @@ static void *server(void *arg)
>>>    	while (bytes < total_bytes && !READ_ONCE(stop)) {
>>>    		nr_sent = send(fd, &batch,
>>>    			       MIN(total_bytes - bytes,
>>> sizeof(batch)), 0);
>>> -		if (nr_sent == -1 && errno == EINTR)
>>> +		if (nr_sent == -1 && (errno == EINTR || errno ==
>>> EAGAIN))
>>
>> This is a non blocking socket. EAGAIN is hitting the timeout
>> situation?
>>
>> The default timeout is 3s and it has not been changed after the
>> recent
>> connect_fd_to_fd and start_server simplifications. I don't find bpf
>> CI failing
>> in this test in the last month also.
>>
>> I would prefer to fail after timeout instead of keep retrying. Do you
>> really hit
>> that in your environment for this specific bpf_tcp_ca test? There are
>> many tests
>> using this timeout value also.
> 
> This is the 2nd patch of "refactor mptcp bpf tests" series:
> 
> https://patchwork.kernel.org/project/mptcp/cover/cover.1711688054.git.tanggeliang@kylinos.cn/ >
> I didn't get the mentioned EAGAIN errors in bpf_tcp_ca tests, but got
> them in MPTCP BPF sched tests (see patch 1). MPTCP BPF sched tests (not
> upstream yet) use the same sending and receiving functions as
> bpf_tcp_ca tests (patch 15). So it makes sense to add this fix for
> bpf_tcp_ca tests too.

It sounds like the EAGAIN is specific to mptcp sched test and is not due to a 
timeout? Did you try to increase the timeout and see if it resolves the issue?

afaik, there is no fix needed for the bpf_tcp_ca test. bpf CI expects the test 
to finish. If the default 3s turns out to be too flaky for most tests, it could 
be increased instead of loop testing EAGAIN because the bpf CI expects the test 
to finish. However, so far I don't see this (i.e. 3s is too short) to be the 
case from looking at one month old data in bpf CI.


> And here's another reason. I want to move these functions from
> bpf_tcp_ca into network_helpers as public ones (patch 4), which can be
> used by both bpf_tcp_ca and MPTCP BPF sched tests. So we must add this
> fix to the public ones too.

Lets understand why mptcp sched test hits EAGAIN first.

> 
> Maybe the commit log of this patch needs to be updated. Or I should
> send patches 2, 3 and 4 together to bpf-next?

All selftests/bpf changes have to pass bpf CI. It is always a good idea to 
target bpf-next to kick off the bpf CI to bar any surprise later when it got 
merged into bpf-next.

Unrelated, since we are in the mptcp bpf test, one thing needs to fix in 
test_mptcpify(). The mptcpify test is upgrading a IPPROTO_TCP socket to 
IPPROTO_MPTCP socket. This could break other tests when the test_progs is run in 
parallel (test_progs "-j" which fork() to run prog_tests/* in parallel). It 
could unexpectedly upgrade the tcp socket of another test. One option is to 
limit the upgrade by checking the pid in the mptcpify.c bpf prog.
Geliang Tang April 2, 2024, 10:53 a.m. UTC | #4
Hi Martin,

On Fri, 2024-03-29 at 10:27 -0700, Martin KaFai Lau wrote:
> > > > On 3/28/24 10:57 PM, Geliang Tang wrote:
> > > > > > > > Hi Martin,
> > > > > > > > 
> > > > > > > > On Thu, 2024-03-28 at 09:55 -0700, Martin KaFai Lau
> > > > > > > > wrote:
> > > > > > > > > > > > On 3/28/24 3:23 AM, Geliang Tang wrote:
> > > > > > > > > > > > > > > > From: Geliang Tang
> > > > > > > > > > > > > > > > <tanggeliang@kylinos.cn>
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > bpf_tcp_ca tests may emit EAGAIN
> > > > > > > > > > > > > > > > sometimes. In that
> > > > > > > > > > > > > > > > case, tests
> > > > > > > > > > > > > > > > fail with
> > > > > > > > > > > > > > > > "bytes != total_bytes" errors. Sending
> > > > > > > > > > > > > > > > should continue,
> > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > break
> > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > errno is EAGAIN. This patch can make
> > > > > > > > > > > > > > > > bpf_tcp_ca tests
> > > > > > > > > > > > > > > > stable.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Signed-off-by: Geliang Tang
> > > > > > > > > > > > > > > > <tanggeliang@kylinos.cn>
> > > > > > > > > > > > > > > > ---
> > > > > > > > > > > > > > > >   
> > > > > > > > > > > > > > > > tools/testing/selftests/bpf/prog_tests/
> > > > > > > > > > > > > > > > bpf_tcp_ca.c
> > > > > > > > > > > > > > > > | 4 ++--
> > > > > > > > > > > > > > > >    1 file changed, 2 insertions(+), 2
> > > > > > > > > > > > > > > > deletions(-)
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > diff --git
> > > > > > > > > > > > > > > > a/tools/testing/selftests/bpf/prog_test
> > > > > > > > > > > > > > > > s/bpf_tcp_ca.c
> > > > > > > > > > > > > > > > b/tools/testing/selftests/bpf/prog_test
> > > > > > > > > > > > > > > > s/bpf_tcp_ca.c
> > > > > > > > > > > > > > > > index 077b107130f6..fbc219c2d53b 100644
> > > > > > > > > > > > > > > > ---
> > > > > > > > > > > > > > > > a/tools/testing/selftests/bpf/prog_test
> > > > > > > > > > > > > > > > s/bpf_tcp_ca.c
> > > > > > > > > > > > > > > > +++
> > > > > > > > > > > > > > > > b/tools/testing/selftests/bpf/prog_test
> > > > > > > > > > > > > > > > s/bpf_tcp_ca.c
> > > > > > > > > > > > > > > > @@ -56,7 +56,7 @@ static void
> > > > > > > > > > > > > > > > *server(void *arg)
> > > > > > > > > > > > > > > >     while (bytes < total_bytes &&
> > > > > > > > > > > > > > > > !READ_ONCE(stop)) {
> > > > > > > > > > > > > > > >     nr_sent = send(fd, &batch,
> > > > > > > > > > > > > > > >            MIN(total_bytes - bytes,
> > > > > > > > > > > > > > > > sizeof(batch)), 0);
> > > > > > > > > > > > > > > > - if (nr_sent == -1 && errno == EINTR)
> > > > > > > > > > > > > > > > + if (nr_sent == -1 && (errno == EINTR
> > > > > > > > > > > > > > > > || errno ==
> > > > > > > > > > > > > > > > EAGAIN))
> > > > > > > > > > > > 
> > > > > > > > > > > > This is a non blocking socket. EAGAIN is
> > > > > > > > > > > > hitting the
> > > > > > > > > > > > timeout
> > > > > > > > > > > > situation?
> > > > > > > > > > > > 
> > > > > > > > > > > > The default timeout is 3s and it has not been
> > > > > > > > > > > > changed after
> > > > > > > > > > > > the
> > > > > > > > > > > > recent
> > > > > > > > > > > > connect_fd_to_fd and start_server
> > > > > > > > > > > > simplifications. I don't
> > > > > > > > > > > > find
> > > > > > > > > > > > bpf
> > > > > > > > > > > > CI failing
> > > > > > > > > > > > in this test in the last month also.
> > > > > > > > > > > > 
> > > > > > > > > > > > I would prefer to fail after timeout instead of
> > > > > > > > > > > > keep
> > > > > > > > > > > > retrying. Do
> > > > > > > > > > > > you
> > > > > > > > > > > > really hit
> > > > > > > > > > > > that in your environment for this specific
> > > > > > > > > > > > bpf_tcp_ca test?
> > > > > > > > > > > > There
> > > > > > > > > > > > are
> > > > > > > > > > > > many tests
> > > > > > > > > > > > using this timeout value also.
> > > > > > > > 
> > > > > > > > This is the 2nd patch of "refactor mptcp bpf tests"
> > > > > > > > series:
> > > > > > > > 
> > > > > > > > https://patchwork.kernel.org/project/mptcp/cover/cover.1711688054.git.tanggeliang@kylinos.cn/
> > > > > > > >  >
> > > > > > > > I didn't get the mentioned EAGAIN errors in bpf_tcp_ca
> > > > > > > > tests,
> > > > > > > > but
> > > > > > > > got
> > > > > > > > them in MPTCP BPF sched tests (see patch 1). MPTCP BPF
> > > > > > > > sched
> > > > > > > > tests
> > > > > > > > (not
> > > > > > > > upstream yet) use the same sending and receiving
> > > > > > > > functions as
> > > > > > > > bpf_tcp_ca tests (patch 15). So it makes sense to add
> > > > > > > > this fix
> > > > > > > > for
> > > > > > > > bpf_tcp_ca tests too.
> > > > 
> > > > It sounds like the EAGAIN is specific to mptcp sched test and
> > > > is
> > > > not
> > > > due to a 
> > > > timeout? Did you try to increase the timeout and see if it
> > > > resolves
> > > > the issue?
> > > > 
> > > > afaik, there is no fix needed for the bpf_tcp_ca test. bpf CI
> > > > expects
> > > > the test 
> > > > to finish. If the default 3s turns out to be too flaky for most
> > > > tests, it could 
> > > > be increased instead of loop testing EAGAIN because the bpf CI
> > > > expects the test 
> > > > to finish. However, so far I don't see this (i.e. 3s is too
> > > > short)
> > > > to
> > > > be the 
> > > > case from looking at one month old data in bpf CI.

Thanks for your suggestions. I finally fixed this issue by explicitly
setting the server sockets with nonblock flags in MPTCP BPF tests. Then
EAGAIN will not appear in the tests anymore. The fix is here:

https://patchwork.kernel.org/project/mptcp/patch/f0b66813ae8274e5653988d80d16171508f05796.1712042049.git.tanggeliang@kylinos.cn/

That means this patch "selftests/bpf: Handle EAGAIN in bpf_tcp_ca" can
be dropped now.

> > > > > > > > And here's another reason. I want to move these
> > > > > > > > functions from
> > > > > > > > bpf_tcp_ca into network_helpers as public ones (patch
> > > > > > > > 4), which
> > > > > > > > can
> > > > > > > > be
> > > > > > > > used by both bpf_tcp_ca and MPTCP BPF sched tests. So
> > > > > > > > we must
> > > > > > > > add
> > > > > > > > this
> > > > > > > > fix to the public ones too.
> > > > 
> > > > Lets understand why mptcp sched test hits EAGAIN first.

The patches "export send_byte and send_recv_data into network_helpers"
have been sent to bpf-next, please review them when you have time:

https://patchwork.kernel.org/project/netdevbpf/cover/cover.1712039441.git.tanggeliang@kylinos.cn/


> > > > > > > > Maybe the commit log of this patch needs to be updated.
> > > > > > > > Or I
> > > > > > > > should
> > > > > > > > send patches 2, 3 and 4 together to bpf-next?
> > > > 
> > > > All selftests/bpf changes have to pass bpf CI. It is always a
> > > > good
> > > > idea to 
> > > > target bpf-next to kick off the bpf CI to bar any surprise
> > > > later
> > > > when
> > > > it got 
> > > > merged into bpf-next.
> > > > 
> > > > Unrelated, since we are in the mptcp bpf test, one thing needs
> > > > to
> > > > fix
> > > > in 
> > > > test_mptcpify(). The mptcpify test is upgrading a IPPROTO_TCP
> > > > socket
> > > > to 
> > > > IPPROTO_MPTCP socket. This could break other tests when the
> > > > test_progs is run in 
> > > > parallel (test_progs "-j" which fork() to run prog_tests/* in
> > > > parallel). It 
> > > > could unexpectedly upgrade the tcp socket of another test. One
> > > > option
> > > > is to 
> > > > limit the upgrade by checking the pid in the mptcpify.c bpf
> > > > prog.

I just sent a patch named "selftests/bpf: Add pid limit for mptcpify
prog" to fix this and added your "suggested-by" tag in it.

Thanks,
-Geliang

> > > > 
> > > >
diff mbox series

Patch

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
index 077b107130f6..fbc219c2d53b 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
@@ -56,7 +56,7 @@  static void *server(void *arg)
 	while (bytes < total_bytes && !READ_ONCE(stop)) {
 		nr_sent = send(fd, &batch,
 			       MIN(total_bytes - bytes, sizeof(batch)), 0);
-		if (nr_sent == -1 && errno == EINTR)
+		if (nr_sent == -1 && (errno == EINTR || errno == EAGAIN))
 			continue;
 		if (nr_sent == -1) {
 			err = -errno;
@@ -131,7 +131,7 @@  static void do_test(const char *tcp_ca, const struct bpf_map *sk_stg_map)
 	while (bytes < total_bytes && !READ_ONCE(stop)) {
 		nr_recv = recv(fd, &batch,
 			       MIN(total_bytes - bytes, sizeof(batch)), 0);
-		if (nr_recv == -1 && errno == EINTR)
+		if (nr_recv == -1 && (errno == EINTR || errno == EAGAIN))
 			continue;
 		if (nr_recv == -1)
 			break;