Message ID | bd55c0d5a90b35f7eeee6d132e950ca338ea1d67.1739895412.git.pablmart@redhat.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] selftests/net: big_tcp: longer netperf session on slow machines | expand |
On Tue, 18 Feb 2025 17:19:28 +0100 Pablo Martin Medrano wrote: > After debugging the following output for big_tcp.sh on a board: > > CLI GSO | GW GRO | GW GSO | SER GRO > on on on on : [PASS] > on off on off : [PASS] > off on on on : [FAIL_on_link1] > on on off on : [FAIL_on_link1] > > Davide Caratti found that by default the test duration 1s is too short > in slow systems to reach the correct cwd size necessary for tcp/ip to > generate at least one packet bigger than 65536 (matching the iptables > match on length rule the test evaluates) Why not increase the test duration then?
On 2/21/25 1:54 AM, Jakub Kicinski wrote: > On Tue, 18 Feb 2025 17:19:28 +0100 Pablo Martin Medrano wrote: >> After debugging the following output for big_tcp.sh on a board: >> >> CLI GSO | GW GRO | GW GSO | SER GRO >> on on on on : [PASS] >> on off on off : [PASS] >> off on on on : [FAIL_on_link1] >> on on off on : [FAIL_on_link1] >> >> Davide Caratti found that by default the test duration 1s is too short >> in slow systems to reach the correct cwd size necessary for tcp/ip to >> generate at least one packet bigger than 65536 (matching the iptables >> match on length rule the test evaluates) > > Why not increase the test duration then? I gave this guidance, as with arbitrary slow machines we would need very long runtime. Similarly to the packetdril tests, instead of increasing the allowed time, simply allow xfail on KSFT_MACHINE_SLOW. Cheers, Paolo
On Fri, 21 Feb 2025, Paolo Abeni wrote: > On 2/21/25 1:54 AM, Jakub Kicinski wrote: >> Why not increase the test duration then? > > I gave this guidance, as with arbitrary slow machines we would need very > long runtime. Similarly to the packetdril tests, instead of increasing > the allowed time, simply allow xfail on KSFT_MACHINE_SLOW. I have resubmitted a properly versioned and tagged patch (and with the right title as indeed it does not increase the netperf session duration) at: https://lore.kernel.org/netdev/23340252eb7bbc1547f5e873be7804adbd7ad092.1739983848.git.pablmart@redhat.com/ In that patch the Fixes: commit, found by Paolo, was when the duration moved from the netperf default (10 seconds) to 1 second. As he mentions even with 10 seconds it is not guaranteed that in slow systems and/or under load the test will not fail, hence the skip/xfail
On Fri, 21 Feb 2025 10:14:35 +0100 Paolo Abeni wrote: > >> Davide Caratti found that by default the test duration 1s is too short > >> in slow systems to reach the correct cwd size necessary for tcp/ip to > >> generate at least one packet bigger than 65536 (matching the iptables > >> match on length rule the test evaluates) > > > > Why not increase the test duration then? > > I gave this guidance, as with arbitrary slow machines we would need very > long runtime. Similarly to the packetdril tests, instead of increasing > the allowed time, simply allow xfail on KSFT_MACHINE_SLOW. Hm. Wouldn't we ideally specify the flow length in bytes? Instead of giving all machines 1 sec, ask to transfer ${TDB number of bytes} and on fast machines it will complete in 1 sec, on slower machines take longer but have a good chance of still growing the windows?
diff --git a/tools/testing/selftests/net/big_tcp.sh b/tools/testing/selftests/net/big_tcp.sh index 2db9d15cd45f..dc2ecfd58961 100755 --- a/tools/testing/selftests/net/big_tcp.sh +++ b/tools/testing/selftests/net/big_tcp.sh @@ -21,8 +21,7 @@ CLIENT_GW6="2001:db8:1::2" MAX_SIZE=128000 CHK_SIZE=65535 -# Kselftest framework requirement - SKIP code is 4. -ksft_skip=4 +source lib.sh setup() { ip netns add $CLIENT_NS @@ -143,21 +142,20 @@ do_test() { start_counter link3 $SERVER_NS do_netperf $CLIENT_NS - if check_counter link1 $ROUTER_NS; then - check_counter link3 $SERVER_NS || ret="FAIL_on_link3" - else - ret="FAIL_on_link1" - fi + check_counter link1 $ROUTER_NS + check_err $? "fail on link1" + check_counter link3 $SERVER_NS + check_err $? "fail on link3" stop_counter link1 $ROUTER_NS stop_counter link3 $SERVER_NS - printf "%-9s %-8s %-8s %-8s: [%s]\n" \ - $cli_tso $gw_gro $gw_tso $ser_gro $ret + log_test "$(printf "%-9s %-8s %-8s %-8s" \ + $cli_tso $gw_gro $gw_tso $ser_gro)" test $ret = "PASS" } testup() { - echo "CLI GSO | GW GRO | GW GSO | SER GRO" && \ + echo " CLI GSO | GW GRO | GW GSO | SER GRO" && \ do_test "on" "on" "on" "on" && \ do_test "on" "off" "on" "off" && \ do_test "off" "on" "on" "on" && \ @@ -176,7 +174,8 @@ if ! ip link help 2>&1 | grep gso_ipv4_max_size &> /dev/null; then fi trap cleanup EXIT +xfail_on_slow setup && echo "Testing for BIG TCP:" && \ NF=4 testup && echo "***v4 Tests Done***" && \ NF=6 testup && echo "***v6 Tests Done***" -exit $? +exit $EXIT_STATUS