diff mbox series

[net] selftests/net: big_tcp: longer netperf session on slow machines

Message ID b800a71479a24a4142542051636e980c3b547434.1739794830.git.pablmart@redhat.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net] selftests/net: big_tcp: longer netperf session on slow machines | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present fail Series targets non-next tree, but doesn't contain any Fixes tags
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success Errors and warnings before: 26 (+1) this patch: 26 (+1)
netdev/cc_maintainers warning 1 maintainers not CCed: linux-kselftest@vger.kernel.org
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/verify_signedoff fail author Signed-off-by missing
netdev/deprecated_api success None detected
netdev/check_selftest success net selftest script(s) already in Makefile
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 35 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2025-02-17--21-00 (tests: 886)

Commit Message

Pablo Martin Medrano Feb. 17, 2025, 12:32 p.m. UTC
After debugging the following output for big_tcp.sh on a board:

CLI GSO | GW GRO | GW GSO | SER GRO
on        on       on       on      : [PASS]
on        off      on       off     : [PASS]
off       on       on       on      : [FAIL_on_link1]
on        on       off      on      : [FAIL_on_link1]

Davide Caratti found that by default the test duration 1s is too short
in slow systems to reach the correct cwd size necessary for tcp/ip to
generate at least one packet bigger than 65536 to hit the iptables match
on length rule the test uses.

This skips (with xfail) the aforementioned failing combinations when
KSFT_MACHINE_SLOW is set.
---
 tools/testing/selftests/net/big_tcp.sh | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

Comments

Petr Machata Feb. 17, 2025, 5:50 p.m. UTC | #1
Pablo Martin Medrano <pablmart@redhat.com> writes:

> After debugging the following output for big_tcp.sh on a board:
>
> CLI GSO | GW GRO | GW GSO | SER GRO
> on        on       on       on      : [PASS]
> on        off      on       off     : [PASS]
> off       on       on       on      : [FAIL_on_link1]
> on        on       off      on      : [FAIL_on_link1]
>
> Davide Caratti found that by default the test duration 1s is too short
> in slow systems to reach the correct cwd size necessary for tcp/ip to
> generate at least one packet bigger than 65536 to hit the iptables match
> on length rule the test uses.
>
> This skips (with xfail) the aforementioned failing combinations when
> KSFT_MACHINE_SLOW is set.
> ---
>  tools/testing/selftests/net/big_tcp.sh | 23 +++++++++++++++--------
>  1 file changed, 15 insertions(+), 8 deletions(-)
>
> diff --git a/tools/testing/selftests/net/big_tcp.sh b/tools/testing/selftests/net/big_tcp.sh
> index 2db9d15cd45f..e613dc3d84ad 100755
> --- a/tools/testing/selftests/net/big_tcp.sh
> +++ b/tools/testing/selftests/net/big_tcp.sh
> @@ -21,8 +21,7 @@ CLIENT_GW6="2001:db8:1::2"
>  MAX_SIZE=128000
>  CHK_SIZE=65535
>  
> -# Kselftest framework requirement - SKIP code is 4.
> -ksft_skip=4
> +source lib.sh
>  
>  setup() {
>  	ip netns add $CLIENT_NS
> @@ -157,12 +156,20 @@ do_test() {
>  }
>  
>  testup() {
> -	echo "CLI GSO | GW GRO | GW GSO | SER GRO" && \
> -	do_test "on"  "on"  "on"  "on"  && \
> -	do_test "on"  "off" "on"  "off" && \
> -	do_test "off" "on"  "on"  "on"  && \
> -	do_test "on"  "on"  "off" "on"  && \
> -	do_test "off" "on"  "off" "on"
> +	echo "CLI GSO | GW GRO | GW GSO | SER GRO"
> +	input_by_test=(
> +	" on  on  on  on"
> +	" on off  on off"
> +	"off  on  on  on"
> +	" on  on off  on"
> +	"off  on off  on"
> +	)
> +	for test_values in "${input_by_test[@]}"; do
> +		do_test ${test_values[0]}
> +		xfail_on_slow check_err $? "test failed"
> +		# check_err sets $RET with $ksft_xfail or $ksft_fail (or 0)
> +		test $RET = 0 || return $RET

This bails out on first failure though, whereas previously it would run
all the tests. Is that intentional?

Looking at the test, it looks like do_test itself could be converted to
lib.sh as follows (sorry, this is a cut-n-paste from the terminal, so
tabs are gone):

@@ -134,3 +133,4 @@ do_test() {
         local ser_gro=$4
-        local ret="PASS"
+
+        RET=0

@@ -145,7 +145,8 @@ do_test() {

-        if check_counter link1 $ROUTER_NS; then
-                check_counter link3 $SERVER_NS || ret="FAIL_on_link3"
-        else
-                ret="FAIL_on_link1"
-        fi
+        check_counter link1 $ROUTER_NS
+        false
+        check_err $? "fail on link1"
+
+        check_counter link3 $SERVER_NS
+        check_err $? "fail on link3"

@@ -153,5 +154,6 @@ do_test() {
         stop_counter link3 $SERVER_NS
-        printf "%-9s %-8s %-8s %-8s: [%s]\n" \
-                $cli_tso $gw_gro $gw_tso $ser_gro $ret
-        test $ret = "PASS"
+
+        log_test "$(printf "%-9s %-8s %-8s %-8s" \
+                            $cli_tso $gw_gro $gw_tso $ser_gro)"
+        :
 }
@@ -159,3 +161,3 @@ do_test() {
 testup() {
-        echo "CLI GSO | GW GRO | GW GSO | SER GRO" && \
+        echo "      CLI GSO | GW GRO | GW GSO | SER GRO" && \
         do_test "on"  "on"  "on"  "on"  && \
@@ -178,2 +177,3 @@ fi
 trap cleanup EXIT
+xfail_on_slow
 setup && echo "Testing for BIG TCP:" && \
@@ -181,2 +181,2 @@ NF=4 testup && echo "***v4 Tests Done***" && \
 NF=6 testup && echo "***v6 Tests Done***"
-exit $?
+exit $EXIT_STATUS

That way you only really touch the bits that do the actual checks to
port them over to the log_test framework. xfail_on_slow() is usually
called on a per-check basis, but if anything in the test can fail, I
think it's fair to just call it like I show so that it toggles the
condition globally.

Then I'm getting this for slow machine with an injected failure:

bash-5.2# KSFT_MACHINE_SLOW=yes ./big_tcp.sh                                                        │
Error: Failed to load TC action module.                                                             │
We have an error talking to the kernel                                                              │
Error: Failed to load TC action module.                                                             │
We have an error talking to the kernel                                                              │
Testing for BIG TCP:                                                                                │
      CLI GSO | GW GRO | GW GSO | SER GRO                                                           │
TEST: on        on       on       on                                [XFAIL]                         │
        fail on link1                                                                               │
TEST: on        off      on       off                               [XFAIL]                         │
        fail on link1                                                                               │
TEST: off       on       on       on                                [XFAIL]                         │
        fail on link1                                                                               │
TEST: on        on       off      on                                [XFAIL]                         │
        fail on link1                                                                               │
TEST: off       on       off      on                                [XFAIL]                         │
        fail on link1                                                                               │
***v4 Tests Done***                                                                                 │
      CLI GSO | GW GRO | GW GSO | SER GRO                                                           │
TEST: on        on       on       on                                [XFAIL]                         │
        fail on link1                                                                               │
TEST: on        off      on       off                               [XFAIL]                         │
        fail on link1                                                                               │
TEST: off       on       on       on                                [XFAIL]                         │
        fail on link1                                                                               │
TEST: on        on       off      on                                [XFAIL]                         │
        fail on link1                                                                               │
TEST: off       on       off      on                                [XFAIL]                         │
        fail on link1                                                                               │
***v6 Tests Done***                                                                                 │
bash-5.2# echo $?                                                                                   │
0                                                                                                   │

... and for non-KSFT_MACHINE_SLOW, I get FAILs with $? of 1, i.e. what
we are after.

> +	done
>  }
>  
>  if ! netperf -V &> /dev/null; then
Pablo Martin Medrano Feb. 18, 2025, 2:26 p.m. UTC | #2
On Mon, 17 Feb 2025, Petr Machata wrote:

> This bails out on first failure though, whereas previously it would run
> all the tests. Is that intentional?

Previously it did bail also on first error, as it did do_test ... && 
do_test && and do_test returned != 0 anytime [PASS] was not printed

But I understand from the semantics of lib.sh that the custom is that all 
tests are passed and then fail/xfail returned if any of them 
failed/xfailed

Thank you Petr! I am resending a patch with your proposed changes
diff mbox series

Patch

diff --git a/tools/testing/selftests/net/big_tcp.sh b/tools/testing/selftests/net/big_tcp.sh
index 2db9d15cd45f..e613dc3d84ad 100755
--- a/tools/testing/selftests/net/big_tcp.sh
+++ b/tools/testing/selftests/net/big_tcp.sh
@@ -21,8 +21,7 @@  CLIENT_GW6="2001:db8:1::2"
 MAX_SIZE=128000
 CHK_SIZE=65535
 
-# Kselftest framework requirement - SKIP code is 4.
-ksft_skip=4
+source lib.sh
 
 setup() {
 	ip netns add $CLIENT_NS
@@ -157,12 +156,20 @@  do_test() {
 }
 
 testup() {
-	echo "CLI GSO | GW GRO | GW GSO | SER GRO" && \
-	do_test "on"  "on"  "on"  "on"  && \
-	do_test "on"  "off" "on"  "off" && \
-	do_test "off" "on"  "on"  "on"  && \
-	do_test "on"  "on"  "off" "on"  && \
-	do_test "off" "on"  "off" "on"
+	echo "CLI GSO | GW GRO | GW GSO | SER GRO"
+	input_by_test=(
+	" on  on  on  on"
+	" on off  on off"
+	"off  on  on  on"
+	" on  on off  on"
+	"off  on off  on"
+	)
+	for test_values in "${input_by_test[@]}"; do
+		do_test ${test_values[0]}
+		xfail_on_slow check_err $? "test failed"
+		# check_err sets $RET with $ksft_xfail or $ksft_fail (or 0)
+		test $RET = 0 || return $RET
+	done
 }
 
 if ! netperf -V &> /dev/null; then