Message ID | 20240827090023.8917-1-fw@strlen.de (mailing list archive) |
---|---|
State | Accepted |
Commit | 0a8b08c554dabea952a75363c89050b1fbcbfffb |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] selftests: netfilter: nft_queue.sh: reduce test file size for debug build | expand |
On Tue, Aug 27, 2024 at 11:00:12AM +0200, Florian Westphal wrote: > The sctp selftest is very slow on debug kernels. > > Reported-by: Jakub Kicinski <kuba@kernel.org> > Closes: https://lore.kernel.org/netdev/20240826192500.32efa22c@kernel.org/ > Fixes: 4e97d521c2be ("selftests: netfilter: nft_queue.sh: sctp coverage") > Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> > --- > Lets see if CI is happy after this tweak. > > tools/testing/selftests/net/netfilter/nft_queue.sh | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/tools/testing/selftests/net/netfilter/nft_queue.sh b/tools/testing/selftests/net/netfilter/nft_queue.sh > index f3bdeb1271eb..9e5f423bff09 100755 > --- a/tools/testing/selftests/net/netfilter/nft_queue.sh > +++ b/tools/testing/selftests/net/netfilter/nft_queue.sh > @@ -39,7 +39,9 @@ TMPFILE2=$(mktemp) > TMPFILE3=$(mktemp) > > TMPINPUT=$(mktemp) > -dd conv=sparse status=none if=/dev/zero bs=1M count=200 of="$TMPINPUT" > +COUNT=200 > +[ "$KSFT_MACHINE_SLOW" = "yes" ] && COUNT=25 > +dd conv=sparse status=none if=/dev/zero bs=1M count=$COUNT of="$TMPINPUT" > > if ! ip link add veth0 netns "$nsrouter" type veth peer name eth0 netns "$ns1" > /dev/null 2>&1; then > echo "SKIP: No virtual ethernet pair device support in kernel" > -- > 2.46.0 > >
On Tue, 27 Aug 2024 11:00:12 +0200 Florian Westphal wrote:
> The sctp selftest is very slow on debug kernels.
I think there may be something more going on here? :(
For reference net-next-2024-08-27--12-00 is when this fix got queued:
https://netdev.bots.linux.dev/contest.html?executor=vmksft-nf-dbg&test=nft-queue-sh
Since then we still see occasional flakes. But take a look at
the runtime. If it's happy the test case takes under a minute.
When it's unhappy it times out (after 5 minutes). I'll increase
the timeout to 10 minutes, but 1min vs 5min feels like it may
be getting stuck rather than being slow..
Jakub Kicinski <kuba@kernel.org> wrote: > On Tue, 27 Aug 2024 11:00:12 +0200 Florian Westphal wrote: > > The sctp selftest is very slow on debug kernels. > > I think there may be something more going on here? :( > > For reference net-next-2024-08-27--12-00 is when this fix got queued: > > https://netdev.bots.linux.dev/contest.html?executor=vmksft-nf-dbg&test=nft-queue-sh > > Since then we still see occasional flakes. But take a look at > the runtime. If it's happy the test case takes under a minute. > When it's unhappy it times out (after 5 minutes). I'll increase > the timeout to 10 minutes, but 1min vs 5min feels like it may > be getting stuck rather than being slow.. Yes, its stuck. Only reason I could imagine is that there is a 2s delay between starting the nf_queue test prog and the first packet getting sent. That would make the listener exit early and then socat sender would hang. I'll test following tomorrow on an old / slow machine: diff --git a/tools/testing/selftests/net/netfilter/nft_queue.sh b/tools/testing/selftests/net/netfilter/nft_queue.sh --- a/tools/testing/selftests/net/netfilter/nft_queue.sh +++ b/tools/testing/selftests/net/netfilter/nft_queue.sh @@ -39,7 +39,10 @@ TMPFILE2=$(mktemp) TMPFILE3=$(mktemp) TMPINPUT=$(mktemp) -dd conv=sparse status=none if=/dev/zero bs=1M count=200 of="$TMPINPUT" + +COUNT=200 +[ "$KSFT_MACHINE_SLOW" = "yes" ] && COUNT=25 +dd conv=sparse status=none if=/dev/zero bs=1M count=$COUNT of="$TMPINPUT" if ! ip link add veth0 netns "$nsrouter" type veth peer name eth0 netns "$ns1" > /dev/null 2>&1; then echo "SKIP: No virtual ethernet pair device support in kernel" @@ -398,7 +401,7 @@ EOF busywait "$BUSYWAIT_TIMEOUT" sctp_listener_ready "$ns2" - ip netns exec "$nsrouter" ./nf_queue -q 10 -G -t "$timeout" & + ip netns exec "$nsrouter" ./nf_queue -q 10 -G & local nfqpid=$! ip netns exec "$ns1" socat -u STDIN SCTP:10.0.2.99:12345 <"$TMPINPUT" >/dev/null @@ -409,6 +412,7 @@ EOF fi wait "$rpid" && echo "PASS: sctp and nfqueue in forward chain" + kill "$nfqpid" if ! diff -u "$TMPINPUT" "$TMPFILE1" ; then echo "FAIL: lost packets?!" 1>&2 @@ -434,7 +438,7 @@ EOF busywait "$BUSYWAIT_TIMEOUT" sctp_listener_ready "$ns2" - ip netns exec "$ns1" ./nf_queue -q 11 -t "$timeout" & + ip netns exec "$ns1" ./nf_queue -q 11 & local nfqpid=$! ip netns exec "$ns1" socat -u STDIN SCTP:10.0.2.99:12345 <"$TMPINPUT" >/dev/null @@ -446,6 +450,7 @@ EOF # must wait before checking completeness of output file. wait "$rpid" && echo "PASS: sctp and nfqueue in output chain with GSO" + kill "$nfqpid" if ! diff -u "$TMPINPUT" "$TMPFILE1" ; then echo "FAIL: lost packets?!" 1>&2
On 8/29/24 10:01, Florian Westphal wrote: > Jakub Kicinski <kuba@kernel.org> wrote: >> On Tue, 27 Aug 2024 11:00:12 +0200 Florian Westphal wrote: >>> The sctp selftest is very slow on debug kernels. >> >> I think there may be something more going on here? :( >> >> For reference net-next-2024-08-27--12-00 is when this fix got queued: >> >> https://netdev.bots.linux.dev/contest.html?executor=vmksft-nf-dbg&test=nft-queue-sh >> >> Since then we still see occasional flakes. But take a look at >> the runtime. If it's happy the test case takes under a minute. >> When it's unhappy it times out (after 5 minutes). I'll increase >> the timeout to 10 minutes, but 1min vs 5min feels like it may >> be getting stuck rather than being slow.. > > Yes, its stuck. Only reason I could imagine is that there is a 2s > delay between starting the nf_queue test prog and the first packet > getting sent. That would make the listener exit early and then > socat sender would hang. As the root cause for this latter hang-up looks unrelated, and this patch is improving the current CI status, I'll apply it as-is. The other issue will be fixed by a separated patch. Thanks, Paolo
Hello: This patch was applied to netdev/net-next.git (main) by Paolo Abeni <pabeni@redhat.com>: On Tue, 27 Aug 2024 11:00:12 +0200 you wrote: > The sctp selftest is very slow on debug kernels. > > Reported-by: Jakub Kicinski <kuba@kernel.org> > Closes: https://lore.kernel.org/netdev/20240826192500.32efa22c@kernel.org/ > Fixes: 4e97d521c2be ("selftests: netfilter: nft_queue.sh: sctp coverage") > Signed-off-by: Florian Westphal <fw@strlen.de> > > [...] Here is the summary with links: - [net-next] selftests: netfilter: nft_queue.sh: reduce test file size for debug build https://git.kernel.org/netdev/net-next/c/0a8b08c554da You are awesome, thank you!
diff --git a/tools/testing/selftests/net/netfilter/nft_queue.sh b/tools/testing/selftests/net/netfilter/nft_queue.sh index f3bdeb1271eb..9e5f423bff09 100755 --- a/tools/testing/selftests/net/netfilter/nft_queue.sh +++ b/tools/testing/selftests/net/netfilter/nft_queue.sh @@ -39,7 +39,9 @@ TMPFILE2=$(mktemp) TMPFILE3=$(mktemp) TMPINPUT=$(mktemp) -dd conv=sparse status=none if=/dev/zero bs=1M count=200 of="$TMPINPUT" +COUNT=200 +[ "$KSFT_MACHINE_SLOW" = "yes" ] && COUNT=25 +dd conv=sparse status=none if=/dev/zero bs=1M count=$COUNT of="$TMPINPUT" if ! ip link add veth0 netns "$nsrouter" type veth peer name eth0 netns "$ns1" > /dev/null 2>&1; then echo "SKIP: No virtual ethernet pair device support in kernel"
The sctp selftest is very slow on debug kernels. Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20240826192500.32efa22c@kernel.org/ Fixes: 4e97d521c2be ("selftests: netfilter: nft_queue.sh: sctp coverage") Signed-off-by: Florian Westphal <fw@strlen.de> --- Lets see if CI is happy after this tweak. tools/testing/selftests/net/netfilter/nft_queue.sh | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)