Message ID | 20240919124412.3014326-1-willemdebruijn.kernel@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 72ef07554c5dcabb0053a147c4fd221a8e39bcfd |
Headers | show |
Series | [net] selftests/net: packetdrill: increase timing tolerance in debug mode | expand |
On Thu, Sep 19, 2024 at 08:43:42AM -0400, Willem de Bruijn wrote: > From: Willem de Bruijn <willemb@google.com> > > Some packetdrill tests are flaky in debug mode. As discussed, increase > tolerance. > > We have been doing this for debug builds outside ksft too. > > Previous setting was 10000. A manual 50 runs in virtme-ng showed two > failures that needed 12000. To be on the safe side, Increase to 14000. > > Link: https://lore.kernel.org/netdev/Zuhhe4-MQHd3EkfN@mini-arch/ > Fixes: 1e42f73fd3c2 ("selftests/net: packetdrill: import tcp/zerocopy") > Reported-by: Stanislav Fomichev <sdf@fomichev.me> > Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Simon Horman <horms@kernel.org>
On 09/19, Willem de Bruijn wrote: > From: Willem de Bruijn <willemb@google.com> > > Some packetdrill tests are flaky in debug mode. As discussed, increase > tolerance. > > We have been doing this for debug builds outside ksft too. > > Previous setting was 10000. A manual 50 runs in virtme-ng showed two > failures that needed 12000. To be on the safe side, Increase to 14000. > > Link: https://lore.kernel.org/netdev/Zuhhe4-MQHd3EkfN@mini-arch/ > Fixes: 1e42f73fd3c2 ("selftests/net: packetdrill: import tcp/zerocopy") > Reported-by: Stanislav Fomichev <sdf@fomichev.me> > Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Thanks! Should probably go to net-next though? (Not sure what's the bar for selftests fixes for 'net')
Hi Willem, On 19/09/2024 14:43, Willem de Bruijn wrote: > From: Willem de Bruijn <willemb@google.com> > > Some packetdrill tests are flaky in debug mode. As discussed, increase > tolerance. Thank you for the patch! > We have been doing this for debug builds outside ksft too. > > Previous setting was 10000. A manual 50 runs in virtme-ng showed two > failures that needed 12000. To be on the safe side, Increase to 14000. So far (in 3 runs), it looks like 14000 is enough. But I guess it is still a bit too early to conclude that. https://netdev.bots.linux.dev/contest.html?executor=vmksft-packetdrill-dbg (Your patch has been introduced in the net-next-2024-09-19--15-00 branch.) Personally, I would not be chocked if the tolerance was even 10x higher to cope with this very slow environment where we care less about timing I think. But if less works, that's good: Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Just one question for later: in the GitHub repo, some tests set the tolerance in the .pkt file, will it be OK for these tests? I guess yes, because the max they set is 10k, but I just want to double-check. (Note that it is now easier to spot other errors :) e.g.) https://netdev-3.bots.linux.dev/vmksft-packetdrill-dbg/results/779660/22-tcp-zerocopy-epoll-exclusive-pkt/stdout Cheers, Matt
On 9/19/24 23:04, Stanislav Fomichev wrote: > On 09/19, Willem de Bruijn wrote: >> From: Willem de Bruijn <willemb@google.com> >> >> Some packetdrill tests are flaky in debug mode. As discussed, increase >> tolerance. >> >> We have been doing this for debug builds outside ksft too. >> >> Previous setting was 10000. A manual 50 runs in virtme-ng showed two >> failures that needed 12000. To be on the safe side, Increase to 14000. >> >> Link: https://lore.kernel.org/netdev/Zuhhe4-MQHd3EkfN@mini-arch/ >> Fixes: 1e42f73fd3c2 ("selftests/net: packetdrill: import tcp/zerocopy") >> Reported-by: Stanislav Fomichev <sdf@fomichev.me> >> Signed-off-by: Willem de Bruijn <willemb@google.com> > > Acked-by: Stanislav Fomichev <sdf@fomichev.me> > > Thanks! Should probably go to net-next though? (Not sure what's > the bar for selftests fixes for 'net') FTR, we want this kind of fixes in net, to reach self-test stability in both trees ASAP. Cheers, Paolo
Hello: This patch was applied to netdev/net.git (main) by Paolo Abeni <pabeni@redhat.com>: On Thu, 19 Sep 2024 08:43:42 -0400 you wrote: > From: Willem de Bruijn <willemb@google.com> > > Some packetdrill tests are flaky in debug mode. As discussed, increase > tolerance. > > We have been doing this for debug builds outside ksft too. > > [...] Here is the summary with links: - [net] selftests/net: packetdrill: increase timing tolerance in debug mode https://git.kernel.org/netdev/net/c/72ef07554c5d You are awesome, thank you!
Hi Willem, On 20/09/2024 00:03, Matthieu Baerts wrote: > On 19/09/2024 14:43, Willem de Bruijn wrote: >> From: Willem de Bruijn <willemb@google.com> (...) >> We have been doing this for debug builds outside ksft too. >> >> Previous setting was 10000. A manual 50 runs in virtme-ng showed two >> failures that needed 12000. To be on the safe side, Increase to 14000. > > So far (in 3 runs), it looks like 14000 is enough. But I guess it is > still a bit too early to conclude that. > > https://netdev.bots.linux.dev/contest.html?executor=vmksft-packetdrill-dbg > > (Your patch has been introduced in the net-next-2024-09-19--15-00 branch.) One week after the introduction of this patch and >50 builds, it looks like the results are good, only one issue related to timing issues: https://netdev-3.bots.linux.dev/vmksft-packetdrill-dbg/results/782181/1-tcp-slow-start-slow-start-after-win-update-pkt/stdout And it passed after a retry. https://netdev.bots.linux.dev/flakes.html?min-flip=0&tn-needle=packetdrill Cheers, Matt
Matthieu Baerts wrote: > Hi Willem, > > On 20/09/2024 00:03, Matthieu Baerts wrote: > > On 19/09/2024 14:43, Willem de Bruijn wrote: > >> From: Willem de Bruijn <willemb@google.com> > > (...) > > >> We have been doing this for debug builds outside ksft too. > >> > >> Previous setting was 10000. A manual 50 runs in virtme-ng showed two > >> failures that needed 12000. To be on the safe side, Increase to 14000. > > > > So far (in 3 runs), it looks like 14000 is enough. But I guess it is > > still a bit too early to conclude that. > > > > https://netdev.bots.linux.dev/contest.html?executor=vmksft-packetdrill-dbg > > > > (Your patch has been introduced in the net-next-2024-09-19--15-00 branch.) > One week after the introduction of this patch and >50 builds, it looks > like the results are good, only one issue related to timing issues: > > https://netdev-3.bots.linux.dev/vmksft-packetdrill-dbg/results/782181/1-tcp-slow-start-slow-start-after-win-update-pkt/stdout > > And it passed after a retry. > > https://netdev.bots.linux.dev/flakes.html?min-flip=0&tn-needle=packetdrill Thanks Matthieu.
diff --git a/tools/testing/selftests/net/packetdrill/ksft_runner.sh b/tools/testing/selftests/net/packetdrill/ksft_runner.sh index 7478c0c0c9aa..4071c133f29e 100755 --- a/tools/testing/selftests/net/packetdrill/ksft_runner.sh +++ b/tools/testing/selftests/net/packetdrill/ksft_runner.sh @@ -30,12 +30,17 @@ if [ -z "$(which packetdrill)" ]; then exit "$KSFT_SKIP" fi +declare -a optargs +if [[ -n "${KSFT_MACHINE_SLOW}" ]]; then + optargs+=('--tolerance_usecs=14000') +fi + ktap_print_header ktap_set_plan 2 -unshare -n packetdrill ${ipv4_args[@]} $(basename $script) > /dev/null \ +unshare -n packetdrill ${ipv4_args[@]} ${optargs[@]} $(basename $script) > /dev/null \ && ktap_test_pass "ipv4" || ktap_test_fail "ipv4" -unshare -n packetdrill ${ipv6_args[@]} $(basename $script) > /dev/null \ +unshare -n packetdrill ${ipv6_args[@]} ${optargs[@]} $(basename $script) > /dev/null \ && ktap_test_pass "ipv6" || ktap_test_fail "ipv6" ktap_finished