mbox series

[net-next,0/3] selftests: openvswitch: Address some flakes in the CI environment

Message ID 20240702132830.213384-1-aconole@redhat.com (mailing list archive)
Headers show
Series selftests: openvswitch: Address some flakes in the CI environment | expand

Message

Aaron Conole July 2, 2024, 1:28 p.m. UTC
These patches aim to make using the openvswitch testsuite more reliable.
These should address the major sources of flakiness in the openvswitch
test suite allowing the CI infrastructure to exercise the openvswitch
module for patch series.  There should be no change for users who simply
run the tests (except that patch 3/3 does make some of the debugging a bit
easier by making some output more verbose).

Aaron Conole (3):
  selftests: openvswitch: Bump timeout to 15 minutes.
  selftests: openvswitch: Attempt to autoload module.
  selftests: openvswitch: Be more verbose with selftest debugging.

 .../selftests/net/openvswitch/openvswitch.sh  | 23 ++++++++++++-------
 .../selftests/net/openvswitch/settings        |  1 +
 2 files changed, 16 insertions(+), 8 deletions(-)
 create mode 100644 tools/testing/selftests/net/openvswitch/settings

Comments

patchwork-bot+netdevbpf@kernel.org July 4, 2024, 2:40 a.m. UTC | #1
Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue,  2 Jul 2024 09:28:27 -0400 you wrote:
> These patches aim to make using the openvswitch testsuite more reliable.
> These should address the major sources of flakiness in the openvswitch
> test suite allowing the CI infrastructure to exercise the openvswitch
> module for patch series.  There should be no change for users who simply
> run the tests (except that patch 3/3 does make some of the debugging a bit
> easier by making some output more verbose).
> 
> [...]

Here is the summary with links:
  - [net-next,1/3] selftests: openvswitch: Bump timeout to 15 minutes.
    https://git.kernel.org/netdev/net-next/c/ff015706fc73
  - [net-next,2/3] selftests: openvswitch: Attempt to autoload module.
    https://git.kernel.org/netdev/net-next/c/818481db3df4
  - [net-next,3/3] selftests: openvswitch: Be more verbose with selftest debugging.
    https://git.kernel.org/netdev/net-next/c/7abfd8ecb785

You are awesome, thank you!
Jakub Kicinski July 5, 2024, 1:28 p.m. UTC | #2
On Tue,  2 Jul 2024 09:28:27 -0400 Aaron Conole wrote:
> These patches aim to make using the openvswitch testsuite more reliable.
> These should address the major sources of flakiness in the openvswitch
> test suite allowing the CI infrastructure to exercise the openvswitch
> module for patch series.  There should be no change for users who simply
> run the tests (except that patch 3/3 does make some of the debugging a bit
> easier by making some output more verbose).

Hi Aaron!

The results look solid on normal builds now, but with a debug kernel
the test is failing consistently:

https://netdev.bots.linux.dev/contest.html?executor=vmksft-net-dbg&test=openvswitch-sh
Aaron Conole July 5, 2024, 1:49 p.m. UTC | #3
Jakub Kicinski <kuba@kernel.org> writes:

> On Tue,  2 Jul 2024 09:28:27 -0400 Aaron Conole wrote:
>> These patches aim to make using the openvswitch testsuite more reliable.
>> These should address the major sources of flakiness in the openvswitch
>> test suite allowing the CI infrastructure to exercise the openvswitch
>> module for patch series.  There should be no change for users who simply
>> run the tests (except that patch 3/3 does make some of the debugging a bit
>> easier by making some output more verbose).
>
> Hi Aaron!
>
> The results look solid on normal builds now, but with a debug kernel
> the test is failing consistently:
>
> https://netdev.bots.linux.dev/contest.html?executor=vmksft-net-dbg&test=openvswitch-sh

Yes - it shows a test case issue with the upcall and psample tests.

Adrian and I discussed the correct approach would be using a wait_for
instead of just sleeping, because it seems the dbg environment might be
too racy.  I think he is working on a follow up to submit after the
psample work gets merged - we were hoping not to hold that patch series
up with more potential conflicts or merge issues if that's okay.
Jakub Kicinski July 5, 2024, 1:53 p.m. UTC | #4
On Fri, 05 Jul 2024 09:49:12 -0400 Aaron Conole wrote:
> > The results look solid on normal builds now, but with a debug kernel
> > the test is failing consistently:
> >
> > https://netdev.bots.linux.dev/contest.html?executor=vmksft-net-dbg&test=openvswitch-sh  
> 
> Yes - it shows a test case issue with the upcall and psample tests.
> 
> Adrian and I discussed the correct approach would be using a wait_for
> instead of just sleeping, because it seems the dbg environment might be
> too racy.  I think he is working on a follow up to submit after the
> psample work gets merged - we were hoping not to hold that patch series
> up with more potential conflicts or merge issues if that's okay.

Makes sense, thanks!
Adrián Moreno July 5, 2024, 2:01 p.m. UTC | #5
On Fri, Jul 05, 2024 at 09:49:12AM GMT, Aaron Conole wrote:
> Jakub Kicinski <kuba@kernel.org> writes:
>
> > On Tue,  2 Jul 2024 09:28:27 -0400 Aaron Conole wrote:
> >> These patches aim to make using the openvswitch testsuite more reliable.
> >> These should address the major sources of flakiness in the openvswitch
> >> test suite allowing the CI infrastructure to exercise the openvswitch
> >> module for patch series.  There should be no change for users who simply
> >> run the tests (except that patch 3/3 does make some of the debugging a bit
> >> easier by making some output more verbose).
> >
> > Hi Aaron!
> >
> > The results look solid on normal builds now, but with a debug kernel
> > the test is failing consistently:
> >
> > https://netdev.bots.linux.dev/contest.html?executor=vmksft-net-dbg&test=openvswitch-sh
>
> Yes - it shows a test case issue with the upcall and psample tests.
>
> Adrian and I discussed the correct approach would be using a wait_for
> instead of just sleeping, because it seems the dbg environment might be
> too racy.  I think he is working on a follow up to submit after the
> psample work gets merged - we were hoping not to hold that patch series
> up with more potential conflicts or merge issues if that's okay.
>

Yes. I am working on a patch to solve the failures in slow systems.

Thanks.
Adrián