mbox series

[net-next,v2,0/3] selftests/net: packetdrill: netns and two imports

Message ID 20240912005317.1253001-1-willemdebruijn.kernel@gmail.com (mailing list archive)
Headers show
Series selftests/net: packetdrill: netns and two imports | expand

Message

Willem de Bruijn Sept. 12, 2024, 12:52 a.m. UTC
From: Willem de Bruijn <willemb@google.com>

1/3: run in nets, as discussed, and add missing CONFIGs
2/3: import tcp/zerocopy
3/3: import tcp/slow_start

Willem de Bruijn (3):
  selftests/net: packetdrill: run in netns and expand config
  selftests/net: packetdrill: import tcp/zerocopy
  selftests/net: packetdrill: import tcp/slow_start

 .../selftests/net/packetdrill/Makefile        |   1 +
 .../testing/selftests/net/packetdrill/config  |   6 +
 .../selftests/net/packetdrill/ksft_runner.sh  |   4 +-
 .../selftests/net/packetdrill/set_sysctls.py  |  38 ++++++
 ...tcp_slow_start_slow-start-ack-per-1pkt.pkt |  56 +++++++++
 ...tart_slow-start-ack-per-2pkt-send-5pkt.pkt |  33 +++++
 ...tart_slow-start-ack-per-2pkt-send-6pkt.pkt |  34 +++++
 ...tcp_slow_start_slow-start-ack-per-2pkt.pkt |  42 +++++++
 ...tcp_slow_start_slow-start-ack-per-4pkt.pkt |  35 ++++++
 .../tcp_slow_start_slow-start-after-idle.pkt  |  39 ++++++
 ...slow_start_slow-start-after-win-update.pkt |  50 ++++++++
 ...t_slow-start-app-limited-9-packets-out.pkt |  38 ++++++
 .../tcp_slow_start_slow-start-app-limited.pkt |  36 ++++++
 ..._slow_start_slow-start-fq-ack-per-2pkt.pkt |  63 ++++++++++
 .../net/packetdrill/tcp_zerocopy_basic.pkt    |  55 ++++++++
 .../net/packetdrill/tcp_zerocopy_batch.pkt    |  41 ++++++
 .../net/packetdrill/tcp_zerocopy_client.pkt   |  30 +++++
 .../net/packetdrill/tcp_zerocopy_closed.pkt   |  44 +++++++
 .../packetdrill/tcp_zerocopy_epoll_edge.pkt   |  61 +++++++++
 .../tcp_zerocopy_epoll_exclusive.pkt          |  63 ++++++++++
 .../tcp_zerocopy_epoll_oneshot.pkt            |  66 ++++++++++
 .../tcp_zerocopy_fastopen-client.pkt          |  56 +++++++++
 .../tcp_zerocopy_fastopen-server.pkt          |  44 +++++++
 .../net/packetdrill/tcp_zerocopy_maxfrags.pkt | 118 ++++++++++++++++++
 .../net/packetdrill/tcp_zerocopy_small.pkt    |  57 +++++++++
 25 files changed, 1108 insertions(+), 2 deletions(-)
 create mode 100755 tools/testing/selftests/net/packetdrill/set_sysctls.py
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-ack-per-1pkt.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-ack-per-2pkt-send-5pkt.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-ack-per-2pkt-send-6pkt.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-ack-per-2pkt.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-ack-per-4pkt.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-after-idle.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-after-win-update.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-app-limited-9-packets-out.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-app-limited.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-fq-ack-per-2pkt.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_zerocopy_basic.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_zerocopy_batch.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_zerocopy_client.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_zerocopy_closed.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_zerocopy_epoll_edge.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_zerocopy_epoll_exclusive.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_zerocopy_epoll_oneshot.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_zerocopy_fastopen-client.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_zerocopy_fastopen-server.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_zerocopy_maxfrags.pkt
 create mode 100644 tools/testing/selftests/net/packetdrill/tcp_zerocopy_small.pkt

Comments

Matthieu Baerts (NGI0) Sept. 12, 2024, 11:22 a.m. UTC | #1
Hi Willem,

On 12/09/2024 02:52, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@google.com>
> 
> 1/3: run in nets, as discussed, and add missing CONFIGs
> 2/3: import tcp/zerocopy
> 3/3: import tcp/slow_start

Thank you for the v2. This new version looks good to me:

Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>


I didn't pay too much attention to the new tests, because they look
good, heavily tested I suppose, and I guess the goal is not to diverge
from the original ones for the moment. Still, please note that the CI
reported some timing issues with tcp_zerocopy_closed.pkt when using a
debug kernel config, e.g.

> tcp_zerocopy_closed.pkt:22: timing error: expected system call return at 0.100596 sec but happened at 0.109564 sec; tolerance 0.004000 sec

https://netdev.bots.linux.dev/contest.html?executor=vmksft-packetdrill-dbg&test=tcp-zerocopy-closed-pkt

Cheers,
Matt
Willem de Bruijn Sept. 12, 2024, 12:10 p.m. UTC | #2
Matthieu Baerts wrote:
> Hi Willem,
> 
> On 12/09/2024 02:52, Willem de Bruijn wrote:
> > From: Willem de Bruijn <willemb@google.com>
> > 
> > 1/3: run in nets, as discussed, and add missing CONFIGs
> > 2/3: import tcp/zerocopy
> > 3/3: import tcp/slow_start
> 
> Thank you for the v2. This new version looks good to me:
> 
> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> 
> 
> I didn't pay too much attention to the new tests, because they look
> good, heavily tested I suppose, and I guess the goal is not to diverge
> from the original ones for the moment. Still, please note that the CI
> reported some timing issues with tcp_zerocopy_closed.pkt when using a
> debug kernel config, e.g.
> 
> > tcp_zerocopy_closed.pkt:22: timing error: expected system call return at 0.100596 sec but happened at 0.109564 sec; tolerance 0.004000 sec
> 
> https://netdev.bots.linux.dev/contest.html?executor=vmksft-packetdrill-dbg&test=tcp-zerocopy-closed-pkt

Thanks Matthieu. I did not run the dbg variant often enough to observe
that. Note to self to run more times before I submit.

It seems to fail 2/10 times on the dbg spinner. I don't have an
explanation for the failure yet. The line itself has no expected delay

# script packet:  0.113203 S 0:0(0) <mss 1460,sackOK,TS val 0 ecr 0,nop,wscale 8>
# actual packet:  0.107191 S 0:0(0) win 65535 <mss 1460,sackOK,TS val 0 ecr 0,nop,wscale 8>

   +0.1 recvmsg(4, {msg_name(...)=...,
                    msg_iov(1)=[{...,0}],
                    msg_flags=MSG_ERRQUEUE,
                    msg_control=[]}, MSG_ERRQUEUE) = -1 EAGAIN (Resource temporarily unavailable)

   +0...0 connect(4, ..., ...) = 0

   +0 > S 0:0(0) <mss 1460,sackOK,TS val 0 ecr 0,nop,wscale 8>

I guess the expectation includes the +0.1 delay before calling recvmsg, and that
timer fired a bit early.

I previously shared a draft patch to adjust --tolerance_usecs in dbg runs.
May have to send that after all.

https://lore.kernel.org/netdev/66da5b8b27259_27bb41294c@willemb.c.googlers.com.notmuch/
patchwork-bot+netdevbpf@kernel.org Sept. 13, 2024, 2:20 a.m. UTC | #3
Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Wed, 11 Sep 2024 20:52:39 -0400 you wrote:
> From: Willem de Bruijn <willemb@google.com>
> 
> 1/3: run in nets, as discussed, and add missing CONFIGs
> 2/3: import tcp/zerocopy
> 3/3: import tcp/slow_start
> 
> Willem de Bruijn (3):
>   selftests/net: packetdrill: run in netns and expand config
>   selftests/net: packetdrill: import tcp/zerocopy
>   selftests/net: packetdrill: import tcp/slow_start
> 
> [...]

Here is the summary with links:
  - [net-next,v2,1/3] selftests/net: packetdrill: run in netns and expand config
    https://git.kernel.org/netdev/net-next/c/cded7e0479c9
  - [net-next,v2,2/3] selftests/net: packetdrill: import tcp/zerocopy
    https://git.kernel.org/netdev/net-next/c/1e42f73fd3c2
  - [net-next,v2,3/3] selftests/net: packetdrill: import tcp/slow_start
    https://git.kernel.org/netdev/net-next/c/e874be276ee4

You are awesome, thank you!
Stanislav Fomichev Sept. 16, 2024, 4:48 p.m. UTC | #4
On 09/12, Willem de Bruijn wrote:
> Matthieu Baerts wrote:
> > Hi Willem,
> > 
> > On 12/09/2024 02:52, Willem de Bruijn wrote:
> > > From: Willem de Bruijn <willemb@google.com>
> > > 
> > > 1/3: run in nets, as discussed, and add missing CONFIGs
> > > 2/3: import tcp/zerocopy
> > > 3/3: import tcp/slow_start
> > 
> > Thank you for the v2. This new version looks good to me:
> > 
> > Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> > 
> > 
> > I didn't pay too much attention to the new tests, because they look
> > good, heavily tested I suppose, and I guess the goal is not to diverge
> > from the original ones for the moment. Still, please note that the CI
> > reported some timing issues with tcp_zerocopy_closed.pkt when using a
> > debug kernel config, e.g.
> > 
> > > tcp_zerocopy_closed.pkt:22: timing error: expected system call return at 0.100596 sec but happened at 0.109564 sec; tolerance 0.004000 sec
> > 
> > https://netdev.bots.linux.dev/contest.html?executor=vmksft-packetdrill-dbg&test=tcp-zerocopy-closed-pkt
> 
> Thanks Matthieu. I did not run the dbg variant often enough to observe
> that. Note to self to run more times before I submit.
> 
> It seems to fail 2/10 times on the dbg spinner. I don't have an
> explanation for the failure yet. The line itself has no expected delay
> 
> # script packet:  0.113203 S 0:0(0) <mss 1460,sackOK,TS val 0 ecr 0,nop,wscale 8>
> # actual packet:  0.107191 S 0:0(0) win 65535 <mss 1460,sackOK,TS val 0 ecr 0,nop,wscale 8>
> 
>    +0.1 recvmsg(4, {msg_name(...)=...,
>                     msg_iov(1)=[{...,0}],
>                     msg_flags=MSG_ERRQUEUE,
>                     msg_control=[]}, MSG_ERRQUEUE) = -1 EAGAIN (Resource temporarily unavailable)
> 
>    +0...0 connect(4, ..., ...) = 0
> 
>    +0 > S 0:0(0) <mss 1460,sackOK,TS val 0 ecr 0,nop,wscale 8>
> 
> I guess the expectation includes the +0.1 delay before calling recvmsg, and that
> timer fired a bit early.
> 
> I previously shared a draft patch to adjust --tolerance_usecs in dbg runs.
> May have to send that after all.
> 
> https://lore.kernel.org/netdev/66da5b8b27259_27bb41294c@willemb.c.googlers.com.notmuch/

Not sure you've seen, but tcp_slow_start_slow-start-after-win-update.pkt
also just popped up on the dashboard for dbg:

# tcp_slow_start_slow-start-after-win-update.pkt:39: error handling packet: timing error: expected outbound packet in relative time range +0.600000~+0.620000

https://netdev-3.bots.linux.dev/vmksft-packetdrill-dbg/results/774981/1-tcp-slow-start-slow-start-after-win-update-pkt/stdout

Do we want to follow up with that '--tolerance_usecs=10000' you've
mentioned above?
Willem de Bruijn Sept. 16, 2024, 8:54 p.m. UTC | #5
Stanislav Fomichev wrote:
> On 09/12, Willem de Bruijn wrote:
> > Matthieu Baerts wrote:
> > > Hi Willem,
> > > 
> > > On 12/09/2024 02:52, Willem de Bruijn wrote:
> > > > From: Willem de Bruijn <willemb@google.com>
> > > > 
> > > > 1/3: run in nets, as discussed, and add missing CONFIGs
> > > > 2/3: import tcp/zerocopy
> > > > 3/3: import tcp/slow_start
> > > 
> > > Thank you for the v2. This new version looks good to me:
> > > 
> > > Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> > > 
> > > 
> > > I didn't pay too much attention to the new tests, because they look
> > > good, heavily tested I suppose, and I guess the goal is not to diverge
> > > from the original ones for the moment. Still, please note that the CI
> > > reported some timing issues with tcp_zerocopy_closed.pkt when using a
> > > debug kernel config, e.g.
> > > 
> > > > tcp_zerocopy_closed.pkt:22: timing error: expected system call return at 0.100596 sec but happened at 0.109564 sec; tolerance 0.004000 sec
> > > 
> > > https://netdev.bots.linux.dev/contest.html?executor=vmksft-packetdrill-dbg&test=tcp-zerocopy-closed-pkt
> > 
> > Thanks Matthieu. I did not run the dbg variant often enough to observe
> > that. Note to self to run more times before I submit.
> > 
> > It seems to fail 2/10 times on the dbg spinner. I don't have an
> > explanation for the failure yet. The line itself has no expected delay
> > 
> > # script packet:  0.113203 S 0:0(0) <mss 1460,sackOK,TS val 0 ecr 0,nop,wscale 8>
> > # actual packet:  0.107191 S 0:0(0) win 65535 <mss 1460,sackOK,TS val 0 ecr 0,nop,wscale 8>
> > 
> >    +0.1 recvmsg(4, {msg_name(...)=...,
> >                     msg_iov(1)=[{...,0}],
> >                     msg_flags=MSG_ERRQUEUE,
> >                     msg_control=[]}, MSG_ERRQUEUE) = -1 EAGAIN (Resource temporarily unavailable)
> > 
> >    +0...0 connect(4, ..., ...) = 0
> > 
> >    +0 > S 0:0(0) <mss 1460,sackOK,TS val 0 ecr 0,nop,wscale 8>
> > 
> > I guess the expectation includes the +0.1 delay before calling recvmsg, and that
> > timer fired a bit early.
> > 
> > I previously shared a draft patch to adjust --tolerance_usecs in dbg runs.
> > May have to send that after all.
> > 
> > https://lore.kernel.org/netdev/66da5b8b27259_27bb41294c@willemb.c.googlers.com.notmuch/
> 
> Not sure you've seen, but tcp_slow_start_slow-start-after-win-update.pkt
> also just popped up on the dashboard for dbg:
> 
> # tcp_slow_start_slow-start-after-win-update.pkt:39: error handling packet: timing error: expected outbound packet in relative time range +0.600000~+0.620000
> 
> https://netdev-3.bots.linux.dev/vmksft-packetdrill-dbg/results/774981/1-tcp-slow-start-slow-start-after-win-update-pkt/stdout
> 
> Do we want to follow up with that '--tolerance_usecs=10000' you've
> mentioned above?

And more tests coming. Looks like it. I'll finish it up. Thanks for
the pointer.