mbox series

[net-next,v2,0/7] net/selftests: TCP-AO selftests updates

Message ID 20240802-tcp-ao-selftests-upd-6-12-v2-0-370c99358161@gmail.com (mailing list archive)
Headers show
Series net/selftests: TCP-AO selftests updates | expand

Message

Dmitry Safonov via B4 Relay Aug. 2, 2024, 9:23 a.m. UTC
First 4 patches are more-or-less cleanups/preparations.

Patch 5 was sent to me/contributed off-list by Mohammad, who wants 32-bit
kernels to run TCP-AO.

Patch 6 is a workaround/fix for slow VMs. Albeit, I can't reproduce
the issue, but I hope it will fix netdev flakes for connect-deny-*
tests.

And the biggest change is adding TCP-AO tracepoints to selftests.
I think it's a good addition by the following reasons:
- The related tracepoints are now tested;
- It allows tcp-ao selftests to raise expectations on the kernel
  behavior - up from the syscalls exit statuses + net counters.
- Provides tracepoints usage samples.

As tracepoints are not a stable ABI, any kernel changes done to them
will be reflected to the selftests, which also will allow users
to see how to change their code. It's quite better than parsing dmesg
(what BGP was doing pre-tracepoints, ugh).

Somewhat arguably, the code parses trace_pipe, rather than uses
libtraceevent (which any sane user should do). The reason behind that is
the same as for rt-netlink macros instead of libmnl: I'm trying
to minimize the library dependencies of the selftests. And the
performance of formatting text in kernel and parsing it again in a test
is not critical.

Current output sample:
> ok 73 Trace events matched expectations: 13 tcp_hash_md5_required[2] tcp_hash_md5_unexpected[4] tcp_hash_ao_required[3] tcp_ao_key_not_found[4]

Previously, tracepoints selftests were part of kernel tcp tracepoints
submission [1], but since then the code was quite changed:
- Now generic tracing setup is in lib/ftrace.c, separate from
  lib/ftrace-tcp.c which utilizes TCP trace points. This separation
  allows future selftests to trace non-TCP events, i.e. to find out
  an skb's drop reason, which was useful in the creation of TCP-CLOSE
  stress-test (not in this patch set, but used in attempt to reproduce
  the issue from [2]).
- Another change is that in the previous submission the trace events
  where used only to detect unexpected TCP-AO/TCP-MD5 events. In this
  version the selftests will fail if an expected trace event didn't
  appear.
  Let's see how reliable this is on the netdev bot - it obviously passes
  on my testing, but potentially may require a temporary XFAIL patch
  if it misbehaves on a slow VM.

[1] https://lore.kernel.org/lkml/20240224-tcp-ao-tracepoints-v1-0-15f31b7f30a7@arista.com/
[2] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=33700a0c9b56

Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
---
Changes in v2:
- Fixed two issues with parsing TCP-AO events: the socket state and TCP
  segment flags. Hopefully, won't fail on netdev.
- Reword patch 1 & 2 messages to be more informative and at some degree
  formal (Paolo)
- Since commit e33a02ed6a4f ("selftests: Add printf attribute to
  kselftest prints") it's possible to use __printf instead of "raw" gcc
  attribute - switch using that, as checkpatch suggests.
- Link to v1: https://lore.kernel.org/r/20240730-tcp-ao-selftests-upd-6-12-v1-0-ffd4bf15d638@gmail.com

---
Dmitry Safonov (6):
      selftests/net: Clean-up double assignment
      selftests/net: Provide test_snprintf() helper
      selftests/net: Be consistent in kconfig checks
      selftests/net: Don't forget to close nsfd after switch_save_ns()
      selftests/net: Synchronize client/server before counters checks
      selftests/net: Add trace events matching to tcp_ao

Mohammad Nassiri (1):
      selftests/tcp_ao: Fix printing format for uint64_t

 tools/testing/selftests/net/tcp_ao/Makefile        |   3 +-
 tools/testing/selftests/net/tcp_ao/bench-lookups.c |   2 +-
 tools/testing/selftests/net/tcp_ao/config          |   1 +
 tools/testing/selftests/net/tcp_ao/connect-deny.c  |  25 +-
 tools/testing/selftests/net/tcp_ao/connect.c       |   6 +-
 tools/testing/selftests/net/tcp_ao/icmps-discard.c |   2 +-
 .../testing/selftests/net/tcp_ao/key-management.c  |  18 +-
 tools/testing/selftests/net/tcp_ao/lib/aolib.h     | 176 ++++++-
 .../testing/selftests/net/tcp_ao/lib/ftrace-tcp.c  | 549 +++++++++++++++++++++
 tools/testing/selftests/net/tcp_ao/lib/ftrace.c    | 466 +++++++++++++++++
 tools/testing/selftests/net/tcp_ao/lib/kconfig.c   |  31 +-
 tools/testing/selftests/net/tcp_ao/lib/setup.c     |  15 +-
 tools/testing/selftests/net/tcp_ao/lib/sock.c      |   1 -
 tools/testing/selftests/net/tcp_ao/lib/utils.c     |  26 +
 tools/testing/selftests/net/tcp_ao/restore.c       |  30 +-
 tools/testing/selftests/net/tcp_ao/rst.c           |   2 +-
 tools/testing/selftests/net/tcp_ao/self-connect.c  |  19 +-
 tools/testing/selftests/net/tcp_ao/seq-ext.c       |  28 +-
 .../selftests/net/tcp_ao/setsockopt-closed.c       |   6 +-
 tools/testing/selftests/net/tcp_ao/unsigned-md5.c  |  35 +-
 20 files changed, 1375 insertions(+), 66 deletions(-)
---
base-commit: 3361a6eae59664ffae640ff7a838f5bd89c24461
change-id: 20240730-tcp-ao-selftests-upd-6-12-4d3e53a74f3f

Best regards,

Comments

Jakub Kicinski Aug. 2, 2024, 3:18 p.m. UTC | #1
On Fri, 02 Aug 2024 10:23:24 +0100 Dmitry Safonov via B4 Relay wrote:
> First 4 patches are more-or-less cleanups/preparations.
> 
> Patch 5 was sent to me/contributed off-list by Mohammad, who wants 32-bit
> kernels to run TCP-AO.
> 
> Patch 6 is a workaround/fix for slow VMs. Albeit, I can't reproduce
> the issue, but I hope it will fix netdev flakes for connect-deny-*
> tests.

Hm, could be a coincidence but we did hit:

# not ok 55 # error 381[unsigned-md5.c:24] Failed to add a VRF: -17
# not ok 56 # error 383[unsigned-md5.c:33] Failed to add a route to VRF: -22: Key was rejected by service

https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/710001/4-unsigned-md5-ipv6/stdout

in the first run after this got queued. But the retry worked:

https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/710001/4-unsigned-md5-ipv6-retry/stdout


Jakub Kicinski Aug. 2, 2024, 3:32 p.m. UTC | #2
On Fri, 2 Aug 2024 08:18:23 -0700 Jakub Kicinski wrote:
> On Fri, 02 Aug 2024 10:23:24 +0100 Dmitry Safonov via B4 Relay wrote:
> > First 4 patches are more-or-less cleanups/preparations.
> > 
> > Patch 5 was sent to me/contributed off-list by Mohammad, who wants 32-bit
> > kernels to run TCP-AO.
> > 
> > Patch 6 is a workaround/fix for slow VMs. Albeit, I can't reproduce
> > the issue, but I hope it will fix netdev flakes for connect-deny-*
> > tests.  
> 
> Hm, could be a coincidence but we did hit:
> 
> # not ok 55 # error 381[unsigned-md5.c:24] Failed to add a VRF: -17
> # not ok 56 # error 383[unsigned-md5.c:33] Failed to add a route to VRF: -22: Key was rejected by service
> 
> https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/710001/4-unsigned-md5-ipv6/stdout
> 
> in the first run after this got queued. But the retry worked:
> 
> https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/710001/4-unsigned-md5-ipv6-retry/stdout

oooh another run, another (different) flake:
https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/710181/11-key-management-ipv4/stdout

I'll keep it around for another run, but looking less and less 
like a coincidence :(
Dmitry Safonov Aug. 3, 2024, 12:50 a.m. UTC | #3
Hi Jakub,

On Fri, 2 Aug 2024 at 16:18, Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 02 Aug 2024 10:23:24 +0100 Dmitry Safonov via B4 Relay wrote:
> > First 4 patches are more-or-less cleanups/preparations.
> >
> > Patch 5 was sent to me/contributed off-list by Mohammad, who wants 32-bit
> > kernels to run TCP-AO.
> >
> > Patch 6 is a workaround/fix for slow VMs. Albeit, I can't reproduce
> > the issue, but I hope it will fix netdev flakes for connect-deny-*
> > tests.
>
> Hm, could be a coincidence but we did hit:
>
> # not ok 55 # error 381[unsigned-md5.c:24] Failed to add a VRF: -17
> # not ok 56 # error 383[unsigned-md5.c:33] Failed to add a route to VRF: -22: Key was rejected by service
>
> https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/710001/4-unsigned-md5-ipv6/stdout

Yeah, I think I've seen that previously on netdev as well, but quite rarely.
Let me take a look and see why adding a VRF table sometimes fails with EEXIST.

> in the first run after this got queued. But the retry worked:
>
> https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/710001/4-unsigned-md5-ipv6-retry/stdout
>
>