Message ID | 20230316175051.922550-3-aleksander.lobakin@intel.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 5640b6d894342d153b719644681b0345fd28ee96 |
Delegated to: | BPF |
Headers | show |
Series | double-fix bpf_test_run + XDP_PASS recycling | expand |
Alexander Lobakin <aleksander.lobakin@intel.com> writes: > Alexei noticed xdp_do_redirect test on BPF CI started failing on > BE systems after skb PP recycling was enabled: > > test_xdp_do_redirect:PASS:prog_run 0 nsec > test_xdp_do_redirect:PASS:pkt_count_xdp 0 nsec > test_xdp_do_redirect:PASS:pkt_count_zero 0 nsec > test_xdp_do_redirect:FAIL:pkt_count_tc unexpected pkt_count_tc: actual > 220 != expected 9998 > test_max_pkt_size:PASS:prog_run_max_size 0 nsec > test_max_pkt_size:PASS:prog_run_too_big 0 nsec > close_netns:PASS:setns 0 nsec > #289 xdp_do_redirect:FAIL > Summary: 270/1674 PASSED, 30 SKIPPED, 1 FAILED > > and it doesn't happen on LE systems. > Ilya then hunted it down to: > > #0 0x0000000000aaeee6 in neigh_hh_output (hh=0x83258df0, > skb=0x88142200) at linux/include/net/neighbour.h:503 > #1 0x0000000000ab2cda in neigh_output (skip_cache=false, > skb=0x88142200, n=<optimized out>) at linux/include/net/neighbour.h:544 > #2 ip6_finish_output2 (net=net@entry=0x88edba00, sk=sk@entry=0x0, > skb=skb@entry=0x88142200) at linux/net/ipv6/ip6_output.c:134 > #3 0x0000000000ab4cbc in __ip6_finish_output (skb=0x88142200, sk=0x0, > net=0x88edba00) at linux/net/ipv6/ip6_output.c:195 > #4 ip6_finish_output (net=0x88edba00, sk=0x0, skb=0x88142200) at > linux/net/ipv6/ip6_output.c:206 > > xdp_do_redirect test places a u32 marker (0x42) right before the Ethernet > header to check it then in the XDP program and return %XDP_ABORTED if it's > not there. Neigh xmit code likes to round up hard header length to speed > up copying the header, so it overwrites two bytes in front of the Eth > header. On LE systems, 0x42 is one byte at `data - 4`, while on BE it's > `data - 1`, what explains why it happens only there. > It didn't happen previously due to that %XDP_PASS meant the page will be > discarded and replaced by a new one, but now it can be recycled as well, > while bpf_test_run code doesn't reinitialize the content of recycled > pages. This mark is limited to this particular test and its setup though, > so there's no need to predict 1000 different possible cases. Just move > it 4 bytes to the left, still keeping it 32 bit to match on more > bytes. Wow, this must have been annoying to track down - nice work :) > Fixes: 9c94bbf9a87b ("xdp: recycle Page Pool backed skbs built from XDP frames") > Reported-by: Alexei Starovoitov <ast@kernel.org> > Link: https://lore.kernel.org/bpf/CAADnVQ+B_JOU+EpP=DKhbY9yXdN6GiRPnpTTXfEZ9sNkUeb-yQ@mail.gmail.com > Reported-by: Ilya Leoshkevich <iii@linux.ibm.com> # + debugging > Link: https://lore.kernel.org/bpf/8341c1d9f935f410438e79d3bd8a9cc50aefe105.camel@linux.ibm.com > Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
On Thu, 2023-03-16 at 18:50 +0100, Alexander Lobakin wrote: > Alexei noticed xdp_do_redirect test on BPF CI started failing on > BE systems after skb PP recycling was enabled: > > test_xdp_do_redirect:PASS:prog_run 0 nsec > test_xdp_do_redirect:PASS:pkt_count_xdp 0 nsec > test_xdp_do_redirect:PASS:pkt_count_zero 0 nsec > test_xdp_do_redirect:FAIL:pkt_count_tc unexpected pkt_count_tc: > actual > 220 != expected 9998 > test_max_pkt_size:PASS:prog_run_max_size 0 nsec > test_max_pkt_size:PASS:prog_run_too_big 0 nsec > close_netns:PASS:setns 0 nsec > #289 xdp_do_redirect:FAIL > Summary: 270/1674 PASSED, 30 SKIPPED, 1 FAILED > > and it doesn't happen on LE systems. > Ilya then hunted it down to: > > #0 0x0000000000aaeee6 in neigh_hh_output (hh=0x83258df0, > skb=0x88142200) at linux/include/net/neighbour.h:503 > #1 0x0000000000ab2cda in neigh_output (skip_cache=false, > skb=0x88142200, n=<optimized out>) at > linux/include/net/neighbour.h:544 > #2 ip6_finish_output2 (net=net@entry=0x88edba00, sk=sk@entry=0x0, > skb=skb@entry=0x88142200) at linux/net/ipv6/ip6_output.c:134 > #3 0x0000000000ab4cbc in __ip6_finish_output (skb=0x88142200, > sk=0x0, > net=0x88edba00) at linux/net/ipv6/ip6_output.c:195 > #4 ip6_finish_output (net=0x88edba00, sk=0x0, skb=0x88142200) at > linux/net/ipv6/ip6_output.c:206 > > xdp_do_redirect test places a u32 marker (0x42) right before the > Ethernet > header to check it then in the XDP program and return %XDP_ABORTED if > it's > not there. Neigh xmit code likes to round up hard header length to > speed > up copying the header, so it overwrites two bytes in front of the Eth > header. On LE systems, 0x42 is one byte at `data - 4`, while on BE > it's > `data - 1`, what explains why it happens only there. > It didn't happen previously due to that %XDP_PASS meant the page will > be > discarded and replaced by a new one, but now it can be recycled as > well, > while bpf_test_run code doesn't reinitialize the content of recycled > pages. This mark is limited to this particular test and its setup > though, > so there's no need to predict 1000 different possible cases. Just > move > it 4 bytes to the left, still keeping it 32 bit to match on more > bytes. > > Fixes: 9c94bbf9a87b ("xdp: recycle Page Pool backed skbs built from > XDP frames") > Reported-by: Alexei Starovoitov <ast@kernel.org> > Link: > https://lore.kernel.org/bpf/CAADnVQ+B_JOU+EpP=DKhbY9yXdN6GiRPnpTTXfEZ9sNkUeb-yQ@mail.gmail.com > Reported-by: Ilya Leoshkevich <iii@linux.ibm.com> # + debugging > Link: > https://lore.kernel.org/bpf/8341c1d9f935f410438e79d3bd8a9cc50aefe105.camel@linux.ibm.com > Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> > --- > tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c | 7 ++++--- > tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c | 2 +- > 2 files changed, 5 insertions(+), 4 deletions(-) > > diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c > b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c > index 856cbc29e6a1..4eaa3dcaebc8 100644 > --- a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c > +++ b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c > @@ -86,12 +86,12 @@ static void test_max_pkt_size(int fd) > void test_xdp_do_redirect(void) > { > int err, xdp_prog_fd, tc_prog_fd, ifindex_src, ifindex_dst; > - char data[sizeof(pkt_udp) + sizeof(__u32)]; > + char data[sizeof(pkt_udp) + sizeof(__u64)]; > struct test_xdp_do_redirect *skel = NULL; > struct nstoken *nstoken = NULL; > struct bpf_link *link; > LIBBPF_OPTS(bpf_xdp_query_opts, query_opts); > - struct xdp_md ctx_in = { .data = sizeof(__u32), > + struct xdp_md ctx_in = { .data = sizeof(__u64), > .data_end = sizeof(data) }; > DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts, > .data_in = &data, > @@ -105,8 +105,9 @@ void test_xdp_do_redirect(void) > DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook, > .attach_point = BPF_TC_INGRESS); > > - memcpy(&data[sizeof(__u32)], &pkt_udp, sizeof(pkt_udp)); > + memcpy(&data[sizeof(__u64)], &pkt_udp, sizeof(pkt_udp)); > *((__u32 *)data) = 0x42; /* metadata test value */ > + *((__u32 *)data + 4) = 0; > > skel = test_xdp_do_redirect__open(); > if (!ASSERT_OK_PTR(skel, "skel")) > diff --git a/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c > b/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c > index cd2d4e3258b8..5baaafed0d2d 100644 > --- a/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c > +++ b/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c > @@ -52,7 +52,7 @@ int xdp_redirect(struct xdp_md *xdp) > > *payload = MARK_IN; > > - if (bpf_xdp_adjust_meta(xdp, 4)) > + if (bpf_xdp_adjust_meta(xdp, sizeof(__u64))) > return XDP_ABORTED; > > if (retcode > XDP_PASS) Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
From: Toke Høiland-Jørgensen <toke@redhat.com> Date: Thu, 16 Mar 2023 21:10:02 +0100 > Alexander Lobakin <aleksander.lobakin@intel.com> writes: > >> Alexei noticed xdp_do_redirect test on BPF CI started failing on >> BE systems after skb PP recycling was enabled: >> >> test_xdp_do_redirect:PASS:prog_run 0 nsec >> test_xdp_do_redirect:PASS:pkt_count_xdp 0 nsec >> test_xdp_do_redirect:PASS:pkt_count_zero 0 nsec >> test_xdp_do_redirect:FAIL:pkt_count_tc unexpected pkt_count_tc: actual >> 220 != expected 9998 >> test_max_pkt_size:PASS:prog_run_max_size 0 nsec >> test_max_pkt_size:PASS:prog_run_too_big 0 nsec >> close_netns:PASS:setns 0 nsec >> #289 xdp_do_redirect:FAIL >> Summary: 270/1674 PASSED, 30 SKIPPED, 1 FAILED >> >> and it doesn't happen on LE systems. >> Ilya then hunted it down to: >> >> #0 0x0000000000aaeee6 in neigh_hh_output (hh=0x83258df0, >> skb=0x88142200) at linux/include/net/neighbour.h:503 >> #1 0x0000000000ab2cda in neigh_output (skip_cache=false, >> skb=0x88142200, n=<optimized out>) at linux/include/net/neighbour.h:544 >> #2 ip6_finish_output2 (net=net@entry=0x88edba00, sk=sk@entry=0x0, >> skb=skb@entry=0x88142200) at linux/net/ipv6/ip6_output.c:134 >> #3 0x0000000000ab4cbc in __ip6_finish_output (skb=0x88142200, sk=0x0, >> net=0x88edba00) at linux/net/ipv6/ip6_output.c:195 >> #4 ip6_finish_output (net=0x88edba00, sk=0x0, skb=0x88142200) at >> linux/net/ipv6/ip6_output.c:206 >> >> xdp_do_redirect test places a u32 marker (0x42) right before the Ethernet >> header to check it then in the XDP program and return %XDP_ABORTED if it's >> not there. Neigh xmit code likes to round up hard header length to speed >> up copying the header, so it overwrites two bytes in front of the Eth >> header. On LE systems, 0x42 is one byte at `data - 4`, while on BE it's >> `data - 1`, what explains why it happens only there. >> It didn't happen previously due to that %XDP_PASS meant the page will be >> discarded and replaced by a new one, but now it can be recycled as well, >> while bpf_test_run code doesn't reinitialize the content of recycled >> pages. This mark is limited to this particular test and its setup though, >> so there's no need to predict 1000 different possible cases. Just move >> it 4 bytes to the left, still keeping it 32 bit to match on more >> bytes. > > Wow, this must have been annoying to track down - nice work :) I just blinked my eyes once and Ilya already came back with the detailed stacktrace, so it's almost entirely his work <O > >> Fixes: 9c94bbf9a87b ("xdp: recycle Page Pool backed skbs built from XDP frames") >> Reported-by: Alexei Starovoitov <ast@kernel.org> >> Link: https://lore.kernel.org/bpf/CAADnVQ+B_JOU+EpP=DKhbY9yXdN6GiRPnpTTXfEZ9sNkUeb-yQ@mail.gmail.com >> Reported-by: Ilya Leoshkevich <iii@linux.ibm.com> # + debugging >> Link: https://lore.kernel.org/bpf/8341c1d9f935f410438e79d3bd8a9cc50aefe105.camel@linux.ibm.com >> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> > > Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> > Thanks! That was quick :D Olek
From: Ilya Leoshkevich <iii@linux.ibm.com> Date: Thu, 16 Mar 2023 22:22:26 +0100 > On Thu, 2023-03-16 at 18:50 +0100, Alexander Lobakin wrote: >> Alexei noticed xdp_do_redirect test on BPF CI started failing on >> BE systems after skb PP recycling was enabled: [...] >> @@ -52,7 +52,7 @@ int xdp_redirect(struct xdp_md *xdp) >> >> *payload = MARK_IN; >> >> - if (bpf_xdp_adjust_meta(xdp, 4)) >> + if (bpf_xdp_adjust_meta(xdp, sizeof(__u64))) >> return XDP_ABORTED; >> >> if (retcode > XDP_PASS) > > Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> Much thanks again for the debugging and testing! It definitely wouldn't have been quick without you :D Olek
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c index 856cbc29e6a1..4eaa3dcaebc8 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c @@ -86,12 +86,12 @@ static void test_max_pkt_size(int fd) void test_xdp_do_redirect(void) { int err, xdp_prog_fd, tc_prog_fd, ifindex_src, ifindex_dst; - char data[sizeof(pkt_udp) + sizeof(__u32)]; + char data[sizeof(pkt_udp) + sizeof(__u64)]; struct test_xdp_do_redirect *skel = NULL; struct nstoken *nstoken = NULL; struct bpf_link *link; LIBBPF_OPTS(bpf_xdp_query_opts, query_opts); - struct xdp_md ctx_in = { .data = sizeof(__u32), + struct xdp_md ctx_in = { .data = sizeof(__u64), .data_end = sizeof(data) }; DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts, .data_in = &data, @@ -105,8 +105,9 @@ void test_xdp_do_redirect(void) DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook, .attach_point = BPF_TC_INGRESS); - memcpy(&data[sizeof(__u32)], &pkt_udp, sizeof(pkt_udp)); + memcpy(&data[sizeof(__u64)], &pkt_udp, sizeof(pkt_udp)); *((__u32 *)data) = 0x42; /* metadata test value */ + *((__u32 *)data + 4) = 0; skel = test_xdp_do_redirect__open(); if (!ASSERT_OK_PTR(skel, "skel")) diff --git a/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c b/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c index cd2d4e3258b8..5baaafed0d2d 100644 --- a/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c +++ b/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c @@ -52,7 +52,7 @@ int xdp_redirect(struct xdp_md *xdp) *payload = MARK_IN; - if (bpf_xdp_adjust_meta(xdp, 4)) + if (bpf_xdp_adjust_meta(xdp, sizeof(__u64))) return XDP_ABORTED; if (retcode > XDP_PASS)
Alexei noticed xdp_do_redirect test on BPF CI started failing on BE systems after skb PP recycling was enabled: test_xdp_do_redirect:PASS:prog_run 0 nsec test_xdp_do_redirect:PASS:pkt_count_xdp 0 nsec test_xdp_do_redirect:PASS:pkt_count_zero 0 nsec test_xdp_do_redirect:FAIL:pkt_count_tc unexpected pkt_count_tc: actual 220 != expected 9998 test_max_pkt_size:PASS:prog_run_max_size 0 nsec test_max_pkt_size:PASS:prog_run_too_big 0 nsec close_netns:PASS:setns 0 nsec #289 xdp_do_redirect:FAIL Summary: 270/1674 PASSED, 30 SKIPPED, 1 FAILED and it doesn't happen on LE systems. Ilya then hunted it down to: #0 0x0000000000aaeee6 in neigh_hh_output (hh=0x83258df0, skb=0x88142200) at linux/include/net/neighbour.h:503 #1 0x0000000000ab2cda in neigh_output (skip_cache=false, skb=0x88142200, n=<optimized out>) at linux/include/net/neighbour.h:544 #2 ip6_finish_output2 (net=net@entry=0x88edba00, sk=sk@entry=0x0, skb=skb@entry=0x88142200) at linux/net/ipv6/ip6_output.c:134 #3 0x0000000000ab4cbc in __ip6_finish_output (skb=0x88142200, sk=0x0, net=0x88edba00) at linux/net/ipv6/ip6_output.c:195 #4 ip6_finish_output (net=0x88edba00, sk=0x0, skb=0x88142200) at linux/net/ipv6/ip6_output.c:206 xdp_do_redirect test places a u32 marker (0x42) right before the Ethernet header to check it then in the XDP program and return %XDP_ABORTED if it's not there. Neigh xmit code likes to round up hard header length to speed up copying the header, so it overwrites two bytes in front of the Eth header. On LE systems, 0x42 is one byte at `data - 4`, while on BE it's `data - 1`, what explains why it happens only there. It didn't happen previously due to that %XDP_PASS meant the page will be discarded and replaced by a new one, but now it can be recycled as well, while bpf_test_run code doesn't reinitialize the content of recycled pages. This mark is limited to this particular test and its setup though, so there's no need to predict 1000 different possible cases. Just move it 4 bytes to the left, still keeping it 32 bit to match on more bytes. Fixes: 9c94bbf9a87b ("xdp: recycle Page Pool backed skbs built from XDP frames") Reported-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/CAADnVQ+B_JOU+EpP=DKhbY9yXdN6GiRPnpTTXfEZ9sNkUeb-yQ@mail.gmail.com Reported-by: Ilya Leoshkevich <iii@linux.ibm.com> # + debugging Link: https://lore.kernel.org/bpf/8341c1d9f935f410438e79d3bd8a9cc50aefe105.camel@linux.ibm.com Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> --- tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c | 7 ++++--- tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c | 2 +- 2 files changed, 5 insertions(+), 4 deletions(-)