Message ID | 20230213142747.3225479-1-alexandr.lobakin@intel.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | [v2,bpf] bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES | expand |
Alexander Lobakin <alexandr.lobakin@intel.com> writes: > &xdp_buff and &xdp_frame are bound in a way that > > xdp_buff->data_hard_start == xdp_frame > > It's always the case and e.g. xdp_convert_buff_to_frame() relies on > this. > IOW, the following: > > for (u32 i = 0; i < 0xdead; i++) { > xdpf = xdp_convert_buff_to_frame(&xdp); > xdp_convert_frame_to_buff(xdpf, &xdp); > } > > shouldn't ever modify @xdpf's contents or the pointer itself. > However, "live packet" code wrongly treats &xdp_frame as part of its > context placed *before* the data_hard_start. With such flow, > data_hard_start is sizeof(*xdpf) off to the right and no longer points > to the XDP frame. > > Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several > places and praying that there are no more miscalcs left somewhere in the > code, unionize ::frm with ::data in a flex array, so that both starts > pointing to the actual data_hard_start and the XDP frame actually starts > being a part of it, i.e. a part of the headroom, not the context. > A nice side effect is that the maximum frame size for this mode gets > increased by 40 bytes, as xdp_buff::frame_sz includes everything from > data_hard_start (-> includes xdpf already) to the end of XDP/skb shared > info. > > Minor: align `&head->data` with how `head->frm` is assigned for > consistency. > Minor #2: rename 'frm' to 'frame' in &xdp_page_head while at it for > clarity. > > (was found while testing XDP traffic generator on ice, which calls > xdp_convert_frame_to_buff() for each XDP frame) > > Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN") > Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> > --- > From v1[0]: > - align `&head->data` with how `head->frm` is assigned for consistency > (Toke); > - rename 'frm' to 'frame' in &xdp_page_head (Jakub); > - no functional changes. > > [0] > https://lore.kernel.org/bpf/20230209172827.874728-1-alexandr.lobakin@intel.com Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
On 2/13/23 3:27 PM, Alexander Lobakin wrote: > &xdp_buff and &xdp_frame are bound in a way that > > xdp_buff->data_hard_start == xdp_frame > > It's always the case and e.g. xdp_convert_buff_to_frame() relies on > this. > IOW, the following: > > for (u32 i = 0; i < 0xdead; i++) { > xdpf = xdp_convert_buff_to_frame(&xdp); > xdp_convert_frame_to_buff(xdpf, &xdp); > } > > shouldn't ever modify @xdpf's contents or the pointer itself. > However, "live packet" code wrongly treats &xdp_frame as part of its > context placed *before* the data_hard_start. With such flow, > data_hard_start is sizeof(*xdpf) off to the right and no longer points > to the XDP frame. > > Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several > places and praying that there are no more miscalcs left somewhere in the > code, unionize ::frm with ::data in a flex array, so that both starts > pointing to the actual data_hard_start and the XDP frame actually starts > being a part of it, i.e. a part of the headroom, not the context. > A nice side effect is that the maximum frame size for this mode gets > increased by 40 bytes, as xdp_buff::frame_sz includes everything from > data_hard_start (-> includes xdpf already) to the end of XDP/skb shared > info. > > Minor: align `&head->data` with how `head->frm` is assigned for > consistency. > Minor #2: rename 'frm' to 'frame' in &xdp_page_head while at it for > clarity. > > (was found while testing XDP traffic generator on ice, which calls > xdp_convert_frame_to_buff() for each XDP frame) > > Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN") > Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Could you double check BPF CI? Looks like a number of XDP related tests are failing on your patch which I'm not seeing on other patches where runs are green, for example test_progs on several archs report the below: https://github.com/kernel-patches/bpf/actions/runs/4164593416/jobs/7207290499 [...] test_xdp_do_redirect:PASS:prog_run 0 nsec test_xdp_do_redirect:PASS:pkt_count_xdp 0 nsec test_xdp_do_redirect:PASS:pkt_count_zero 0 nsec test_xdp_do_redirect:PASS:pkt_count_tc 0 nsec test_max_pkt_size:PASS:prog_run_max_size 0 nsec test_max_pkt_size:FAIL:prog_run_too_big unexpected prog_run_too_big: actual -28 != expected -22 close_netns:PASS:setns 0 nsec #275 xdp_do_redirect:FAIL Summary: 273/1581 PASSED, 21 SKIPPED, 2 FAILED
From: Daniel Borkmann <daniel@iogearbox.net> Date: Tue, 14 Feb 2023 16:24:10 +0100 > On 2/13/23 3:27 PM, Alexander Lobakin wrote: >> &xdp_buff and &xdp_frame are bound in a way that >> >> xdp_buff->data_hard_start == xdp_frame >> >> It's always the case and e.g. xdp_convert_buff_to_frame() relies on >> this. >> IOW, the following: >> >> for (u32 i = 0; i < 0xdead; i++) { >> xdpf = xdp_convert_buff_to_frame(&xdp); >> xdp_convert_frame_to_buff(xdpf, &xdp); >> } >> >> shouldn't ever modify @xdpf's contents or the pointer itself. >> However, "live packet" code wrongly treats &xdp_frame as part of its >> context placed *before* the data_hard_start. With such flow, >> data_hard_start is sizeof(*xdpf) off to the right and no longer points >> to the XDP frame. >> >> Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several >> places and praying that there are no more miscalcs left somewhere in the >> code, unionize ::frm with ::data in a flex array, so that both starts >> pointing to the actual data_hard_start and the XDP frame actually starts >> being a part of it, i.e. a part of the headroom, not the context. >> A nice side effect is that the maximum frame size for this mode gets >> increased by 40 bytes, as xdp_buff::frame_sz includes everything from >> data_hard_start (-> includes xdpf already) to the end of XDP/skb shared >> info. >> >> Minor: align `&head->data` with how `head->frm` is assigned for >> consistency. >> Minor #2: rename 'frm' to 'frame' in &xdp_page_head while at it for >> clarity. >> >> (was found while testing XDP traffic generator on ice, which calls >> xdp_convert_frame_to_buff() for each XDP frame) >> >> Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in >> BPF_PROG_RUN") >> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> > > Could you double check BPF CI? Looks like a number of XDP related tests > are failing on your patch which I'm not seeing on other patches where runs > are green, for example test_progs on several archs report the below: > > https://github.com/kernel-patches/bpf/actions/runs/4164593416/jobs/7207290499 > > [...] > test_xdp_do_redirect:PASS:prog_run 0 nsec > test_xdp_do_redirect:PASS:pkt_count_xdp 0 nsec > test_xdp_do_redirect:PASS:pkt_count_zero 0 nsec > test_xdp_do_redirect:PASS:pkt_count_tc 0 nsec > test_max_pkt_size:PASS:prog_run_max_size 0 nsec > test_max_pkt_size:FAIL:prog_run_too_big unexpected prog_run_too_big: > actual -28 != expected -22 > close_netns:PASS:setns 0 nsec > #275 xdp_do_redirect:FAIL > Summary: 273/1581 PASSED, 21 SKIPPED, 2 FAILED Ah I see. xdp_do_redirect.c test defines: /* The maximum permissible size is: PAGE_SIZE - * sizeof(struct xdp_page_head) - sizeof(struct skb_shared_info) - * XDP_PACKET_HEADROOM = 3368 bytes */ #define MAX_PKT_SIZE 3368 This needs to be updated as it now became bigger. The test checks that this size passes and size + 1 fails, but now it doesn't. Will send v3 in a couple minutes. Thanks, Olek
From: Alexander Lobakin <alexandr.lobakin@intel.com> Date: Tue, 14 Feb 2023 16:39:25 +0100 > From: Daniel Borkmann <daniel@iogearbox.net> > Date: Tue, 14 Feb 2023 16:24:10 +0100 > >> On 2/13/23 3:27 PM, Alexander Lobakin wrote: [...] >>> Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in >>> BPF_PROG_RUN") >>> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> >> >> Could you double check BPF CI? Looks like a number of XDP related tests >> are failing on your patch which I'm not seeing on other patches where runs >> are green, for example test_progs on several archs report the below: >> >> https://github.com/kernel-patches/bpf/actions/runs/4164593416/jobs/7207290499 >> >> [...] >> test_xdp_do_redirect:PASS:prog_run 0 nsec >> test_xdp_do_redirect:PASS:pkt_count_xdp 0 nsec >> test_xdp_do_redirect:PASS:pkt_count_zero 0 nsec >> test_xdp_do_redirect:PASS:pkt_count_tc 0 nsec >> test_max_pkt_size:PASS:prog_run_max_size 0 nsec >> test_max_pkt_size:FAIL:prog_run_too_big unexpected prog_run_too_big: >> actual -28 != expected -22 >> close_netns:PASS:setns 0 nsec >> #275 xdp_do_redirect:FAIL >> Summary: 273/1581 PASSED, 21 SKIPPED, 2 FAILED > Ah I see. xdp_do_redirect.c test defines: > > /* The maximum permissible size is: PAGE_SIZE - > * sizeof(struct xdp_page_head) - sizeof(struct skb_shared_info) - > * XDP_PACKET_HEADROOM = 3368 bytes > */ > #define MAX_PKT_SIZE 3368 > > This needs to be updated as it now became bigger. The test checks that > this size passes and size + 1 fails, but now it doesn't. > Will send v3 in a couple minutes. Problem :s This 3368/3408 assumes %L1_CACHE_BYTES is 64 and we're running on a 64-bit arch. For 32 bits the value will be bigger, also for cachelines bigger than 64 it will be smaller (skb_shared_info has to be aligned). Given that selftests are generic / arch-independent, how to approach this? I added a static_assert() to test_run.c to make sure this value is in sync to not run into the same problem in future, but then realized it will fail on a number of architectures. My first thought was to hardcode the worst-case value (64 bit, cacheline is 128) in test_run.c for every architecture, but there might be more elegant ways. > > Thanks, > Olek Thanks, Olek
Alexander Lobakin <alexandr.lobakin@intel.com> writes: > From: Alexander Lobakin <alexandr.lobakin@intel.com> > Date: Tue, 14 Feb 2023 16:39:25 +0100 > >> From: Daniel Borkmann <daniel@iogearbox.net> >> Date: Tue, 14 Feb 2023 16:24:10 +0100 >> >>> On 2/13/23 3:27 PM, Alexander Lobakin wrote: > > [...] > >>>> Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in >>>> BPF_PROG_RUN") >>>> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> >>> >>> Could you double check BPF CI? Looks like a number of XDP related tests >>> are failing on your patch which I'm not seeing on other patches where runs >>> are green, for example test_progs on several archs report the below: >>> >>> https://github.com/kernel-patches/bpf/actions/runs/4164593416/jobs/7207290499 >>> >>> [...] >>> test_xdp_do_redirect:PASS:prog_run 0 nsec >>> test_xdp_do_redirect:PASS:pkt_count_xdp 0 nsec >>> test_xdp_do_redirect:PASS:pkt_count_zero 0 nsec >>> test_xdp_do_redirect:PASS:pkt_count_tc 0 nsec >>> test_max_pkt_size:PASS:prog_run_max_size 0 nsec >>> test_max_pkt_size:FAIL:prog_run_too_big unexpected prog_run_too_big: >>> actual -28 != expected -22 >>> close_netns:PASS:setns 0 nsec >>> #275 xdp_do_redirect:FAIL >>> Summary: 273/1581 PASSED, 21 SKIPPED, 2 FAILED >> Ah I see. xdp_do_redirect.c test defines: >> >> /* The maximum permissible size is: PAGE_SIZE - >> * sizeof(struct xdp_page_head) - sizeof(struct skb_shared_info) - >> * XDP_PACKET_HEADROOM = 3368 bytes >> */ >> #define MAX_PKT_SIZE 3368 >> >> This needs to be updated as it now became bigger. The test checks that >> this size passes and size + 1 fails, but now it doesn't. >> Will send v3 in a couple minutes. > > Problem :s > > This 3368/3408 assumes %L1_CACHE_BYTES is 64 and we're running on a > 64-bit arch. For 32 bits the value will be bigger, also for cachelines > bigger than 64 it will be smaller (skb_shared_info has to be aligned). > Given that selftests are generic / arch-independent, how to approach > this? I added a static_assert() to test_run.c to make sure this value > is in sync to not run into the same problem in future, but then realized > it will fail on a number of architectures. > > My first thought was to hardcode the worst-case value (64 bit, cacheline > is 128) in test_run.c for every architecture, but there might be more > elegant ways. The 32/64 bit split should be straight-forward to handle for the head; an xdp_buff is 6*sizeof(void)+8 bytes long, and xdp_page_head is just two of those after this patch. The skb_shared_info size is a bit harder; do we have the alignment / size macros available to userspace somewhere? Hmm, the selftests generate a vmlinux.h file which would have the structure definitions; maybe something could be generated from that? Not straight-forward to include it in a userspace application, though. Otherwise, does anyone run the selftests on architectures that don't have a 64-byte cache-line size? Or even on 32-bit arches? We don't handle larger page sizes either... -Toke
From: Toke Høiland-Jørgensen <toke@redhat.com> Date: Tue, 14 Feb 2023 22:05:26 +0100 > Alexander Lobakin <alexandr.lobakin@intel.com> writes: > >> From: Alexander Lobakin <alexandr.lobakin@intel.com> >> Date: Tue, 14 Feb 2023 16:39:25 +0100 >> >>> From: Daniel Borkmann <daniel@iogearbox.net> >>> Date: Tue, 14 Feb 2023 16:24:10 +0100 >>> >>>> On 2/13/23 3:27 PM, Alexander Lobakin wrote: >> >> [...] >> >>>>> Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in >>>>> BPF_PROG_RUN") >>>>> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> >>>> >>>> Could you double check BPF CI? Looks like a number of XDP related tests >>>> are failing on your patch which I'm not seeing on other patches where runs >>>> are green, for example test_progs on several archs report the below: >>>> >>>> https://github.com/kernel-patches/bpf/actions/runs/4164593416/jobs/7207290499 >>>> >>>> [...] >>>> test_xdp_do_redirect:PASS:prog_run 0 nsec >>>> test_xdp_do_redirect:PASS:pkt_count_xdp 0 nsec >>>> test_xdp_do_redirect:PASS:pkt_count_zero 0 nsec >>>> test_xdp_do_redirect:PASS:pkt_count_tc 0 nsec >>>> test_max_pkt_size:PASS:prog_run_max_size 0 nsec >>>> test_max_pkt_size:FAIL:prog_run_too_big unexpected prog_run_too_big: >>>> actual -28 != expected -22 >>>> close_netns:PASS:setns 0 nsec >>>> #275 xdp_do_redirect:FAIL >>>> Summary: 273/1581 PASSED, 21 SKIPPED, 2 FAILED >>> Ah I see. xdp_do_redirect.c test defines: >>> >>> /* The maximum permissible size is: PAGE_SIZE - >>> * sizeof(struct xdp_page_head) - sizeof(struct skb_shared_info) - >>> * XDP_PACKET_HEADROOM = 3368 bytes >>> */ >>> #define MAX_PKT_SIZE 3368 >>> >>> This needs to be updated as it now became bigger. The test checks that >>> this size passes and size + 1 fails, but now it doesn't. >>> Will send v3 in a couple minutes. >> >> Problem :s >> >> This 3368/3408 assumes %L1_CACHE_BYTES is 64 and we're running on a >> 64-bit arch. For 32 bits the value will be bigger, also for cachelines >> bigger than 64 it will be smaller (skb_shared_info has to be aligned). >> Given that selftests are generic / arch-independent, how to approach >> this? I added a static_assert() to test_run.c to make sure this value >> is in sync to not run into the same problem in future, but then realized >> it will fail on a number of architectures. >> >> My first thought was to hardcode the worst-case value (64 bit, cacheline >> is 128) in test_run.c for every architecture, but there might be more >> elegant ways. > > The 32/64 bit split should be straight-forward to handle for the head; > an xdp_buff is 6*sizeof(void)+8 bytes long, and xdp_page_head is just > two of those after this patch. The skb_shared_info size is a bit harder; > do we have the alignment / size macros available to userspace somewhere? > > Hmm, the selftests generate a vmlinux.h file which would have the > structure definitions; maybe something could be generated from that? Not > straight-forward to include it in a userspace application, though. > > Otherwise, does anyone run the selftests on architectures that don't > have a 64-byte cache-line size? Or even on 32-bit arches? We don't > handle larger page sizes either... I believe nobody does that :D Everyone just use x86_64 and ARM64 with 4k pages. I think for this particular patch we can just change 3368 to 3408 without trying to assert it in the kernel code. And later I'll brainstorm this :D > > -Toke > Thanks, Olek
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index 2723623429ac..522869ccc007 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -97,8 +97,11 @@ static bool bpf_test_timer_continue(struct bpf_test_timer *t, int iterations, struct xdp_page_head { struct xdp_buff orig_ctx; struct xdp_buff ctx; - struct xdp_frame frm; - u8 data[]; + union { + /* ::data_hard_start starts here */ + DECLARE_FLEX_ARRAY(struct xdp_frame, frame); + DECLARE_FLEX_ARRAY(u8, data); + }; }; struct xdp_test_data { @@ -132,8 +135,8 @@ static void xdp_test_run_init_page(struct page *page, void *arg) headroom -= meta_len; new_ctx = &head->ctx; - frm = &head->frm; - data = &head->data; + frm = head->frame; + data = head->data; memcpy(data + headroom, orig_ctx->data_meta, frm_len); xdp_init_buff(new_ctx, TEST_XDP_FRAME_SIZE, &xdp->rxq); @@ -223,7 +226,7 @@ static void reset_ctx(struct xdp_page_head *head) head->ctx.data = head->orig_ctx.data; head->ctx.data_meta = head->orig_ctx.data_meta; head->ctx.data_end = head->orig_ctx.data_end; - xdp_update_frame_from_buff(&head->ctx, &head->frm); + xdp_update_frame_from_buff(&head->ctx, head->frame); } static int xdp_recv_frames(struct xdp_frame **frames, int nframes, @@ -285,7 +288,7 @@ static int xdp_test_run_batch(struct xdp_test_data *xdp, struct bpf_prog *prog, head = phys_to_virt(page_to_phys(page)); reset_ctx(head); ctx = &head->ctx; - frm = &head->frm; + frm = head->frame; xdp->frame_cnt++; act = bpf_prog_run_xdp(prog, ctx);
&xdp_buff and &xdp_frame are bound in a way that xdp_buff->data_hard_start == xdp_frame It's always the case and e.g. xdp_convert_buff_to_frame() relies on this. IOW, the following: for (u32 i = 0; i < 0xdead; i++) { xdpf = xdp_convert_buff_to_frame(&xdp); xdp_convert_frame_to_buff(xdpf, &xdp); } shouldn't ever modify @xdpf's contents or the pointer itself. However, "live packet" code wrongly treats &xdp_frame as part of its context placed *before* the data_hard_start. With such flow, data_hard_start is sizeof(*xdpf) off to the right and no longer points to the XDP frame. Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several places and praying that there are no more miscalcs left somewhere in the code, unionize ::frm with ::data in a flex array, so that both starts pointing to the actual data_hard_start and the XDP frame actually starts being a part of it, i.e. a part of the headroom, not the context. A nice side effect is that the maximum frame size for this mode gets increased by 40 bytes, as xdp_buff::frame_sz includes everything from data_hard_start (-> includes xdpf already) to the end of XDP/skb shared info. Minor: align `&head->data` with how `head->frm` is assigned for consistency. Minor #2: rename 'frm' to 'frame' in &xdp_page_head while at it for clarity. (was found while testing XDP traffic generator on ice, which calls xdp_convert_frame_to_buff() for each XDP frame) Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> --- From v1[0]: - align `&head->data` with how `head->frm` is assigned for consistency (Toke); - rename 'frm' to 'frame' in &xdp_page_head (Jakub); - no functional changes. [0] https://lore.kernel.org/bpf/20230209172827.874728-1-alexandr.lobakin@intel.com --- net/bpf/test_run.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)