Message ID | 20230209172827.874728-1-alexandr.lobakin@intel.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | [bpf] bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES | expand |
From: Alexander Lobakin <alexandr.lobakin@intel.com> Date: Thu, 9 Feb 2023 18:28:27 +0100 > &xdp_buff and &xdp_frame are bound in a way that > > xdp_buff->data_hard_start == xdp_frame [...] > diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c > index 2723623429ac..c3cce7a8d47d 100644 > --- a/net/bpf/test_run.c > +++ b/net/bpf/test_run.c > @@ -97,8 +97,11 @@ static bool bpf_test_timer_continue(struct bpf_test_timer *t, int iterations, > struct xdp_page_head { > struct xdp_buff orig_ctx; > struct xdp_buff ctx; > - struct xdp_frame frm; > - u8 data[]; > + union { > + /* ::data_hard_start starts here */ > + DECLARE_FLEX_ARRAY(struct xdp_frame, frm); > + DECLARE_FLEX_ARRAY(u8, data); > + }; BTW, xdp_frame here starts at 112 byte offset, i.e. in 16 bytes a cacheline boundary is hit, so xdp_frame gets sliced into halves: 16 bytes in CL1 + 24 bytes in CL2. Maybe we'd better align this union to %NET_SKB_PAD / %SMP_CACHE_BYTES / ... to avoid this? (but in bpf-next probably) > }; > > struct xdp_test_data { Thanks, Olek
Alexander Lobakin <alexandr.lobakin@intel.com> writes: > &xdp_buff and &xdp_frame are bound in a way that > > xdp_buff->data_hard_start == xdp_frame > > It's always the case and e.g. xdp_convert_buff_to_frame() relies on > this. > IOW, the following: > > for (u32 i = 0; i < 0xdead; i++) { > xdpf = xdp_convert_buff_to_frame(&xdp); > xdp_convert_frame_to_buff(xdpf, &xdp); > } > > shouldn't ever modify @xdpf's contents or the pointer itself. > However, "live packet" code wrongly treats &xdp_frame as part of its > context placed *before* the data_hard_start. With such flow, > data_hard_start is sizeof(*xdpf) off to the right and no longer points > to the XDP frame. Oh, nice find! > Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several > places and praying that there are no more miscalcs left somewhere in the > code, unionize ::frm with ::data in a flex array, so that both starts > pointing to the actual data_hard_start and the XDP frame actually starts > being a part of it, i.e. a part of the headroom, not the context. > A nice side effect is that the maximum frame size for this mode gets > increased by 40 bytes, as xdp_buff::frame_sz includes everything from > data_hard_start (-> includes xdpf already) to the end of XDP/skb shared > info. I like the union approach, however... > (was found while testing XDP traffic generator on ice, which calls > xdp_convert_frame_to_buff() for each XDP frame) > > Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN") > Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> > --- > net/bpf/test_run.c | 13 ++++++++----- > 1 file changed, 8 insertions(+), 5 deletions(-) > > diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c > index 2723623429ac..c3cce7a8d47d 100644 > --- a/net/bpf/test_run.c > +++ b/net/bpf/test_run.c > @@ -97,8 +97,11 @@ static bool bpf_test_timer_continue(struct bpf_test_timer *t, int iterations, > struct xdp_page_head { > struct xdp_buff orig_ctx; > struct xdp_buff ctx; > - struct xdp_frame frm; > - u8 data[]; > + union { > + /* ::data_hard_start starts here */ > + DECLARE_FLEX_ARRAY(struct xdp_frame, frm); > + DECLARE_FLEX_ARRAY(u8, data); > + }; ...why does the xdp_frame need to be a flex array? Shouldn't this just be: + union { + /* ::data_hard_start starts here */ + struct xdp_frame frm; + DECLARE_FLEX_ARRAY(u8, data); + }; which would also get rid of the other three hunks of the patch? -Toke
Alexander Lobakin <alexandr.lobakin@intel.com> writes: > From: Alexander Lobakin <alexandr.lobakin@intel.com> > Date: Thu, 9 Feb 2023 18:28:27 +0100 > >> &xdp_buff and &xdp_frame are bound in a way that >> >> xdp_buff->data_hard_start == xdp_frame > > [...] > >> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c >> index 2723623429ac..c3cce7a8d47d 100644 >> --- a/net/bpf/test_run.c >> +++ b/net/bpf/test_run.c >> @@ -97,8 +97,11 @@ static bool bpf_test_timer_continue(struct bpf_test_timer *t, int iterations, >> struct xdp_page_head { >> struct xdp_buff orig_ctx; >> struct xdp_buff ctx; >> - struct xdp_frame frm; >> - u8 data[]; >> + union { >> + /* ::data_hard_start starts here */ >> + DECLARE_FLEX_ARRAY(struct xdp_frame, frm); >> + DECLARE_FLEX_ARRAY(u8, data); >> + }; > > BTW, xdp_frame here starts at 112 byte offset, i.e. in 16 bytes a > cacheline boundary is hit, so xdp_frame gets sliced into halves: 16 > bytes in CL1 + 24 bytes in CL2. Maybe we'd better align this union to > %NET_SKB_PAD / %SMP_CACHE_BYTES / ... to avoid this? Hmm, IIRC my reasoning was that both those cache lines will be touched by the code in xdp_test_run_batch(), so it wouldn't matter? But if there's a performance benefit I don't mind adding an explicit alignment annotation, certainly! > (but in bpf-next probably) Yeah... -Toke
From: Toke Høiland-Jørgensen <toke@redhat.com> Date: Thu, 09 Feb 2023 21:04:38 +0100 > Alexander Lobakin <alexandr.lobakin@intel.com> writes: > >> &xdp_buff and &xdp_frame are bound in a way that >> >> xdp_buff->data_hard_start == xdp_frame >> >> It's always the case and e.g. xdp_convert_buff_to_frame() relies on >> this. >> IOW, the following: >> >> for (u32 i = 0; i < 0xdead; i++) { >> xdpf = xdp_convert_buff_to_frame(&xdp); >> xdp_convert_frame_to_buff(xdpf, &xdp); >> } >> >> shouldn't ever modify @xdpf's contents or the pointer itself. >> However, "live packet" code wrongly treats &xdp_frame as part of its >> context placed *before* the data_hard_start. With such flow, >> data_hard_start is sizeof(*xdpf) off to the right and no longer points >> to the XDP frame. > > Oh, nice find! > >> Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several >> places and praying that there are no more miscalcs left somewhere in the >> code, unionize ::frm with ::data in a flex array, so that both starts >> pointing to the actual data_hard_start and the XDP frame actually starts >> being a part of it, i.e. a part of the headroom, not the context. >> A nice side effect is that the maximum frame size for this mode gets >> increased by 40 bytes, as xdp_buff::frame_sz includes everything from >> data_hard_start (-> includes xdpf already) to the end of XDP/skb shared >> info. > > I like the union approach, however... > >> (was found while testing XDP traffic generator on ice, which calls >> xdp_convert_frame_to_buff() for each XDP frame) >> >> Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN") >> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> >> --- >> net/bpf/test_run.c | 13 ++++++++----- >> 1 file changed, 8 insertions(+), 5 deletions(-) >> >> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c >> index 2723623429ac..c3cce7a8d47d 100644 >> --- a/net/bpf/test_run.c >> +++ b/net/bpf/test_run.c >> @@ -97,8 +97,11 @@ static bool bpf_test_timer_continue(struct bpf_test_timer *t, int iterations, >> struct xdp_page_head { >> struct xdp_buff orig_ctx; >> struct xdp_buff ctx; >> - struct xdp_frame frm; >> - u8 data[]; >> + union { >> + /* ::data_hard_start starts here */ >> + DECLARE_FLEX_ARRAY(struct xdp_frame, frm); >> + DECLARE_FLEX_ARRAY(u8, data); >> + }; > > ...why does the xdp_frame need to be a flex array? Shouldn't this just be: > > + union { > + /* ::data_hard_start starts here */ > + struct xdp_frame frm; > + DECLARE_FLEX_ARRAY(u8, data); > + }; > > which would also get rid of the other three hunks of the patch? That was my first thought. However, as I mentioned in between the lines in the commitmsg, this doesn't decrease the sizeof(ctx), so we'd have to replace those sizeofs with offsetof() in a couple places (-> the patch length would be the same). So I went this way to declare that frm doesn't belong to ctx but to the headroom. I'm fine either way tho, so up to you guys. > > -Toke > Thanks, Olek
From: Toke Høiland-Jørgensen <toke@redhat.com> Date: Thu, 09 Feb 2023 21:58:07 +0100 > Alexander Lobakin <alexandr.lobakin@intel.com> writes: > >> From: Alexander Lobakin <alexandr.lobakin@intel.com> >> Date: Thu, 9 Feb 2023 18:28:27 +0100 >> >>> &xdp_buff and &xdp_frame are bound in a way that >>> >>> xdp_buff->data_hard_start == xdp_frame >> >> [...] >> >>> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c >>> index 2723623429ac..c3cce7a8d47d 100644 >>> --- a/net/bpf/test_run.c >>> +++ b/net/bpf/test_run.c >>> @@ -97,8 +97,11 @@ static bool bpf_test_timer_continue(struct bpf_test_timer *t, int iterations, >>> struct xdp_page_head { >>> struct xdp_buff orig_ctx; >>> struct xdp_buff ctx; >>> - struct xdp_frame frm; >>> - u8 data[]; >>> + union { >>> + /* ::data_hard_start starts here */ >>> + DECLARE_FLEX_ARRAY(struct xdp_frame, frm); >>> + DECLARE_FLEX_ARRAY(u8, data); >>> + }; >> >> BTW, xdp_frame here starts at 112 byte offset, i.e. in 16 bytes a >> cacheline boundary is hit, so xdp_frame gets sliced into halves: 16 >> bytes in CL1 + 24 bytes in CL2. Maybe we'd better align this union to >> %NET_SKB_PAD / %SMP_CACHE_BYTES / ... to avoid this? > > Hmm, IIRC my reasoning was that both those cache lines will be touched > by the code in xdp_test_run_batch(), so it wouldn't matter? But if > there's a performance benefit I don't mind adding an explicit alignment > annotation, certainly! Let me retest both ways and will see. I saw some huge CPU loads on reading xdpf in ice_xdp_xmit(), so that was my first thought. > >> (but in bpf-next probably) > > Yeah... > > -Toke > Thanks, Olek
From: Alexander Lobakin <alexandr.lobakin@intel.com> Date: Fri, 10 Feb 2023 13:31:28 +0100 > From: Toke Høiland-Jørgensen <toke@redhat.com> > Date: Thu, 09 Feb 2023 21:58:07 +0100 > >> Alexander Lobakin <alexandr.lobakin@intel.com> writes: [...] >> Hmm, IIRC my reasoning was that both those cache lines will be touched >> by the code in xdp_test_run_batch(), so it wouldn't matter? But if >> there's a performance benefit I don't mind adding an explicit alignment >> annotation, certainly! > > Let me retest both ways and will see. I saw some huge CPU loads on > reading xdpf in ice_xdp_xmit(), so that was my first thought. No visible difference in perf and CPU load... Ok, aligning isn't worth it. > >> >>> (but in bpf-next probably) >> >> Yeah... >> >> -Toke >> Olek
Alexander Lobakin <alexandr.lobakin@intel.com> writes: > From: Toke Høiland-Jørgensen <toke@redhat.com> > Date: Thu, 09 Feb 2023 21:04:38 +0100 > >> Alexander Lobakin <alexandr.lobakin@intel.com> writes: >> >>> &xdp_buff and &xdp_frame are bound in a way that >>> >>> xdp_buff->data_hard_start == xdp_frame >>> >>> It's always the case and e.g. xdp_convert_buff_to_frame() relies on >>> this. >>> IOW, the following: >>> >>> for (u32 i = 0; i < 0xdead; i++) { >>> xdpf = xdp_convert_buff_to_frame(&xdp); >>> xdp_convert_frame_to_buff(xdpf, &xdp); >>> } >>> >>> shouldn't ever modify @xdpf's contents or the pointer itself. >>> However, "live packet" code wrongly treats &xdp_frame as part of its >>> context placed *before* the data_hard_start. With such flow, >>> data_hard_start is sizeof(*xdpf) off to the right and no longer points >>> to the XDP frame. >> >> Oh, nice find! >> >>> Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several >>> places and praying that there are no more miscalcs left somewhere in the >>> code, unionize ::frm with ::data in a flex array, so that both starts >>> pointing to the actual data_hard_start and the XDP frame actually starts >>> being a part of it, i.e. a part of the headroom, not the context. >>> A nice side effect is that the maximum frame size for this mode gets >>> increased by 40 bytes, as xdp_buff::frame_sz includes everything from >>> data_hard_start (-> includes xdpf already) to the end of XDP/skb shared >>> info. >> >> I like the union approach, however... >> >>> (was found while testing XDP traffic generator on ice, which calls >>> xdp_convert_frame_to_buff() for each XDP frame) >>> >>> Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN") >>> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> >>> --- >>> net/bpf/test_run.c | 13 ++++++++----- >>> 1 file changed, 8 insertions(+), 5 deletions(-) >>> >>> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c >>> index 2723623429ac..c3cce7a8d47d 100644 >>> --- a/net/bpf/test_run.c >>> +++ b/net/bpf/test_run.c >>> @@ -97,8 +97,11 @@ static bool bpf_test_timer_continue(struct bpf_test_timer *t, int iterations, >>> struct xdp_page_head { >>> struct xdp_buff orig_ctx; >>> struct xdp_buff ctx; >>> - struct xdp_frame frm; >>> - u8 data[]; >>> + union { >>> + /* ::data_hard_start starts here */ >>> + DECLARE_FLEX_ARRAY(struct xdp_frame, frm); >>> + DECLARE_FLEX_ARRAY(u8, data); >>> + }; >> >> ...why does the xdp_frame need to be a flex array? Shouldn't this just be: >> >> + union { >> + /* ::data_hard_start starts here */ >> + struct xdp_frame frm; >> + DECLARE_FLEX_ARRAY(u8, data); >> + }; >> >> which would also get rid of the other three hunks of the patch? > > That was my first thought. However, as I mentioned in between the lines > in the commitmsg, this doesn't decrease the sizeof(ctx), so we'd have to > replace those sizeofs with offsetof() in a couple places (-> the patch > length would be the same). So I went this way to declare that frm > doesn't belong to ctx but to the headroom. Ah, right, I see! Okay, let's keep both as flex arrays, then. One other nit, though: after your patch, we'll end up with this: frm = head->frm; data = &head->data; both of those assignments refer to flex arrays, which seems a bit inconsistent. The second one works because it's assigning to a void pointer, so the compiler doesn't complain about the type mismatch; but it should work with just 'data = head->data' as well, so can we update that as well for consistency? -Toke
On Thu, 9 Feb 2023 18:28:27 +0100 Alexander Lobakin wrote:
> - struct xdp_frame frm;
BTW could this be a chance to rename this? First time I read your
commit msg I thought frm stood for "from".
From: Toke Høiland-Jørgensen <toke@redhat.com> Date: Fri, 10 Feb 2023 18:38:45 +0100 > Alexander Lobakin <alexandr.lobakin@intel.com> writes: > >> From: Toke Høiland-Jørgensen <toke@redhat.com> >> Date: Thu, 09 Feb 2023 21:04:38 +0100 [...] > both of those assignments refer to flex arrays, which seems a bit > inconsistent. The second one works because it's assigning to a void > pointer, so the compiler doesn't complain about the type mismatch; but > it should work with just 'data = head->data' as well, so can we update > that as well for consistency? Aaaah, I see, you're right. Will do in a minute. > > -Toke Thanks, Olek
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index 2723623429ac..c3cce7a8d47d 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -97,8 +97,11 @@ static bool bpf_test_timer_continue(struct bpf_test_timer *t, int iterations, struct xdp_page_head { struct xdp_buff orig_ctx; struct xdp_buff ctx; - struct xdp_frame frm; - u8 data[]; + union { + /* ::data_hard_start starts here */ + DECLARE_FLEX_ARRAY(struct xdp_frame, frm); + DECLARE_FLEX_ARRAY(u8, data); + }; }; struct xdp_test_data { @@ -132,7 +135,7 @@ static void xdp_test_run_init_page(struct page *page, void *arg) headroom -= meta_len; new_ctx = &head->ctx; - frm = &head->frm; + frm = head->frm; data = &head->data; memcpy(data + headroom, orig_ctx->data_meta, frm_len); @@ -223,7 +226,7 @@ static void reset_ctx(struct xdp_page_head *head) head->ctx.data = head->orig_ctx.data; head->ctx.data_meta = head->orig_ctx.data_meta; head->ctx.data_end = head->orig_ctx.data_end; - xdp_update_frame_from_buff(&head->ctx, &head->frm); + xdp_update_frame_from_buff(&head->ctx, head->frm); } static int xdp_recv_frames(struct xdp_frame **frames, int nframes, @@ -285,7 +288,7 @@ static int xdp_test_run_batch(struct xdp_test_data *xdp, struct bpf_prog *prog, head = phys_to_virt(page_to_phys(page)); reset_ctx(head); ctx = &head->ctx; - frm = &head->frm; + frm = head->frm; xdp->frame_cnt++; act = bpf_prog_run_xdp(prog, ctx);
&xdp_buff and &xdp_frame are bound in a way that xdp_buff->data_hard_start == xdp_frame It's always the case and e.g. xdp_convert_buff_to_frame() relies on this. IOW, the following: for (u32 i = 0; i < 0xdead; i++) { xdpf = xdp_convert_buff_to_frame(&xdp); xdp_convert_frame_to_buff(xdpf, &xdp); } shouldn't ever modify @xdpf's contents or the pointer itself. However, "live packet" code wrongly treats &xdp_frame as part of its context placed *before* the data_hard_start. With such flow, data_hard_start is sizeof(*xdpf) off to the right and no longer points to the XDP frame. Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several places and praying that there are no more miscalcs left somewhere in the code, unionize ::frm with ::data in a flex array, so that both starts pointing to the actual data_hard_start and the XDP frame actually starts being a part of it, i.e. a part of the headroom, not the context. A nice side effect is that the maximum frame size for this mode gets increased by 40 bytes, as xdp_buff::frame_sz includes everything from data_hard_start (-> includes xdpf already) to the end of XDP/skb shared info. (was found while testing XDP traffic generator on ice, which calls xdp_convert_frame_to_buff() for each XDP frame) Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> --- net/bpf/test_run.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)