diff mbox series

[bpf-next,v5] bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES

Message ID 20230220154627.72267-1-aleksander.lobakin@intel.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series [bpf-next,v5] bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 36 this patch: 36
netdev/cc_maintainers warning 12 maintainers not CCed: linux-kselftest@vger.kernel.org john.fastabend@gmail.com pabeni@redhat.com sdf@google.com shuah@kernel.org jolsa@kernel.org edumazet@google.com mykolal@fb.com haoluo@google.com yhs@fb.com kpsingh@kernel.org davem@davemloft.net
netdev/build_clang success Errors and warnings before: 1 this patch: 1
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 36 this patch: 36
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 65 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-14 fail Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-15 fail Logs for test_progs on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32 on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-22 fail Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_progs_no_alu32_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for test_progs_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-32 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-33 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-34 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-35 success Logs for test_verifier on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-37 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-38 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-36 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_progs_no_alu32_parallel on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-31 success Logs for test_progs_parallel on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ${{ matrix.test }} on ${{ matrix.arch }} with ${{ matrix.toolchain }}
bpf/vmtest-bpf-next-VM_Test-2 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-3 fail Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-4 fail Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-5 fail Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-8 success Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-9 success Logs for set-matrix

Commit Message

Alexander Lobakin Feb. 20, 2023, 3:46 p.m. UTC
&xdp_buff and &xdp_frame are bound in a way that

xdp_buff->data_hard_start == xdp_frame

It's always the case and e.g. xdp_convert_buff_to_frame() relies on
this.
IOW, the following:

	for (u32 i = 0; i < 0xdead; i++) {
		xdpf = xdp_convert_buff_to_frame(&xdp);
		xdp_convert_frame_to_buff(xdpf, &xdp);
	}

shouldn't ever modify @xdpf's contents or the pointer itself.
However, "live packet" code wrongly treats &xdp_frame as part of its
context placed *before* the data_hard_start. With such flow,
data_hard_start is sizeof(*xdpf) off to the right and no longer points
to the XDP frame.

Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several
places and praying that there are no more miscalcs left somewhere in the
code, unionize ::frm with ::data in a flex array, so that both starts
pointing to the actual data_hard_start and the XDP frame actually starts
being a part of it, i.e. a part of the headroom, not the context.
A nice side effect is that the maximum frame size for this mode gets
increased by 40 bytes, as xdp_buff::frame_sz includes everything from
data_hard_start (-> includes xdpf already) to the end of XDP/skb shared
info.
Also update %MAX_PKT_SIZE accordingly in the selftests code. Leave it
hardcoded for 64 bit && 4k pages, it can be made more flexible later on.

Minor: align `&head->data` with how `head->frm` is assigned for
consistency.
Minor #2: rename 'frm' to 'frame' in &xdp_page_head while at it for
clarity.

(was found while testing XDP traffic generator on ice, which calls
 xdp_convert_frame_to_buff() for each XDP frame)

Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN")
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20230215185440.4126672-1-aleksander.lobakin@intel.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
---
 net/bpf/test_run.c                            | 19 +++++++++++++------
 .../bpf/prog_tests/xdp_do_redirect.c          |  7 ++++---
 2 files changed, 17 insertions(+), 9 deletions(-)

Comments

Alexander Lobakin Feb. 21, 2023, 12:35 p.m. UTC | #1
From: Alexander Lobakin <aleksander.lobakin@intel.com>
Date: Mon, 20 Feb 2023 16:46:27 +0100

> &xdp_buff and &xdp_frame are bound in a way that
> 
> xdp_buff->data_hard_start == xdp_frame
> 
> It's always the case and e.g. xdp_convert_buff_to_frame() relies on
> this.
> IOW, the following:
> 
> 	for (u32 i = 0; i < 0xdead; i++) {
> 		xdpf = xdp_convert_buff_to_frame(&xdp);
> 		xdp_convert_frame_to_buff(xdpf, &xdp);
> 	}
> 
> shouldn't ever modify @xdpf's contents or the pointer itself.
> However, "live packet" code wrongly treats &xdp_frame as part of its
> context placed *before* the data_hard_start. With such flow,
> data_hard_start is sizeof(*xdpf) off to the right and no longer points
> to the XDP frame.
> 
> Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several
> places and praying that there are no more miscalcs left somewhere in the
> code, unionize ::frm with ::data in a flex array, so that both starts
> pointing to the actual data_hard_start and the XDP frame actually starts
> being a part of it, i.e. a part of the headroom, not the context.
> A nice side effect is that the maximum frame size for this mode gets
> increased by 40 bytes, as xdp_buff::frame_sz includes everything from
> data_hard_start (-> includes xdpf already) to the end of XDP/skb shared
> info.
> Also update %MAX_PKT_SIZE accordingly in the selftests code. Leave it
> hardcoded for 64 bit && 4k pages, it can be made more flexible later on.
> 
> Minor: align `&head->data` with how `head->frm` is assigned for
> consistency.
> Minor #2: rename 'frm' to 'frame' in &xdp_page_head while at it for
> clarity.
> 
> (was found while testing XDP traffic generator on ice, which calls
>  xdp_convert_frame_to_buff() for each XDP frame)

Sorry, maybe this could be taken directly to net-next while it's still
open? It was tested and then reverted from bpf-next only due to not 100%
compile-time assertion, which I removed in this version. No more
changes. I doubt there'll be a second PR from bpf and would like this to
hit mainline before RC1 :s

> 
> Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN")
> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
> Link: https://lore.kernel.org/r/20230215185440.4126672-1-aleksander.lobakin@intel.com
> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
(>_< those two last tags are incorrect, lemme know if I should resubmit
 it without them or you could do it if ok with taking it now)

Thanks,
Olek
Martin KaFai Lau Feb. 21, 2023, 5:52 p.m. UTC | #2
On 2/21/23 4:35 AM, Alexander Lobakin wrote:
> From: Alexander Lobakin <aleksander.lobakin@intel.com>
> Date: Mon, 20 Feb 2023 16:46:27 +0100
> 
>> &xdp_buff and &xdp_frame are bound in a way that
>>
>> xdp_buff->data_hard_start == xdp_frame
>>
>> It's always the case and e.g. xdp_convert_buff_to_frame() relies on
>> this.
>> IOW, the following:
>>
>> 	for (u32 i = 0; i < 0xdead; i++) {
>> 		xdpf = xdp_convert_buff_to_frame(&xdp);
>> 		xdp_convert_frame_to_buff(xdpf, &xdp);
>> 	}
>>
>> shouldn't ever modify @xdpf's contents or the pointer itself.
>> However, "live packet" code wrongly treats &xdp_frame as part of its
>> context placed *before* the data_hard_start. With such flow,
>> data_hard_start is sizeof(*xdpf) off to the right and no longer points
>> to the XDP frame.
>>
>> Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several
>> places and praying that there are no more miscalcs left somewhere in the
>> code, unionize ::frm with ::data in a flex array, so that both starts
>> pointing to the actual data_hard_start and the XDP frame actually starts
>> being a part of it, i.e. a part of the headroom, not the context.
>> A nice side effect is that the maximum frame size for this mode gets
>> increased by 40 bytes, as xdp_buff::frame_sz includes everything from
>> data_hard_start (-> includes xdpf already) to the end of XDP/skb shared
>> info.
>> Also update %MAX_PKT_SIZE accordingly in the selftests code. Leave it
>> hardcoded for 64 bit && 4k pages, it can be made more flexible later on.
>>
>> Minor: align `&head->data` with how `head->frm` is assigned for
>> consistency.
>> Minor #2: rename 'frm' to 'frame' in &xdp_page_head while at it for
>> clarity.
>>
>> (was found while testing XDP traffic generator on ice, which calls
>>   xdp_convert_frame_to_buff() for each XDP frame)
> 
> Sorry, maybe this could be taken directly to net-next while it's still
> open? It was tested and then reverted from bpf-next only due to not 100%
> compile-time assertion, which I removed in this version. No more
> changes. I doubt there'll be a second PR from bpf and would like this to
> hit mainline before RC1 :s

I think this could go to bpf soon instead of bpf-next. The change is specific to 
the bpf selftest. It is better to go through bpf to get bpf CI coverage.

> 
>>
>> Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN")
>> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
>> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
>> Link: https://lore.kernel.org/r/20230215185440.4126672-1-aleksander.lobakin@intel.com
>> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
> (>_< those two last tags are incorrect, lemme know if I should resubmit
>   it without them or you could do it if ok with taking it now)

Please respin when it can be landed to the bpf tree on top of the s390 changes.
Alexander Lobakin Feb. 21, 2023, 5:55 p.m. UTC | #3
From: Martin Kafai Lau <martin.lau@linux.dev>
Date: Tue, 21 Feb 2023 09:52:52 -0800

> On 2/21/23 4:35 AM, Alexander Lobakin wrote:
>> From: Alexander Lobakin <aleksander.lobakin@intel.com>
>> Date: Mon, 20 Feb 2023 16:46:27 +0100
>>
>>> &xdp_buff and &xdp_frame are bound in a way that
>>>
>>> xdp_buff->data_hard_start == xdp_frame
>>>
>>> It's always the case and e.g. xdp_convert_buff_to_frame() relies on
>>> this.
>>> IOW, the following:
>>>
>>>     for (u32 i = 0; i < 0xdead; i++) {
>>>         xdpf = xdp_convert_buff_to_frame(&xdp);
>>>         xdp_convert_frame_to_buff(xdpf, &xdp);
>>>     }
>>>
>>> shouldn't ever modify @xdpf's contents or the pointer itself.
>>> However, "live packet" code wrongly treats &xdp_frame as part of its
>>> context placed *before* the data_hard_start. With such flow,
>>> data_hard_start is sizeof(*xdpf) off to the right and no longer points
>>> to the XDP frame.
>>>
>>> Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several
>>> places and praying that there are no more miscalcs left somewhere in the
>>> code, unionize ::frm with ::data in a flex array, so that both starts
>>> pointing to the actual data_hard_start and the XDP frame actually starts
>>> being a part of it, i.e. a part of the headroom, not the context.
>>> A nice side effect is that the maximum frame size for this mode gets
>>> increased by 40 bytes, as xdp_buff::frame_sz includes everything from
>>> data_hard_start (-> includes xdpf already) to the end of XDP/skb shared
>>> info.
>>> Also update %MAX_PKT_SIZE accordingly in the selftests code. Leave it
>>> hardcoded for 64 bit && 4k pages, it can be made more flexible later on.
>>>
>>> Minor: align `&head->data` with how `head->frm` is assigned for
>>> consistency.
>>> Minor #2: rename 'frm' to 'frame' in &xdp_page_head while at it for
>>> clarity.
>>>
>>> (was found while testing XDP traffic generator on ice, which calls
>>>   xdp_convert_frame_to_buff() for each XDP frame)
>>
>> Sorry, maybe this could be taken directly to net-next while it's still
>> open? It was tested and then reverted from bpf-next only due to not 100%
>> compile-time assertion, which I removed in this version. No more
>> changes. I doubt there'll be a second PR from bpf and would like this to
>> hit mainline before RC1 :s
> 
> I think this could go to bpf soon instead of bpf-next. The change is
> specific to the bpf selftest. It is better to go through bpf to get bpf
> CI coverage.

Ah okay, I'll resend when bpf pulls merged net-next from Linus' tree.

> 
>>
>>>
>>> Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in
>>> BPF_PROG_RUN")
>>> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
>>> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
>>> Link:
>>> https://lore.kernel.org/r/20230215185440.4126672-1-aleksander.lobakin@intel.com
>>> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
>> (>_< those two last tags are incorrect, lemme know if I should resubmit
>>   it without them or you could do it if ok with taking it now)
> 
> Please respin when it can be landed to the bpf tree on top of the s390
> changes.

Thanks,
Olek
diff mbox series

Patch

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 6f3d654b3339..f81b24320a36 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -97,8 +97,11 @@  static bool bpf_test_timer_continue(struct bpf_test_timer *t, int iterations,
 struct xdp_page_head {
 	struct xdp_buff orig_ctx;
 	struct xdp_buff ctx;
-	struct xdp_frame frm;
-	u8 data[];
+	union {
+		/* ::data_hard_start starts here */
+		DECLARE_FLEX_ARRAY(struct xdp_frame, frame);
+		DECLARE_FLEX_ARRAY(u8, data);
+	};
 };
 
 struct xdp_test_data {
@@ -113,6 +116,10 @@  struct xdp_test_data {
 	u32 frame_cnt;
 };
 
+/* tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c:%MAX_PKT_SIZE
+ * must be updated accordingly this gets changed, otherwise BPF selftests
+ * will fail.
+ */
 #define TEST_XDP_FRAME_SIZE (PAGE_SIZE - sizeof(struct xdp_page_head))
 #define TEST_XDP_MAX_BATCH 256
 
@@ -132,8 +139,8 @@  static void xdp_test_run_init_page(struct page *page, void *arg)
 	headroom -= meta_len;
 
 	new_ctx = &head->ctx;
-	frm = &head->frm;
-	data = &head->data;
+	frm = head->frame;
+	data = head->data;
 	memcpy(data + headroom, orig_ctx->data_meta, frm_len);
 
 	xdp_init_buff(new_ctx, TEST_XDP_FRAME_SIZE, &xdp->rxq);
@@ -223,7 +230,7 @@  static void reset_ctx(struct xdp_page_head *head)
 	head->ctx.data = head->orig_ctx.data;
 	head->ctx.data_meta = head->orig_ctx.data_meta;
 	head->ctx.data_end = head->orig_ctx.data_end;
-	xdp_update_frame_from_buff(&head->ctx, &head->frm);
+	xdp_update_frame_from_buff(&head->ctx, head->frame);
 }
 
 static int xdp_recv_frames(struct xdp_frame **frames, int nframes,
@@ -285,7 +292,7 @@  static int xdp_test_run_batch(struct xdp_test_data *xdp, struct bpf_prog *prog,
 		head = phys_to_virt(page_to_phys(page));
 		reset_ctx(head);
 		ctx = &head->ctx;
-		frm = &head->frm;
+		frm = head->frame;
 		xdp->frame_cnt++;
 
 		act = bpf_prog_run_xdp(prog, ctx);
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
index 2666c84dbd01..7271a18ab3e2 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
@@ -65,12 +65,13 @@  static int attach_tc_prog(struct bpf_tc_hook *hook, int fd)
 }
 
 /* The maximum permissible size is: PAGE_SIZE - sizeof(struct xdp_page_head) -
- * sizeof(struct skb_shared_info) - XDP_PACKET_HEADROOM = 3368 bytes
+ * SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) - XDP_PACKET_HEADROOM =
+ * 3408 bytes for 64-byte cacheline and 3216 for 256-byte one.
  */
 #if defined(__s390x__)
-#define MAX_PKT_SIZE 3176
+#define MAX_PKT_SIZE 3216
 #else
-#define MAX_PKT_SIZE 3368
+#define MAX_PKT_SIZE 3408
 #endif
 static void test_max_pkt_size(int fd)
 {