diff mbox series

[v5,bpf-next,10/24] xsk: add new netlink attribute dedicated for ZC max frags

Message ID 20230706204650.469087-11-maciej.fijalkowski@intel.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series xsk: multi-buffer support | expand

Checks

Context Check Description
netdev/series_format fail Series longer than 15 patches (and no cover letter)
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 4172 this patch: 4172
netdev/cc_maintainers warning 17 maintainers not CCed: kuba@kernel.org hawk@kernel.org lorenzo@kernel.org yhs@fb.com kpsingh@kernel.org martin.lau@linux.dev memxor@gmail.com john.fastabend@gmail.com sdf@google.com alardam@gmail.com tariqt@nvidia.com song@kernel.org jolsa@kernel.org davem@davemloft.net pabeni@redhat.com haoluo@google.com edumazet@google.com
netdev/build_clang fail Errors and warnings before: 893 this patch: 896
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 4389 this patch: 4389
netdev/checkpatch warning WARNING: line length of 81 exceeds 80 columns
netdev/kdoc fail Errors and warnings before: 118 this patch: 119
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-8 success Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-6 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-7 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 fail Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 fail Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 fail Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 fail Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for veristat
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12 fail Logs for test_progs on s390x with gcc

Commit Message

Maciej Fijalkowski July 6, 2023, 8:46 p.m. UTC
Introduce new netlink attribute NETDEV_A_DEV_XDP_ZC_MAX_SEGS that will
carry maximum fragments that underlying ZC driver is able to handle on
TX side. It is going to be included in netlink response only when driver
supports ZC. Any value higher than 1 implies multi-buffer ZC support on
underlying device.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 Documentation/netlink/specs/netdev.yaml | 6 ++++++
 include/linux/netdevice.h               | 1 +
 include/uapi/linux/netdev.h             | 1 +
 net/core/dev.c                          | 1 +
 net/core/netdev-genl.c                  | 8 ++++++++
 tools/include/uapi/linux/netdev.h       | 1 +
 tools/lib/bpf/libbpf.h                  | 1 +
 tools/lib/bpf/netlink.c                 | 5 +++++
 8 files changed, 24 insertions(+)

Comments

Alexei Starovoitov July 11, 2023, 1:09 a.m. UTC | #1
On Thu, Jul 6, 2023 at 1:47 PM Maciej Fijalkowski
<maciej.fijalkowski@intel.com> wrote:
>
> Introduce new netlink attribute NETDEV_A_DEV_XDP_ZC_MAX_SEGS that will
> carry maximum fragments that underlying ZC driver is able to handle on
> TX side. It is going to be included in netlink response only when driver
> supports ZC. Any value higher than 1 implies multi-buffer ZC support on
> underlying device.
>
> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>

I suspect something in this patch makes XDP bonding test fail.
See BPF CI.

I can reproduce the failure locally as well.
test_progs -t bond
works without the series and fails with them.
Maciej Fijalkowski July 13, 2023, 6:58 p.m. UTC | #2
On Mon, Jul 10, 2023 at 06:09:28PM -0700, Alexei Starovoitov wrote:
> On Thu, Jul 6, 2023 at 1:47 PM Maciej Fijalkowski
> <maciej.fijalkowski@intel.com> wrote:
> >
> > Introduce new netlink attribute NETDEV_A_DEV_XDP_ZC_MAX_SEGS that will
> > carry maximum fragments that underlying ZC driver is able to handle on
> > TX side. It is going to be included in netlink response only when driver
> > supports ZC. Any value higher than 1 implies multi-buffer ZC support on
> > underlying device.
> >
> > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> 
> I suspect something in this patch makes XDP bonding test fail.
> See BPF CI.
> 
> I can reproduce the failure locally as well.
> test_progs -t bond
> works without the series and fails with them.

Hi Alexei,

this fails on second bpf_xdp_query() call due to non-zero (?) contents at the
end of bpf_xdp_query_opts struct - currently it looks as following:

$ pahole -C bpf_xdp_query_opts libbpf.so
struct bpf_xdp_query_opts {
	size_t                     sz;                   /*     0     8 */
	__u32                      prog_id;              /*     8     4 */
	__u32                      drv_prog_id;          /*    12     4 */
	__u32                      hw_prog_id;           /*    16     4 */
	__u32                      skb_prog_id;          /*    20     4 */
	__u8                       attach_mode;          /*    24     1 */

	/* XXX 7 bytes hole, try to pack */

	__u64                      feature_flags;        /*    32     8 */
	__u32                      xdp_zc_max_segs;      /*    40     4 */

	/* size: 48, cachelines: 1, members: 8 */
	/* sum members: 37, holes: 1, sum holes: 7 */
	/* padding: 4 */
	/* last cacheline: 48 bytes */
};

Fix is either to move xdp_zc_max_segs up to existing hole or to zero out
struct before bpf_xdp_query() calls, like:

	memset(&query_opts, 0, sizeof(struct bpf_xdp_query_opts));
	query_opts.sz = sizeof(struct bpf_xdp_query_opts);

I am kinda confused as this is happening due to two things. First off
bonding driver sets its xdp_features to NETDEV_XDP_ACT_MASK and in turn
this implies ZC feature enabled which makes xdp_zc_max_segs being included
in the response (it's value is 1 as it's the default).

Then, offsetofend(struct type, type##__last_field) that is used as one of
libbpf_validate_opts() args gives me 40 but bpf_xdp_query_opts::sz has
stored 48, so in the end we go through the last 8 bytes in
libbpf_is_mem_zeroed() and we hit the '1' from xdp_zc_max_segs.

So, (silly) questions:
- why bonding driver defaults to all features enabled?
- why __last_field does not recognize xdp_zc_max_segs at the end?

Besides, I think i'll move xdp_zc_max_segs above to the hole. This fixes
the bonding test for me.
Alexei Starovoitov July 13, 2023, 11:02 p.m. UTC | #3
On Thu, Jul 13, 2023 at 11:59 AM Maciej Fijalkowski
<maciej.fijalkowski@intel.com> wrote:
>
> On Mon, Jul 10, 2023 at 06:09:28PM -0700, Alexei Starovoitov wrote:
> > On Thu, Jul 6, 2023 at 1:47 PM Maciej Fijalkowski
> > <maciej.fijalkowski@intel.com> wrote:
> > >
> > > Introduce new netlink attribute NETDEV_A_DEV_XDP_ZC_MAX_SEGS that will
> > > carry maximum fragments that underlying ZC driver is able to handle on
> > > TX side. It is going to be included in netlink response only when driver
> > > supports ZC. Any value higher than 1 implies multi-buffer ZC support on
> > > underlying device.
> > >
> > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> >
> > I suspect something in this patch makes XDP bonding test fail.
> > See BPF CI.
> >
> > I can reproduce the failure locally as well.
> > test_progs -t bond
> > works without the series and fails with them.
>
> Hi Alexei,
>
> this fails on second bpf_xdp_query() call due to non-zero (?) contents at the
> end of bpf_xdp_query_opts struct - currently it looks as following:
>
> $ pahole -C bpf_xdp_query_opts libbpf.so
> struct bpf_xdp_query_opts {
>         size_t                     sz;                   /*     0     8 */
>         __u32                      prog_id;              /*     8     4 */
>         __u32                      drv_prog_id;          /*    12     4 */
>         __u32                      hw_prog_id;           /*    16     4 */
>         __u32                      skb_prog_id;          /*    20     4 */
>         __u8                       attach_mode;          /*    24     1 */
>
>         /* XXX 7 bytes hole, try to pack */
>
>         __u64                      feature_flags;        /*    32     8 */
>         __u32                      xdp_zc_max_segs;      /*    40     4 */
>
>         /* size: 48, cachelines: 1, members: 8 */
>         /* sum members: 37, holes: 1, sum holes: 7 */
>         /* padding: 4 */
>         /* last cacheline: 48 bytes */
> };
>
> Fix is either to move xdp_zc_max_segs up to existing hole or to zero out
> struct before bpf_xdp_query() calls, like:
>
>         memset(&query_opts, 0, sizeof(struct bpf_xdp_query_opts));
>         query_opts.sz = sizeof(struct bpf_xdp_query_opts);

Right. That would be good to have to clear the hole,
but probably unrelated.

> I am kinda confused as this is happening due to two things. First off
> bonding driver sets its xdp_features to NETDEV_XDP_ACT_MASK and in turn
> this implies ZC feature enabled which makes xdp_zc_max_segs being included
> in the response (it's value is 1 as it's the default).
>
> Then, offsetofend(struct type, type##__last_field) that is used as one of
> libbpf_validate_opts() args gives me 40 but bpf_xdp_query_opts::sz has
> stored 48, so in the end we go through the last 8 bytes in
> libbpf_is_mem_zeroed() and we hit the '1' from xdp_zc_max_segs.

Because this patch didn't update
bpf_xdp_query_opts__last_field

It added a new field, but didn't update the macro.

> So, (silly) questions:
> - why bonding driver defaults to all features enabled?

doesn't really matter in this context.

> - why __last_field does not recognize xdp_zc_max_segs at the end?

because the patch didn't update it :)

> Besides, I think i'll move xdp_zc_max_segs above to the hole. This fixes
> the bonding test for me.

No. Keep it at the end.
Maciej Fijalkowski July 14, 2023, 9:17 a.m. UTC | #4
On Thu, Jul 13, 2023 at 04:02:42PM -0700, Alexei Starovoitov wrote:
> On Thu, Jul 13, 2023 at 11:59 AM Maciej Fijalkowski
> <maciej.fijalkowski@intel.com> wrote:
> >
> > On Mon, Jul 10, 2023 at 06:09:28PM -0700, Alexei Starovoitov wrote:
> > > On Thu, Jul 6, 2023 at 1:47 PM Maciej Fijalkowski
> > > <maciej.fijalkowski@intel.com> wrote:
> > > >
> > > > Introduce new netlink attribute NETDEV_A_DEV_XDP_ZC_MAX_SEGS that will
> > > > carry maximum fragments that underlying ZC driver is able to handle on
> > > > TX side. It is going to be included in netlink response only when driver
> > > > supports ZC. Any value higher than 1 implies multi-buffer ZC support on
> > > > underlying device.
> > > >
> > > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > >
> > > I suspect something in this patch makes XDP bonding test fail.
> > > See BPF CI.
> > >
> > > I can reproduce the failure locally as well.
> > > test_progs -t bond
> > > works without the series and fails with them.
> >
> > Hi Alexei,
> >
> > this fails on second bpf_xdp_query() call due to non-zero (?) contents at the
> > end of bpf_xdp_query_opts struct - currently it looks as following:
> >
> > $ pahole -C bpf_xdp_query_opts libbpf.so
> > struct bpf_xdp_query_opts {
> >         size_t                     sz;                   /*     0     8 */
> >         __u32                      prog_id;              /*     8     4 */
> >         __u32                      drv_prog_id;          /*    12     4 */
> >         __u32                      hw_prog_id;           /*    16     4 */
> >         __u32                      skb_prog_id;          /*    20     4 */
> >         __u8                       attach_mode;          /*    24     1 */
> >
> >         /* XXX 7 bytes hole, try to pack */
> >
> >         __u64                      feature_flags;        /*    32     8 */
> >         __u32                      xdp_zc_max_segs;      /*    40     4 */
> >
> >         /* size: 48, cachelines: 1, members: 8 */
> >         /* sum members: 37, holes: 1, sum holes: 7 */
> >         /* padding: 4 */
> >         /* last cacheline: 48 bytes */
> > };
> >
> > Fix is either to move xdp_zc_max_segs up to existing hole or to zero out
> > struct before bpf_xdp_query() calls, like:
> >
> >         memset(&query_opts, 0, sizeof(struct bpf_xdp_query_opts));
> >         query_opts.sz = sizeof(struct bpf_xdp_query_opts);
> 
> Right. That would be good to have to clear the hole,
> but probably unrelated.
> 
> > I am kinda confused as this is happening due to two things. First off
> > bonding driver sets its xdp_features to NETDEV_XDP_ACT_MASK and in turn
> > this implies ZC feature enabled which makes xdp_zc_max_segs being included
> > in the response (it's value is 1 as it's the default).
> >
> > Then, offsetofend(struct type, type##__last_field) that is used as one of
> > libbpf_validate_opts() args gives me 40 but bpf_xdp_query_opts::sz has
> > stored 48, so in the end we go through the last 8 bytes in
> > libbpf_is_mem_zeroed() and we hit the '1' from xdp_zc_max_segs.
> 
> Because this patch didn't update
> bpf_xdp_query_opts__last_field

Heh, bummer, this was right in front of me whole time. I think I need a
break:) but before that let me send v6. Thanks Alexei.

> 
> It added a new field, but didn't update the macro.
> 
> > So, (silly) questions:
> > - why bonding driver defaults to all features enabled?
> 
> doesn't really matter in this context.
> 
> > - why __last_field does not recognize xdp_zc_max_segs at the end?
> 
> because the patch didn't update it :)
> 
> > Besides, I think i'll move xdp_zc_max_segs above to the hole. This fixes
> > the bonding test for me.
> 
> No. Keep it at the end.
diff mbox series

Patch

diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml
index b99e7ffef7a1..e41015310a6e 100644
--- a/Documentation/netlink/specs/netdev.yaml
+++ b/Documentation/netlink/specs/netdev.yaml
@@ -62,6 +62,12 @@  attribute-sets:
         type: u64
         enum: xdp-act
         enum-as-flags: true
+      -
+        name: xdp_zc_max_segs
+        doc: max fragment count supported by ZC driver
+        type: u32
+        checks:
+          min: 1
 
 operations:
   list:
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index b828c7a75be2..b12477ea4032 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2250,6 +2250,7 @@  struct net_device {
 #define GRO_MAX_SIZE		(8 * 65535u)
 	unsigned int		gro_max_size;
 	unsigned int		gro_ipv4_max_size;
+	unsigned int		xdp_zc_max_segs;
 	rx_handler_func_t __rcu	*rx_handler;
 	void __rcu		*rx_handler_data;
 
diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h
index 639524b59930..bf71698a1e82 100644
--- a/include/uapi/linux/netdev.h
+++ b/include/uapi/linux/netdev.h
@@ -41,6 +41,7 @@  enum {
 	NETDEV_A_DEV_IFINDEX = 1,
 	NETDEV_A_DEV_PAD,
 	NETDEV_A_DEV_XDP_FEATURES,
+	NETDEV_A_DEV_XDP_ZC_MAX_SEGS,
 
 	__NETDEV_A_DEV_MAX,
 	NETDEV_A_DEV_MAX = (__NETDEV_A_DEV_MAX - 1)
diff --git a/net/core/dev.c b/net/core/dev.c
index 69a3e544676c..b14dd28eb51e 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10617,6 +10617,7 @@  struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 	dev_net_set(dev, &init_net);
 
 	dev->gso_max_size = GSO_LEGACY_MAX_SIZE;
+	dev->xdp_zc_max_segs = 1;
 	dev->gso_max_segs = GSO_MAX_SEGS;
 	dev->gro_max_size = GRO_LEGACY_MAX_SIZE;
 	dev->gso_ipv4_max_size = GSO_LEGACY_MAX_SIZE;
diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c
index a4270fafdf11..65ef4867fc49 100644
--- a/net/core/netdev-genl.c
+++ b/net/core/netdev-genl.c
@@ -25,6 +25,14 @@  netdev_nl_dev_fill(struct net_device *netdev, struct sk_buff *rsp,
 		return -EINVAL;
 	}
 
+	if (netdev->xdp_features & NETDEV_XDP_ACT_XSK_ZEROCOPY) {
+		if (nla_put_u32(rsp, NETDEV_A_DEV_XDP_ZC_MAX_SEGS,
+				netdev->xdp_zc_max_segs)) {
+			genlmsg_cancel(rsp, hdr);
+			return -EINVAL;
+		}
+	}
+
 	genlmsg_end(rsp, hdr);
 
 	return 0;
diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h
index 639524b59930..bf71698a1e82 100644
--- a/tools/include/uapi/linux/netdev.h
+++ b/tools/include/uapi/linux/netdev.h
@@ -41,6 +41,7 @@  enum {
 	NETDEV_A_DEV_IFINDEX = 1,
 	NETDEV_A_DEV_PAD,
 	NETDEV_A_DEV_XDP_FEATURES,
+	NETDEV_A_DEV_XDP_ZC_MAX_SEGS,
 
 	__NETDEV_A_DEV_MAX,
 	NETDEV_A_DEV_MAX = (__NETDEV_A_DEV_MAX - 1)
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 10642ad69d76..ca280db729d2 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -1105,6 +1105,7 @@  struct bpf_xdp_query_opts {
 	__u32 skb_prog_id;	/* output */
 	__u8 attach_mode;	/* output */
 	__u64 feature_flags;	/* output */
+	__u32 xdp_zc_max_segs;	/* output */
 	size_t :0;
 };
 #define bpf_xdp_query_opts__last_field feature_flags
diff --git a/tools/lib/bpf/netlink.c b/tools/lib/bpf/netlink.c
index 84dd5fa14905..090bcf6e3b3d 100644
--- a/tools/lib/bpf/netlink.c
+++ b/tools/lib/bpf/netlink.c
@@ -45,6 +45,7 @@  struct xdp_id_md {
 
 struct xdp_features_md {
 	int ifindex;
+	__u32 xdp_zc_max_segs;
 	__u64 flags;
 };
 
@@ -421,6 +422,9 @@  static int parse_xdp_features(struct nlmsghdr *nh, libbpf_dump_nlmsg_t fn,
 		return NL_CONT;
 
 	md->flags = libbpf_nla_getattr_u64(tb[NETDEV_A_DEV_XDP_FEATURES]);
+	if (tb[NETDEV_A_DEV_XDP_ZC_MAX_SEGS])
+		md->xdp_zc_max_segs =
+			libbpf_nla_getattr_u32(tb[NETDEV_A_DEV_XDP_ZC_MAX_SEGS]);
 	return NL_DONE;
 }
 
@@ -493,6 +497,7 @@  int bpf_xdp_query(int ifindex, int xdp_flags, struct bpf_xdp_query_opts *opts)
 		return libbpf_err(err);
 
 	opts->feature_flags = md.flags;
+	opts->xdp_zc_max_segs = md.xdp_zc_max_segs;
 
 skip_feature_flags:
 	return 0;