Message ID | 20230706204650.469087-11-maciej.fijalkowski@intel.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | xsk: multi-buffer support | expand |
On Thu, Jul 6, 2023 at 1:47 PM Maciej Fijalkowski <maciej.fijalkowski@intel.com> wrote: > > Introduce new netlink attribute NETDEV_A_DEV_XDP_ZC_MAX_SEGS that will > carry maximum fragments that underlying ZC driver is able to handle on > TX side. It is going to be included in netlink response only when driver > supports ZC. Any value higher than 1 implies multi-buffer ZC support on > underlying device. > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> I suspect something in this patch makes XDP bonding test fail. See BPF CI. I can reproduce the failure locally as well. test_progs -t bond works without the series and fails with them.
On Mon, Jul 10, 2023 at 06:09:28PM -0700, Alexei Starovoitov wrote: > On Thu, Jul 6, 2023 at 1:47 PM Maciej Fijalkowski > <maciej.fijalkowski@intel.com> wrote: > > > > Introduce new netlink attribute NETDEV_A_DEV_XDP_ZC_MAX_SEGS that will > > carry maximum fragments that underlying ZC driver is able to handle on > > TX side. It is going to be included in netlink response only when driver > > supports ZC. Any value higher than 1 implies multi-buffer ZC support on > > underlying device. > > > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > > I suspect something in this patch makes XDP bonding test fail. > See BPF CI. > > I can reproduce the failure locally as well. > test_progs -t bond > works without the series and fails with them. Hi Alexei, this fails on second bpf_xdp_query() call due to non-zero (?) contents at the end of bpf_xdp_query_opts struct - currently it looks as following: $ pahole -C bpf_xdp_query_opts libbpf.so struct bpf_xdp_query_opts { size_t sz; /* 0 8 */ __u32 prog_id; /* 8 4 */ __u32 drv_prog_id; /* 12 4 */ __u32 hw_prog_id; /* 16 4 */ __u32 skb_prog_id; /* 20 4 */ __u8 attach_mode; /* 24 1 */ /* XXX 7 bytes hole, try to pack */ __u64 feature_flags; /* 32 8 */ __u32 xdp_zc_max_segs; /* 40 4 */ /* size: 48, cachelines: 1, members: 8 */ /* sum members: 37, holes: 1, sum holes: 7 */ /* padding: 4 */ /* last cacheline: 48 bytes */ }; Fix is either to move xdp_zc_max_segs up to existing hole or to zero out struct before bpf_xdp_query() calls, like: memset(&query_opts, 0, sizeof(struct bpf_xdp_query_opts)); query_opts.sz = sizeof(struct bpf_xdp_query_opts); I am kinda confused as this is happening due to two things. First off bonding driver sets its xdp_features to NETDEV_XDP_ACT_MASK and in turn this implies ZC feature enabled which makes xdp_zc_max_segs being included in the response (it's value is 1 as it's the default). Then, offsetofend(struct type, type##__last_field) that is used as one of libbpf_validate_opts() args gives me 40 but bpf_xdp_query_opts::sz has stored 48, so in the end we go through the last 8 bytes in libbpf_is_mem_zeroed() and we hit the '1' from xdp_zc_max_segs. So, (silly) questions: - why bonding driver defaults to all features enabled? - why __last_field does not recognize xdp_zc_max_segs at the end? Besides, I think i'll move xdp_zc_max_segs above to the hole. This fixes the bonding test for me.
On Thu, Jul 13, 2023 at 11:59 AM Maciej Fijalkowski <maciej.fijalkowski@intel.com> wrote: > > On Mon, Jul 10, 2023 at 06:09:28PM -0700, Alexei Starovoitov wrote: > > On Thu, Jul 6, 2023 at 1:47 PM Maciej Fijalkowski > > <maciej.fijalkowski@intel.com> wrote: > > > > > > Introduce new netlink attribute NETDEV_A_DEV_XDP_ZC_MAX_SEGS that will > > > carry maximum fragments that underlying ZC driver is able to handle on > > > TX side. It is going to be included in netlink response only when driver > > > supports ZC. Any value higher than 1 implies multi-buffer ZC support on > > > underlying device. > > > > > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > > > > I suspect something in this patch makes XDP bonding test fail. > > See BPF CI. > > > > I can reproduce the failure locally as well. > > test_progs -t bond > > works without the series and fails with them. > > Hi Alexei, > > this fails on second bpf_xdp_query() call due to non-zero (?) contents at the > end of bpf_xdp_query_opts struct - currently it looks as following: > > $ pahole -C bpf_xdp_query_opts libbpf.so > struct bpf_xdp_query_opts { > size_t sz; /* 0 8 */ > __u32 prog_id; /* 8 4 */ > __u32 drv_prog_id; /* 12 4 */ > __u32 hw_prog_id; /* 16 4 */ > __u32 skb_prog_id; /* 20 4 */ > __u8 attach_mode; /* 24 1 */ > > /* XXX 7 bytes hole, try to pack */ > > __u64 feature_flags; /* 32 8 */ > __u32 xdp_zc_max_segs; /* 40 4 */ > > /* size: 48, cachelines: 1, members: 8 */ > /* sum members: 37, holes: 1, sum holes: 7 */ > /* padding: 4 */ > /* last cacheline: 48 bytes */ > }; > > Fix is either to move xdp_zc_max_segs up to existing hole or to zero out > struct before bpf_xdp_query() calls, like: > > memset(&query_opts, 0, sizeof(struct bpf_xdp_query_opts)); > query_opts.sz = sizeof(struct bpf_xdp_query_opts); Right. That would be good to have to clear the hole, but probably unrelated. > I am kinda confused as this is happening due to two things. First off > bonding driver sets its xdp_features to NETDEV_XDP_ACT_MASK and in turn > this implies ZC feature enabled which makes xdp_zc_max_segs being included > in the response (it's value is 1 as it's the default). > > Then, offsetofend(struct type, type##__last_field) that is used as one of > libbpf_validate_opts() args gives me 40 but bpf_xdp_query_opts::sz has > stored 48, so in the end we go through the last 8 bytes in > libbpf_is_mem_zeroed() and we hit the '1' from xdp_zc_max_segs. Because this patch didn't update bpf_xdp_query_opts__last_field It added a new field, but didn't update the macro. > So, (silly) questions: > - why bonding driver defaults to all features enabled? doesn't really matter in this context. > - why __last_field does not recognize xdp_zc_max_segs at the end? because the patch didn't update it :) > Besides, I think i'll move xdp_zc_max_segs above to the hole. This fixes > the bonding test for me. No. Keep it at the end.
On Thu, Jul 13, 2023 at 04:02:42PM -0700, Alexei Starovoitov wrote: > On Thu, Jul 13, 2023 at 11:59 AM Maciej Fijalkowski > <maciej.fijalkowski@intel.com> wrote: > > > > On Mon, Jul 10, 2023 at 06:09:28PM -0700, Alexei Starovoitov wrote: > > > On Thu, Jul 6, 2023 at 1:47 PM Maciej Fijalkowski > > > <maciej.fijalkowski@intel.com> wrote: > > > > > > > > Introduce new netlink attribute NETDEV_A_DEV_XDP_ZC_MAX_SEGS that will > > > > carry maximum fragments that underlying ZC driver is able to handle on > > > > TX side. It is going to be included in netlink response only when driver > > > > supports ZC. Any value higher than 1 implies multi-buffer ZC support on > > > > underlying device. > > > > > > > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > > > > > > I suspect something in this patch makes XDP bonding test fail. > > > See BPF CI. > > > > > > I can reproduce the failure locally as well. > > > test_progs -t bond > > > works without the series and fails with them. > > > > Hi Alexei, > > > > this fails on second bpf_xdp_query() call due to non-zero (?) contents at the > > end of bpf_xdp_query_opts struct - currently it looks as following: > > > > $ pahole -C bpf_xdp_query_opts libbpf.so > > struct bpf_xdp_query_opts { > > size_t sz; /* 0 8 */ > > __u32 prog_id; /* 8 4 */ > > __u32 drv_prog_id; /* 12 4 */ > > __u32 hw_prog_id; /* 16 4 */ > > __u32 skb_prog_id; /* 20 4 */ > > __u8 attach_mode; /* 24 1 */ > > > > /* XXX 7 bytes hole, try to pack */ > > > > __u64 feature_flags; /* 32 8 */ > > __u32 xdp_zc_max_segs; /* 40 4 */ > > > > /* size: 48, cachelines: 1, members: 8 */ > > /* sum members: 37, holes: 1, sum holes: 7 */ > > /* padding: 4 */ > > /* last cacheline: 48 bytes */ > > }; > > > > Fix is either to move xdp_zc_max_segs up to existing hole or to zero out > > struct before bpf_xdp_query() calls, like: > > > > memset(&query_opts, 0, sizeof(struct bpf_xdp_query_opts)); > > query_opts.sz = sizeof(struct bpf_xdp_query_opts); > > Right. That would be good to have to clear the hole, > but probably unrelated. > > > I am kinda confused as this is happening due to two things. First off > > bonding driver sets its xdp_features to NETDEV_XDP_ACT_MASK and in turn > > this implies ZC feature enabled which makes xdp_zc_max_segs being included > > in the response (it's value is 1 as it's the default). > > > > Then, offsetofend(struct type, type##__last_field) that is used as one of > > libbpf_validate_opts() args gives me 40 but bpf_xdp_query_opts::sz has > > stored 48, so in the end we go through the last 8 bytes in > > libbpf_is_mem_zeroed() and we hit the '1' from xdp_zc_max_segs. > > Because this patch didn't update > bpf_xdp_query_opts__last_field Heh, bummer, this was right in front of me whole time. I think I need a break:) but before that let me send v6. Thanks Alexei. > > It added a new field, but didn't update the macro. > > > So, (silly) questions: > > - why bonding driver defaults to all features enabled? > > doesn't really matter in this context. > > > - why __last_field does not recognize xdp_zc_max_segs at the end? > > because the patch didn't update it :) > > > Besides, I think i'll move xdp_zc_max_segs above to the hole. This fixes > > the bonding test for me. > > No. Keep it at the end.
diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index b99e7ffef7a1..e41015310a6e 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -62,6 +62,12 @@ attribute-sets: type: u64 enum: xdp-act enum-as-flags: true + - + name: xdp_zc_max_segs + doc: max fragment count supported by ZC driver + type: u32 + checks: + min: 1 operations: list: diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index b828c7a75be2..b12477ea4032 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -2250,6 +2250,7 @@ struct net_device { #define GRO_MAX_SIZE (8 * 65535u) unsigned int gro_max_size; unsigned int gro_ipv4_max_size; + unsigned int xdp_zc_max_segs; rx_handler_func_t __rcu *rx_handler; void __rcu *rx_handler_data; diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index 639524b59930..bf71698a1e82 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -41,6 +41,7 @@ enum { NETDEV_A_DEV_IFINDEX = 1, NETDEV_A_DEV_PAD, NETDEV_A_DEV_XDP_FEATURES, + NETDEV_A_DEV_XDP_ZC_MAX_SEGS, __NETDEV_A_DEV_MAX, NETDEV_A_DEV_MAX = (__NETDEV_A_DEV_MAX - 1) diff --git a/net/core/dev.c b/net/core/dev.c index 69a3e544676c..b14dd28eb51e 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -10617,6 +10617,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name, dev_net_set(dev, &init_net); dev->gso_max_size = GSO_LEGACY_MAX_SIZE; + dev->xdp_zc_max_segs = 1; dev->gso_max_segs = GSO_MAX_SEGS; dev->gro_max_size = GRO_LEGACY_MAX_SIZE; dev->gso_ipv4_max_size = GSO_LEGACY_MAX_SIZE; diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index a4270fafdf11..65ef4867fc49 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -25,6 +25,14 @@ netdev_nl_dev_fill(struct net_device *netdev, struct sk_buff *rsp, return -EINVAL; } + if (netdev->xdp_features & NETDEV_XDP_ACT_XSK_ZEROCOPY) { + if (nla_put_u32(rsp, NETDEV_A_DEV_XDP_ZC_MAX_SEGS, + netdev->xdp_zc_max_segs)) { + genlmsg_cancel(rsp, hdr); + return -EINVAL; + } + } + genlmsg_end(rsp, hdr); return 0; diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h index 639524b59930..bf71698a1e82 100644 --- a/tools/include/uapi/linux/netdev.h +++ b/tools/include/uapi/linux/netdev.h @@ -41,6 +41,7 @@ enum { NETDEV_A_DEV_IFINDEX = 1, NETDEV_A_DEV_PAD, NETDEV_A_DEV_XDP_FEATURES, + NETDEV_A_DEV_XDP_ZC_MAX_SEGS, __NETDEV_A_DEV_MAX, NETDEV_A_DEV_MAX = (__NETDEV_A_DEV_MAX - 1) diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 10642ad69d76..ca280db729d2 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -1105,6 +1105,7 @@ struct bpf_xdp_query_opts { __u32 skb_prog_id; /* output */ __u8 attach_mode; /* output */ __u64 feature_flags; /* output */ + __u32 xdp_zc_max_segs; /* output */ size_t :0; }; #define bpf_xdp_query_opts__last_field feature_flags diff --git a/tools/lib/bpf/netlink.c b/tools/lib/bpf/netlink.c index 84dd5fa14905..090bcf6e3b3d 100644 --- a/tools/lib/bpf/netlink.c +++ b/tools/lib/bpf/netlink.c @@ -45,6 +45,7 @@ struct xdp_id_md { struct xdp_features_md { int ifindex; + __u32 xdp_zc_max_segs; __u64 flags; }; @@ -421,6 +422,9 @@ static int parse_xdp_features(struct nlmsghdr *nh, libbpf_dump_nlmsg_t fn, return NL_CONT; md->flags = libbpf_nla_getattr_u64(tb[NETDEV_A_DEV_XDP_FEATURES]); + if (tb[NETDEV_A_DEV_XDP_ZC_MAX_SEGS]) + md->xdp_zc_max_segs = + libbpf_nla_getattr_u32(tb[NETDEV_A_DEV_XDP_ZC_MAX_SEGS]); return NL_DONE; } @@ -493,6 +497,7 @@ int bpf_xdp_query(int ifindex, int xdp_flags, struct bpf_xdp_query_opts *opts) return libbpf_err(err); opts->feature_flags = md.flags; + opts->xdp_zc_max_segs = md.xdp_zc_max_segs; skip_feature_flags: return 0;
Introduce new netlink attribute NETDEV_A_DEV_XDP_ZC_MAX_SEGS that will carry maximum fragments that underlying ZC driver is able to handle on TX side. It is going to be included in netlink response only when driver supports ZC. Any value higher than 1 implies multi-buffer ZC support on underlying device. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> --- Documentation/netlink/specs/netdev.yaml | 6 ++++++ include/linux/netdevice.h | 1 + include/uapi/linux/netdev.h | 1 + net/core/dev.c | 1 + net/core/netdev-genl.c | 8 ++++++++ tools/include/uapi/linux/netdev.h | 1 + tools/lib/bpf/libbpf.h | 1 + tools/lib/bpf/netlink.c | 5 +++++ 8 files changed, 24 insertions(+)