Message ID | 20211208140702.642741-2-alexandr.lobakin@intel.com (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: intel: napi_alloc_skb() vs metadata | expand |
On 08/12/2021 15.06, Alexander Lobakin wrote: > {__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD > + NET_IP_ALIGN for any skb. > OTOH, i40e_construct_skb_zc() currently allocates and reserves > additional `xdp->data - xdp->data_hard_start`, which is > XDP_PACKET_HEADROOM for XSK frames. > There's no need for that at all as the frame is post-XDP and will > go only to the networking stack core. I disagree with this assumption, that headroom is not needed by netstack. Why "no need for that at all" for netstack? Having headroom is important for netstack in general. When packet will grow we avoid realloc of SKB. Use-case could also be cpumap or veth redirect, or XDP-generic, that expect this headroom. > Pass the size of the actual data only to __napi_alloc_skb() and > don't reserve anything. This will give enough headroom for stack > processing. > > Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support") > Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> > Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> > --- > drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c > index f08d19b8c554..9564906b7da8 100644 > --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c > +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c > @@ -245,13 +245,11 @@ static struct sk_buff *i40e_construct_skb_zc(struct i40e_ring *rx_ring, > struct sk_buff *skb; > > /* allocate a skb to store the frags */ > - skb = __napi_alloc_skb(&rx_ring->q_vector->napi, > - xdp->data_end - xdp->data_hard_start, > + skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize, > GFP_ATOMIC | __GFP_NOWARN); > if (unlikely(!skb)) > goto out; > > - skb_reserve(skb, xdp->data - xdp->data_hard_start); > memcpy(__skb_put(skb, datasize), xdp->data, datasize); > if (metasize) > skb_metadata_set(skb, metasize); >
From: Jesper Dangaard Brouer <jbrouer@redhat.com> Date: Thu, 9 Dec 2021 09:19:46 +0100 > On 08/12/2021 15.06, Alexander Lobakin wrote: > > {__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD > > + NET_IP_ALIGN for any skb. > > OTOH, i40e_construct_skb_zc() currently allocates and reserves > > additional `xdp->data - xdp->data_hard_start`, which is > > XDP_PACKET_HEADROOM for XSK frames. > > There's no need for that at all as the frame is post-XDP and will > > go only to the networking stack core. > > I disagree with this assumption, that headroom is not needed by netstack. > Why "no need for that at all" for netstack? napi_alloc_skb() in our particular case will reserve 64 bytes, it is sufficient for {TCP,UDP,SCTP,...}/IPv{4,6} etc. > > Having headroom is important for netstack in general. When packet will > grow we avoid realloc of SKB. Use-case could also be cpumap or veth > redirect, or XDP-generic, that expect this headroom. Well, those are not common cases at all. Allocating 256 bytes more for some hypothetical usecases (and having 320 in total) is more expensive than expanding headroom in-place. I don't know any other drivers or ifaces which reserve XDP_PACKET_HEADROOM just for the case of using both driver-side and generic XDP at the same time. To be more precise, I can't remember any driver which would check whether generic XDP is enabled for its netdev(s). As a second option, I was trying to get exactly XDP_PACKET_HEADROOM of headroom, but this involves either __alloc_skb() which is slower than napi_alloc_skb(), or skb = napi_alloc_skb(napi, xdp->data_end - xdp->data_hard_start - NET_SKB_PAD); skb_reserve(skb, xdp->data_meta - xdp->data_hard_start - NET_SKB_PAD); Doesn't look good for me. We could probably introduce a version of napi_alloc_skb() which wouldn't reserve any headroom for you to have more control over it, but that's more global material than these local fixes I'd say. > > > > Pass the size of the actual data only to __napi_alloc_skb() and > > don't reserve anything. This will give enough headroom for stack > > processing. > > > > Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support") > > Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> > > Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> > > --- > > drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 +--- > > 1 file changed, 1 insertion(+), 3 deletions(-) > > > > diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c > > index f08d19b8c554..9564906b7da8 100644 > > --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c > > +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c > > @@ -245,13 +245,11 @@ static struct sk_buff *i40e_construct_skb_zc(struct i40e_ring *rx_ring, > > struct sk_buff *skb; > > > > /* allocate a skb to store the frags */ > > - skb = __napi_alloc_skb(&rx_ring->q_vector->napi, > > - xdp->data_end - xdp->data_hard_start, > > + skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize, > > GFP_ATOMIC | __GFP_NOWARN); > > if (unlikely(!skb)) > > goto out; > > > > - skb_reserve(skb, xdp->data - xdp->data_hard_start); > > memcpy(__skb_put(skb, datasize), xdp->data, datasize); > > if (metasize) > > skb_metadata_set(skb, metasize); Thanks, Al
On 09/12/2021 18.33, Alexander Lobakin wrote: > From: Jesper Dangaard Brouer <jbrouer@redhat.com> > Date: Thu, 9 Dec 2021 09:19:46 +0100 > >> On 08/12/2021 15.06, Alexander Lobakin wrote: >>> {__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD >>> + NET_IP_ALIGN for any skb. >>> OTOH, i40e_construct_skb_zc() currently allocates and reserves >>> additional `xdp->data - xdp->data_hard_start`, which is >>> XDP_PACKET_HEADROOM for XSK frames. >>> There's no need for that at all as the frame is post-XDP and will >>> go only to the networking stack core. >> >> I disagree with this assumption, that headroom is not needed by netstack. >> Why "no need for that at all" for netstack? > > napi_alloc_skb() in our particular case will reserve 64 bytes, it is > sufficient for {TCP,UDP,SCTP,...}/IPv{4,6} etc. My bad, I misunderstood you. I now see (looking at code) that (as you say) 64 bytes of headroom *is* reserved (in bottom of __napi_alloc_skb). Thus, the SKB *do* have headroom, so this patch should be fine. Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Do watch out that 64 bytes is not always enough. Notice the define LL_MAX_HEADER and MAX_HEADER in include/linux/netdevice.h (that tries to determine worst-case header length) which is above 64 bytes. It is also affected by HyperV and WiFi configs.
> -----Original Message----- > From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of > Alexander Lobakin > Sent: Wednesday, December 8, 2021 7:37 PM > To: intel-wired-lan@lists.osuosl.org > Cc: Song Liu <songliubraving@fb.com>; Jesper Dangaard Brouer > <hawk@kernel.org>; Daniel Borkmann <daniel@iogearbox.net>; Yonghong > Song <yhs@fb.com>; Martin KaFai Lau <kafai@fb.com>; John Fastabend > <john.fastabend@gmail.com>; Alexei Starovoitov <ast@kernel.org>; Andrii > Nakryiko <andrii@kernel.org>; Björn Töpel <bjorn@kernel.org>; > netdev@vger.kernel.org; Jakub Kicinski <kuba@kernel.org>; KP Singh > <kpsingh@kernel.org>; bpf@vger.kernel.org; David S. Miller > <davem@davemloft.net>; linux-kernel@vger.kernel.org > Subject: [Intel-wired-lan] [PATCH v4 net-next 1/9] i40e: don't reserve > excessive XDP_PACKET_HEADROOM on XSK Rx to skb > > {__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD > + NET_IP_ALIGN for any skb. > OTOH, i40e_construct_skb_zc() currently allocates and reserves additional > `xdp->data - xdp->data_hard_start`, which is XDP_PACKET_HEADROOM for > XSK frames. > There's no need for that at all as the frame is post-XDP and will go only to the > networking stack core. > Pass the size of the actual data only to __napi_alloc_skb() and don't reserve > anything. This will give enough headroom for stack processing. > > Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support") > Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> > Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> > --- > drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> A Contingent Worker at Intel
diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c index f08d19b8c554..9564906b7da8 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c @@ -245,13 +245,11 @@ static struct sk_buff *i40e_construct_skb_zc(struct i40e_ring *rx_ring, struct sk_buff *skb; /* allocate a skb to store the frags */ - skb = __napi_alloc_skb(&rx_ring->q_vector->napi, - xdp->data_end - xdp->data_hard_start, + skb = __napi_alloc_skb(&rx_ring->q_vector->napi, datasize, GFP_ATOMIC | __GFP_NOWARN); if (unlikely(!skb)) goto out; - skb_reserve(skb, xdp->data - xdp->data_hard_start); memcpy(__skb_put(skb, datasize), xdp->data, datasize); if (metasize) skb_metadata_set(skb, metasize);