From patchwork Wed Mar 31 12:28:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 12175413 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6148C43611 for ; Wed, 31 Mar 2021 12:38:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7925461A21 for ; Wed, 31 Mar 2021 12:29:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235454AbhCaM3Q (ORCPT ); Wed, 31 Mar 2021 08:29:16 -0400 Received: from mail1.protonmail.ch ([185.70.40.18]:59068 "EHLO mail1.protonmail.ch" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235723AbhCaM2m (ORCPT ); Wed, 31 Mar 2021 08:28:42 -0400 Date: Wed, 31 Mar 2021 12:28:33 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pm.me; s=protonmail; t=1617193719; bh=zI3QaYD0UyhErTHzJ9xtVjlAtErv7jv4T3mrzGJ1yG8=; h=Date:To:From:Cc:Reply-To:Subject:In-Reply-To:References:From; b=c84tXW1g3EszL3GXgeXUMymQ0g5Zg9NaGGhLaxp3OI1aGtitirwU50pNTmtC57J7F 8uyuZz4xQtRNHkpA5LDrPLYxlLcv6acHSbabi7lczADJ9VQffjOgPXX6TvuboORA6x ZFYAEh2IyrGDfs0Zk3oCLPkJimyPwpbv1YleOMgfhCmWQCTYXsSioWuJrFNcAlmwQR sjGSswH3MvgbPsH/nvBnc6w3DJxSMwAWxBDrs3pKTQuYBCoia2Re0sd9zYQ4gl41Mu IiVqf9FqLY5Wccqd9NpuGzHkhjzEuyZYn5Q3Hyni2Wprffnt5Stg20RNv0jNk2Zm2v YygrVsSlVR4ZA== To: Alexei Starovoitov , Daniel Borkmann From: Alexander Lobakin Cc: Xuan Zhuo , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Magnus Karlsson , Jonathan Lemon , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , Alexander Lobakin , netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org Reply-To: Alexander Lobakin Subject: [PATCH v2 bpf-next 1/2] xsk: speed-up generic full-copy xmit Message-ID: <20210331122820.6356-1-alobakin@pm.me> In-Reply-To: <20210331122602.6000-1-alobakin@pm.me> References: <20210331122602.6000-1-alobakin@pm.me> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net There are a few moments that are known for sure at the moment of copying: - allocated skb is fully linear; - its linear space is long enough to hold the full buffer data. So, the out-of-line skb_put(), skb_store_bits() and the check for a retcode can be replaced with plain memcpy(__skb_put()) with no loss. Also align memcpy()'s len to sizeof(long) to improve its performance. Signed-off-by: Alexander Lobakin --- net/xdp/xsk.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) -- 2.31.1 diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index a71ed664da0a..41f8f21b3348 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -517,14 +517,9 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs, return ERR_PTR(err); skb_reserve(skb, hr); - skb_put(skb, len); buffer = xsk_buff_raw_get_data(xs->pool, desc->addr); - err = skb_store_bits(skb, 0, buffer, len); - if (unlikely(err)) { - kfree_skb(skb); - return ERR_PTR(err); - } + memcpy(__skb_put(skb, len), buffer, ALIGN(len, sizeof(long))); } skb->dev = dev; From patchwork Wed Mar 31 12:28:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 12175415 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18EAEC001B4 for ; Wed, 31 Mar 2021 12:38:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AF4A961A20 for ; Wed, 31 Mar 2021 12:29:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235506AbhCaM3R (ORCPT ); Wed, 31 Mar 2021 08:29:17 -0400 Received: from mail-40133.protonmail.ch ([185.70.40.133]:12251 "EHLO mail-40133.protonmail.ch" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235684AbhCaM2o (ORCPT ); Wed, 31 Mar 2021 08:28:44 -0400 Date: Wed, 31 Mar 2021 12:28:40 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pm.me; s=protonmail; t=1617193722; bh=DzNNIhtueDbi5knSFF79jDSW/yjamllT21Rs5yAKrfU=; h=Date:To:From:Cc:Reply-To:Subject:In-Reply-To:References:From; b=Kiw5ofMJM0PoEIqciBjRNDDMl1JXXl/7oolEPXqHxm6Pc6fAd7cra4+dtICoPoeh0 BiCIIgSL7w2rB7iHpkK55IPXHllbKzUktAimLk/a6gyj8fiJgXGt45DHNm37YP4GSq euu1UogxpGYjrVb9naM1L7rG7ekltnJa8irSdcC45Ni8fsBYpNTf67ZJRz6+1PWMhP DvPYDY83zvry5hFdv1k0VxZ7N2lbDHx1leA/cYa/0TwG1xwAJx7KReTIFnlIJNCva9 GVQlYRe/b4yMmXJhviuscQ0UU5LYtUBseHqPf0Mi70/hP6xT41G8Q7pc5jXldOdNCo I4shqL0Ycj0Pw== To: Alexei Starovoitov , Daniel Borkmann From: Alexander Lobakin Cc: Xuan Zhuo , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Magnus Karlsson , Jonathan Lemon , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , Alexander Lobakin , netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org Reply-To: Alexander Lobakin Subject: [PATCH v2 bpf-next 2/2] xsk: introduce generic almost-zerocopy xmit Message-ID: <20210331122820.6356-2-alobakin@pm.me> In-Reply-To: <20210331122820.6356-1-alobakin@pm.me> References: <20210331122602.6000-1-alobakin@pm.me> <20210331122820.6356-1-alobakin@pm.me> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net The reasons behind IFF_TX_SKB_NO_LINEAR are: - most drivers expect skb with the linear space; - most drivers expect hard header in the linear space; - many drivers need some headroom to insert custom headers and/or pull headers from frags (pskb_may_pull() etc.). With some bits of overhead, we can satisfy all of this without inducing full buffer data copy. Now frames that are bigger than 128 bytes (to mitigate allocation overhead) are also being built using zerocopy path (if the device and driver support S/G xmit, which is almost always true). We allocate 256* additional bytes for skb linear space and pull hard header there (aligning its end by 16 bytes for platforms with NET_IP_ALIGN). The rest of the buffer data is just pinned as frags. A room of at least 240 bytes is left for any driver needs. We could just pass the buffer to eth_get_headlen() to minimize allocation overhead and be able to copy all the headers into the linear space, but the flow dissection procedure tends to be more expensive than the current approach. IFF_TX_SKB_NO_LINEAR path remains unchanged and is still actual and generally faster. * The value of 256 bytes is kinda "magic", it can be found in lots of drivers and places of core code and it is believed that 256 bytes are enough to store any headers of any frame. Cc: Xuan Zhuo Signed-off-by: Alexander Lobakin --- net/xdp/xsk.c | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-) -- 2.31.1 diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 41f8f21b3348..1d241f87422c 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -445,6 +445,9 @@ static void xsk_destruct_skb(struct sk_buff *skb) sock_wfree(skb); } +#define XSK_SKB_HEADLEN 256 +#define XSK_COPY_THRESHOLD (XSK_SKB_HEADLEN / 2) + static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs, struct xdp_desc *desc) { @@ -452,13 +455,21 @@ static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs, u32 hr, len, ts, offset, copy, copied; struct sk_buff *skb; struct page *page; + bool need_pull; void *buffer; int err, i; u64 addr; hr = max(NET_SKB_PAD, L1_CACHE_ALIGN(xs->dev->needed_headroom)); + len = hr; + + need_pull = !(xs->dev->priv_flags & IFF_TX_SKB_NO_LINEAR); + if (need_pull) { + len += XSK_SKB_HEADLEN; + hr += NET_IP_ALIGN; + } - skb = sock_alloc_send_skb(&xs->sk, hr, 1, &err); + skb = sock_alloc_send_skb(&xs->sk, len, 1, &err); if (unlikely(!skb)) return ERR_PTR(err); @@ -488,6 +499,11 @@ static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs, skb->data_len += len; skb->truesize += ts; + if (need_pull && unlikely(!__pskb_pull_tail(skb, ETH_HLEN))) { + kfree_skb(skb); + return ERR_PTR(-ENOMEM); + } + refcount_add(ts, &xs->sk.sk_wmem_alloc); return skb; @@ -498,19 +514,20 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs, { struct net_device *dev = xs->dev; struct sk_buff *skb; + u32 len = desc->len; - if (dev->priv_flags & IFF_TX_SKB_NO_LINEAR) { + if ((dev->priv_flags & IFF_TX_SKB_NO_LINEAR) || + (len > XSK_COPY_THRESHOLD && likely(dev->features & NETIF_F_SG))) { skb = xsk_build_skb_zerocopy(xs, desc); if (IS_ERR(skb)) return skb; } else { - u32 hr, tr, len; void *buffer; + u32 hr, tr; int err; hr = max(NET_SKB_PAD, L1_CACHE_ALIGN(dev->needed_headroom)); tr = dev->needed_tailroom; - len = desc->len; skb = sock_alloc_send_skb(&xs->sk, hr + len + tr, 1, &err); if (unlikely(!skb))