From patchwork Tue Aug 22 17:59:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jesper Dangaard Brouer X-Patchwork-Id: 13361276 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E2DE51DA54 for ; Tue, 22 Aug 2023 17:59:12 +0000 (UTC) Received: from [192.168.42.3] (194-45-78-10.static.kviknet.net [194.45.78.10]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp.kernel.org (Postfix) with ESMTPSA id 785C7C433CA; Tue, 22 Aug 2023 17:59:09 +0000 (UTC) Subject: [PATCH net-next RFC v1 1/4] veth: use same bpf_xdp_adjust_head check as generic-XDP From: Jesper Dangaard Brouer To: netdev@vger.kernel.org, edumazet@google.com Cc: Jesper Dangaard Brouer , pabeni@redhat.com, kuba@kernel.org, davem@davemloft.net, lorenzo@kernel.org, Ilias Apalodimas , mtahhan@redhat.com, huangjie.albert@bytedance.com, Yunsheng Lin , Liang Chen Date: Tue, 22 Aug 2023 19:59:07 +0200 Message-ID: <169272714720.1975370.12172959079424954030.stgit@firesoul> In-Reply-To: <169272709850.1975370.16698220879817216294.stgit@firesoul> References: <169272709850.1975370.16698220879817216294.stgit@firesoul> User-Agent: StGit/1.4 Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC This is a consistency patch, no functional change. Both veth_xdp_rcv_skb() and bpf_prog_run_generic_xdp() checks if XDP bpf_prog adjusted packet head via BPF-helper bpf_xdp_adjust_head(). The order of subtracting orig_data and xdp->data are opposite between the two functions. This is confusing when reviewing and reading the code. This patch choose to follow generic-XDP and adjust veth_xdp_rcv_skb(). In veth_xdp_rcv_skb() the skb_mac_header length have been __skb_push()'ed. Thus, we skip the skb->mac_header adjustments like bpf_prog_run_generic_xdp() and instead do a skb_reset_mac_header() as skb->data point to Eth header. Signed-off-by: Jesper Dangaard Brouer --- drivers/net/veth.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 953f6d8f8db0..be7b62f57087 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -897,12 +897,13 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, rcu_read_unlock(); /* check if bpf_xdp_adjust_head was used */ - off = orig_data - xdp->data; - if (off > 0) - __skb_push(skb, off); - else if (off < 0) - __skb_pull(skb, -off); - + off = xdp->data - orig_data; + if (off) { + if (off > 0) + __skb_pull(skb, off); + else if (off < 0) + __skb_push(skb, -off); + } skb_reset_mac_header(skb); /* check if bpf_xdp_adjust_tail was used */ From patchwork Tue Aug 22 17:59:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jesper Dangaard Brouer X-Patchwork-Id: 13361277 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 34A741D307 for ; Tue, 22 Aug 2023 17:59:19 +0000 (UTC) Received: from [192.168.42.3] (194-45-78-10.static.kviknet.net [194.45.78.10]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp.kernel.org (Postfix) with ESMTPSA id 117CFC433CC; Tue, 22 Aug 2023 17:59:15 +0000 (UTC) Subject: [PATCH net-next RFC v1 2/4] veth: use generic-XDP functions when dealing with SKBs From: Jesper Dangaard Brouer To: netdev@vger.kernel.org, edumazet@google.com Cc: Jesper Dangaard Brouer , pabeni@redhat.com, kuba@kernel.org, davem@davemloft.net, lorenzo@kernel.org, Ilias Apalodimas , mtahhan@redhat.com, huangjie.albert@bytedance.com, Yunsheng Lin , Liang Chen Date: Tue, 22 Aug 2023 19:59:14 +0200 Message-ID: <169272715407.1975370.3989385869434330916.stgit@firesoul> In-Reply-To: <169272709850.1975370.16698220879817216294.stgit@firesoul> References: <169272709850.1975370.16698220879817216294.stgit@firesoul> User-Agent: StGit/1.4 Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC The root-cause the realloc issue is that veth_xdp_rcv_skb() code path (that handles SKBs like generic-XDP) is calling a native-XDP function xdp_do_redirect(), instead of simply using xdp_do_generic_redirect() that can handle SKBs. The existing code tries to steal the packet-data from the SKB (and frees the SKB itself). This cause issues as SKBs can have different memory models that are incompatible with native-XDP call xdp_do_redirect(). For this reason the checks in veth_convert_skb_to_xdp_buff() becomes more strict. This in turn makes this a bad approach. Simply leveraging generic-XDP helpers e.g. generic_xdp_tx() and xdp_do_generic_redirect() as this resolves the issue given netstack can handle these different SKB memory models. Signed-off-by: Jesper Dangaard Brouer --- drivers/net/veth.c | 69 ++++++++++++++++++++-------------------------------- net/core/dev.c | 1 + net/core/filter.c | 1 + 3 files changed, 28 insertions(+), 43 deletions(-) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index be7b62f57087..192547035194 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -713,19 +713,6 @@ static void veth_xdp_rcv_bulk_skb(struct veth_rq *rq, void **frames, } } -static void veth_xdp_get(struct xdp_buff *xdp) -{ - struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp); - int i; - - get_page(virt_to_page(xdp->data)); - if (likely(!xdp_buff_has_frags(xdp))) - return; - - for (i = 0; i < sinfo->nr_frags; i++) - __skb_frag_ref(&sinfo->frags[i]); -} - static int veth_convert_skb_to_xdp_buff(struct veth_rq *rq, struct xdp_buff *xdp, struct sk_buff **pskb) @@ -837,7 +824,7 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, struct veth_xdp_buff vxbuf; struct xdp_buff *xdp = &vxbuf.xdp; u32 act, metalen; - int off; + int off, err; skb_prepare_for_gro(skb); @@ -860,30 +847,10 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, switch (act) { case XDP_PASS: - break; case XDP_TX: - veth_xdp_get(xdp); - consume_skb(skb); - xdp->rxq->mem = rq->xdp_mem; - if (unlikely(veth_xdp_tx(rq, xdp, bq) < 0)) { - trace_xdp_exception(rq->dev, xdp_prog, act); - stats->rx_drops++; - goto err_xdp; - } - stats->xdp_tx++; - rcu_read_unlock(); - goto xdp_xmit; case XDP_REDIRECT: - veth_xdp_get(xdp); - consume_skb(skb); - xdp->rxq->mem = rq->xdp_mem; - if (xdp_do_redirect(rq->dev, xdp, xdp_prog)) { - stats->rx_drops++; - goto err_xdp; - } - stats->xdp_redirect++; - rcu_read_unlock(); - goto xdp_xmit; + /* Postpone actions to after potential SKB geometry update */ + break; default: bpf_warn_invalid_xdp_action(rq->dev, xdp_prog, act); fallthrough; @@ -894,7 +861,6 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, stats->xdp_drops++; goto xdp_drop; } - rcu_read_unlock(); /* check if bpf_xdp_adjust_head was used */ off = xdp->data - orig_data; @@ -919,11 +885,32 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, else skb->data_len = 0; - skb->protocol = eth_type_trans(skb, rq->dev); - metalen = xdp->data - xdp->data_meta; if (metalen) skb_metadata_set(skb, metalen); + + switch (act) { + case XDP_PASS: + /* This skb_pull's off mac_len, __skb_push'ed above */ + skb->protocol = eth_type_trans(skb, rq->dev); + break; + case XDP_REDIRECT: + err = xdp_do_generic_redirect(rq->dev, skb, xdp, xdp_prog); + if (unlikely(err)) { + trace_xdp_exception(rq->dev, xdp_prog, act); + goto xdp_drop; + } + stats->xdp_redirect++; + rcu_read_unlock(); + goto xdp_xmit; + case XDP_TX: + /* TODO: this can be optimized to be veth specific */ + generic_xdp_tx(skb, xdp_prog); + stats->xdp_tx++; + rcu_read_unlock(); + goto xdp_xmit; + } + rcu_read_unlock(); out: return skb; drop: @@ -931,10 +918,6 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, xdp_drop: rcu_read_unlock(); kfree_skb(skb); - return NULL; -err_xdp: - rcu_read_unlock(); - xdp_return_buff(xdp); xdp_xmit: return NULL; } diff --git a/net/core/dev.c b/net/core/dev.c index 17e6281e408c..1187bfced9ec 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4987,6 +4987,7 @@ void generic_xdp_tx(struct sk_buff *skb, struct bpf_prog *xdp_prog) kfree_skb(skb); } } +EXPORT_SYMBOL_GPL(generic_xdp_tx); static DEFINE_STATIC_KEY_FALSE(generic_xdp_needed_key); diff --git a/net/core/filter.c b/net/core/filter.c index a094694899c9..a6fd7ba901ba 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -4443,6 +4443,7 @@ int xdp_do_generic_redirect(struct net_device *dev, struct sk_buff *skb, _trace_xdp_redirect_err(dev, xdp_prog, ri->tgt_index, err); return err; } +EXPORT_SYMBOL_GPL(xdp_do_generic_redirect); BPF_CALL_2(bpf_xdp_redirect, u32, ifindex, u64, flags) { From patchwork Tue Aug 22 17:59:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jesper Dangaard Brouer X-Patchwork-Id: 13361278 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5B8171D307 for ; Tue, 22 Aug 2023 17:59:25 +0000 (UTC) Received: from [192.168.42.3] (194-45-78-10.static.kviknet.net [194.45.78.10]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp.kernel.org (Postfix) with ESMTPSA id 8D89DC433C7; Tue, 22 Aug 2023 17:59:22 +0000 (UTC) Subject: [PATCH net-next RFC v1 3/4] veth: lift skb_head_is_locked restriction for SKB based XDP From: Jesper Dangaard Brouer To: netdev@vger.kernel.org, edumazet@google.com Cc: Jesper Dangaard Brouer , pabeni@redhat.com, kuba@kernel.org, davem@davemloft.net, lorenzo@kernel.org, Ilias Apalodimas , mtahhan@redhat.com, huangjie.albert@bytedance.com, Yunsheng Lin , Liang Chen Date: Tue, 22 Aug 2023 19:59:20 +0200 Message-ID: <169272716036.1975370.9471039093816465682.stgit@firesoul> In-Reply-To: <169272709850.1975370.16698220879817216294.stgit@firesoul> References: <169272709850.1975370.16698220879817216294.stgit@firesoul> User-Agent: StGit/1.4 Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC As veth_xdp_rcv_skb() no-longer steals SKB data and no-longer uses XDP native xdp_do_redirect(), then it is possible to handle more types of SKBs with other memory types. Replacing skb_head_is_locked() with skb_cloned(). Signed-off-by: Jesper Dangaard Brouer --- drivers/net/veth.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 192547035194..8e117cc44fda 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -720,7 +720,7 @@ static int veth_convert_skb_to_xdp_buff(struct veth_rq *rq, struct sk_buff *skb = *pskb; u32 frame_sz; - if (skb_shared(skb) || skb_head_is_locked(skb) || + if (skb_shared(skb) || skb_cloned(skb) || skb_shinfo(skb)->nr_frags || skb_headroom(skb) < XDP_PACKET_HEADROOM) { u32 size, len, max_head_size, off; From patchwork Tue Aug 22 17:59:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jesper Dangaard Brouer X-Patchwork-Id: 13361279 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AD9911D307 for ; Tue, 22 Aug 2023 17:59:31 +0000 (UTC) Received: from [192.168.42.3] (194-45-78-10.static.kviknet.net [194.45.78.10]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp.kernel.org (Postfix) with ESMTPSA id 75C82C433C7; Tue, 22 Aug 2023 17:59:28 +0000 (UTC) Subject: [PATCH net-next RFC v1 4/4] veth: when XDP is loaded increase needed_headroom From: Jesper Dangaard Brouer To: netdev@vger.kernel.org, edumazet@google.com Cc: Jesper Dangaard Brouer , pabeni@redhat.com, kuba@kernel.org, davem@davemloft.net, lorenzo@kernel.org, Ilias Apalodimas , mtahhan@redhat.com, huangjie.albert@bytedance.com, Yunsheng Lin , Liang Chen Date: Tue, 22 Aug 2023 19:59:26 +0200 Message-ID: <169272716651.1975370.10514711233878278884.stgit@firesoul> In-Reply-To: <169272709850.1975370.16698220879817216294.stgit@firesoul> References: <169272709850.1975370.16698220879817216294.stgit@firesoul> User-Agent: StGit/1.4 Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC When sending (sendmsg) SKBs out an veth device, the SKB headroom is too small to satisfy XDP on the receiving veth peer device. For AF_XDP (normal non-zero-copy) it is worth noticing that xsk_build_skb() adjust headroom according to dev->needed_headroom. Other parts of the kernel also take this into account (see macro LL_RESERVED_SPACE). This solves the XDP_PACKET_HEADROOM check in veth_convert_skb_to_xdp_buff(). Signed-off-by: Jesper Dangaard Brouer --- drivers/net/veth.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 8e117cc44fda..3630e9124071 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -1649,6 +1649,7 @@ static int veth_xdp_set(struct net_device *dev, struct bpf_prog *prog, if (!old_prog) { peer->hw_features &= ~NETIF_F_GSO_SOFTWARE; peer->max_mtu = max_mtu; + veth_set_rx_headroom(dev, XDP_PACKET_HEADROOM); } xdp_features_set_redirect_target(peer, true); @@ -1666,6 +1667,7 @@ static int veth_xdp_set(struct net_device *dev, struct bpf_prog *prog, peer->hw_features |= NETIF_F_GSO_SOFTWARE; peer->max_mtu = ETH_MAX_MTU; } + veth_set_rx_headroom(dev, -1); } bpf_prog_put(old_prog); }