From patchwork Wed Dec 14 23:25:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13073687 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AFF3FC46467 for ; Wed, 14 Dec 2022 23:28:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229957AbiLNX23 (ORCPT ); Wed, 14 Dec 2022 18:28:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58582 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229600AbiLNX2D (ORCPT ); Wed, 14 Dec 2022 18:28:03 -0500 Received: from wout4-smtp.messagingengine.com (wout4-smtp.messagingengine.com [64.147.123.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C2DE3B9FF; Wed, 14 Dec 2022 15:26:07 -0800 (PST) Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailout.west.internal (Postfix) with ESMTP id 3697C32008FF; Wed, 14 Dec 2022 18:26:06 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Wed, 14 Dec 2022 18:26:07 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm1; t=1671060365; x=1671146765; bh=zX M0VhlCciX8F/i6KXAGG0kiwhCDpeRhU1M+a6/y414=; b=QAz90bISldXoMkbDut O6BP+qovTz0JRF8nw6ZqC82EAec+qLHrfd0V1PBGZSOPN2FKC+/uwB37Oi/4XcD/ c6GtmKrA8CXBp2hdUWCyi18rKgfT9n+R0mgEvcv+vUAMOCMTiRXcee211MCdZlhv 4i9F9/2aoL1/8pppDmpMpc/xjNJxsfuKpwsandLTN7G/kGB7MYrhF+B9MXHKViz9 fHDtJTwAK3KpadTK02ubYecuPk5TSZkYSoxuR5CxFWUGmQMMbnlsVs+AyaEaojUf TRoMpuElUVQP9CvQY9L2Oo9msep1Q6A1l/D2fW91zVvMcQLgdBcsuTWGKShQOVU6 JqJQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm2; t=1671060365; x=1671146765; bh=zXM0VhlCciX8F /i6KXAGG0kiwhCDpeRhU1M+a6/y414=; b=L5TwVjkUGMM+MIORXLq7Py8MuNRDx LxXLyrTqwsx5teMAoBH2G4L22AC0MQkTzOlh+wuh+bARDcyvaHvJ0ULHvspfkN/x cH4SpzWTjBtyg4hcWtpOZuEiwJlB5W831slPn061IxX3Zkb95jXO5kX9NEh3joFg KQix3/AZbpJLmCcIRJ72fXxzy2fPWzHRi144eXGTMDdUHynpebQ+zjnhuOaVk1Mf TeiUQBoIkssQwKP+lOZwdwiilyhZlD0fVHpqkt96Asa4OU6ViyqITN786834egis YFql6TRxBTSTYl7eXcRb9osdqr2sar5xoq6SZNeFHS5klE7O5Yb2ruqBQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrfeeggddtjecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenfg hrlhcuvffnffculdejtddmnecujfgurhephffvvefufffkofgjfhgggfestdekredtredt tdenucfhrhhomhepffgrnhhivghlucgiuhcuoegugihusegugihuuhhurdighiiiqeenuc ggtffrrghtthgvrhhnpefgfefggeejhfduieekvdeuteffleeifeeuvdfhheejleejjeek gfffgefhtddtteenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfh hrohhmpegugihusegugihuuhhurdighiii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 14 Dec 2022 18:26:04 -0500 (EST) From: Daniel Xu To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Hideaki YOSHIFUJI , David Ahern Cc: ppenkov@aviatrix.com, dbird@aviatrix.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, bpf@vger.kernel.org Subject: [PATCH bpf-next 1/6] ip: frags: Return actual error codes from ip_check_defrag() Date: Wed, 14 Dec 2022 16:25:28 -0700 Message-Id: X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Once we wrap ip_check_defrag() in a kfunc, it may be useful for progs to know the exact error condition ip_check_defrag() encountered. Signed-off-by: Daniel Xu --- drivers/net/macvlan.c | 2 +- net/ipv4/ip_fragment.c | 11 ++++++----- net/packet/af_packet.c | 2 +- 3 files changed, 8 insertions(+), 7 deletions(-) diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index 99a971929c8e..b8310e13d7e1 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -456,7 +456,7 @@ static rx_handler_result_t macvlan_handle_frame(struct sk_buff **pskb) unsigned int hash; skb = ip_check_defrag(dev_net(skb->dev), skb, IP_DEFRAG_MACVLAN); - if (!skb) + if (IS_ERR(skb)) return RX_HANDLER_CONSUMED; *pskb = skb; eth = eth_hdr(skb); diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index 69c00ffdcf3e..7406c6b6376d 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -514,6 +514,7 @@ struct sk_buff *ip_check_defrag(struct net *net, struct sk_buff *skb, u32 user) struct iphdr iph; int netoff; u32 len; + int err; if (skb->protocol != htons(ETH_P_IP)) return skb; @@ -535,15 +536,15 @@ struct sk_buff *ip_check_defrag(struct net *net, struct sk_buff *skb, u32 user) if (skb) { if (!pskb_may_pull(skb, netoff + iph.ihl * 4)) { kfree_skb(skb); - return NULL; + return ERR_PTR(-ENOMEM); } - if (pskb_trim_rcsum(skb, netoff + len)) { + if ((err = pskb_trim_rcsum(skb, netoff + len))) { kfree_skb(skb); - return NULL; + return ERR_PTR(err); } memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); - if (ip_defrag(net, skb, user)) - return NULL; + if ((err = ip_defrag(net, skb, user))) + return ERR_PTR(err); skb_clear_hash(skb); } } diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 41c4ccc3a5d6..cf706f98448e 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -1472,7 +1472,7 @@ static int packet_rcv_fanout(struct sk_buff *skb, struct net_device *dev, if (fanout_has_flag(f, PACKET_FANOUT_FLAG_DEFRAG)) { skb = ip_check_defrag(net, skb, IP_DEFRAG_AF_PACKET); - if (!skb) + if (IS_ERR(skb)) return 0; } switch (f->type) { From patchwork Wed Dec 14 23:25:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13073686 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7ACDEC10F31 for ; Wed, 14 Dec 2022 23:28:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229625AbiLNX2b (ORCPT ); Wed, 14 Dec 2022 18:28:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57676 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229614AbiLNX2E (ORCPT ); Wed, 14 Dec 2022 18:28:04 -0500 Received: from wout4-smtp.messagingengine.com (wout4-smtp.messagingengine.com [64.147.123.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 23FB5442E3; Wed, 14 Dec 2022 15:26:11 -0800 (PST) Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailout.west.internal (Postfix) with ESMTP id 7DB95320070D; Wed, 14 Dec 2022 18:26:09 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Wed, 14 Dec 2022 18:26:10 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm1; t=1671060369; x=1671146769; bh=2H //0Y9RAGciK3bd0PHbQuXcwoJvkHSOlJrDofxj0TQ=; b=NJF57KhZRRHZv9r1KI w85sWf7gHbiL3cPoHEEmvQaLA1D19YDCUCbC8jCyjbewolrunXGZxEvsftXXmHJY ZiZGAsTFeSIwuEMOvO8+qo9SP2wbdadp0bwS5047K29uUfroQ3mCBv8aBPEKrvrz ax363pnE66mvXugJKistXkPFOyfKCOLe7C1WAmocPI8qgu+oPZeHArwlZxJF1LdX kxprYT5jvjNuioAtFfz1MhIq3p3gvVOBPUClI8JuSi6G4V1q8NCqcclyINCBFDJZ njYjTQG1Qd9URCnz5UZqaZvxvvn6OP2d7kshMxGbnOIC1YxBkkNJmhCuP6Jpc8sF 5KfQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm2; t=1671060369; x=1671146769; bh=2H//0Y9RAGciK 3bd0PHbQuXcwoJvkHSOlJrDofxj0TQ=; b=V8UGbeDVbBUBu96dosQh5lLxITdvy H95LItiFmZNrp08k2Ppmbdf6FlSbQfFbgbUPbGRk0iR47jcxhBZiaNpN51iv62r5 v2cohqnDCC9W0ORS1sqdpIVAbd2u+gJvplOcSFssVSZdRsUyxWatcSWyftrYHm2v qdaxyGVps3bIz8IGcP1XJTOfw35SqtcIhRAQPxoZsukQPUTXujWxCx9yVUJ4gFo1 9HLiym0KI3z721R7/sxyyJeLTlZOEPSSY/DSA1z6EX4GxJ9HCCz/wpFD95s9F9H+ ZK3ZFc8QoBpNMmvSCRMYBcunBeYUOjlk5IMdCxAfkCT9gpxiRpvfiwmSA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrfeeggddtjecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenfg hrlhcuvffnffculdejtddmnecujfgurhephffvvefufffkofgjfhgggfestdekredtredt tdenucfhrhhomhepffgrnhhivghlucgiuhcuoegugihusegugihuuhhurdighiiiqeenuc ggtffrrghtthgvrhhnpefgfefggeejhfduieekvdeuteffleeifeeuvdfhheejleejjeek gfffgefhtddtteenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfh hrohhmpegugihusegugihuuhhurdighiii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 14 Dec 2022 18:26:07 -0500 (EST) From: Daniel Xu To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Jonathan Corbet Cc: ppenkov@aviatrix.com, dbird@aviatrix.com, bpf@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH bpf-next 2/6] bpf: verifier: Support KF_CHANGES_PKT flag Date: Wed, 14 Dec 2022 16:25:29 -0700 Message-Id: X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net KF_CHANGES_PKT indicates that the kfunc call may change packet data. This is analogous to bpf_helper_changes_pkt_data(). Signed-off-by: Daniel Xu --- Documentation/bpf/kfuncs.rst | 7 +++++++ include/linux/btf.h | 1 + kernel/bpf/verifier.c | 8 ++++++++ 3 files changed, 16 insertions(+) diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst index 9fd7fb539f85..061ab392a02f 100644 --- a/Documentation/bpf/kfuncs.rst +++ b/Documentation/bpf/kfuncs.rst @@ -200,6 +200,13 @@ single argument which must be a trusted argument or a MEM_RCU pointer. The argument may have reference count of 0 and the kfunc must take this into consideration. +2.4.9 KF_CHANGES_PKT flag +----------------- + +The KF_CHANGES_PKT is used for kfuncs that may change packet data. +After calls to such kfuncs, existing packet pointers will be invalidated +and must be revalidated before the prog can access packet data. + 2.5 Registering the kfuncs -------------------------- diff --git a/include/linux/btf.h b/include/linux/btf.h index 5f628f323442..0575f530e40b 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -71,6 +71,7 @@ #define KF_SLEEPABLE (1 << 5) /* kfunc may sleep */ #define KF_DESTRUCTIVE (1 << 6) /* kfunc performs destructive actions */ #define KF_RCU (1 << 7) /* kfunc only takes rcu pointer arguments */ +#define KF_CHANGES_PKT (1 << 8) /* kfunc may change packet data */ /* * Return the name of the passed struct, if exists, or halt the build if for diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index a5255a0dcbb6..0ac505cbd6ba 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -8213,6 +8213,11 @@ static bool is_kfunc_rcu(struct bpf_kfunc_call_arg_meta *meta) return meta->kfunc_flags & KF_RCU; } +static bool is_kfunc_changes_pkt(struct bpf_kfunc_call_arg_meta *meta) +{ + return meta->kfunc_flags & KF_CHANGES_PKT; +} + static bool is_kfunc_arg_kptr_get(struct bpf_kfunc_call_arg_meta *meta, int arg) { return arg == 0 && (meta->kfunc_flags & KF_KPTR_GET); @@ -9313,6 +9318,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, mark_btf_func_reg_size(env, regno, t->size); } + if (is_kfunc_changes_pkt(&meta)) + clear_all_pkt_pointers(env); + return 0; } From patchwork Wed Dec 14 23:25:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13073688 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EA52C4708D for ; Wed, 14 Dec 2022 23:28:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229976AbiLNX2c (ORCPT ); Wed, 14 Dec 2022 18:28:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57692 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229477AbiLNX2G (ORCPT ); Wed, 14 Dec 2022 18:28:06 -0500 Received: from wout4-smtp.messagingengine.com (wout4-smtp.messagingengine.com [64.147.123.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B412249B6B; Wed, 14 Dec 2022 15:26:14 -0800 (PST) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.west.internal (Postfix) with ESMTP id D8F5A32008C0; Wed, 14 Dec 2022 18:26:12 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Wed, 14 Dec 2022 18:26:13 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm1; t=1671060372; x=1671146772; bh=KB GCcVFV9pdga3P7P36DjTI4Zsnw96kBATcVDh8ugd0=; b=KeHjTp6oF8Buv9l5WR Q50uzwI+/AEohr4Z19sGhlRkJyONAzTBQ+a4nDMkms78SDfSk/4yDw2CvQAUUXvx gbOuWuwxYwXD7Ueof3vbPUfowBPoMTXrt6iNM3sPe9dFpZ0lg35Cc7LS7vkTtqDe eOZhe1D2q3GCPGDeKLctuiiJB+xtnj5zIqd6PJpBM2g624DiNGQj1Q4mdvz++8OD YpGp+p+Dyuajt0ZmgWrGGfXRu4hEg7aGneshXiAvbG1MqOmQhVIlNbOKKEDc3Q0T 2OEZ04WZ3n8amJDxqQvPlAE40nr46mRIqU2XWvs2RKVCYmikbuw+RfSoLLxzErUA Dm+Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm2; t=1671060372; x=1671146772; bh=KBGCcVFV9pdga 3P7P36DjTI4Zsnw96kBATcVDh8ugd0=; b=ZbY8sZhJ/ZesGsBcEXE9DgB1vzmZI cpA59ckz9VBBGhlAmxEIqdoK7VV04RSWc+ZY7UvlifczSiarh7wEyh5K21UqGrDq fzBIb9FJqlkdHkIz7RQ6PjArWg/clyIAn/UIevHj5y2H0VOFh0AKfKUf8ibNRTva OtkISjca6mBBV9xvkMuq8oxQo+qBgeIsLcnBz4qvf1SsFHPLhyZ6AhPbT5hF88bO nWWZvLSZjmuabsggwAauGpHSuC4heLXPBkhJip0em3OyEZyaIMI4mej5qk0DAbru f7ZAeTBmNmwWUp+L26t4+p8Szh91tt0azKYJRVzDT7kaFdXfEVKAvzU5w== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrfeeggddtiecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenfg hrlhcuvffnffculdejtddmnecujfgurhephffvvefufffkofgjfhgggfestdekredtredt tdenucfhrhhomhepffgrnhhivghlucgiuhcuoegugihusegugihuuhhurdighiiiqeenuc ggtffrrghtthgvrhhnpefgfefggeejhfduieekvdeuteffleeifeeuvdfhheejleejjeek gfffgefhtddtteenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfh hrohhmpegugihusegugihuuhhurdighiii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 14 Dec 2022 18:26:11 -0500 (EST) From: Daniel Xu To: "David S. Miller" , Hideaki YOSHIFUJI , David Ahern , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: ppenkov@aviatrix.com, dbird@aviatrix.com, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Subject: [PATCH bpf-next 3/6] bpf, net, frags: Add bpf_ip_check_defrag() kfunc Date: Wed, 14 Dec 2022 16:25:30 -0700 Message-Id: <1f48a340a898c4d22d65e0e445dbf15f72081b9a.1671049840.git.dxu@dxuuu.xyz> X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This kfunc is used to defragment IPv4 packets. The idea is that if you see a fragmented packet, you call this kfunc. If the kfunc returns 0, then the skb has been updated to contain the entire reassembled packet. If the kfunc returns an error (most likely -EINPROGRESS), then it means the skb is part of a yet-incomplete original packet. A reasonable response to -EINPROGRESS is to drop the packet, as the ip defrag infrastructure is already hanging onto the frag for future reassembly. Care has been taken to ensure the prog skb remains valid no matter what the underlying ip_check_defrag() call does. This is in contrast to ip_defrag(), which may consume the skb if the skb is part of a yet-incomplete original packet. So far this kfunc is only callable from TC clsact progs. Signed-off-by: Daniel Xu --- include/net/ip.h | 11 +++++ net/ipv4/Makefile | 1 + net/ipv4/ip_fragment.c | 2 + net/ipv4/ip_fragment_bpf.c | 98 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 112 insertions(+) create mode 100644 net/ipv4/ip_fragment_bpf.c diff --git a/include/net/ip.h b/include/net/ip.h index 144bdfbb25af..14f1e69a6523 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -679,6 +679,7 @@ enum ip_defrag_users { IP_DEFRAG_VS_FWD, IP_DEFRAG_AF_PACKET, IP_DEFRAG_MACVLAN, + IP_DEFRAG_BPF, }; /* Return true if the value of 'user' is between 'lower_bond' @@ -692,6 +693,16 @@ static inline bool ip_defrag_user_in_between(u32 user, } int ip_defrag(struct net *net, struct sk_buff *skb, u32 user); + +#ifdef CONFIG_DEBUG_INFO_BTF +int register_ip_frag_bpf(void); +#else +static inline int register_ip_frag_bpf(void) +{ + return 0; +} +#endif + #ifdef CONFIG_INET struct sk_buff *ip_check_defrag(struct net *net, struct sk_buff *skb, u32 user); #else diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile index af7d2cf490fb..749da1599933 100644 --- a/net/ipv4/Makefile +++ b/net/ipv4/Makefile @@ -64,6 +64,7 @@ obj-$(CONFIG_TCP_CONG_ILLINOIS) += tcp_illinois.o obj-$(CONFIG_NET_SOCK_MSG) += tcp_bpf.o obj-$(CONFIG_BPF_SYSCALL) += udp_bpf.o obj-$(CONFIG_NETLABEL) += cipso_ipv4.o +obj-$(CONFIG_DEBUG_INFO_BTF) += ip_fragment_bpf.o obj-$(CONFIG_XFRM) += xfrm4_policy.o xfrm4_state.o xfrm4_input.o \ xfrm4_output.o xfrm4_protocol.o diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index 7406c6b6376d..467aa8ace9fb 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -757,5 +757,7 @@ void __init ipfrag_init(void) if (inet_frags_init(&ip4_frags)) panic("IP: failed to allocate ip4_frags cache\n"); ip4_frags_ctl_register(); + if (register_ip_frag_bpf()) + panic("IP: bpf: failed to register ip_frag_bpf\n"); register_pernet_subsys(&ip4_frags_ops); } diff --git a/net/ipv4/ip_fragment_bpf.c b/net/ipv4/ip_fragment_bpf.c new file mode 100644 index 000000000000..a9e5908ed216 --- /dev/null +++ b/net/ipv4/ip_fragment_bpf.c @@ -0,0 +1,98 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Unstable ipv4 fragmentation helpers for TC-BPF hook + * + * These are called from SCHED_CLS BPF programs. Note that it is allowed to + * break compatibility for these functions since the interface they are exposed + * through to BPF programs is explicitly unstable. + */ + +#include +#include +#include +#include +#include +#include +#include + +__diag_push(); +__diag_ignore_all("-Wmissing-prototypes", + "Global functions as their definitions will be in ip_fragment BTF"); + +/* bpf_ip_check_defrag - Defragment an ipv4 packet + * + * This helper takes an skb as input. If this skb successfully reassembles + * the original packet, the skb is updated to contain the original, reassembled + * packet. + * + * Otherwise (on error or incomplete reassembly), the input skb remains + * unmodified. + * + * Parameters: + * @ctx - Pointer to program context (skb) + * @netns - Child network namespace id. If value is a negative signed + * 32-bit integer, the netns of the device in the skb is used. + * + * Return: + * 0 on successfully reassembly or non-fragmented packet. Negative value on + * error or incomplete reassembly. + */ +int bpf_ip_check_defrag(struct __sk_buff *ctx, u64 netns) +{ + struct sk_buff *skb = (struct sk_buff *)ctx; + struct sk_buff *skb_cpy, *skb_out; + struct net *caller_net; + struct net *net; + int mac_len; + void *mac; + + if (unlikely(!((s32)netns < 0 || netns <= S32_MAX))) + return -EINVAL; + + caller_net = skb->dev ? dev_net(skb->dev) : sock_net(skb->sk); + if ((s32)netns < 0) { + net = caller_net; + } else { + net = get_net_ns_by_id(caller_net, netns); + if (unlikely(!net)) + return -EINVAL; + } + + mac_len = skb->mac_len; + skb_cpy = skb_copy(skb, GFP_ATOMIC); + if (!skb_cpy) + return -ENOMEM; + + skb_out = ip_check_defrag(net, skb_cpy, IP_DEFRAG_BPF); + if (IS_ERR(skb_out)) + return PTR_ERR(skb_out); + + skb_morph(skb, skb_out); + kfree_skb(skb_out); + + /* ip_check_defrag() does not maintain mac header, so push empty header + * in so prog sees the correct layout. The empty mac header will be + * later pulled from cls_bpf. + */ + mac = skb_push(skb, mac_len); + memset(mac, 0, mac_len); + bpf_compute_data_pointers(skb); + + return 0; +} + +__diag_pop() + +BTF_SET8_START(ip_frag_kfunc_set) +BTF_ID_FLAGS(func, bpf_ip_check_defrag, KF_CHANGES_PKT) +BTF_SET8_END(ip_frag_kfunc_set) + +static const struct btf_kfunc_id_set ip_frag_bpf_kfunc_set = { + .owner = THIS_MODULE, + .set = &ip_frag_kfunc_set, +}; + +int register_ip_frag_bpf(void) +{ + return register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, + &ip_frag_bpf_kfunc_set); +} From patchwork Wed Dec 14 23:25:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13073699 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC3ADC2D0CB for ; Wed, 14 Dec 2022 23:28:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229730AbiLNX2i (ORCPT ); Wed, 14 Dec 2022 18:28:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58656 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229697AbiLNX2H (ORCPT ); Wed, 14 Dec 2022 18:28:07 -0500 Received: from wout4-smtp.messagingengine.com (wout4-smtp.messagingengine.com [64.147.123.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C53CB4B98A; Wed, 14 Dec 2022 15:26:17 -0800 (PST) Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.west.internal (Postfix) with ESMTP id 18413320046E; Wed, 14 Dec 2022 18:26:16 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute3.internal (MEProxy); Wed, 14 Dec 2022 18:26:17 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm1; t=1671060375; x=1671146775; bh=pd 7hPqCj1rehq8dhbnt6R5oY2ttXZHs4/Flb91zMciw=; b=L1GSh0Y2wzAAZIidc3 ymLqrA6lW80edFw11+ahwc+Qmtd8xkDNi0OS5rUjwEKP3zuEZ4C8+m1Id92jq4zE 0dLVz4+CDU3oZW8qozAIxJefisJ2nSnjDjLWLqm5VtLdf+1b4+OXIT8Iz1YzJyGe xNjChP35Gq9rEmfyL6lR0qFIobM22nrpmV78sCa9LpXH5bpuPv2bvBoO6rZhwqvL prRFGS8n6/6nhc94zskG+7kR55Q4fiBuFQyPGoPpm9L1sm7oeMnGcxMFOs5OAaES 9BSLVaCIKk8DnxNGMOZwDRob7LxwpePNH5WerLNsAPvQMh+0eu7edGP9du1kJYka jWZg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm2; t=1671060375; x=1671146775; bh=pd7hPqCj1rehq 8dhbnt6R5oY2ttXZHs4/Flb91zMciw=; b=DsPoClf0RQVxj54YXJw/qv0CQrrrn LmN3WpfoJIgkwv81pLN+QHifyZLE+QdVOpI4t2PrKYEnxJfqp4bEqyQ8QHaBILyc HJ+oRsMeYO/GkxEKNsFf4eT9dICKQIPOew+YG6Bw5AoXZaSczquzhGqCYnLyAU/C w3nMIIae3zjNnxBp82GQEc0p/jD6+cSWTYSn4Go4KIgZu+RkP61SU4xt01PSp8fW 1HGQbqMjeL5Hpmoebjwwkmn2w9j2Y43JzUTpoPg9Ltc12AlABCZGxQQn/iR408jo 25E9V3MX5ah/AX03iQ+kWPfX54f/dGrSlxZtNl5IDENVip5uITuzOQumg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrfeeggddtjecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenfg hrlhcuvffnffculdejtddmnecujfgurhephffvvefufffkofgjfhgggfestdekredtredt tdenucfhrhhomhepffgrnhhivghlucgiuhcuoegugihusegugihuuhhurdighiiiqeenuc ggtffrrghtthgvrhhnpefgfefggeejhfduieekvdeuteffleeifeeuvdfhheejleejjeek gfffgefhtddtteenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfh hrohhmpegugihusegugihuuhhurdighiii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 14 Dec 2022 18:26:14 -0500 (EST) From: Daniel Xu To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan Cc: ppenkov@aviatrix.com, dbird@aviatrix.com, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH bpf-next 4/6] bpf: selftests: Support not connecting client socket Date: Wed, 14 Dec 2022 16:25:31 -0700 Message-Id: <3623f9fbdfe7e1be46ca6745312eb020329d23c9.1671049840.git.dxu@dxuuu.xyz> X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net For connectionless protocols or raw sockets we do not want to actually connect() to the server. Signed-off-by: Daniel Xu --- tools/testing/selftests/bpf/network_helpers.c | 5 +++-- tools/testing/selftests/bpf/network_helpers.h | 1 + 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c index 01de33191226..24f5efebc7dd 100644 --- a/tools/testing/selftests/bpf/network_helpers.c +++ b/tools/testing/selftests/bpf/network_helpers.c @@ -301,8 +301,9 @@ int connect_to_fd_opts(int server_fd, const struct network_helper_opts *opts) strlen(opts->cc) + 1)) goto error_close; - if (connect_fd_to_addr(fd, &addr, addrlen, opts->must_fail)) - goto error_close; + if (!opts->noconnect) + if (connect_fd_to_addr(fd, &addr, addrlen, opts->must_fail)) + goto error_close; return fd; diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h index f882c691b790..8be04cd76d8b 100644 --- a/tools/testing/selftests/bpf/network_helpers.h +++ b/tools/testing/selftests/bpf/network_helpers.h @@ -21,6 +21,7 @@ struct network_helper_opts { const char *cc; int timeout_ms; bool must_fail; + bool noconnect; }; /* ipv4 test vector */ From patchwork Wed Dec 14 23:25:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13073698 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14DB3C4167B for ; Wed, 14 Dec 2022 23:28:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229595AbiLNX2f (ORCPT ); Wed, 14 Dec 2022 18:28:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58286 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229719AbiLNX2H (ORCPT ); Wed, 14 Dec 2022 18:28:07 -0500 Received: from wout4-smtp.messagingengine.com (wout4-smtp.messagingengine.com [64.147.123.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7453D4A07E; Wed, 14 Dec 2022 15:26:21 -0800 (PST) Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailout.west.internal (Postfix) with ESMTP id C0C2932008FD; Wed, 14 Dec 2022 18:26:19 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Wed, 14 Dec 2022 18:26:21 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm1; t=1671060379; x=1671146779; bh=BD uLvltuZiCfQIpl/cg6UT6rVLB+E4UVzC26FRWH+yc=; b=v0duqfnizi6TXQq0pv tV0Mtr9tJguGjJjr9lhJDVIsWIjj0l2hEI6imQgrX/wGZDbyGsgRV3LXJqvJ71YG hhNzVPpyVwB3yDcqBWvMPRdfo5EMF/GrGer7yXJGjPaJkOUSlzlCpqRKWbWpKQmJ pOiNkef22a5HuMT072LceBUyShZqIrg4m/yCID8tojqGYpEIqWOKLOkN7QNhqrbQ 8KD8vuqOoBup/9lPXEnj5c/8DsFCf6rkUe3u10a6Xpryg5jMlvEr+7cpF/btxyBb rG+jDWXq87fCgSUCG9afQvVT4KJs8yIBcoQ9pcnLM+S5laneM8MKP1MgwPCg2qbd 8kOg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm2; t=1671060379; x=1671146779; bh=BDuLvltuZiCfQ Ipl/cg6UT6rVLB+E4UVzC26FRWH+yc=; b=cC7kZ9LHhbZsYpHUpfWe/Q7WhbFu3 p/metdPtPy7+iJt71YvXtkl1JIpJLqjBpAseCDlEUUh5dR4KhWcQU9B43rzxFM9h uizLbcpKEh56jf1OBg3ZfQEauFf9YN28c9C89eJux+App++SWYqCh9RM0+OIXXrB 2jjDGGWg2+MBeh8wWo2W4eumqEybjVyC2AC7+szWmIf+N5tsx/XNixMHx1pxhU+k KSVAvIVj9yWdG8ApUE7V9NZcZo7m+ezcdp5qrkc1z2W0KNbH8EgKwZ0F9BaPjKP0 P7mvs+EJegrBg+xADPEi4bw7sF6b60ZDG70ot5rsZjCx/iVBvyKD10kJQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrfeeggddtjecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenfg hrlhcuvffnffculdejtddmnecujfgurhephffvvefufffkofgjfhgggfestdekredtredt tdenucfhrhhomhepffgrnhhivghlucgiuhcuoegugihusegugihuuhhurdighiiiqeenuc ggtffrrghtthgvrhhnpefgfefggeejhfduieekvdeuteffleeifeeuvdfhheejleejjeek gfffgefhtddtteenucevlhhushhtvghrufhiiigvpedvnecurfgrrhgrmhepmhgrihhlfh hrohhmpegugihusegugihuuhhurdighiii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 14 Dec 2022 18:26:17 -0500 (EST) From: Daniel Xu To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan Cc: ppenkov@aviatrix.com, dbird@aviatrix.com, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH bpf-next 5/6] bpf: selftests: Support custom type and proto for client sockets Date: Wed, 14 Dec 2022 16:25:32 -0700 Message-Id: <892b579588b4985fdae013fc752fa1b2c2315e34.1671049840.git.dxu@dxuuu.xyz> X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Extend connect_to_fd_opts() to take optional type and protocol parameters for the client socket. These parameters are useful when opening a raw socket to send IP fragments. Signed-off-by: Daniel Xu --- tools/testing/selftests/bpf/network_helpers.c | 21 +++++++++++++------ tools/testing/selftests/bpf/network_helpers.h | 2 ++ 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c index 24f5efebc7dd..4f9ba90b1b7e 100644 --- a/tools/testing/selftests/bpf/network_helpers.c +++ b/tools/testing/selftests/bpf/network_helpers.c @@ -270,14 +270,23 @@ int connect_to_fd_opts(int server_fd, const struct network_helper_opts *opts) opts = &default_opts; optlen = sizeof(type); - if (getsockopt(server_fd, SOL_SOCKET, SO_TYPE, &type, &optlen)) { - log_err("getsockopt(SOL_TYPE)"); - return -1; + + if (opts->type) { + type = opts->type; + } else { + if (getsockopt(server_fd, SOL_SOCKET, SO_TYPE, &type, &optlen)) { + log_err("getsockopt(SOL_TYPE)"); + return -1; + } } - if (getsockopt(server_fd, SOL_SOCKET, SO_PROTOCOL, &protocol, &optlen)) { - log_err("getsockopt(SOL_PROTOCOL)"); - return -1; + if (opts->proto) { + protocol = opts->proto; + } else { + if (getsockopt(server_fd, SOL_SOCKET, SO_PROTOCOL, &protocol, &optlen)) { + log_err("getsockopt(SOL_PROTOCOL)"); + return -1; + } } addrlen = sizeof(addr); diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h index 8be04cd76d8b..7119804ea79b 100644 --- a/tools/testing/selftests/bpf/network_helpers.h +++ b/tools/testing/selftests/bpf/network_helpers.h @@ -22,6 +22,8 @@ struct network_helper_opts { int timeout_ms; bool must_fail; bool noconnect; + int type; + int proto; }; /* ipv4 test vector */ From patchwork Wed Dec 14 23:25:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13073700 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BA28C001B2 for ; Wed, 14 Dec 2022 23:28:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229655AbiLNX2g (ORCPT ); Wed, 14 Dec 2022 18:28:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58664 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229723AbiLNX2I (ORCPT ); Wed, 14 Dec 2022 18:28:08 -0500 Received: from wout4-smtp.messagingengine.com (wout4-smtp.messagingengine.com [64.147.123.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7584549B74; Wed, 14 Dec 2022 15:26:25 -0800 (PST) Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.west.internal (Postfix) with ESMTP id BFF78320016F; Wed, 14 Dec 2022 18:26:23 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute3.internal (MEProxy); Wed, 14 Dec 2022 18:26:25 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm1; t=1671060383; x=1671146783; bh=yv 6JU9opx4TWQWD6Z2v6I1MGb71Y/3jWyXaKtRmlwEo=; b=M0VBR5o1pLChjaM1Hc uNKPb715Ijyi8Vpgn//F9Ik0StG0AdRAbqdwKD4YPyowrWjdF56/j1v/YthP9iN6 ltQ66f+R/axHlobmUQHGDFJQ7i9+v0+Jub76K1iu/rXBy94uS21NXJqzcoyeufIE ZJfbOB2IjS7TdOU+n37Ql6ppExfMbYs+41LzwpB/IMcsuvACsClkChWWtIjOg+Vq bqyPmt8F7TuGjSAp5w8veiHOjEPGSX2gTsB4pzVyz/r15r2yEb8U2bO7tEt0JaFL xfk9AuUwAP+f/y2mZjea/Dmt09Gy4c0S6DLxMvgmfKbNBRfwZXVLzZjpy6lSgY4G BPXQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm2; t=1671060383; x=1671146783; bh=yv6JU9opx4TWQ WD6Z2v6I1MGb71Y/3jWyXaKtRmlwEo=; b=b0r25TMe7o48HtDF3KThnGZz55t6t qai0wTQii5zNAlKW8M1PJiM2HxYbLUmKpF7hIM7nx3/jCiwIb/bHIUz0Nl8YUVib LNjTPS7knRijpGsesFQgydkcmFVaOk2WhAgrDIG5fcSPng41k2FrAAQzPN7bnux7 BVqCwWNcpCoKeXpW6w093pJ9SMV6PAIV998sgJ4lDwTd2GxdbkycOJAtOm2g5m52 dNBuaVMU/DUjPxYtX+AqPxr8EZuHnjW4yPDUocdcwax265YvAjANK+JcwhAY/LG8 WldW38lL88iQoja+HK+fU0PHqgLPhwde34H/LWLVom0+l+fjV/cp6EvsQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrfeeggddtjecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenfg hrlhcuvffnffculdejtddmnecujfgurhephffvvefufffkofgjfhgggfestdekredtredt tdenucfhrhhomhepffgrnhhivghlucgiuhcuoegugihusegugihuuhhurdighiiiqeenuc ggtffrrghtthgvrhhnpefgfefggeejhfduieekvdeuteffleeifeeuvdfhheejleejjeek gfffgefhtddtteenucevlhhushhtvghrufhiiigvpedunecurfgrrhgrmhepmhgrihhlfh hrohhmpegugihusegugihuuhhurdighiii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 14 Dec 2022 18:26:21 -0500 (EST) From: Daniel Xu To: Andrii Nakryiko , Mykola Lysenko , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Shuah Khan Cc: ppenkov@aviatrix.com, dbird@aviatrix.com, linux-kernel@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [PATCH bpf-next 6/6] bpf: selftests: Add bpf_ip_check_defrag() selftest Date: Wed, 14 Dec 2022 16:25:33 -0700 Message-Id: <48b0ce1f1f11ba7244ec4df7e990d79c634fa52e.1671049840.git.dxu@dxuuu.xyz> X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This selftest tests 2 major scenarios: the BPF based defragmentation can succesfully be done and that packet pointers are invalidated after calls to the kfunc. In the first scenario, we create a UDP client and UDP echo server. The the server side is fairly straightforward: we attach the prog and simply echo back the message. The on the client side, we send fragmented packets to and expect the reassembled message back from the server. Signed-off-by: Daniel Xu --- .../selftests/bpf/generate_udp_fragments.py | 52 +++ .../bpf/prog_tests/ip_check_defrag.c | 296 ++++++++++++++++++ .../selftests/bpf/progs/bpf_tracing_net.h | 1 + .../selftests/bpf/progs/ip_check_defrag.c | 83 +++++ 4 files changed, 432 insertions(+) create mode 100755 tools/testing/selftests/bpf/generate_udp_fragments.py create mode 100644 tools/testing/selftests/bpf/prog_tests/ip_check_defrag.c create mode 100644 tools/testing/selftests/bpf/progs/ip_check_defrag.c diff --git a/tools/testing/selftests/bpf/generate_udp_fragments.py b/tools/testing/selftests/bpf/generate_udp_fragments.py new file mode 100755 index 000000000000..b7ee3f7b42b4 --- /dev/null +++ b/tools/testing/selftests/bpf/generate_udp_fragments.py @@ -0,0 +1,52 @@ +#!/bin/env python3 + +""" +This script helps generate fragmented UDP packets. + +While it is technically possible to dynamically generate +fragmented packets in C, it is much harder to read and write +said code. `scapy` is relatively industry standard and really +easy to read / write. + +So we choose to write this script that generates valid C code. +""" + +import argparse +from scapy.all import * + +def print_frags(frags): + for idx, frag in enumerate(frags): + # 10 bytes per line to keep width in check + chunks = [frag[i: i+10] for i in range(0, len(frag), 10)] + chunks_fmted = [", ".join([str(hex(b)) for b in chunk]) for chunk in chunks] + + print(f"static uint8_t frag{idx}[] = {{") + for chunk in chunks_fmted: + print(f"\t{chunk},") + print(f"}};") + + +def main(args): + # srcip of 0 is filled in by IP_HDRINCL + sip = "0.0.0.0" + dip = args.dst_ip + sport = args.src_port + dport = args.dst_port + payload = args.payload.encode() + + # Disable UDP checksums to keep code simpler + pkt = IP(src=sip,dst=dip) / UDP(sport=sport,dport=dport,chksum=0) / Raw(load=payload) + + frags = [f.build() for f in pkt.fragment(24)] + print_frags(frags) + + +if __name__ == "__main__": + parser = argparse.ArgumentParser() + parser.add_argument("dst_ip") + parser.add_argument("src_port", type=int) + parser.add_argument("dst_port", type=int) + parser.add_argument("payload") + args = parser.parse_args() + + main(args) diff --git a/tools/testing/selftests/bpf/prog_tests/ip_check_defrag.c b/tools/testing/selftests/bpf/prog_tests/ip_check_defrag.c new file mode 100644 index 000000000000..ed078e8265de --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/ip_check_defrag.c @@ -0,0 +1,296 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include "ip_check_defrag.skel.h" + +/* + * This selftest spins up a client and an echo server, each in their own + * network namespace. The server will receive fragmented messages which + * the attached BPF prog should reassemble. We verify that reassembly + * occurred by checking the original (fragmented) message is received + * in whole. + * + * Topology: + * ========= + * NS0 | NS1 + * | + * client | server + * ---------- | ---------- + * | veth0 | --------- | veth1 | + * ---------- peer ---------- + * | + * | with bpf + */ + +#define NS0 "defrag_ns0" +#define NS1 "defrag_ns1" +#define VETH0 "veth0" +#define VETH1 "veth1" +#define VETH0_ADDR "172.16.1.100" +#define VETH1_ADDR "172.16.1.200" +#define CLIENT_PORT 48878 +#define SERVER_PORT 48879 +#define MAGIC_MESSAGE "THIS IS THE ORIGINAL MESSAGE, PLEASE REASSEMBLE ME" + +static char log_buf[1024 * 1024]; + +#define SYS(fmt, ...) \ + ({ \ + char cmd[1024]; \ + snprintf(cmd, sizeof(cmd), fmt, ##__VA_ARGS__); \ + if (!ASSERT_OK(system(cmd), cmd)) \ + goto fail; \ + }) + +#define SYS_NOFAIL(fmt, ...) \ + ({ \ + char cmd[1024]; \ + snprintf(cmd, sizeof(cmd), fmt, ##__VA_ARGS__); \ + system(cmd); \ + }) + +/* + * The following fragments are generated with this script invocation: + * + * ./generate_udp_fragments $VETH1_ADDR $CLIENT_PORT $SERVER_PORT $MAGIC_MESSAGE + * + * where the `$` indicates replacement with preprocessor macro. + */ +static uint8_t frag0[] = { + 0x45, 0x0, 0x0, 0x2c, 0x0, 0x1, 0x20, 0x0, 0x40, 0x11, + 0xac, 0xe8, 0x0, 0x0, 0x0, 0x0, 0xac, 0x10, 0x1, 0xc8, + 0xbe, 0xee, 0xbe, 0xef, 0x0, 0x3a, 0x0, 0x0, 0x54, 0x48, + 0x49, 0x53, 0x20, 0x49, 0x53, 0x20, 0x54, 0x48, 0x45, 0x20, + 0x4f, 0x52, 0x49, 0x47, +}; +static uint8_t frag1[] = { + 0x45, 0x0, 0x0, 0x2c, 0x0, 0x1, 0x20, 0x3, 0x40, 0x11, + 0xac, 0xe5, 0x0, 0x0, 0x0, 0x0, 0xac, 0x10, 0x1, 0xc8, + 0x49, 0x4e, 0x41, 0x4c, 0x20, 0x4d, 0x45, 0x53, 0x53, 0x41, + 0x47, 0x45, 0x2c, 0x20, 0x50, 0x4c, 0x45, 0x41, 0x53, 0x45, + 0x20, 0x52, 0x45, 0x41, +}; +static uint8_t frag2[] = { + 0x45, 0x0, 0x0, 0x1e, 0x0, 0x1, 0x0, 0x6, 0x40, 0x11, + 0xcc, 0xf0, 0x0, 0x0, 0x0, 0x0, 0xac, 0x10, 0x1, 0xc8, + 0x53, 0x53, 0x45, 0x4d, 0x42, 0x4c, 0x45, 0x20, 0x4d, 0x45, +}; + +static int setup_topology(void) +{ + SYS("ip netns add " NS0); + SYS("ip netns add " NS1); + SYS("ip link add " VETH0 " netns " NS0 " type veth peer name " VETH1 " netns " NS1); + SYS("ip -net " NS0 " addr add " VETH0_ADDR "/24 dev " VETH0); + SYS("ip -net " NS0 " link set dev " VETH0 " up"); + SYS("ip -net " NS1 " addr add " VETH1_ADDR "/24 dev " VETH1); + SYS("ip -net " NS1 " link set dev " VETH1 " up"); + + return 0; +fail: + return -1; +} + +static void cleanup_topology(void) +{ + SYS_NOFAIL("test -f /var/run/netns/" NS0 " && ip netns delete " NS0); + SYS_NOFAIL("test -f /var/run/netns/" NS1 " && ip netns delete " NS1); +} + +static int attach(struct ip_check_defrag *skel) +{ + LIBBPF_OPTS(bpf_tc_hook, tc_hook, + .attach_point = BPF_TC_INGRESS); + LIBBPF_OPTS(bpf_tc_opts, tc_attach, + .prog_fd = bpf_program__fd(skel->progs.defrag)); + struct nstoken *nstoken; + int err = -1; + + nstoken = open_netns(NS1); + + tc_hook.ifindex = if_nametoindex(VETH1); + if (!ASSERT_OK(bpf_tc_hook_create(&tc_hook), "bpf_tc_hook_create")) + goto out; + + if (!ASSERT_OK(bpf_tc_attach(&tc_hook, &tc_attach), "bpf_tc_attach")) + goto out; + + err = 0; +out: + close_netns(nstoken); + return err; +} + +static int send_frags(int client) +{ + struct sockaddr_storage saddr; + struct sockaddr *saddr_p; + socklen_t saddr_len; + int err; + + saddr_p = (struct sockaddr*)&saddr; + err = make_sockaddr(AF_INET, VETH1_ADDR, SERVER_PORT, &saddr, &saddr_len); + if (!ASSERT_OK(err, "make_sockaddr")) + return -1; + + err = sendto(client, frag0, sizeof(frag0), 0, saddr_p, saddr_len); + if (!ASSERT_GE(err, 0, "sendto frag0")) + return -1; + + err = sendto(client, frag1, sizeof(frag1), 0, saddr_p, saddr_len); + if (!ASSERT_GE(err, 0, "sendto frag1")) + return -1; + + err = sendto(client, frag2, sizeof(frag2), 0, saddr_p, saddr_len); + if (!ASSERT_GE(err, 0, "sendto frag2")) + return -1; + + return 0; +} + +void test_bpf_ip_check_defrag_ok(void) +{ + struct network_helper_opts rx_opts = { + .timeout_ms = 1000, + .noconnect = true, + }; + struct network_helper_opts tx_ops = { + .timeout_ms = 1000, + .type = SOCK_RAW, + .proto = IPPROTO_RAW, + .noconnect = true, + }; + struct ip_check_defrag *skel; + struct sockaddr_in caddr; + struct nstoken *nstoken; + int client_tx_fd = -1; + int client_rx_fd = -1; + socklen_t caddr_len; + int srv_fd = -1; + char buf[1024]; + int len, err; + + skel = ip_check_defrag__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + return; + + if (!ASSERT_OK(setup_topology(), "setup_topology")) + goto out; + + if (!ASSERT_OK(attach(skel), "attach")) + goto out; + + /* Start server in ns1 */ + nstoken = open_netns(NS1); + if (!ASSERT_OK_PTR(nstoken, "setns ns1")) + goto out; + srv_fd = start_server(AF_INET, SOCK_DGRAM, NULL, SERVER_PORT, 0); + close_netns(nstoken); + if (!ASSERT_GE(srv_fd, 0, "start_server")) + goto out; + + /* Open tx raw socket in ns0 */ + nstoken = open_netns(NS0); + if (!ASSERT_OK_PTR(nstoken, "setns ns0")) + goto out; + client_tx_fd = connect_to_fd_opts(srv_fd, &tx_ops); + close_netns(nstoken); + if (!ASSERT_GE(client_tx_fd, 0, "connect_to_fd_opts")) + goto out; + + /* Open rx socket in ns0 */ + nstoken = open_netns(NS0); + if (!ASSERT_OK_PTR(nstoken, "setns ns0")) + goto out; + client_rx_fd = connect_to_fd_opts(srv_fd, &rx_opts); + close_netns(nstoken); + if (!ASSERT_GE(client_rx_fd, 0, "connect_to_fd_opts")) + goto out; + + /* Bind rx socket to a premeditated port */ + memset(&caddr, 0, sizeof(caddr)); + caddr.sin_family = AF_INET; + inet_pton(AF_INET, VETH0_ADDR, &caddr.sin_addr); + caddr.sin_port = htons(CLIENT_PORT); + nstoken = open_netns(NS0); + err = bind(client_rx_fd, (struct sockaddr *)&caddr, sizeof(caddr)); + close_netns(nstoken); + if (!ASSERT_OK(err, "bind")) + goto out; + + /* Send message in fragments */ + if (!ASSERT_OK(send_frags(client_tx_fd), "send_frags")) + goto out; + + if (!ASSERT_EQ(skel->bss->frags_seen, 3, "frags_seen")) + goto out; + + if (!ASSERT_FALSE(skel->data->is_final_frag, "is_final_frag")) + goto out; + + /* Receive reassembled msg on server and echo back to client */ + len = recvfrom(srv_fd, buf, sizeof(buf), 0, (struct sockaddr *)&caddr, &caddr_len); + if (!ASSERT_GE(len, 0, "server recvfrom")) + goto out; + len = sendto(srv_fd, buf, len, 0, (struct sockaddr *)&caddr, caddr_len); + if (!ASSERT_GE(len, 0, "server sendto")) + goto out; + + /* Expect reassembed message to be echoed back */ + len = recvfrom(client_rx_fd, buf, sizeof(buf), 0, NULL, NULL); + if (!ASSERT_EQ(len, sizeof(MAGIC_MESSAGE) - 1, "client short read")) + goto out; + +out: + if (client_rx_fd != -1) + close(client_rx_fd); + if (client_tx_fd != -1) + close(client_tx_fd); + if (srv_fd != -1) + close(srv_fd); + cleanup_topology(); + ip_check_defrag__destroy(skel); +} + +void test_bpf_ip_check_defrag_fail(void) +{ + const char *err_msg = "invalid mem access 'scalar'"; + LIBBPF_OPTS(bpf_object_open_opts, opts, + .kernel_log_buf = log_buf, + .kernel_log_size = sizeof(log_buf), + .kernel_log_level = 1); + struct ip_check_defrag *skel; + struct bpf_program *prog; + int err; + + skel = ip_check_defrag__open_opts(&opts); + if (!ASSERT_OK_PTR(skel, "ip_check_defrag__open_opts")) + return; + + prog = bpf_object__find_program_by_name(skel->obj, "defrag_fail"); + if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name")) + goto out; + + bpf_program__set_autoload(prog, true); + + err = ip_check_defrag__load(skel); + if (!ASSERT_ERR(err, "ip_check_defrag__load must fail")) + goto out; + + if (!ASSERT_OK_PTR(strstr(log_buf, err_msg), "expected error message")) { + fprintf(stderr, "Expected: %s\n", err_msg); + fprintf(stderr, "Verifier: %s\n", log_buf); + } + +out: + ip_check_defrag__destroy(skel); +} + +void test_bpf_ip_check_defrag(void) +{ + if (test__start_subtest("ok")) + test_bpf_ip_check_defrag_ok(); + if (test__start_subtest("fail")) + test_bpf_ip_check_defrag_fail(); +} diff --git a/tools/testing/selftests/bpf/progs/bpf_tracing_net.h b/tools/testing/selftests/bpf/progs/bpf_tracing_net.h index b394817126cf..a1d6cc1f2ef8 100644 --- a/tools/testing/selftests/bpf/progs/bpf_tracing_net.h +++ b/tools/testing/selftests/bpf/progs/bpf_tracing_net.h @@ -26,6 +26,7 @@ #define IPV6_AUTOFLOWLABEL 70 #define TC_ACT_UNSPEC (-1) +#define TC_ACT_OK 0 #define TC_ACT_SHOT 2 #define SOL_TCP 6 diff --git a/tools/testing/selftests/bpf/progs/ip_check_defrag.c b/tools/testing/selftests/bpf/progs/ip_check_defrag.c new file mode 100644 index 000000000000..71300b77a43f --- /dev/null +++ b/tools/testing/selftests/bpf/progs/ip_check_defrag.c @@ -0,0 +1,83 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include "vmlinux.h" +#include +#include +#include "bpf_tracing_net.h" + +#define ETH_P_IP 0x0800 +#define IP_DF 0x4000 +#define IP_MF 0x2000 +#define IP_OFFSET 0x1FFF +#define ctx_ptr(field) (void *)(long)(field) + +int bpf_ip_check_defrag(struct __sk_buff *ctx, u64 netns) __ksym; + +volatile int frags_seen = 0; +volatile bool is_final_frag = true; + +static inline bool is_frag(struct iphdr *iph) +{ + int offset; + int flags; + + offset = bpf_ntohs(iph->frag_off); + flags = offset & ~IP_OFFSET; + offset &= IP_OFFSET; + offset <<= 3; + + return (flags & IP_MF) || offset; +} + +SEC("tc") +int defrag(struct __sk_buff *skb) +{ + void *data_end = ctx_ptr(skb->data_end); + void *data = ctx_ptr(skb->data); + struct iphdr *iph; + + if (skb->protocol != bpf_htons(ETH_P_IP)) + return TC_ACT_OK; + + iph = data + sizeof(struct ethhdr); + if (iph + 1 > data_end) + return TC_ACT_SHOT; + + if (!is_frag(iph)) + return TC_ACT_OK; + + frags_seen++; + if (bpf_ip_check_defrag(skb, BPF_F_CURRENT_NETNS)) + return TC_ACT_SHOT; + + data_end = ctx_ptr(skb->data_end); + data = ctx_ptr(skb->data); + iph = data + sizeof(struct ethhdr); + if (iph + 1 > data_end) + return TC_ACT_SHOT; + is_final_frag = is_frag(iph); + + return TC_ACT_OK; +} + +SEC("?tc") +int defrag_fail(struct __sk_buff *skb) +{ + void *data_end = ctx_ptr(skb->data_end); + void *data = ctx_ptr(skb->data); + struct iphdr *iph; + + if (skb->protocol != bpf_htons(ETH_P_IP)) + return TC_ACT_OK; + + iph = data + sizeof(struct ethhdr); + if (iph + 1 > data_end) + return TC_ACT_SHOT; + + if (bpf_ip_check_defrag(skb, BPF_F_CURRENT_NETNS)) + return TC_ACT_SHOT; + + /* Boom. Must revalidate pkt ptrs */ + return iph->ttl ? TC_ACT_OK : TC_ACT_SHOT; +} + +char _license[] SEC("license") = "GPL";