From patchwork Thu Jul 6 23:25:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13304299 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DD22063BA; Thu, 6 Jul 2023 23:26:21 +0000 (UTC) Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 51F811BC3; Thu, 6 Jul 2023 16:26:20 -0700 (PDT) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id 732195C00C8; Thu, 6 Jul 2023 19:26:18 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Thu, 06 Jul 2023 19:26:18 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm1; t=1688685978; x= 1688772378; bh=50Mq7yDH4SaalwjOr9VWbgSV3HaqQVn5wwG7OgrzEBE=; b=i MmC7YyfNAqXjJdbhv9QlzlkYkxFoTRyeKZBYp77nwt/S6s3qqC9kHyn78vNhSDVd c41e1H2KeMcvcYxUi6JneGQRBjk5H4y2DxH9SdxQqBjA3gv9RzxbeNHlOUhXr/0m 37CYrGVXvVQCJhbLotc0TBfclhde8hLFdjJQ1QD/8xQvb22X4T+nQZVt+OQrbmzS Exwy0E57rSuZdd6aYn7y3euafKy6aylOCB0ZQtdvQjcK1ftZBzegpYRXvndmdUdm 0ELoYwCVhGq9Cytz4N2eaAZpdYkhLfYMnXXh0bJtnZnmcbFT5bi8sv/WQgWrch9/ LFHt4st8pUW55RFmIh8Sw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1688685978; x= 1688772378; bh=50Mq7yDH4SaalwjOr9VWbgSV3HaqQVn5wwG7OgrzEBE=; b=I yEO8iEM+w1uIEu2/YEr9VFqi4sYs4mbDj4ZZfHaiEmEkjdoDvq2euyIEActD2KVS Wfu2MbTzYFzygYgPX1rZJ8+Mfo4fZYOB/sYbppMq/mVqVT5RMhmqyTQ6sqhHwcQi XrNCWQEGRlNg8+FGpz9P2q3gZ0iDZ1box8M7/tMwB10svu2S4M1Ai6gA7TLXDpjq SX9HDGX/HWtAx2Mcd8Oq1zw1OCQ/AAQ7+VwXVizPqHedai+LFgO3cJPvSumdmPB9 /KNgepLZkRSLVp+wDz0qH7XH+KnWM6LeCWBXMhfQtsZaO4JVuhTw0phzJtSUhprj 5iq7kMnu2hJ8HLPgVwFsg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrvddtgddvudcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecufghrlhcuvffnffculdejtddmnecujfgurhephffvve fufffkofgjfhgggfestdekredtredttdenucfhrhhomhepffgrnhhivghlucgiuhcuoegu gihusegugihuuhhurdighiiiqeenucggtffrrghtthgvrhhnpefgfefggeejhfduieekvd euteffleeifeeuvdfhheejleejjeekgfffgefhtddtteenucevlhhushhtvghrufhiiigv pedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegugihusegugihuuhhurdighiii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 6 Jul 2023 19:26:17 -0400 (EDT) From: Daniel Xu To: pabeni@redhat.com, edumazet@google.com, kuba@kernel.org, fw@strlen.de, davem@davemloft.net, pablo@netfilter.org, kadlec@netfilter.org, dsahern@kernel.org, daniel@iogearbox.net Cc: netfilter-devel@vger.kernel.org, coreteam@netfilter.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Subject: [PATCH bpf-next v2 1/6] netfilter: defrag: Add glue hooks for enabling/disabling defrag Date: Thu, 6 Jul 2023 17:25:48 -0600 Message-ID: <20411df83bf659ac202af13923d5a6d82bfd4087.1688685338.git.dxu@dxuuu.xyz> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_PASS, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net We want to be able to enable/disable IP packet defrag from core bpf/netfilter code. In other words, execute code from core that could possibly be built as a module. To help avoid symbol resolution errors, use glue hooks that the modules will register callbacks with during module init. Signed-off-by: Daniel Xu --- include/linux/netfilter.h | 12 ++++++++++++ net/ipv4/netfilter/nf_defrag_ipv4.c | 16 +++++++++++++++- net/ipv6/netfilter/nf_defrag_ipv6_hooks.c | 10 ++++++++++ net/netfilter/core.c | 6 ++++++ 4 files changed, 43 insertions(+), 1 deletion(-) diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h index d4fed4c508ca..77a637b681f2 100644 --- a/include/linux/netfilter.h +++ b/include/linux/netfilter.h @@ -481,6 +481,18 @@ struct nfnl_ct_hook { }; extern const struct nfnl_ct_hook __rcu *nfnl_ct_hook; +struct nf_defrag_v4_hook { + int (*enable)(struct net *net); + void (*disable)(struct net *net); +}; +extern const struct nf_defrag_v4_hook __rcu *nf_defrag_v4_hook; + +struct nf_defrag_v6_hook { + int (*enable)(struct net *net); + void (*disable)(struct net *net); +}; +extern const struct nf_defrag_v6_hook __rcu *nf_defrag_v6_hook; + /* * nf_skb_duplicated - TEE target has sent a packet * diff --git a/net/ipv4/netfilter/nf_defrag_ipv4.c b/net/ipv4/netfilter/nf_defrag_ipv4.c index e61ea428ea18..1f3e0e893b7a 100644 --- a/net/ipv4/netfilter/nf_defrag_ipv4.c +++ b/net/ipv4/netfilter/nf_defrag_ipv4.c @@ -7,6 +7,7 @@ #include #include #include +#include #include #include #include @@ -113,17 +114,30 @@ static void __net_exit defrag4_net_exit(struct net *net) } } +static const struct nf_defrag_v4_hook defrag_hook = { + .enable = nf_defrag_ipv4_enable, + .disable = nf_defrag_ipv4_disable, +}; + static struct pernet_operations defrag4_net_ops = { .exit = defrag4_net_exit, }; static int __init nf_defrag_init(void) { - return register_pernet_subsys(&defrag4_net_ops); + int err; + + err = register_pernet_subsys(&defrag4_net_ops); + if (err) + return err; + + rcu_assign_pointer(nf_defrag_v4_hook, &defrag_hook); + return err; } static void __exit nf_defrag_fini(void) { + rcu_assign_pointer(nf_defrag_v4_hook, NULL); unregister_pernet_subsys(&defrag4_net_ops); } diff --git a/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c b/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c index cb4eb1d2c620..f7c7ee31c472 100644 --- a/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c +++ b/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #include @@ -96,6 +97,11 @@ static void __net_exit defrag6_net_exit(struct net *net) } } +static const struct nf_defrag_v6_hook defrag_hook = { + .enable = nf_defrag_ipv6_enable, + .disable = nf_defrag_ipv6_disable, +}; + static struct pernet_operations defrag6_net_ops = { .exit = defrag6_net_exit, }; @@ -114,6 +120,9 @@ static int __init nf_defrag_init(void) pr_err("nf_defrag_ipv6: can't register pernet ops\n"); goto cleanup_frag6; } + + rcu_assign_pointer(nf_defrag_v6_hook, &defrag_hook); + return ret; cleanup_frag6: @@ -124,6 +133,7 @@ static int __init nf_defrag_init(void) static void __exit nf_defrag_fini(void) { + rcu_assign_pointer(nf_defrag_v6_hook, NULL); unregister_pernet_subsys(&defrag6_net_ops); nf_ct_frag6_cleanup(); } diff --git a/net/netfilter/core.c b/net/netfilter/core.c index 5f76ae86a656..34845155bb85 100644 --- a/net/netfilter/core.c +++ b/net/netfilter/core.c @@ -680,6 +680,12 @@ EXPORT_SYMBOL_GPL(nfnl_ct_hook); const struct nf_ct_hook __rcu *nf_ct_hook __read_mostly; EXPORT_SYMBOL_GPL(nf_ct_hook); +const struct nf_defrag_v4_hook __rcu *nf_defrag_v4_hook __read_mostly; +EXPORT_SYMBOL_GPL(nf_defrag_v4_hook); + +const struct nf_defrag_v6_hook __rcu *nf_defrag_v6_hook __read_mostly; +EXPORT_SYMBOL_GPL(nf_defrag_v6_hook); + #if IS_ENABLED(CONFIG_NF_CONNTRACK) u8 nf_ctnetlink_has_listener; EXPORT_SYMBOL_GPL(nf_ctnetlink_has_listener); From patchwork Thu Jul 6 23:25:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13304306 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EB68FD2E9; Thu, 6 Jul 2023 23:35:53 +0000 (UTC) X-Greylist: delayed 569 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Thu, 06 Jul 2023 16:35:51 PDT Received: from new2-smtp.messagingengine.com (new2-smtp.messagingengine.com [66.111.4.224]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BE4251BC3; Thu, 6 Jul 2023 16:35:51 -0700 (PDT) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.nyi.internal (Postfix) with ESMTP id D3AC45802E5; Thu, 6 Jul 2023 19:26:20 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Thu, 06 Jul 2023 19:26:20 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm1; t=1688685980; x= 1688693180; bh=E1jtUQpyvb6NHwarldVNq8v8v/YlodFgEoqeEX93HuI=; b=R MjG+66Xg6VCZAJK4n3j9i0XltlhbyrQhNBnTT42GY8+bOQPBZY7zBBdtSlhvbQIU LEb6ilrGbdhiVQdBnhJ3E4l0s+mXvpkZRHCxQ6eIiuQh7kQwOW8XlfMWb2ikGuLl u7W1bkOJ1gqf3/NoG7akwhcRosOqCt4YavBgvfx/bmt5Iz5yGuxfVFeDedfO7pVF 6IHbsRYxJ5uJVZ0oSuLMBs1XM1LsY+gSkiCaG35STSE5+ngSwTvZWU9dO417noTC SeJhttYewAmZJ5ddgxYinc/45sHzQErULcQ5BUnBAQMhGrguAc7uNososSe9f92V lVvhSBMdsdVDbhEiSWS7Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1688685980; x= 1688693180; bh=E1jtUQpyvb6NHwarldVNq8v8v/YlodFgEoqeEX93HuI=; b=Z 6Gvcy1mLAKlBLstKK1Bh3f2kRf+cHwpVUuQo84cGjsM/igpjJpwpET5gxkOQiWBZ S03I/MMssyABDccrumd3OPF95Yf4yOsvfpqhQ3AET6cK7B2o7t7bW5ZnrZGM94eq ftjKEdy7GRHlohE6OM37WjcadVQiuWdC0QHSgfMnjS2bOMrJwuAZ6aDIIxVNtUad 28jZdvX+TICUbBJ1/0MrlA6u2UCCC1YBXqrtPEK/iHa54sVFN4Qzz2gfQx7rOfJd Gm9gFdDR51d4Sr+gJNGUcufrOLyoOq2sy6RWXIk6MNwwMh64JkxMhluhPjcg9voX F6XhH5cNh8zsnKEGJlZ0g== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrvddtgddvudcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecufghrlhcuvffnffculdejtddmnecujfgurhephffvve fufffkofgjfhgggfestdekredtredttdenucfhrhhomhepffgrnhhivghlucgiuhcuoegu gihusegugihuuhhurdighiiiqeenucggtffrrghtthgvrhhnpeejgeevleejhedvveduud ffffelleevjeeugeejvdehvdehvdehtefhjeegtdeiieenucffohhmrghinhepnhgvthhf ihhlthgvrhdrphhfnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilh hfrhhomhepugiguhesugiguhhuuhdrgiihii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 6 Jul 2023 19:26:18 -0400 (EDT) From: Daniel Xu To: pabeni@redhat.com, edumazet@google.com, kuba@kernel.org, fw@strlen.de, davem@davemloft.net, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, pablo@netfilter.org, kadlec@netfilter.org Cc: martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, netfilter-devel@vger.kernel.org, coreteam@netfilter.org, netdev@vger.kernel.org, dsahern@kernel.org Subject: [PATCH bpf-next v2 2/6] netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link Date: Thu, 6 Jul 2023 17:25:49 -0600 Message-ID: X-Mailer: git-send-email 2.41.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_PASS, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net This commit adds support for enabling IP defrag using pre-existing netfilter defrag support. Basically all the flag does is bump a refcnt while the link the active. Checks are also added to ensure the prog requesting defrag support is run _after_ netfilter defrag hooks. Signed-off-by: Daniel Xu --- include/uapi/linux/bpf.h | 5 ++ net/netfilter/nf_bpf_link.c | 132 ++++++++++++++++++++++++++++++--- tools/include/uapi/linux/bpf.h | 5 ++ 3 files changed, 131 insertions(+), 11 deletions(-) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 60a9d59beeab..04ac77481583 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1170,6 +1170,11 @@ enum bpf_link_type { */ #define BPF_F_KPROBE_MULTI_RETURN (1U << 0) +/* link_create.netfilter.flags used in LINK_CREATE command for + * BPF_PROG_TYPE_NETFILTER to enable IP packet defragmentation. + */ +#define BPF_F_NETFILTER_IP_DEFRAG (1U << 0) + /* When BPF ldimm64's insn[0].src_reg != 0 then this can have * the following extensions: * diff --git a/net/netfilter/nf_bpf_link.c b/net/netfilter/nf_bpf_link.c index c36da56d756f..765612cb2370 100644 --- a/net/netfilter/nf_bpf_link.c +++ b/net/netfilter/nf_bpf_link.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 #include #include +#include #include #include @@ -23,8 +24,101 @@ struct bpf_nf_link { struct nf_hook_ops hook_ops; struct net *net; u32 dead; + bool defrag; }; +static int bpf_nf_enable_defrag(struct bpf_nf_link *link) +{ + int err; + + switch (link->hook_ops.pf) { +#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV4) + case NFPROTO_IPV4: + const struct nf_defrag_v4_hook *v4_hook; + + rcu_read_lock(); + v4_hook = rcu_dereference(nf_defrag_v4_hook); + if (!v4_hook) { + rcu_read_unlock(); + err = request_module("nf_defrag_ipv4"); + if (err) + return err < 0 ? err : -EINVAL; + + rcu_read_lock(); + v4_hook = rcu_dereference(nf_defrag_v4_hook); + if (!v4_hook) { + WARN_ONCE(1, "nf_defrag_ipv4 bad registration"); + err = -ENOENT; + goto out_v4; + } + } + + err = v4_hook->enable(link->net); +out_v4: + rcu_read_unlock(); + return err; +#endif +#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6) + case NFPROTO_IPV6: + const struct nf_defrag_v6_hook *v6_hook; + + rcu_read_lock(); + v6_hook = rcu_dereference(nf_defrag_v6_hook); + if (!v6_hook) { + rcu_read_unlock(); + err = request_module("nf_defrag_ipv6"); + if (err) + return err < 0 ? err : -EINVAL; + + rcu_read_lock(); + v6_hook = rcu_dereference(nf_defrag_v6_hook); + if (!v6_hook) { + WARN_ONCE(1, "nf_defrag_ipv6_hooks bad registration"); + err = -ENOENT; + goto out_v6; + } + } + + err = v6_hook->enable(link->net); +out_v6: + rcu_read_unlock(); + return err; +#endif + default: + return -EAFNOSUPPORT; + } +} + +static void bpf_nf_disable_defrag(struct bpf_nf_link *link) +{ + switch (link->hook_ops.pf) { +#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV4) + case NFPROTO_IPV4: + const struct nf_defrag_v4_hook *v4_hook; + + rcu_read_lock(); + v4_hook = rcu_dereference(nf_defrag_v4_hook); + if (v4_hook) + v4_hook->disable(link->net); + rcu_read_unlock(); + + break; +#endif +#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6) + case NFPROTO_IPV6: + const struct nf_defrag_v6_hook *v6_hook; + + rcu_read_lock(); + v6_hook = rcu_dereference(nf_defrag_v6_hook); + if (v6_hook) + v6_hook->disable(link->net); + rcu_read_unlock(); + + break; + } +#endif +} + static void bpf_nf_link_release(struct bpf_link *link) { struct bpf_nf_link *nf_link = container_of(link, struct bpf_nf_link, link); @@ -37,6 +131,9 @@ static void bpf_nf_link_release(struct bpf_link *link) */ if (!cmpxchg(&nf_link->dead, 0, 1)) nf_unregister_net_hook(nf_link->net, &nf_link->hook_ops); + + if (nf_link->defrag) + bpf_nf_disable_defrag(nf_link); } static void bpf_nf_link_dealloc(struct bpf_link *link) @@ -92,6 +189,8 @@ static const struct bpf_link_ops bpf_nf_link_lops = { static int bpf_nf_check_pf_and_hooks(const union bpf_attr *attr) { + int prio; + switch (attr->link_create.netfilter.pf) { case NFPROTO_IPV4: case NFPROTO_IPV6: @@ -102,19 +201,18 @@ static int bpf_nf_check_pf_and_hooks(const union bpf_attr *attr) return -EAFNOSUPPORT; } - if (attr->link_create.netfilter.flags) + if (attr->link_create.netfilter.flags & ~BPF_F_NETFILTER_IP_DEFRAG) return -EOPNOTSUPP; - /* make sure conntrack confirm is always last. - * - * In the future, if userspace can e.g. request defrag, then - * "defrag_requested && prio before NF_IP_PRI_CONNTRACK_DEFRAG" - * should fail. - */ - switch (attr->link_create.netfilter.priority) { - case NF_IP_PRI_FIRST: return -ERANGE; /* sabotage_in and other warts */ - case NF_IP_PRI_LAST: return -ERANGE; /* e.g. conntrack confirm */ - } + /* make sure conntrack confirm is always last */ + prio = attr->link_create.netfilter.priority; + if (prio == NF_IP_PRI_FIRST) + return -ERANGE; /* sabotage_in and other warts */ + else if (prio == NF_IP_PRI_LAST) + return -ERANGE; /* e.g. conntrack confirm */ + else if ((attr->link_create.netfilter.flags & BPF_F_NETFILTER_IP_DEFRAG) && + prio <= NF_IP_PRI_CONNTRACK_DEFRAG) + return -ERANGE; /* cannot use defrag if prog runs before nf_defrag */ return 0; } @@ -156,6 +254,18 @@ int bpf_nf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog) return err; } + if (attr->link_create.netfilter.flags & BPF_F_NETFILTER_IP_DEFRAG) { + err = bpf_nf_enable_defrag(link); + if (err) { + bpf_link_cleanup(&link_primer); + return err; + } + /* only mark defrag enabled if enabling succeeds so cleanup path + * doesn't disable without a corresponding enable + */ + link->defrag = true; + } + err = nf_register_net_hook(net, &link->hook_ops); if (err) { bpf_link_cleanup(&link_primer); diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 60a9d59beeab..04ac77481583 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1170,6 +1170,11 @@ enum bpf_link_type { */ #define BPF_F_KPROBE_MULTI_RETURN (1U << 0) +/* link_create.netfilter.flags used in LINK_CREATE command for + * BPF_PROG_TYPE_NETFILTER to enable IP packet defragmentation. + */ +#define BPF_F_NETFILTER_IP_DEFRAG (1U << 0) + /* When BPF ldimm64's insn[0].src_reg != 0 then this can have * the following extensions: * From patchwork Thu Jul 6 23:25:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13304300 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DC27B171A7; Thu, 6 Jul 2023 23:26:23 +0000 (UTC) Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AC5981BC3; Thu, 6 Jul 2023 16:26:22 -0700 (PDT) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id 2901B5C00E1; Thu, 6 Jul 2023 19:26:22 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Thu, 06 Jul 2023 19:26:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm1; t=1688685982; x= 1688772382; bh=q+GZB03bv8FWk8Fi1TVDaCiqNoaHfJjJsc6EY7aLdqg=; b=d ajqFc2+WLI+82TWCjXDI3fUTiyA1e5YSIVr/195+w77LRdYDkSQg2IP8uvcqANBQ Bf99hzP2pDhGj0+NtdqC3xgMiFoTPSLDPSpe1+ZIfWuhVsXP5TNoVmCBsIkdaIBo y/aCpmPlftxp+Vg68nYAvAExJg6mEmuWKZg2H4tzDYHAlCU9HZjZl5K3YM+/UOz3 0EQzI2mT9rvzwqKpf5pjRYD7bv+t04AEoWizCZ/8WqN12/GGYUx0q3Ey81UHJ6ga UMiMcqEIRd9Nt4rSSzuqEtwmpP1EgBAuAo1d/cNr0tEGu5AD9519/8dodbPrpr8P deH5gpN3JF3fRLH5g3SmA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1688685982; x= 1688772382; bh=q+GZB03bv8FWk8Fi1TVDaCiqNoaHfJjJsc6EY7aLdqg=; b=Z PRPVuy1Nnpk7552fuze8F7nRqZCIouBMxVE8pOBjAGtUcDDnb0bUJxw2DbPvx1OK JXwSPAOT00maCUBb+5gH3UKwQhsMrm4B9h7LpHJUsO38IrBN9bHBjKXaImqqciG+ LV0W8q8K4cgs+ewOKK+akonksj3bT6G+rwrrJmEbP3lbiOBoMbPgHDecdxvvgm9A n2sEapu2fAqC2TDbM8IjbjvDZkehGb10fOHFDt0C23sp0jPOaz6sINsYb/KUYAUk jiMCa698yq2I+FPl5DfjKC5VALPC6EyNrai5N7xzOx0jOfBEnSW0U4jCZsgR4Gtu x0ywPn1escick0aqFHetg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrvddtgddvudcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecufghrlhcuvffnffculdejtddmnecujfgurhephffvve fufffkofgjfhgggfestdekredtredttdenucfhrhhomhepffgrnhhivghlucgiuhcuoegu gihusegugihuuhhurdighiiiqeenucggtffrrghtthgvrhhnpefgfefggeejhfduieekvd euteffleeifeeuvdfhheejleejjeekgfffgefhtddtteenucevlhhushhtvghrufhiiigv pedunecurfgrrhgrmhepmhgrihhlfhhrohhmpegugihusegugihuuhhurdighiii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 6 Jul 2023 19:26:20 -0400 (EDT) From: Daniel Xu To: pabeni@redhat.com, edumazet@google.com, kuba@kernel.org, fw@strlen.de, davem@davemloft.net, pablo@netfilter.org, kadlec@netfilter.org, dsahern@kernel.org, daniel@iogearbox.net Cc: netfilter-devel@vger.kernel.org, coreteam@netfilter.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Subject: [PATCH bpf-next v2 3/6] netfilter: bpf: Prevent defrag module unload while link active Date: Thu, 6 Jul 2023 17:25:50 -0600 Message-ID: <81ede90e3f1468763ea5b0b6ec2971b7b1b870c1.1688685338.git.dxu@dxuuu.xyz> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_PASS, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net While in practice we could handle the module being unloaded while a netfilter link (that requested defrag) was active, it's a better user experience to prevent the defrag module from going away. It would violate user expectations if fragmented packets started showing up if some other part of the system tried to unload defrag module. Signed-off-by: Daniel Xu --- include/linux/netfilter.h | 3 +++ net/ipv4/netfilter/nf_defrag_ipv4.c | 1 + net/ipv6/netfilter/nf_defrag_ipv6_hooks.c | 1 + net/netfilter/nf_bpf_link.c | 21 +++++++++++++++++++-- 4 files changed, 24 insertions(+), 2 deletions(-) diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h index 77a637b681f2..a160dc1e23bf 100644 --- a/include/linux/netfilter.h +++ b/include/linux/netfilter.h @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -482,12 +483,14 @@ struct nfnl_ct_hook { extern const struct nfnl_ct_hook __rcu *nfnl_ct_hook; struct nf_defrag_v4_hook { + struct module *owner; int (*enable)(struct net *net); void (*disable)(struct net *net); }; extern const struct nf_defrag_v4_hook __rcu *nf_defrag_v4_hook; struct nf_defrag_v6_hook { + struct module *owner; int (*enable)(struct net *net); void (*disable)(struct net *net); }; diff --git a/net/ipv4/netfilter/nf_defrag_ipv4.c b/net/ipv4/netfilter/nf_defrag_ipv4.c index 1f3e0e893b7a..fb133bf3131d 100644 --- a/net/ipv4/netfilter/nf_defrag_ipv4.c +++ b/net/ipv4/netfilter/nf_defrag_ipv4.c @@ -115,6 +115,7 @@ static void __net_exit defrag4_net_exit(struct net *net) } static const struct nf_defrag_v4_hook defrag_hook = { + .owner = THIS_MODULE, .enable = nf_defrag_ipv4_enable, .disable = nf_defrag_ipv4_disable, }; diff --git a/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c b/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c index f7c7ee31c472..29d31721c9c0 100644 --- a/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c +++ b/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c @@ -98,6 +98,7 @@ static void __net_exit defrag6_net_exit(struct net *net) } static const struct nf_defrag_v6_hook defrag_hook = { + .owner = THIS_MODULE, .enable = nf_defrag_ipv6_enable, .disable = nf_defrag_ipv6_disable, }; diff --git a/net/netfilter/nf_bpf_link.c b/net/netfilter/nf_bpf_link.c index 765612cb2370..8ce34696939e 100644 --- a/net/netfilter/nf_bpf_link.c +++ b/net/netfilter/nf_bpf_link.c @@ -2,6 +2,7 @@ #include #include #include +#include #include #include @@ -53,6 +54,12 @@ static int bpf_nf_enable_defrag(struct bpf_nf_link *link) } } + /* Prevent defrag module from going away while in use */ + if (!try_module_get(v4_hook->owner)) { + err = -ENOENT; + goto out_v4; + } + err = v4_hook->enable(link->net); out_v4: rcu_read_unlock(); @@ -79,6 +86,12 @@ static int bpf_nf_enable_defrag(struct bpf_nf_link *link) } } + /* Prevent defrag module from going away while in use */ + if (!try_module_get(v6_hook->owner)) { + err = -ENOENT; + goto out_v6; + } + err = v6_hook->enable(link->net); out_v6: rcu_read_unlock(); @@ -98,8 +111,10 @@ static void bpf_nf_disable_defrag(struct bpf_nf_link *link) rcu_read_lock(); v4_hook = rcu_dereference(nf_defrag_v4_hook); - if (v4_hook) + if (v4_hook) { v4_hook->disable(link->net); + module_put(v4_hook->owner); + } rcu_read_unlock(); break; @@ -110,8 +125,10 @@ static void bpf_nf_disable_defrag(struct bpf_nf_link *link) rcu_read_lock(); v6_hook = rcu_dereference(nf_defrag_v6_hook); - if (v6_hook) + if (v6_hook) { v6_hook->disable(link->net); + module_put(v6_hook->owner); + } rcu_read_unlock(); break; From patchwork Thu Jul 6 23:25:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13304301 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4B30F171A7 for ; Thu, 6 Jul 2023 23:26:25 +0000 (UTC) Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 11D881BC3; Thu, 6 Jul 2023 16:26:24 -0700 (PDT) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id 813BD5C00BC; Thu, 6 Jul 2023 19:26:23 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Thu, 06 Jul 2023 19:26:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm1; t=1688685983; x= 1688772383; bh=C5gTyD3ZK69kBeMGtv3ULDgMyW30wlIrQqdN4DdpOCA=; b=B c+H2ztTaE7ptsCMni+nenlvV7jKQHsFh4Ji0Eu/Adb/s/zyiRtqz+eHz5wt1vCvZ KCtuZq6qeNMoTnAGs5RI2nN2ikp4bhqMKLaEZ/pOyoY/mKyn8AdtGBvsRl8N+Wav rbuSlSkaEBPkC5UMEHPAAH+smg+MUwoHVk5yiM8D4pjtxMWpJcDaDt9QzXbtHs8k kX00fiuJS4iRtnwVMhZPRuu2cosWbI7MAsK46No9TWMksauqkeUbu2VCjFFMfzYh pwHxwc2hvD/zc6h869YwBAt6KYKa9UKDMjmO0Ui+/N3Sw4LnfxDL1pWEql1LmM1O 4ws3BFiFughd3puZ29PJw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1688685983; x= 1688772383; bh=C5gTyD3ZK69kBeMGtv3ULDgMyW30wlIrQqdN4DdpOCA=; b=B J7nPJopT2ail/G1jh6hTswuDk5YFCdeJDozGn+LRbe3OBXAsLCmcpZDv0c3U4WMm UPNtgADIbvQVviYFK3YgbZxg+gOF+zRL3lw6QNYDKQvahvL5zQZNvRoMneYcdteE wHkhBn4dFNcV0KhaKbDX26PcxG8MrIv2N5exnqStS+5bTzwdGil88FL54WRWjvQO hoNYslh3Pch1tpf8VoJt8AjT976VcKgusK3ThJbinvyw2rXsKV3IatT6MiYpYS7J c0KzN/Bgv9PuPdYkoa3bO4tbOHE1UpCZZuWgzuIO1M2vmFMQQ/cOHppvgsxCP/qC R2UeqaZfVNd+4xiPAbcPg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrvddtgddvudcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecufghrlhcuvffnffculdejtddmnecujfgurhephffvve fufffkofgjfhgggfestdekredtredttdenucfhrhhomhepffgrnhhivghlucgiuhcuoegu gihusegugihuuhhurdighiiiqeenucggtffrrghtthgvrhhnpefgfefggeejhfduieekvd euteffleeifeeuvdfhheejleejjeekgfffgefhtddtteenucevlhhushhtvghrufhiiigv pedunecurfgrrhgrmhepmhgrihhlfhhrohhmpegugihusegugihuuhhurdighiii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 6 Jul 2023 19:26:22 -0400 (EDT) From: Daniel Xu To: andrii@kernel.org, ast@kernel.org, shuah@kernel.org, daniel@iogearbox.net, fw@strlen.de Cc: mykolal@fb.com, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, netfilter-devel@vger.kernel.org, dsahern@kernel.org Subject: [PATCH bpf-next v2 4/6] bpf: selftests: Support not connecting client socket Date: Thu, 6 Jul 2023 17:25:51 -0600 Message-ID: <2ca086a7122d7a5a65090b4b90da6037c6ad3ff2.1688685338.git.dxu@dxuuu.xyz> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_PASS, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net For connectionless protocols or raw sockets we do not want to actually connect() to the server. Signed-off-by: Daniel Xu --- tools/testing/selftests/bpf/network_helpers.c | 5 +++-- tools/testing/selftests/bpf/network_helpers.h | 1 + 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c index a105c0cd008a..d5c78c08903b 100644 --- a/tools/testing/selftests/bpf/network_helpers.c +++ b/tools/testing/selftests/bpf/network_helpers.c @@ -301,8 +301,9 @@ int connect_to_fd_opts(int server_fd, const struct network_helper_opts *opts) strlen(opts->cc) + 1)) goto error_close; - if (connect_fd_to_addr(fd, &addr, addrlen, opts->must_fail)) - goto error_close; + if (!opts->noconnect) + if (connect_fd_to_addr(fd, &addr, addrlen, opts->must_fail)) + goto error_close; return fd; diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h index 694185644da6..87894dc984dd 100644 --- a/tools/testing/selftests/bpf/network_helpers.h +++ b/tools/testing/selftests/bpf/network_helpers.h @@ -21,6 +21,7 @@ struct network_helper_opts { const char *cc; int timeout_ms; bool must_fail; + bool noconnect; }; /* ipv4 test vector */ From patchwork Thu Jul 6 23:25:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13304302 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 21512168A5 for ; Thu, 6 Jul 2023 23:26:27 +0000 (UTC) Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8A4BC1BC9; Thu, 6 Jul 2023 16:26:25 -0700 (PDT) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 043725C00C1; Thu, 6 Jul 2023 19:26:25 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Thu, 06 Jul 2023 19:26:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm1; t=1688685985; x= 1688772385; bh=jCzr01pZ2u8uI49LnXSVMw7KSVSfUH7qENNmnpv8K+w=; b=q DE8nerLk1zdCRgbLbcNyLzu/2fSH9WE1z0b0GxZ8SHIjiixreXCpkwPakHgKFlCF 0E6t+Js8uIkuFnolE2Pji7yr4hedGHf6vjMDTZzrtAJgyiLCpTGEdP5a4GVz77He qsjrtPu3Z6JbkBx2FZzl6xny2TPfxfmzKs+GkUQjTquvCqecSPKr0pbmpONPoTQS frbqeZ58i1XmA72AdnfK1xQb+iUBPMSWgsGTqxeoim102ujAbcCO7Kk4c+WMAbT3 7VGcHgSOI+45OcW765VqwHpOCXWEBUgO0sy4GwiT9bwvMRA+BGJE2tzudAqMwQup e6VMRyyEEwtiYB/uy05Jw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1688685985; x= 1688772385; bh=jCzr01pZ2u8uI49LnXSVMw7KSVSfUH7qENNmnpv8K+w=; b=O x7aiC6MDx0+rDHB+9xIAgqkTslsqDkbevuH2G8qZFU0RIwzPztjFJtMwIHxl0RcN iO1QB00iWG/CJkYxCm4IXn4rnENa+mwsY7wEXabcob5EH5Oib/x+UJv442vIv4PJ WKmgQ5162tcmT6D7fdiYXwiXkqYKu5pzQ+nnMh/dCMIQUCOVTwaT8h3qmCfhTEhJ jU+ePAtMAPaI21a41WEXOO/oTCZUx8R6kVQgK8Auphui8KxzI0yLTSHN/zxoCM+N v5S4WdacrWKpDw1P50lL2/HavxiL7cD6wXGYmSsEsAa5c/B3poSRxm33HKrsZCRM g4+j0ZWlEFHhP20XaT3sA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrvddtgddvudcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecufghrlhcuvffnffculdejtddmnecujfgurhephffvve fufffkofgjfhgggfestdekredtredttdenucfhrhhomhepffgrnhhivghlucgiuhcuoegu gihusegugihuuhhurdighiiiqeenucggtffrrghtthgvrhhnpefgfefggeejhfduieekvd euteffleeifeeuvdfhheejleejjeekgfffgefhtddtteenucevlhhushhtvghrufhiiigv pedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegugihusegugihuuhhurdighiii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 6 Jul 2023 19:26:23 -0400 (EDT) From: Daniel Xu To: andrii@kernel.org, ast@kernel.org, shuah@kernel.org, daniel@iogearbox.net, fw@strlen.de Cc: martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, mykolal@fb.com, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, netfilter-devel@vger.kernel.org, dsahern@kernel.org Subject: [PATCH bpf-next v2 5/6] bpf: selftests: Support custom type and proto for client sockets Date: Thu, 6 Jul 2023 17:25:52 -0600 Message-ID: <7729886a0e1ae0e07850840a7cd54530b26318d8.1688685338.git.dxu@dxuuu.xyz> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_PASS, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net Extend connect_to_fd_opts() to take optional type and protocol parameters for the client socket. These parameters are useful when opening a raw socket to send IP fragments. Signed-off-by: Daniel Xu --- tools/testing/selftests/bpf/network_helpers.c | 21 +++++++++++++------ tools/testing/selftests/bpf/network_helpers.h | 2 ++ 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c index d5c78c08903b..910d5d0470e6 100644 --- a/tools/testing/selftests/bpf/network_helpers.c +++ b/tools/testing/selftests/bpf/network_helpers.c @@ -270,14 +270,23 @@ int connect_to_fd_opts(int server_fd, const struct network_helper_opts *opts) opts = &default_opts; optlen = sizeof(type); - if (getsockopt(server_fd, SOL_SOCKET, SO_TYPE, &type, &optlen)) { - log_err("getsockopt(SOL_TYPE)"); - return -1; + + if (opts->type) { + type = opts->type; + } else { + if (getsockopt(server_fd, SOL_SOCKET, SO_TYPE, &type, &optlen)) { + log_err("getsockopt(SOL_TYPE)"); + return -1; + } } - if (getsockopt(server_fd, SOL_SOCKET, SO_PROTOCOL, &protocol, &optlen)) { - log_err("getsockopt(SOL_PROTOCOL)"); - return -1; + if (opts->proto) { + protocol = opts->proto; + } else { + if (getsockopt(server_fd, SOL_SOCKET, SO_PROTOCOL, &protocol, &optlen)) { + log_err("getsockopt(SOL_PROTOCOL)"); + return -1; + } } addrlen = sizeof(addr); diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h index 87894dc984dd..5eccc67d1a99 100644 --- a/tools/testing/selftests/bpf/network_helpers.h +++ b/tools/testing/selftests/bpf/network_helpers.h @@ -22,6 +22,8 @@ struct network_helper_opts { int timeout_ms; bool must_fail; bool noconnect; + int type; + int proto; }; /* ipv4 test vector */ From patchwork Thu Jul 6 23:25:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13304303 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 44BEF174E8 for ; Thu, 6 Jul 2023 23:26:30 +0000 (UTC) Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 27C791BD3; Thu, 6 Jul 2023 16:26:27 -0700 (PDT) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 97D465C007B; Thu, 6 Jul 2023 19:26:26 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Thu, 06 Jul 2023 19:26:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm1; t=1688685986; x= 1688772386; bh=KYTx8/AVmAwctDU/Q4rV3rnsQa0jrAaUVq9C4p0inNo=; b=D iBHk893NEkIOu3wlM1n1X+qyvWxeAwxeHGAvlHQFLBqHggSE48ytufP/HZTQhP8d 4Dyb1lKXC+WC+iNRC4sFPThm46If6qjNl0OHQeXk6zWDeiO711C9clbKkKZ3rRQ4 RdFVvqcjdeFNf4wtDaBBWa98JUqCheaWAbk8GkbGsTcfQpzBOSXJeadtOYrTwPOL CXer175IcBM0zFsWZujlGi0PovLO4cG/Kmwwkrctc++G+rXE6zWTqNXQDNHemKTx EKKCc0d//jdF2sLWZrAkXaAJvY3WftJ4AkbQxgzBuukWzvnC8lVBLp5bQoaCA1BJ lDSEXtCRerzay9URnZ0pg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1688685986; x= 1688772386; bh=KYTx8/AVmAwctDU/Q4rV3rnsQa0jrAaUVq9C4p0inNo=; b=j RhNOt1tY2/0GE3uFixkdkc/g0MISb/pZXcDEDzSl8piepW951PzVsTv9NKdeXvsl Aa2GoHOlAJhzlDLA5V2koDVu00gytNl6pUtxztpf0tw343A3qbSKZCZZXPbKRTDJ IRcN55CV1IMXQBK5ZodR6RL6Z2/Yn/tOtv6pZblZPdPtL0QhZbVP6CPvlBi6k/c9 3HgvaAjKVJ4bX6gmUw1ImiTDnd728h5cDMoC5Yi20DjqrG0hkJ7siJlayv3R2sf8 del8EWJAv2RxkUfQ4yD0+h0Y+xc+6xFBSkL0vy4RXfU2yRfgHV3WqjFD3Oa1VYgF lHgzcBI2jMHVmAlm7JYZw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrvddtgddvudcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecufghrlhcuvffnffculdejtddmnecujfgurhephffvve fufffkofgjfhgggfestdekredtredttdenucfhrhhomhepffgrnhhivghlucgiuhcuoegu gihusegugihuuhhurdighiiiqeenucggtffrrghtthgvrhhnpefgfefggeejhfduieekvd euteffleeifeeuvdfhheejleejjeekgfffgefhtddtteenucevlhhushhtvghrufhiiigv pedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegugihusegugihuuhhurdighiii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 6 Jul 2023 19:26:25 -0400 (EDT) From: Daniel Xu To: andrii@kernel.org, ast@kernel.org, shuah@kernel.org, daniel@iogearbox.net, fw@strlen.de Cc: martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, mykolal@fb.com, linux-kernel@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, netfilter-devel@vger.kernel.org, dsahern@kernel.org Subject: [PATCH bpf-next v2 6/6] bpf: selftests: Add defrag selftests Date: Thu, 6 Jul 2023 17:25:53 -0600 Message-ID: X-Mailer: git-send-email 2.41.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_PASS, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net These selftests tests 2 major scenarios: the BPF based defragmentation can successfully be done and that packet pointers are invalidated after calls to the kfunc. The logic is similar for both ipv4 and ipv6. In the first scenario, we create a UDP client and UDP echo server. The the server side is fairly straightforward: we attach the prog and simply echo back the message. The on the client side, we send fragmented packets to and expect the reassembled message back from the server. Signed-off-by: Daniel Xu --- tools/testing/selftests/bpf/Makefile | 4 +- .../selftests/bpf/generate_udp_fragments.py | 90 ++++++ .../selftests/bpf/ip_check_defrag_frags.h | 57 ++++ .../bpf/prog_tests/ip_check_defrag.c | 282 ++++++++++++++++++ .../selftests/bpf/progs/ip_check_defrag.c | 104 +++++++ 5 files changed, 535 insertions(+), 2 deletions(-) create mode 100755 tools/testing/selftests/bpf/generate_udp_fragments.py create mode 100644 tools/testing/selftests/bpf/ip_check_defrag_frags.h create mode 100644 tools/testing/selftests/bpf/prog_tests/ip_check_defrag.c create mode 100644 tools/testing/selftests/bpf/progs/ip_check_defrag.c diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 882be03b179f..619df497fce5 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -565,8 +565,8 @@ TRUNNER_EXTRA_SOURCES := test_progs.c cgroup_helpers.c trace_helpers.c \ network_helpers.c testing_helpers.c \ btf_helpers.c flow_dissector_load.h \ cap_helpers.c test_loader.c xsk.c disasm.c \ - json_writer.c unpriv_helpers.c - + json_writer.c unpriv_helpers.c \ + ip_check_defrag_frags.h TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko \ $(OUTPUT)/liburandom_read.so \ $(OUTPUT)/xdp_synproxy \ diff --git a/tools/testing/selftests/bpf/generate_udp_fragments.py b/tools/testing/selftests/bpf/generate_udp_fragments.py new file mode 100755 index 000000000000..2b8a1187991c --- /dev/null +++ b/tools/testing/selftests/bpf/generate_udp_fragments.py @@ -0,0 +1,90 @@ +#!/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +""" +This script helps generate fragmented UDP packets. + +While it is technically possible to dynamically generate +fragmented packets in C, it is much harder to read and write +said code. `scapy` is relatively industry standard and really +easy to read / write. + +So we choose to write this script that generates a valid C +header. Rerun script and commit generated file after any +modifications. +""" + +import argparse +import os + +from scapy.all import * + + +# These constants must stay in sync with `ip_check_defrag.c` +VETH1_ADDR = "172.16.1.200" +VETH0_ADDR6 = "fc00::100" +VETH1_ADDR6 = "fc00::200" +CLIENT_PORT = 48878 +SERVER_PORT = 48879 +MAGIC_MESSAGE = "THIS IS THE ORIGINAL MESSAGE, PLEASE REASSEMBLE ME" + + +def print_header(f): + f.write("// SPDX-License-Identifier: GPL-2.0\n") + f.write("/* DO NOT EDIT -- this file is generated */\n") + f.write("\n") + f.write("#ifndef _IP_CHECK_DEFRAG_FRAGS_H\n") + f.write("#define _IP_CHECK_DEFRAG_FRAGS_H\n") + f.write("\n") + f.write("#include \n") + f.write("\n") + + +def print_frags(f, frags, v6): + for idx, frag in enumerate(frags): + # 10 bytes per line to keep width in check + chunks = [frag[i : i + 10] for i in range(0, len(frag), 10)] + chunks_fmted = [", ".join([str(hex(b)) for b in chunk]) for chunk in chunks] + suffix = "6" if v6 else "" + + f.write(f"static uint8_t frag{suffix}_{idx}[] = {{\n") + for chunk in chunks_fmted: + f.write(f"\t{chunk},\n") + f.write(f"}};\n") + + +def print_trailer(f): + f.write("\n") + f.write("#endif /* _IP_CHECK_DEFRAG_FRAGS_H */\n") + + +def main(f): + # srcip of 0 is filled in by IP_HDRINCL + sip = "0.0.0.0" + sip6 = VETH0_ADDR6 + dip = VETH1_ADDR + dip6 = VETH1_ADDR6 + sport = CLIENT_PORT + dport = SERVER_PORT + payload = MAGIC_MESSAGE.encode() + + # Disable UDPv4 checksums to keep code simpler + pkt = IP(src=sip,dst=dip) / UDP(sport=sport,dport=dport,chksum=0) / Raw(load=payload) + # UDPv6 requires a checksum + # Also pin the ipv6 fragment header ID, otherwise it's a random value + pkt6 = IPv6(src=sip6,dst=dip6) / IPv6ExtHdrFragment(id=0xBEEF) / UDP(sport=sport,dport=dport) / Raw(load=payload) + + frags = [f.build() for f in pkt.fragment(24)] + frags6 = [f.build() for f in fragment6(pkt6, 72)] + + print_header(f) + print_frags(f, frags, False) + print_frags(f, frags6, True) + print_trailer(f) + + +if __name__ == "__main__": + dir = os.path.dirname(os.path.realpath(__file__)) + header = f"{dir}/ip_check_defrag_frags.h" + with open(header, "w") as f: + main(f) diff --git a/tools/testing/selftests/bpf/ip_check_defrag_frags.h b/tools/testing/selftests/bpf/ip_check_defrag_frags.h new file mode 100644 index 000000000000..70ab7e9fa22b --- /dev/null +++ b/tools/testing/selftests/bpf/ip_check_defrag_frags.h @@ -0,0 +1,57 @@ +// SPDX-License-Identifier: GPL-2.0 +/* DO NOT EDIT -- this file is generated */ + +#ifndef _IP_CHECK_DEFRAG_FRAGS_H +#define _IP_CHECK_DEFRAG_FRAGS_H + +#include + +static uint8_t frag_0[] = { + 0x45, 0x0, 0x0, 0x2c, 0x0, 0x1, 0x20, 0x0, 0x40, 0x11, + 0xac, 0xe8, 0x0, 0x0, 0x0, 0x0, 0xac, 0x10, 0x1, 0xc8, + 0xbe, 0xee, 0xbe, 0xef, 0x0, 0x3a, 0x0, 0x0, 0x54, 0x48, + 0x49, 0x53, 0x20, 0x49, 0x53, 0x20, 0x54, 0x48, 0x45, 0x20, + 0x4f, 0x52, 0x49, 0x47, +}; +static uint8_t frag_1[] = { + 0x45, 0x0, 0x0, 0x2c, 0x0, 0x1, 0x20, 0x3, 0x40, 0x11, + 0xac, 0xe5, 0x0, 0x0, 0x0, 0x0, 0xac, 0x10, 0x1, 0xc8, + 0x49, 0x4e, 0x41, 0x4c, 0x20, 0x4d, 0x45, 0x53, 0x53, 0x41, + 0x47, 0x45, 0x2c, 0x20, 0x50, 0x4c, 0x45, 0x41, 0x53, 0x45, + 0x20, 0x52, 0x45, 0x41, +}; +static uint8_t frag_2[] = { + 0x45, 0x0, 0x0, 0x1e, 0x0, 0x1, 0x0, 0x6, 0x40, 0x11, + 0xcc, 0xf0, 0x0, 0x0, 0x0, 0x0, 0xac, 0x10, 0x1, 0xc8, + 0x53, 0x53, 0x45, 0x4d, 0x42, 0x4c, 0x45, 0x20, 0x4d, 0x45, +}; +static uint8_t frag6_0[] = { + 0x60, 0x0, 0x0, 0x0, 0x0, 0x20, 0x2c, 0x40, 0xfc, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x1, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2, 0x0, + 0x11, 0x0, 0x0, 0x1, 0x0, 0x0, 0xbe, 0xef, 0xbe, 0xee, + 0xbe, 0xef, 0x0, 0x3a, 0xd0, 0xf8, 0x54, 0x48, 0x49, 0x53, + 0x20, 0x49, 0x53, 0x20, 0x54, 0x48, 0x45, 0x20, 0x4f, 0x52, + 0x49, 0x47, +}; +static uint8_t frag6_1[] = { + 0x60, 0x0, 0x0, 0x0, 0x0, 0x20, 0x2c, 0x40, 0xfc, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x1, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2, 0x0, + 0x11, 0x0, 0x0, 0x19, 0x0, 0x0, 0xbe, 0xef, 0x49, 0x4e, + 0x41, 0x4c, 0x20, 0x4d, 0x45, 0x53, 0x53, 0x41, 0x47, 0x45, + 0x2c, 0x20, 0x50, 0x4c, 0x45, 0x41, 0x53, 0x45, 0x20, 0x52, + 0x45, 0x41, +}; +static uint8_t frag6_2[] = { + 0x60, 0x0, 0x0, 0x0, 0x0, 0x12, 0x2c, 0x40, 0xfc, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x1, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2, 0x0, + 0x11, 0x0, 0x0, 0x30, 0x0, 0x0, 0xbe, 0xef, 0x53, 0x53, + 0x45, 0x4d, 0x42, 0x4c, 0x45, 0x20, 0x4d, 0x45, +}; + +#endif /* _IP_CHECK_DEFRAG_FRAGS_H */ diff --git a/tools/testing/selftests/bpf/prog_tests/ip_check_defrag.c b/tools/testing/selftests/bpf/prog_tests/ip_check_defrag.c new file mode 100644 index 000000000000..5cd08d6e0ebc --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/ip_check_defrag.c @@ -0,0 +1,282 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include "ip_check_defrag.skel.h" +#include "ip_check_defrag_frags.h" + +/* + * This selftest spins up a client and an echo server, each in their own + * network namespace. The client will send a fragmented message to the server. + * The prog attached to the server will shoot down any fragments. Thus, if + * the server is able to correctly echo back the message to the client, we will + * have verified that netfilter is reassembling packets for us. + * + * Topology: + * ========= + * NS0 | NS1 + * | + * client | server + * ---------- | ---------- + * | veth0 | --------- | veth1 | + * ---------- peer ---------- + * | + * | with bpf + */ + +#define NS0 "defrag_ns0" +#define NS1 "defrag_ns1" +#define VETH0 "veth0" +#define VETH1 "veth1" +#define VETH0_ADDR "172.16.1.100" +#define VETH0_ADDR6 "fc00::100" +/* The following constants must stay in sync with `generate_udp_fragments.py` */ +#define VETH1_ADDR "172.16.1.200" +#define VETH1_ADDR6 "fc00::200" +#define CLIENT_PORT 48878 +#define SERVER_PORT 48879 +#define MAGIC_MESSAGE "THIS IS THE ORIGINAL MESSAGE, PLEASE REASSEMBLE ME" + +static int setup_topology(bool ipv6) +{ + bool up; + int i; + + SYS(fail, "ip netns add " NS0); + SYS(fail, "ip netns add " NS1); + SYS(fail, "ip link add " VETH0 " netns " NS0 " type veth peer name " VETH1 " netns " NS1); + if (ipv6) { + SYS(fail, "ip -6 -net " NS0 " addr add " VETH0_ADDR6 "/64 dev " VETH0 " nodad"); + SYS(fail, "ip -6 -net " NS1 " addr add " VETH1_ADDR6 "/64 dev " VETH1 " nodad"); + } else { + SYS(fail, "ip -net " NS0 " addr add " VETH0_ADDR "/24 dev " VETH0); + SYS(fail, "ip -net " NS1 " addr add " VETH1_ADDR "/24 dev " VETH1); + } + SYS(fail, "ip -net " NS0 " link set dev " VETH0 " up"); + SYS(fail, "ip -net " NS1 " link set dev " VETH1 " up"); + + /* Wait for up to 5s for links to come up */ + for (i = 0; i < 5; ++i) { + if (ipv6) + up = !system("ip netns exec " NS0 " ping -6 -c 1 -W 1 " VETH1_ADDR6 " &>/dev/null"); + else + up = !system("ip netns exec " NS0 " ping -c 1 -W 1 " VETH1_ADDR " &>/dev/null"); + + if (up) + break; + } + + return 0; +fail: + return -1; +} + +static void cleanup_topology(void) +{ + SYS_NOFAIL("test -f /var/run/netns/" NS0 " && ip netns delete " NS0); + SYS_NOFAIL("test -f /var/run/netns/" NS1 " && ip netns delete " NS1); +} + +static int attach(struct ip_check_defrag *skel, bool ipv6) +{ + LIBBPF_OPTS(bpf_netfilter_opts, opts, + .pf = ipv6 ? NFPROTO_IPV6 : NFPROTO_IPV4, + .priority = 42, + .flags = BPF_F_NETFILTER_IP_DEFRAG); + struct nstoken *nstoken; + int err = -1; + + nstoken = open_netns(NS1); + + skel->links.defrag = bpf_program__attach_netfilter(skel->progs.defrag, &opts); + if (!ASSERT_OK_PTR(skel->links.defrag, "program attach")) + goto out; + + err = 0; +out: + close_netns(nstoken); + return err; +} + +static int send_frags(int client) +{ + struct sockaddr_storage saddr; + struct sockaddr *saddr_p; + socklen_t saddr_len; + int err; + + saddr_p = (struct sockaddr *)&saddr; + err = make_sockaddr(AF_INET, VETH1_ADDR, SERVER_PORT, &saddr, &saddr_len); + if (!ASSERT_OK(err, "make_sockaddr")) + return -1; + + err = sendto(client, frag_0, sizeof(frag_0), 0, saddr_p, saddr_len); + if (!ASSERT_GE(err, 0, "sendto frag_0")) + return -1; + + err = sendto(client, frag_1, sizeof(frag_1), 0, saddr_p, saddr_len); + if (!ASSERT_GE(err, 0, "sendto frag_1")) + return -1; + + err = sendto(client, frag_2, sizeof(frag_2), 0, saddr_p, saddr_len); + if (!ASSERT_GE(err, 0, "sendto frag_2")) + return -1; + + return 0; +} + +static int send_frags6(int client) +{ + struct sockaddr_storage saddr; + struct sockaddr *saddr_p; + socklen_t saddr_len; + int err; + + saddr_p = (struct sockaddr *)&saddr; + /* Port needs to be set to 0 for raw ipv6 socket for some reason */ + err = make_sockaddr(AF_INET6, VETH1_ADDR6, 0, &saddr, &saddr_len); + if (!ASSERT_OK(err, "make_sockaddr")) + return -1; + + err = sendto(client, frag6_0, sizeof(frag6_0), 0, saddr_p, saddr_len); + if (!ASSERT_GE(err, 0, "sendto frag6_0")) + return -1; + + err = sendto(client, frag6_1, sizeof(frag6_1), 0, saddr_p, saddr_len); + if (!ASSERT_GE(err, 0, "sendto frag6_1")) + return -1; + + err = sendto(client, frag6_2, sizeof(frag6_2), 0, saddr_p, saddr_len); + if (!ASSERT_GE(err, 0, "sendto frag6_2")) + return -1; + + return 0; +} + +void test_bpf_ip_check_defrag_ok(bool ipv6) +{ + struct network_helper_opts rx_opts = { + .timeout_ms = 1000, + .noconnect = true, + }; + struct network_helper_opts tx_ops = { + .timeout_ms = 1000, + .type = SOCK_RAW, + .proto = IPPROTO_RAW, + .noconnect = true, + }; + struct sockaddr_storage caddr; + struct ip_check_defrag *skel; + struct nstoken *nstoken; + int client_tx_fd = -1; + int client_rx_fd = -1; + socklen_t caddr_len; + int srv_fd = -1; + char buf[1024]; + int len, err; + + skel = ip_check_defrag__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + return; + + if (!ASSERT_OK(setup_topology(ipv6), "setup_topology")) + goto out; + + if (!ASSERT_OK(attach(skel, ipv6), "attach")) + goto out; + + /* Start server in ns1 */ + nstoken = open_netns(NS1); + if (!ASSERT_OK_PTR(nstoken, "setns ns1")) + goto out; + srv_fd = start_server(ipv6 ? AF_INET6 : AF_INET, SOCK_DGRAM, NULL, SERVER_PORT, 0); + close_netns(nstoken); + if (!ASSERT_GE(srv_fd, 0, "start_server")) + goto out; + + /* Open tx raw socket in ns0 */ + nstoken = open_netns(NS0); + if (!ASSERT_OK_PTR(nstoken, "setns ns0")) + goto out; + client_tx_fd = connect_to_fd_opts(srv_fd, &tx_ops); + close_netns(nstoken); + if (!ASSERT_GE(client_tx_fd, 0, "connect_to_fd_opts")) + goto out; + + /* Open rx socket in ns0 */ + nstoken = open_netns(NS0); + if (!ASSERT_OK_PTR(nstoken, "setns ns0")) + goto out; + client_rx_fd = connect_to_fd_opts(srv_fd, &rx_opts); + close_netns(nstoken); + if (!ASSERT_GE(client_rx_fd, 0, "connect_to_fd_opts")) + goto out; + + /* Bind rx socket to a premeditated port */ + memset(&caddr, 0, sizeof(caddr)); + nstoken = open_netns(NS0); + if (!ASSERT_OK_PTR(nstoken, "setns ns0")) + goto out; + if (ipv6) { + struct sockaddr_in6 *c = (struct sockaddr_in6 *)&caddr; + + c->sin6_family = AF_INET6; + inet_pton(AF_INET6, VETH0_ADDR6, &c->sin6_addr); + c->sin6_port = htons(CLIENT_PORT); + err = bind(client_rx_fd, (struct sockaddr *)c, sizeof(*c)); + } else { + struct sockaddr_in *c = (struct sockaddr_in *)&caddr; + + c->sin_family = AF_INET; + inet_pton(AF_INET, VETH0_ADDR, &c->sin_addr); + c->sin_port = htons(CLIENT_PORT); + err = bind(client_rx_fd, (struct sockaddr *)c, sizeof(*c)); + } + close_netns(nstoken); + if (!ASSERT_OK(err, "bind")) + goto out; + + /* Send message in fragments */ + if (ipv6) { + if (!ASSERT_OK(send_frags6(client_tx_fd), "send_frags6")) + goto out; + } else { + if (!ASSERT_OK(send_frags(client_tx_fd), "send_frags")) + goto out; + } + + if (!ASSERT_EQ(skel->bss->shootdowns, 0, "shootdowns")) + goto out; + + /* Receive reassembled msg on server and echo back to client */ + len = recvfrom(srv_fd, buf, sizeof(buf), 0, (struct sockaddr *)&caddr, &caddr_len); + if (!ASSERT_GE(len, 0, "server recvfrom")) + goto out; + len = sendto(srv_fd, buf, len, 0, (struct sockaddr *)&caddr, caddr_len); + if (!ASSERT_GE(len, 0, "server sendto")) + goto out; + + /* Expect reassembed message to be echoed back */ + len = recvfrom(client_rx_fd, buf, sizeof(buf), 0, NULL, NULL); + if (!ASSERT_EQ(len, sizeof(MAGIC_MESSAGE) - 1, "client short read")) + goto out; + +out: + if (client_rx_fd != -1) + close(client_rx_fd); + if (client_tx_fd != -1) + close(client_tx_fd); + if (srv_fd != -1) + close(srv_fd); + cleanup_topology(); + ip_check_defrag__destroy(skel); +} + +void test_bpf_ip_check_defrag(void) +{ + if (test__start_subtest("v4")) + test_bpf_ip_check_defrag_ok(false); + if (test__start_subtest("v6")) + test_bpf_ip_check_defrag_ok(true); +} diff --git a/tools/testing/selftests/bpf/progs/ip_check_defrag.c b/tools/testing/selftests/bpf/progs/ip_check_defrag.c new file mode 100644 index 000000000000..4259c6d59968 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/ip_check_defrag.c @@ -0,0 +1,104 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include "vmlinux.h" +#include +#include +#include "bpf_tracing_net.h" + +#define NF_DROP 0 +#define NF_ACCEPT 1 +#define ETH_P_IP 0x0800 +#define ETH_P_IPV6 0x86DD +#define IP_MF 0x2000 +#define IP_OFFSET 0x1FFF +#define NEXTHDR_FRAGMENT 44 + +extern int bpf_dynptr_from_skb(struct sk_buff *skb, __u64 flags, + struct bpf_dynptr *ptr__uninit) __ksym; +extern void *bpf_dynptr_slice(const struct bpf_dynptr *ptr, uint32_t offset, + void *buffer, uint32_t buffer__sz) __ksym; + +volatile int shootdowns = 0; + +static bool is_frag_v4(struct iphdr *iph) +{ + int offset; + int flags; + + offset = bpf_ntohs(iph->frag_off); + flags = offset & ~IP_OFFSET; + offset &= IP_OFFSET; + offset <<= 3; + + return (flags & IP_MF) || offset; +} + +static bool is_frag_v6(struct ipv6hdr *ip6h) +{ + /* Simplifying assumption that there are no extension headers + * between fixed header and fragmentation header. This assumption + * is only valid in this test case. It saves us the hassle of + * searching all potential extension headers. + */ + return ip6h->nexthdr == NEXTHDR_FRAGMENT; +} + +static int handle_v4(struct sk_buff *skb) +{ + struct bpf_dynptr ptr; + u8 iph_buf[20] = {}; + struct iphdr *iph; + + if (bpf_dynptr_from_skb(skb, 0, &ptr)) + return NF_DROP; + + iph = bpf_dynptr_slice(&ptr, 0, iph_buf, sizeof(iph_buf)); + if (!iph) + return NF_DROP; + + /* Shootdown any frags */ + if (is_frag_v4(iph)) { + shootdowns++; + return NF_DROP; + } + + return NF_ACCEPT; +} + +static int handle_v6(struct sk_buff *skb) +{ + struct bpf_dynptr ptr; + struct ipv6hdr *ip6h; + u8 ip6h_buf[40] = {}; + + if (bpf_dynptr_from_skb(skb, 0, &ptr)) + return NF_DROP; + + ip6h = bpf_dynptr_slice(&ptr, 0, ip6h_buf, sizeof(ip6h_buf)); + if (!ip6h) + return NF_DROP; + + /* Shootdown any frags */ + if (is_frag_v6(ip6h)) { + shootdowns++; + return NF_DROP; + } + + return NF_ACCEPT; +} + +SEC("netfilter") +int defrag(struct bpf_nf_ctx *ctx) +{ + struct sk_buff *skb = ctx->skb; + + switch (bpf_ntohs(skb->protocol)) { + case ETH_P_IP: + return handle_v4(skb); + case ETH_P_IPV6: + return handle_v6(skb); + default: + return NF_ACCEPT; + } +} + +char _license[] SEC("license") = "GPL";