From patchwork Mon Feb 27 19:51:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13154160 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B10CEC7EE31 for ; Mon, 27 Feb 2023 19:51:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229708AbjB0Tvs (ORCPT ); Mon, 27 Feb 2023 14:51:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35626 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230351AbjB0Tvp (ORCPT ); Mon, 27 Feb 2023 14:51:45 -0500 Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 636CB2885C; Mon, 27 Feb 2023 11:51:40 -0800 (PST) Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.west.internal (Postfix) with ESMTP id EE4323200094; Mon, 27 Feb 2023 14:51:38 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Mon, 27 Feb 2023 14:51:39 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm3; t=1677527498; x=1677613898; bh=gB Fdud0U0/x2N3GtbhHj2ToHCT4bsmtEl3upzYm7cgE=; b=RSZLtxSSnnabLFxjX9 7seENV5tpnz+8y96rl2FUH3cUjsZy+Uuy7BLFtImRR79+uEAn42MkwmEAWjA+0YW Wt4LeL3YULW45KoquooeoMpUAvJjiZ+DbVtxZfY2NokW5wxpeRtbd3k7fcI3zdTd 21CHy5j67/mH8BNTkm4enhqGwyS+E39M9v3e+V7EjSTKmkP3vI4NeM4kzc5G5RR9 /OXv+ATLO3YZio2Z/TySOLN4Yyrvi80KBsOnV0LQSZwp8Qb50RPOFq2f4ZeUcQld 2+ctfgodm0aF7EwtzioY1Gz5qBWw8FPbf44qIotLRWsG73mMJzK7Ablk+fhDfV6O 6ruw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1677527498; x=1677613898; bh=gBFdud0U0/x2N 3GtbhHj2ToHCT4bsmtEl3upzYm7cgE=; b=sYDu8x/cAom0vpBgEy+QrnEmIaBP+ OevzV6NuOMdDz7RaRoMpATMXpnO/PQNDkUdae/IL8tO1AwUMBHTaHg3rRqka8CAo YTdwe5KYhcpGo6iO8xHyUiwBs30un5Ne5eAdMve5pZjGxQM19D7EHR0F00xF+/By wq//RtupuuoHwNrl+yNMu4ntYxYHD1LNc9GGYAzS6PtVqIDls5DaxJpR5Z17iB0d x+m6jtjyBBZpxRDit9fWcCn/T4NjDsocyKef9B0mAvprYdTuhtnNjRp1+14Dm11e tqsRYI6+REJF1QS5169WG8SDcW+pvvgvJkW5mpSeAtdJYBVYUKkO0TkiQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrudeltddguddviecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecufghrlhcuvffnffculdejtddmnecujfgurhephf fvvefufffkofgjfhgggfestdekredtredttdenucfhrhhomhepffgrnhhivghlucgiuhcu oegugihusegugihuuhhurdighiiiqeenucggtffrrghtthgvrhhnpefgfefggeejhfduie ekvdeuteffleeifeeuvdfhheejleejjeekgfffgefhtddtteenucevlhhushhtvghrufhi iigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegugihusegugihuuhhurdighiii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 27 Feb 2023 14:51:37 -0500 (EST) From: Daniel Xu To: kuba@kernel.org, edumazet@google.com, willemdebruijn.kernel@gmail.com, davem@davemloft.net, pabeni@redhat.com, dsahern@kernel.org Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, bpf@vger.kernel.org Subject: [PATCH bpf-next v2 1/8] ip: frags: Return actual error codes from ip_check_defrag() Date: Mon, 27 Feb 2023 12:51:03 -0700 Message-Id: X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Once we wrap ip_check_defrag() in a kfunc, it may be useful for progs to know the exact error condition ip_check_defrag() encountered. Signed-off-by: Daniel Xu --- drivers/net/macvlan.c | 2 +- net/ipv4/ip_fragment.c | 13 ++++++++----- net/packet/af_packet.c | 2 +- 3 files changed, 10 insertions(+), 7 deletions(-) diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index 99a971929c8e..b8310e13d7e1 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -456,7 +456,7 @@ static rx_handler_result_t macvlan_handle_frame(struct sk_buff **pskb) unsigned int hash; skb = ip_check_defrag(dev_net(skb->dev), skb, IP_DEFRAG_MACVLAN); - if (!skb) + if (IS_ERR(skb)) return RX_HANDLER_CONSUMED; *pskb = skb; eth = eth_hdr(skb); diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index 69c00ffdcf3e..959d2c4260ea 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -514,6 +514,7 @@ struct sk_buff *ip_check_defrag(struct net *net, struct sk_buff *skb, u32 user) struct iphdr iph; int netoff; u32 len; + int err; if (skb->protocol != htons(ETH_P_IP)) return skb; @@ -535,15 +536,17 @@ struct sk_buff *ip_check_defrag(struct net *net, struct sk_buff *skb, u32 user) if (skb) { if (!pskb_may_pull(skb, netoff + iph.ihl * 4)) { kfree_skb(skb); - return NULL; + return ERR_PTR(-ENOMEM); } - if (pskb_trim_rcsum(skb, netoff + len)) { + err = pskb_trim_rcsum(skb, netoff + len); + if (err) { kfree_skb(skb); - return NULL; + return ERR_PTR(err); } memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); - if (ip_defrag(net, skb, user)) - return NULL; + err = ip_defrag(net, skb, user); + if (err) + return ERR_PTR(err); skb_clear_hash(skb); } } diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index d4e76e2ae153..1ef94828c8da 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -1470,7 +1470,7 @@ static int packet_rcv_fanout(struct sk_buff *skb, struct net_device *dev, if (fanout_has_flag(f, PACKET_FANOUT_FLAG_DEFRAG)) { skb = ip_check_defrag(net, skb, IP_DEFRAG_AF_PACKET); - if (!skb) + if (IS_ERR(skb)) return 0; } switch (f->type) { From patchwork Mon Feb 27 19:51:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13154161 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E077C7EE2D for ; Mon, 27 Feb 2023 19:52:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230361AbjB0TwG (ORCPT ); Mon, 27 Feb 2023 14:52:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36296 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230388AbjB0TwC (ORCPT ); Mon, 27 Feb 2023 14:52:02 -0500 Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E59142887D; Mon, 27 Feb 2023 11:51:52 -0800 (PST) Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.west.internal (Postfix) with ESMTP id 26EB032000CC; Mon, 27 Feb 2023 14:51:51 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Mon, 27 Feb 2023 14:51:52 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm3; t=1677527510; x=1677613910; bh=2w S7QXee/ZD2u1qIg7tfCJKh+l0Hd7Y+MEF85vsyvdY=; b=ltEEr5nsui4odRXKET yer0ZxMVXfaBCaaLcvWs7eNHjqd/FtdIOiRpVst+nvP1EBs59AKzuiCQ1OTMF9Ir ovITRvtGrVcFajtCzBrCuccx/FhehzVotm4RuZg3gYLyZ3hqV15eWsgJ7ySbc+fD ajKI4iv+kAJitaFso/fBTOzaN80WRQ/y5sJpmDFY5iJXYFU+FyyH6gMzxd7hqH85 lK2w3bJMGVpQhNtG8d7SZ+qyei8O96B7EEUq37P+QGjrTxnjr9K15KiPHA8FbYhj 15wcSh7o906GNpmwibc6TGji1Ke+CxxhxrzjPEHArKqpy2/JLrkolfA4zdV00tj8 +VOw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1677527510; x=1677613910; bh=2wS7QXee/ZD2u 1qIg7tfCJKh+l0Hd7Y+MEF85vsyvdY=; b=KVE1sgYfCYqBVJFZEoOkdxkLLhpcV X9sTPVEl80dCszfxkvQphyXW9nyPHR9ouM+dj0EXr8GNNuK9amA9+lowwtb1y0vm q2YtvL56n5T/jc7VWCpg2EnN/rb+gN8LbZIo8eir3VbtnrysNyUX8Uetmv+xIoH9 Bs1JTHSLLMLs2jjLqxTKzzb4r1k3dGkqrrI+CcUwh1/1yBbBhKGd8NvddAreAA/7 DX2WddKOzvDqFZsF0i81AYTs5JywQKptGsfUIhnOivQQvPISNxq1KVb5wQf7XOob dP46H8uTMHUTavnolSp9PnfyIVQpw/3QIUPTuGsvvz0XdqnZq4RekSYHg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrudeltddguddviecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecufghrlhcuvffnffculdejtddmnecujfgurhephf fvvefufffkofgjfhgggfestdekredtredttdenucfhrhhomhepffgrnhhivghlucgiuhcu oegugihusegugihuuhhurdighiiiqeenucggtffrrghtthgvrhhnpefgfefggeejhfduie ekvdeuteffleeifeeuvdfhheejleejjeekgfffgefhtddtteenucevlhhushhtvghrufhi iigvpedunecurfgrrhgrmhepmhgrihhlfhhrohhmpegugihusegugihuuhhurdighiii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 27 Feb 2023 14:51:49 -0500 (EST) From: Daniel Xu To: corbet@lwn.net, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, ast@kernel.org Cc: song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, bpf@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH bpf-next v2 2/8] bpf: verifier: Support KF_CHANGES_PKT flag Date: Mon, 27 Feb 2023 12:51:04 -0700 Message-Id: <991bc64ee4013bc81d7d4ab908d541d8978595a8.1677526810.git.dxu@dxuuu.xyz> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net KF_CHANGES_PKT indicates that the kfunc call may change packet data. This is analogous to bpf_helper_changes_pkt_data(). Signed-off-by: Daniel Xu --- Documentation/bpf/kfuncs.rst | 7 +++++++ include/linux/btf.h | 1 + kernel/bpf/verifier.c | 8 ++++++++ 3 files changed, 16 insertions(+) diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst index 226313747be5..16c387ee987f 100644 --- a/Documentation/bpf/kfuncs.rst +++ b/Documentation/bpf/kfuncs.rst @@ -260,6 +260,13 @@ encouraged to make their use-cases known as early as possible, and participate in upstream discussions regarding whether to keep, change, deprecate, or remove those kfuncs if and when such discussions occur. +2.4.10 KF_CHANGES_PKT flag +----------------- + +The KF_CHANGES_PKT is used for kfuncs that may change packet data. +After calls to such kfuncs, existing packet pointers will be invalidated +and must be revalidated before the prog can access packet data. + 2.5 Registering the kfuncs -------------------------- diff --git a/include/linux/btf.h b/include/linux/btf.h index 49e0fe6d8274..ee3d6c3e6cc0 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -71,6 +71,7 @@ #define KF_SLEEPABLE (1 << 5) /* kfunc may sleep */ #define KF_DESTRUCTIVE (1 << 6) /* kfunc performs destructive actions */ #define KF_RCU (1 << 7) /* kfunc only takes rcu pointer arguments */ +#define KF_CHANGES_PKT (1 << 8) /* kfunc may change packet data */ /* * Tag marking a kernel function as a kfunc. This is meant to minimize the diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 5cb8b623f639..e58065498a35 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -8681,6 +8681,11 @@ static bool is_kfunc_rcu(struct bpf_kfunc_call_arg_meta *meta) return meta->kfunc_flags & KF_RCU; } +static bool is_kfunc_changes_pkt(struct bpf_kfunc_call_arg_meta *meta) +{ + return meta->kfunc_flags & KF_CHANGES_PKT; +} + static bool is_kfunc_arg_kptr_get(struct bpf_kfunc_call_arg_meta *meta, int arg) { return arg == 0 && (meta->kfunc_flags & KF_KPTR_GET); @@ -10083,6 +10088,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, mark_btf_func_reg_size(env, regno, t->size); } + if (is_kfunc_changes_pkt(&meta)) + clear_all_pkt_pointers(env); + return 0; } From patchwork Mon Feb 27 19:51:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13154162 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DEEA0C64ED6 for ; Mon, 27 Feb 2023 19:52:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230281AbjB0TwT (ORCPT ); Mon, 27 Feb 2023 14:52:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36744 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229845AbjB0TwS (ORCPT ); Mon, 27 Feb 2023 14:52:18 -0500 Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2E4F528D05; Mon, 27 Feb 2023 11:52:00 -0800 (PST) Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailout.west.internal (Postfix) with ESMTP id 9A9E03200949; Mon, 27 Feb 2023 14:51:58 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Mon, 27 Feb 2023 14:51:59 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm3; t=1677527518; x=1677613918; bh=4a 9g+C87bs/Akd4lBTjV68zWx7r6qrL4D2J7Om59rgI=; b=vsSHyw1cIo8PHT7ex6 GGlJP075ADctVIXghE0sCUsAtGOFFBO+VwFQxCZvyNZCyE5GadWe5HrnWzn0/gg+ D89EqQfUWbx11dKqxxSgTgzxRQQoUBoZNIT4em4wDtFZuyOKq2r+5bz1Eotc0sqQ YzWNNwpNCQiALYBqbzJWzedBClJHSNuyYlEEfRsb2NxadwdacoGNVzakQhkbCAhl l59+J7KBDWUgLYN6Oo9tJixAI6z73JnXutxP17RBNoGJi+VyglG4g8/EAad5xXnT h9y/OVwi98Ld3fN62JtE1184sUgsL301AHDXMnVOLsYdcl+deZUuQvsVbLcB9W/f cN6Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1677527518; x=1677613918; bh=4a9g+C87bs/Ak d4lBTjV68zWx7r6qrL4D2J7Om59rgI=; b=GtsmOpkLjdwfN5spl9E5aRHSZ0xrt Bg0TpCSc+wJAfEKuxff6UsoDX+Iv0gMYq9q7h1A4oIbzPr2YDQZgF6Gcfayej1rm XWmo5Y7f1iwqSbsSxf04p8g7axBBVJltqpuN/xesdD9KUy9SwdMUlyohG+0S6O91 WG2F+QPTncOndkpKCYXV33daIZCj+jdtmg4gTnCG6m3O0X0iHpyBo+JgdY91vojs HihX1yxYPp7iwYFXdGob9cJEMAEv3oowm1uuEt/Owz+eBDAUPXiY6Pd0cSGD2pah eVXB7zw87tXCQldVF/e4vSq96DnBSeGaBVu0Bvz/TYp0V8AGSWVA2Q+Wg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrudeltddguddviecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecufghrlhcuvffnffculdejtddmnecujfgurhephf fvvefufffkofgjfhgggfestdekredtredttdenucfhrhhomhepffgrnhhivghlucgiuhcu oegugihusegugihuuhhurdighiiiqeenucggtffrrghtthgvrhhnpefgfefggeejhfduie ekvdeuteffleeifeeuvdfhheejleejjeekgfffgefhtddtteenucevlhhushhtvghrufhi iigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegugihusegugihuuhhurdighiii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 27 Feb 2023 14:51:57 -0500 (EST) From: Daniel Xu To: kuba@kernel.org, edumazet@google.com, davem@davemloft.net, dsahern@kernel.org, pabeni@redhat.com Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Subject: [PATCH bpf-next v2 3/8] bpf, net, frags: Add bpf_ip_check_defrag() kfunc Date: Mon, 27 Feb 2023 12:51:05 -0700 Message-Id: <7145c9891791db1c868a326476fef590f22b352b.1677526810.git.dxu@dxuuu.xyz> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This kfunc is used to defragment IPv4 packets. The idea is that if you see a fragmented packet, you call this kfunc. If the kfunc returns 0, then the skb has been updated to contain the entire reassembled packet. If the kfunc returns an error (most likely -EINPROGRESS), then it means the skb is part of a yet-incomplete original packet. A reasonable response to -EINPROGRESS is to drop the packet, as the ip defrag infrastructure is already hanging onto the frag for future reassembly. Care has been taken to ensure the prog skb remains valid no matter what the underlying ip_check_defrag() call does. This is in contrast to ip_defrag(), which may consume the skb if the skb is part of a yet-incomplete original packet. So far this kfunc is only callable from TC clsact progs. Signed-off-by: Daniel Xu --- include/net/ip.h | 11 +++++ net/ipv4/Makefile | 1 + net/ipv4/ip_fragment.c | 2 + net/ipv4/ip_fragment_bpf.c | 98 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 112 insertions(+) create mode 100644 net/ipv4/ip_fragment_bpf.c diff --git a/include/net/ip.h b/include/net/ip.h index c3fffaa92d6e..f3796b1b5cac 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -680,6 +680,7 @@ enum ip_defrag_users { IP_DEFRAG_VS_FWD, IP_DEFRAG_AF_PACKET, IP_DEFRAG_MACVLAN, + IP_DEFRAG_BPF, }; /* Return true if the value of 'user' is between 'lower_bond' @@ -693,6 +694,16 @@ static inline bool ip_defrag_user_in_between(u32 user, } int ip_defrag(struct net *net, struct sk_buff *skb, u32 user); + +#ifdef CONFIG_DEBUG_INFO_BTF +int register_ip_frag_bpf(void); +#else +static inline int register_ip_frag_bpf(void) +{ + return 0; +} +#endif + #ifdef CONFIG_INET struct sk_buff *ip_check_defrag(struct net *net, struct sk_buff *skb, u32 user); #else diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile index 880277c9fd07..950efb166d37 100644 --- a/net/ipv4/Makefile +++ b/net/ipv4/Makefile @@ -65,6 +65,7 @@ obj-$(CONFIG_TCP_CONG_ILLINOIS) += tcp_illinois.o obj-$(CONFIG_NET_SOCK_MSG) += tcp_bpf.o obj-$(CONFIG_BPF_SYSCALL) += udp_bpf.o obj-$(CONFIG_NETLABEL) += cipso_ipv4.o +obj-$(CONFIG_DEBUG_INFO_BTF) += ip_fragment_bpf.o obj-$(CONFIG_XFRM) += xfrm4_policy.o xfrm4_state.o xfrm4_input.o \ xfrm4_output.o xfrm4_protocol.o diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index 959d2c4260ea..e3fda5203f09 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -759,5 +759,7 @@ void __init ipfrag_init(void) if (inet_frags_init(&ip4_frags)) panic("IP: failed to allocate ip4_frags cache\n"); ip4_frags_ctl_register(); + if (register_ip_frag_bpf()) + panic("IP: bpf: failed to register ip_frag_bpf\n"); register_pernet_subsys(&ip4_frags_ops); } diff --git a/net/ipv4/ip_fragment_bpf.c b/net/ipv4/ip_fragment_bpf.c new file mode 100644 index 000000000000..a9e5908ed216 --- /dev/null +++ b/net/ipv4/ip_fragment_bpf.c @@ -0,0 +1,98 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Unstable ipv4 fragmentation helpers for TC-BPF hook + * + * These are called from SCHED_CLS BPF programs. Note that it is allowed to + * break compatibility for these functions since the interface they are exposed + * through to BPF programs is explicitly unstable. + */ + +#include +#include +#include +#include +#include +#include +#include + +__diag_push(); +__diag_ignore_all("-Wmissing-prototypes", + "Global functions as their definitions will be in ip_fragment BTF"); + +/* bpf_ip_check_defrag - Defragment an ipv4 packet + * + * This helper takes an skb as input. If this skb successfully reassembles + * the original packet, the skb is updated to contain the original, reassembled + * packet. + * + * Otherwise (on error or incomplete reassembly), the input skb remains + * unmodified. + * + * Parameters: + * @ctx - Pointer to program context (skb) + * @netns - Child network namespace id. If value is a negative signed + * 32-bit integer, the netns of the device in the skb is used. + * + * Return: + * 0 on successfully reassembly or non-fragmented packet. Negative value on + * error or incomplete reassembly. + */ +int bpf_ip_check_defrag(struct __sk_buff *ctx, u64 netns) +{ + struct sk_buff *skb = (struct sk_buff *)ctx; + struct sk_buff *skb_cpy, *skb_out; + struct net *caller_net; + struct net *net; + int mac_len; + void *mac; + + if (unlikely(!((s32)netns < 0 || netns <= S32_MAX))) + return -EINVAL; + + caller_net = skb->dev ? dev_net(skb->dev) : sock_net(skb->sk); + if ((s32)netns < 0) { + net = caller_net; + } else { + net = get_net_ns_by_id(caller_net, netns); + if (unlikely(!net)) + return -EINVAL; + } + + mac_len = skb->mac_len; + skb_cpy = skb_copy(skb, GFP_ATOMIC); + if (!skb_cpy) + return -ENOMEM; + + skb_out = ip_check_defrag(net, skb_cpy, IP_DEFRAG_BPF); + if (IS_ERR(skb_out)) + return PTR_ERR(skb_out); + + skb_morph(skb, skb_out); + kfree_skb(skb_out); + + /* ip_check_defrag() does not maintain mac header, so push empty header + * in so prog sees the correct layout. The empty mac header will be + * later pulled from cls_bpf. + */ + mac = skb_push(skb, mac_len); + memset(mac, 0, mac_len); + bpf_compute_data_pointers(skb); + + return 0; +} + +__diag_pop() + +BTF_SET8_START(ip_frag_kfunc_set) +BTF_ID_FLAGS(func, bpf_ip_check_defrag, KF_CHANGES_PKT) +BTF_SET8_END(ip_frag_kfunc_set) + +static const struct btf_kfunc_id_set ip_frag_bpf_kfunc_set = { + .owner = THIS_MODULE, + .set = &ip_frag_kfunc_set, +}; + +int register_ip_frag_bpf(void) +{ + return register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, + &ip_frag_bpf_kfunc_set); +} From patchwork Mon Feb 27 19:51:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13154163 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8300CC64ED6 for ; Mon, 27 Feb 2023 19:52:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229961AbjB0Twf (ORCPT ); Mon, 27 Feb 2023 14:52:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37004 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230373AbjB0Twb (ORCPT ); Mon, 27 Feb 2023 14:52:31 -0500 Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7356629404; Mon, 27 Feb 2023 11:52:03 -0800 (PST) Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.west.internal (Postfix) with ESMTP id 8CA9E320085B; Mon, 27 Feb 2023 14:52:02 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Mon, 27 Feb 2023 14:52:03 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm3; t=1677527522; x=1677613922; bh=n5 tZMMg11TOco/li89LY2FO0tLfTaFl2pcXo+bm3FBc=; b=fxb51F2M+L08SuW336 BI5i3C1CBNe4+uMtuv5gMqdG1NIvlmsUfqlZhS5rvTXU8T60ipLEaJqjod/bu66+ OHlpAsYS+KvVk4L2toVhSg8gCYIpP5QgJWDXHPQ1TvbCSHIIo5BvYA5ySLLj2UTD 88II+U1gS9zE0eeV9KNph7LZNaBSG53240eydWdZdZyXR/RyGTVF6jUUN6Liulzz 7LHgmIGXYyHO5bosf9rX5Eu0CfhXMzx4ix0gn8t/K1K4L/hegyuOQrmrjbgpUts0 pBB4pRPfZpikbn8AlVIcoLW9B5OL+Xdw1Dgo20bW+wKkbr2k9V3uTyoMq3Gvc2Pp WwZA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1677527522; x=1677613922; bh=n5tZMMg11TOco /li89LY2FO0tLfTaFl2pcXo+bm3FBc=; b=s2kFawmVBtB4k54CZ/h/txnrv9owU XislZ3yArXAawhcKAB857z/pb92fvtSgdqwGrkvvMrJvZHFsIXJdI0HckEKEiCwn omiZ6NYk6IpLRir/+NZQiCqlMJB5TPaCZzaBbnTa527vm9jxYv4tzMC3Y96oybQ+ 7NLdjEY6TPQBA37yVUdDuzAe55jrVk4ktsekZzBjd+VJ/UbRuecsvwqf/kh10+Bv 9KJn4GJVp/N6NiWrpln2KX6LM1YKKlWI8SUJ4Ua52+DLCOCiJF8PyTvnycCG5BMQ H+p7wCj4ME2WH1xTyGtnbVLkiGeWljeL02wZX76tlNBVBACew8bCNsgwg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrudeltddguddviecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecufghrlhcuvffnffculdejtddmnecujfgurhephf fvvefufffkofgjfhgggfestdekredtredttdenucfhrhhomhepffgrnhhivghlucgiuhcu oegugihusegugihuuhhurdighiiiqeenucggtffrrghtthgvrhhnpefgfefggeejhfduie ekvdeuteffleeifeeuvdfhheejleejjeekgfffgefhtddtteenucevlhhushhtvghrufhi iigvpedvnecurfgrrhgrmhepmhgrihhlfhhrohhmpegugihusegugihuuhhurdighiii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 27 Feb 2023 14:52:01 -0500 (EST) From: Daniel Xu To: kuba@kernel.org, edumazet@google.com, davem@davemloft.net, dsahern@kernel.org, pabeni@redhat.com Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, bpf@vger.kernel.org Subject: [PATCH bpf-next v2 4/8] net: ipv6: Factor ipv6_frag_rcv() to take netns and user Date: Mon, 27 Feb 2023 12:51:06 -0700 Message-Id: <2928ca6d91690f04f59759bb330e01fcf3f061a7.1677526810.git.dxu@dxuuu.xyz> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Factor _ipv6_frag_rcv() out of ipv6_frag_rcv() such that the former takes a netns and user field. We do this so that the BPF interface for ipv6 defrag can have the same semantics as ipv4 defrag (see ip_check_defrag()). Signed-off-by: Daniel Xu --- include/net/ipv6.h | 1 + net/ipv6/reassembly.c | 16 +++++++++++----- 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 7332296eca44..9bbdf82ca6c0 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -1238,6 +1238,7 @@ int inet6_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, extern const struct proto_ops inet6_stream_ops; extern const struct proto_ops inet6_dgram_ops; extern const struct proto_ops inet6_sockraw_ops; +int _ipv6_frag_rcv(struct net *net, struct sk_buff *skb, u32 user); struct group_source_req; struct group_filter; diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c index 5bc8a28e67f9..5100430eb982 100644 --- a/net/ipv6/reassembly.c +++ b/net/ipv6/reassembly.c @@ -81,13 +81,13 @@ static void ip6_frag_expire(struct timer_list *t) } static struct frag_queue * -fq_find(struct net *net, __be32 id, const struct ipv6hdr *hdr, int iif) +fq_find(struct net *net, __be32 id, const struct ipv6hdr *hdr, int iif, u32 user) { struct frag_v6_compare_key key = { .id = id, .saddr = hdr->saddr, .daddr = hdr->daddr, - .user = IP6_DEFRAG_LOCAL_DELIVER, + .user = user, .iif = iif, }; struct inet_frag_queue *q; @@ -324,12 +324,11 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *skb, return -1; } -static int ipv6_frag_rcv(struct sk_buff *skb) +int _ipv6_frag_rcv(struct net *net, struct sk_buff *skb, u32 user) { struct frag_hdr *fhdr; struct frag_queue *fq; const struct ipv6hdr *hdr = ipv6_hdr(skb); - struct net *net = dev_net(skb_dst(skb)->dev); u8 nexthdr; int iif; @@ -377,7 +376,7 @@ static int ipv6_frag_rcv(struct sk_buff *skb) } iif = skb->dev ? skb->dev->ifindex : 0; - fq = fq_find(net, fhdr->identification, hdr, iif); + fq = fq_find(net, fhdr->identification, hdr, iif, user); if (fq) { u32 prob_offset = 0; int ret; @@ -410,6 +409,13 @@ static int ipv6_frag_rcv(struct sk_buff *skb) return -1; } +static int ipv6_frag_rcv(struct sk_buff *skb) +{ + struct net *net = dev_net(skb_dst(skb)->dev); + + return _ipv6_frag_rcv(net, skb, IP6_DEFRAG_LOCAL_DELIVER); +} + static const struct inet6_protocol frag_protocol = { .handler = ipv6_frag_rcv, .flags = INET6_PROTO_NOPOLICY, From patchwork Mon Feb 27 19:51:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13154164 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F5B7C64ED6 for ; Mon, 27 Feb 2023 19:52:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230401AbjB0Twn (ORCPT ); Mon, 27 Feb 2023 14:52:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36896 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230371AbjB0Twi (ORCPT ); Mon, 27 Feb 2023 14:52:38 -0500 Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 46B6528862; Mon, 27 Feb 2023 11:52:08 -0800 (PST) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.west.internal (Postfix) with ESMTP id 3A8BB32000CC; Mon, 27 Feb 2023 14:52:07 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Mon, 27 Feb 2023 14:52:07 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm3; t=1677527526; x=1677613926; bh=vq OPrU2USs/DBPvfuObiynBOOBEeIGTPg/9yydmeqQo=; b=Z4tJwqYA3DJV7CLlHF 7zSwAocfWfQRFbAqGuQoL5985PQMS5/0JARZfs3EXUuR9aFTrdRXq1Oo/NG0TQ0h X6eiWPj4ZSjaDnO2Ki7LDkbpTOjP8iezX6lptw/qMQ9l9UxVj82lzaLTubx+cIGS bCxCqpNoDAoTzwhJNXOCZK8L1i1BbGA16brgEWyvtimafxgXqqqLVYwqMjssjyef KJ27IJld+I/yiF61lRbOvSG18Xi/ZBZJO4kgLMVr9j821R9peqyuEwipqQda5XjY LHw8MxV7ra+m048qp/aNJcrAmk/yZY0GUa1H05nza1/EvNy6vtPuYdMK61DhaQev NKsA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1677527526; x=1677613926; bh=vqOPrU2USs/DB PvfuObiynBOOBEeIGTPg/9yydmeqQo=; b=AkgaaiB3NcAsUn39bFP5FEK1KYKtC 0Srry3tuLsNTT5VGRt1UTw4FCSDipnYaCv5P5amMz8xm/2OvxBy+FsFk49V4roT+ 1O+yYA+1wq56WMAt15gQtE4nsLQOliDgft2FArQBC0L/hGnsaIDgTBliSKUOlPou Odw+LxCoze55Yk5df+46TxmmsnRrdJ8Jc/5gqgsKWDS4q3PNxl8esSGhEIVvxWq+ TpWfAawj/osjSm3h2378povMNUjEIL3JBg/kuDx/QL64FpkDdUYESdR9nZYaedKc 39G2JmkkOwICZWEGigQ3mJiSCrMmXplQg827LBJDP32Sg++p/fACDnxZQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrudeltddguddvhecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecufghrlhcuvffnffculdejtddmnecujfgurhephf fvvefufffkofgjfhgggfestdekredtredttdenucfhrhhomhepffgrnhhivghlucgiuhcu oegugihusegugihuuhhurdighiiiqeenucggtffrrghtthgvrhhnpefgfefggeejhfduie ekvdeuteffleeifeeuvdfhheejleejjeekgfffgefhtddtteenucevlhhushhtvghrufhi iigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegugihusegugihuuhhurdighiii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 27 Feb 2023 14:52:05 -0500 (EST) From: Daniel Xu To: kuba@kernel.org, edumazet@google.com, davem@davemloft.net, dsahern@kernel.org, pabeni@redhat.com Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Subject: [PATCH bpf-next v2 5/8] bpf: net: ipv6: Add bpf_ipv6_frag_rcv() kfunc Date: Mon, 27 Feb 2023 12:51:07 -0700 Message-Id: X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This helper is used to defragment IPv6 packets. Similar to the previous bpf_ip_check_defrag() kfunc, this kfunc: * Returns 0 on defrag + skb update success * Returns < 0 on error * Takes care to ensure ctx (skb) remains valid no matter what the underlying call to _ipv6_frag_rcv() does * Is only callable from TC clsact progs Please see bpf_ip_check_defrag() commit for more details / suggestions. Signed-off-by: Daniel Xu --- include/net/ipv6_frag.h | 1 + include/net/transp_v6.h | 1 + net/ipv6/Makefile | 1 + net/ipv6/af_inet6.c | 4 ++ net/ipv6/reassembly_bpf.c | 143 ++++++++++++++++++++++++++++++++++++++ 5 files changed, 150 insertions(+) create mode 100644 net/ipv6/reassembly_bpf.c diff --git a/include/net/ipv6_frag.h b/include/net/ipv6_frag.h index 7321ffe3a108..cf4763cd3886 100644 --- a/include/net/ipv6_frag.h +++ b/include/net/ipv6_frag.h @@ -15,6 +15,7 @@ enum ip6_defrag_users { __IP6_DEFRAG_CONNTRACK_OUT = IP6_DEFRAG_CONNTRACK_OUT + USHRT_MAX, IP6_DEFRAG_CONNTRACK_BRIDGE_IN, __IP6_DEFRAG_CONNTRACK_BRIDGE_IN = IP6_DEFRAG_CONNTRACK_BRIDGE_IN + USHRT_MAX, + IP6_DEFRAG_BPF, }; /* diff --git a/include/net/transp_v6.h b/include/net/transp_v6.h index d27b1caf3753..244123a74349 100644 --- a/include/net/transp_v6.h +++ b/include/net/transp_v6.h @@ -20,6 +20,7 @@ int ipv6_exthdrs_init(void); void ipv6_exthdrs_exit(void); int ipv6_frag_init(void); void ipv6_frag_exit(void); +int register_ipv6_reassembly_bpf(void); /* transport protocols */ int pingv6_init(void); diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile index 3036a45e8a1e..6e90ff1d20c0 100644 --- a/net/ipv6/Makefile +++ b/net/ipv6/Makefile @@ -26,6 +26,7 @@ ipv6-$(CONFIG_IPV6_SEG6_LWTUNNEL) += seg6_iptunnel.o seg6_local.o ipv6-$(CONFIG_IPV6_SEG6_HMAC) += seg6_hmac.o ipv6-$(CONFIG_IPV6_RPL_LWTUNNEL) += rpl_iptunnel.o ipv6-$(CONFIG_IPV6_IOAM6_LWTUNNEL) += ioam6_iptunnel.o +ipv6-$(CONFIG_DEBUG_INFO_BTF) += reassembly_bpf.o obj-$(CONFIG_INET6_AH) += ah6.o obj-$(CONFIG_INET6_ESP) += esp6.o diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 38689bedfce7..39663de75fbd 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -1174,6 +1174,10 @@ static int __init inet6_init(void) if (err) goto ipv6_frag_fail; + err = register_ipv6_reassembly_bpf(); + if (err) + goto ipv6_frag_fail; + /* Init v6 transport protocols. */ err = udpv6_init(); if (err) diff --git a/net/ipv6/reassembly_bpf.c b/net/ipv6/reassembly_bpf.c new file mode 100644 index 000000000000..c6c804d4f636 --- /dev/null +++ b/net/ipv6/reassembly_bpf.c @@ -0,0 +1,143 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Unstable ipv6 fragmentation helpers for TC-BPF hook + * + * These are called from SCHED_CLS BPF programs. Note that it is allowed to + * break compatibility for these functions since the interface they are exposed + * through to BPF programs is explicitly unstable. + */ + +#include +#include +#include +#include +#include +#include +#include + +static int set_dst(struct sk_buff *skb, struct net *net) +{ + const struct ipv6hdr *ip6h = ipv6_hdr(skb); + struct dst_entry *dst; + + struct flowi6 fl6 = { + .flowi6_flags = FLOWI_FLAG_ANYSRC, + .flowi6_mark = skb->mark, + .flowlabel = ip6_flowinfo(ip6h), + .flowi6_iif = skb->skb_iif, + .flowi6_proto = ip6h->nexthdr, + .daddr = ip6h->daddr, + .saddr = ip6h->saddr, + }; + + dst = ipv6_stub->ipv6_dst_lookup_flow(net, NULL, &fl6, NULL); + if (IS_ERR(dst)) + return PTR_ERR(dst); + + skb_dst_set(skb, dst); + + return 0; +} + +__diag_push(); +__diag_ignore_all("-Wmissing-prototypes", + "Global functions as their definitions will be in reassembly BTF"); + +/* bpf_ipv6_frag_rcv - Defragment an ipv6 packet + * + * This helper takes an skb as input. If this skb successfully reassembles + * the original packet, the skb is updated to contain the original, reassembled + * packet. + * + * Otherwise (on error or incomplete reassembly), the input skb remains + * unmodified. + * + * Parameters: + * @ctx - Pointer to program context (skb) + * @netns - Child network namespace id. If value is a negative signed + * 32-bit integer, the netns of the device in the skb is used. + * + * Return: + * 0 on successfully reassembly or non-fragmented packet. Negative value on + * error or incomplete reassembly. + */ +int bpf_ipv6_frag_rcv(struct __sk_buff *ctx, u64 netns) +{ + struct sk_buff *skb = (struct sk_buff *)ctx; + struct sk_buff *skb_cpy; + struct net *caller_net; + unsigned int foff; + struct net *net; + int mac_len; + void *mac; + int err; + + if (unlikely(!((s32)netns < 0 || netns <= S32_MAX))) + return -EINVAL; + + caller_net = skb->dev ? dev_net(skb->dev) : sock_net(skb->sk); + if ((s32)netns < 0) { + net = caller_net; + } else { + net = get_net_ns_by_id(caller_net, netns); + if (unlikely(!net)) + return -EINVAL; + } + + err = set_dst(skb, net); + if (err < 0) + return err; + + mac_len = skb->mac_len; + skb_cpy = skb_copy(skb, GFP_ATOMIC); + if (!skb_cpy) + return -ENOMEM; + + /* _ipv6_frag_rcv() expects skb->transport_header to be set to start of + * the frag header and nhoff to be set. + */ + err = ipv6_find_hdr(skb_cpy, &foff, NEXTHDR_FRAGMENT, NULL, NULL); + if (err < 0) + return err; + skb_set_transport_header(skb_cpy, foff); + IP6CB(skb_cpy)->nhoff = offsetof(struct ipv6hdr, nexthdr); + + /* inet6_protocol handlers return >0 on success, 0 on out of band + * consumption, <0 on error. We never expect to see 0 here. + */ + err = _ipv6_frag_rcv(net, skb_cpy, IP6_DEFRAG_BPF); + if (err < 0) + return err; + else if (err == 0) + return -EINVAL; + + skb_morph(skb, skb_cpy); + kfree_skb(skb_cpy); + + /* _ipv6_frag_rcv() does not maintain mac header, so push empty header + * in so prog sees the correct layout. The empty mac header will be + * later pulled from cls_bpf. + */ + skb->mac_len = mac_len; + mac = skb_push(skb, mac_len); + memset(mac, 0, mac_len); + bpf_compute_data_pointers(skb); + + return 0; +} + +__diag_pop() + +BTF_SET8_START(ipv6_reassembly_kfunc_set) +BTF_ID_FLAGS(func, bpf_ipv6_frag_rcv, KF_CHANGES_PKT) +BTF_SET8_END(ipv6_reassembly_kfunc_set) + +static const struct btf_kfunc_id_set ipv6_reassembly_bpf_kfunc_set = { + .owner = THIS_MODULE, + .set = &ipv6_reassembly_kfunc_set, +}; + +int register_ipv6_reassembly_bpf(void) +{ + return register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, + &ipv6_reassembly_bpf_kfunc_set); +} From patchwork Mon Feb 27 19:51:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13154165 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 833E3C64ED8 for ; Mon, 27 Feb 2023 19:53:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230286AbjB0TxK (ORCPT ); Mon, 27 Feb 2023 14:53:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36910 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230373AbjB0Tw7 (ORCPT ); Mon, 27 Feb 2023 14:52:59 -0500 Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CB92328867; Mon, 27 Feb 2023 11:52:25 -0800 (PST) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.west.internal (Postfix) with ESMTP id F39423200947; Mon, 27 Feb 2023 14:52:15 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Mon, 27 Feb 2023 14:52:17 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm3; t=1677527535; x=1677613935; bh=c8 o35w1Jj7teVzLK9zP1vbcoSLG5zTY/3S1Db25AnFg=; b=lrht50y/jtiCKp0+xK Vb21ok4ouCL6GyyPauJ75Jbh8VM78wKMZMZlyMQKyHXfNAi0G1G5AaUrIRoh1X1g SbICxrSc2immkkXH9rGoG3IIyKOwg3aSuk8XmKGn7+lbsn66vYeW1f6DKjoOaXzq muxGK97c0tclwKKxb8zj2EQPMuBk1SWGOeAXrE85Qiv8med3Uv2gMVeIQLlsOffm PRcOii8SFj9tdbNOPVaKEKBa+jJ9vrGVheXMUT/MtG7jz5yAHvi9iEEpvWl7KT8N ipWZJxY+0r/jMoCGMAWYDJtcAcfUAUrumvComhRqFsmA4dRFAwtnRueVJBG7OdaE Nthg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1677527535; x=1677613935; bh=c8o35w1Jj7teV zLK9zP1vbcoSLG5zTY/3S1Db25AnFg=; b=IFx0kw7RqjmJKAWOJydKaR+gZDsLb 0NfnUjqo48NAN9QEpVUGhHGzRUSoD43OWslLJurKT+8WXU6NkesgG68X3hUJDTLy 0t8b2YGmk/aLTCJ53u8URdM9Ou66oIXaxueeAcnAs73gGtrnN0VSi+vAVPE5ZSfy jqwQy71hae7WkmcNFGCDgu86z7E0RIgQfuuj43JWpDXFP46wjKVE7iTjCTOnCr6R m/ykOJ+AZq0tdNmVGcnwErbFzeJ0eJptQPAUzYXLrKD1jtSHn9Jd5sGwciyKEkm/ VaGkjfEqMjuNt83SaLu72F2kQTGyfvUxpDCFNo8/qbQK4zasS4lvdrUgw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrudeltddguddvhecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecufghrlhcuvffnffculdejtddmnecujfgurhephf fvvefufffkofgjfhgggfestdekredtredttdenucfhrhhomhepffgrnhhivghlucgiuhcu oegugihusegugihuuhhurdighiiiqeenucggtffrrghtthgvrhhnpefgfefggeejhfduie ekvdeuteffleeifeeuvdfhheejleejjeekgfffgefhtddtteenucevlhhushhtvghrufhi iigvpedunecurfgrrhgrmhepmhgrihhlfhhrohhmpegugihusegugihuuhhurdighiii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 27 Feb 2023 14:52:14 -0500 (EST) From: Daniel Xu To: andrii@kernel.org, daniel@iogearbox.net, shuah@kernel.org, ast@kernel.org Cc: martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, mykolal@fb.com, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH bpf-next v2 6/8] bpf: selftests: Support not connecting client socket Date: Mon, 27 Feb 2023 12:51:08 -0700 Message-Id: <1e98c66945fdba2b4665e9b9bdf084757ca8a112.1677526810.git.dxu@dxuuu.xyz> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net For connectionless protocols or raw sockets we do not want to actually connect() to the server. Signed-off-by: Daniel Xu --- tools/testing/selftests/bpf/network_helpers.c | 5 +++-- tools/testing/selftests/bpf/network_helpers.h | 1 + 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c index 01de33191226..24f5efebc7dd 100644 --- a/tools/testing/selftests/bpf/network_helpers.c +++ b/tools/testing/selftests/bpf/network_helpers.c @@ -301,8 +301,9 @@ int connect_to_fd_opts(int server_fd, const struct network_helper_opts *opts) strlen(opts->cc) + 1)) goto error_close; - if (connect_fd_to_addr(fd, &addr, addrlen, opts->must_fail)) - goto error_close; + if (!opts->noconnect) + if (connect_fd_to_addr(fd, &addr, addrlen, opts->must_fail)) + goto error_close; return fd; diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h index f882c691b790..8be04cd76d8b 100644 --- a/tools/testing/selftests/bpf/network_helpers.h +++ b/tools/testing/selftests/bpf/network_helpers.h @@ -21,6 +21,7 @@ struct network_helper_opts { const char *cc; int timeout_ms; bool must_fail; + bool noconnect; }; /* ipv4 test vector */ From patchwork Mon Feb 27 19:51:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13154166 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9D22C7EE2D for ; Mon, 27 Feb 2023 19:53:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230411AbjB0TxY (ORCPT ); Mon, 27 Feb 2023 14:53:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38158 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230470AbjB0TxK (ORCPT ); Mon, 27 Feb 2023 14:53:10 -0500 Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7E3B298E2; Mon, 27 Feb 2023 11:52:46 -0800 (PST) Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.west.internal (Postfix) with ESMTP id 4F9D23200955; Mon, 27 Feb 2023 14:52:22 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute3.internal (MEProxy); Mon, 27 Feb 2023 14:52:23 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm3; t=1677527541; x=1677613941; bh=pJ 73DrAYXIIgt6NXNbsY1qViBhntRO01+UwF0Z/mWJw=; b=MDIPO55uf1Rb/b4t3e jiFTioMtMkDuBgR4BulVyFNVLTKTzPBqjzkyOB42QBpmAb0736fxW8suE7MzxVQE m166WOWMxdx2ye6ajv33y5v6YOtJUKl4FjT6CDzdYpaxd87+4HZZXY3sgJfNgzKr My7z5vPM1cBZI0GM8V7gQ4RE1K0rzgmBY0arz7EVXp5CetjY1VczIrQIt1tKFWzO Y9T5ElwC4Z9hC4l5T7GMyDlcW6t2JAEncSBIQCMIW390p8BStzUS0kBzryd3dDvZ SAcY/wVzNhEG18P2mfERIn+8ZMuCHhll4PROWinAi0egMClEBSkHM4uqT8FL19eH Mxog== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1677527541; x=1677613941; bh=pJ73DrAYXIIgt 6NXNbsY1qViBhntRO01+UwF0Z/mWJw=; b=kz+0nHbGjE2FFDAJ001z+0EeK/zzD DwucsUCmtlowxtidYhNqKQfYSSyBzDvZcDmav5+abm4zUGke1tewTtUoLs7IwGNW PO7DFlNKI2I3DODBUIjW9FRPdwmSkI1WtUGGRD73xUkjrA1n5Aw9KaEhC5WrUTZ+ aAXR2GA6T4qPzWcxdaZjV53lJKBuKJlEtVWlXlSDM4EogrTbk3Nn1f5/QMP4CQEY JcU+DFWKhCAjfQDGLrhcpEhKNuudtSWxvuvkH2e4SD1cVM/KpEvipIWZzq0Pqi/S cM0CnNZjqdnqMsJoHhRAI8tY2PD+Y7zOtJRmHVTQm8SKgQgVanD+dQgog== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrudeltddguddviecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecufghrlhcuvffnffculdejtddmnecujfgurhephf fvvefufffkofgjfhgggfestdekredtredttdenucfhrhhomhepffgrnhhivghlucgiuhcu oegugihusegugihuuhhurdighiiiqeenucggtffrrghtthgvrhhnpefgfefggeejhfduie ekvdeuteffleeifeeuvdfhheejleejjeekgfffgefhtddtteenucevlhhushhtvghrufhi iigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegugihusegugihuuhhurdighiii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 27 Feb 2023 14:52:20 -0500 (EST) From: Daniel Xu To: andrii@kernel.org, daniel@iogearbox.net, shuah@kernel.org, ast@kernel.org Cc: mykolal@fb.com, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH bpf-next v2 7/8] bpf: selftests: Support custom type and proto for client sockets Date: Mon, 27 Feb 2023 12:51:09 -0700 Message-Id: <4c74acce194c9896da3c84cbaf6191f3c706845c.1677526810.git.dxu@dxuuu.xyz> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Extend connect_to_fd_opts() to take optional type and protocol parameters for the client socket. These parameters are useful when opening a raw socket to send IP fragments. Signed-off-by: Daniel Xu --- tools/testing/selftests/bpf/network_helpers.c | 21 +++++++++++++------ tools/testing/selftests/bpf/network_helpers.h | 2 ++ 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c index 24f5efebc7dd..4f9ba90b1b7e 100644 --- a/tools/testing/selftests/bpf/network_helpers.c +++ b/tools/testing/selftests/bpf/network_helpers.c @@ -270,14 +270,23 @@ int connect_to_fd_opts(int server_fd, const struct network_helper_opts *opts) opts = &default_opts; optlen = sizeof(type); - if (getsockopt(server_fd, SOL_SOCKET, SO_TYPE, &type, &optlen)) { - log_err("getsockopt(SOL_TYPE)"); - return -1; + + if (opts->type) { + type = opts->type; + } else { + if (getsockopt(server_fd, SOL_SOCKET, SO_TYPE, &type, &optlen)) { + log_err("getsockopt(SOL_TYPE)"); + return -1; + } } - if (getsockopt(server_fd, SOL_SOCKET, SO_PROTOCOL, &protocol, &optlen)) { - log_err("getsockopt(SOL_PROTOCOL)"); - return -1; + if (opts->proto) { + protocol = opts->proto; + } else { + if (getsockopt(server_fd, SOL_SOCKET, SO_PROTOCOL, &protocol, &optlen)) { + log_err("getsockopt(SOL_PROTOCOL)"); + return -1; + } } addrlen = sizeof(addr); diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h index 8be04cd76d8b..7119804ea79b 100644 --- a/tools/testing/selftests/bpf/network_helpers.h +++ b/tools/testing/selftests/bpf/network_helpers.h @@ -22,6 +22,8 @@ struct network_helper_opts { int timeout_ms; bool must_fail; bool noconnect; + int type; + int proto; }; /* ipv4 test vector */ From patchwork Mon Feb 27 19:51:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Xu X-Patchwork-Id: 13154167 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 196C6C7EE2D for ; Mon, 27 Feb 2023 19:53:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229841AbjB0Txm (ORCPT ); Mon, 27 Feb 2023 14:53:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38440 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230045AbjB0Txh (ORCPT ); Mon, 27 Feb 2023 14:53:37 -0500 Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 861B11FC5; Mon, 27 Feb 2023 11:53:06 -0800 (PST) Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.west.internal (Postfix) with ESMTP id A9B8A3200951; Mon, 27 Feb 2023 14:52:37 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Mon, 27 Feb 2023 14:52:39 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm3; t=1677527557; x=1677613957; bh=kC /vImF3ygzXdPRqMN8FHSC+Z1VznBI2VSoBhnnxh40=; b=L43SeDFAr4a88CJ7B5 /ddQTXKm4Cff/D9fv3+iMRdt9jZFoOfSdd4tvyK+a1kWMEKFrKU61m+ZHfVRUDOz M11ZX2hMFMsczWcnicRzP0FEU/QYBfK9vaRO6zWfwFarMHCSThhfrrvswmU0C2oA zpafWdxsEoCkUH4Bx8LaqC+QVeLZt/+Gl590n+DwuK7cFbmo6S0lFyBR7Ct85zof 22oFrOvW1KlSJAtcvOb2hOSEjxI+w5HImk99I9VzpeGi7pY3pvuFP4JAaG8uWbU9 SKpiTtwW0k68/mZl/2emvfXYN/sEwBfd4DlaVuwdwDAsEmW7l/bwoZaHR4v44X17 VV/A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1677527557; x=1677613957; bh=kC/vImF3ygzXd PRqMN8FHSC+Z1VznBI2VSoBhnnxh40=; b=QlmpONWXMW+HHz5T//TDxSgjrNZx5 E9u/RJSJElddQWU86Tpmx96fFDNUbCTGLiN39rtlhkqPr80JP6Nf+EJWDLsZ3P+k Jy4JYCehVt6MAwZC1IRgu64431znkHgOXBbvPFXdNbfa0pXsqT/CaDZajT0eNcuZ GO5j7RLHw8RmApXhKKPbFZx3Xo81TIyW7veFtr29mIOJRERi5ESgrHrh4rm0ZYJ9 N2yIbImuvKhtKULxx68rEy8timGvXzB9TXPYCopT7eN7DxALbdnHEhrRPsQndlIL Viaboelp4PMeAZaQRp8id4Tx8dgvmh1BNdipbAkyH0A4a6h7NJUZtrp4Q== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrudeltddguddviecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecufghrlhcuvffnffculdejtddmnecujfgurhephf fvvefufffkofgjfhgggfestdekredtredttdenucfhrhhomhepffgrnhhivghlucgiuhcu oegugihusegugihuuhhurdighiiiqeenucggtffrrghtthgvrhhnpefgfefggeejhfduie ekvdeuteffleeifeeuvdfhheejleejjeekgfffgefhtddtteenucevlhhushhtvghrufhi iigvpeefnecurfgrrhgrmhepmhgrihhlfhhrohhmpegugihusegugihuuhhurdighiii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 27 Feb 2023 14:52:35 -0500 (EST) From: Daniel Xu To: andrii@kernel.org, daniel@iogearbox.net, shuah@kernel.org, ast@kernel.org Cc: martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, mykolal@fb.com, linux-kernel@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [PATCH bpf-next v2 8/8] bpf: selftests: Add defrag selftests Date: Mon, 27 Feb 2023 12:51:10 -0700 Message-Id: <99ddd1e2b35f6133c1f49a0245340e1a8aaaf32f.1677526810.git.dxu@dxuuu.xyz> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net These selftests tests 2 major scenarios: the BPF based defragmentation can successfully be done and that packet pointers are invalidated after calls to the kfunc. The logic is similar for both ipv4 and ipv6. In the first scenario, we create a UDP client and UDP echo server. The the server side is fairly straightforward: we attach the prog and simply echo back the message. The on the client side, we send fragmented packets to and expect the reassembled message back from the server. Signed-off-by: Daniel Xu --- tools/testing/selftests/bpf/Makefile | 3 +- .../selftests/bpf/generate_udp_fragments.py | 90 +++++ .../selftests/bpf/ip_check_defrag_frags.h | 57 +++ .../bpf/prog_tests/ip_check_defrag.c | 327 ++++++++++++++++++ .../selftests/bpf/progs/bpf_tracing_net.h | 1 + .../selftests/bpf/progs/ip_check_defrag.c | 133 +++++++ 6 files changed, 610 insertions(+), 1 deletion(-) create mode 100755 tools/testing/selftests/bpf/generate_udp_fragments.py create mode 100644 tools/testing/selftests/bpf/ip_check_defrag_frags.h create mode 100644 tools/testing/selftests/bpf/prog_tests/ip_check_defrag.c create mode 100644 tools/testing/selftests/bpf/progs/ip_check_defrag.c diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index b677dcd0b77a..979af1611139 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -558,7 +558,8 @@ TRUNNER_BPF_PROGS_DIR := progs TRUNNER_EXTRA_SOURCES := test_progs.c cgroup_helpers.c trace_helpers.c \ network_helpers.c testing_helpers.c \ btf_helpers.c flow_dissector_load.h \ - cap_helpers.c test_loader.c xsk.c + cap_helpers.c test_loader.c xsk.c \ + ip_check_defrag_frags.h TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko \ $(OUTPUT)/liburandom_read.so \ $(OUTPUT)/xdp_synproxy \ diff --git a/tools/testing/selftests/bpf/generate_udp_fragments.py b/tools/testing/selftests/bpf/generate_udp_fragments.py new file mode 100755 index 000000000000..2b8a1187991c --- /dev/null +++ b/tools/testing/selftests/bpf/generate_udp_fragments.py @@ -0,0 +1,90 @@ +#!/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +""" +This script helps generate fragmented UDP packets. + +While it is technically possible to dynamically generate +fragmented packets in C, it is much harder to read and write +said code. `scapy` is relatively industry standard and really +easy to read / write. + +So we choose to write this script that generates a valid C +header. Rerun script and commit generated file after any +modifications. +""" + +import argparse +import os + +from scapy.all import * + + +# These constants must stay in sync with `ip_check_defrag.c` +VETH1_ADDR = "172.16.1.200" +VETH0_ADDR6 = "fc00::100" +VETH1_ADDR6 = "fc00::200" +CLIENT_PORT = 48878 +SERVER_PORT = 48879 +MAGIC_MESSAGE = "THIS IS THE ORIGINAL MESSAGE, PLEASE REASSEMBLE ME" + + +def print_header(f): + f.write("// SPDX-License-Identifier: GPL-2.0\n") + f.write("/* DO NOT EDIT -- this file is generated */\n") + f.write("\n") + f.write("#ifndef _IP_CHECK_DEFRAG_FRAGS_H\n") + f.write("#define _IP_CHECK_DEFRAG_FRAGS_H\n") + f.write("\n") + f.write("#include \n") + f.write("\n") + + +def print_frags(f, frags, v6): + for idx, frag in enumerate(frags): + # 10 bytes per line to keep width in check + chunks = [frag[i : i + 10] for i in range(0, len(frag), 10)] + chunks_fmted = [", ".join([str(hex(b)) for b in chunk]) for chunk in chunks] + suffix = "6" if v6 else "" + + f.write(f"static uint8_t frag{suffix}_{idx}[] = {{\n") + for chunk in chunks_fmted: + f.write(f"\t{chunk},\n") + f.write(f"}};\n") + + +def print_trailer(f): + f.write("\n") + f.write("#endif /* _IP_CHECK_DEFRAG_FRAGS_H */\n") + + +def main(f): + # srcip of 0 is filled in by IP_HDRINCL + sip = "0.0.0.0" + sip6 = VETH0_ADDR6 + dip = VETH1_ADDR + dip6 = VETH1_ADDR6 + sport = CLIENT_PORT + dport = SERVER_PORT + payload = MAGIC_MESSAGE.encode() + + # Disable UDPv4 checksums to keep code simpler + pkt = IP(src=sip,dst=dip) / UDP(sport=sport,dport=dport,chksum=0) / Raw(load=payload) + # UDPv6 requires a checksum + # Also pin the ipv6 fragment header ID, otherwise it's a random value + pkt6 = IPv6(src=sip6,dst=dip6) / IPv6ExtHdrFragment(id=0xBEEF) / UDP(sport=sport,dport=dport) / Raw(load=payload) + + frags = [f.build() for f in pkt.fragment(24)] + frags6 = [f.build() for f in fragment6(pkt6, 72)] + + print_header(f) + print_frags(f, frags, False) + print_frags(f, frags6, True) + print_trailer(f) + + +if __name__ == "__main__": + dir = os.path.dirname(os.path.realpath(__file__)) + header = f"{dir}/ip_check_defrag_frags.h" + with open(header, "w") as f: + main(f) diff --git a/tools/testing/selftests/bpf/ip_check_defrag_frags.h b/tools/testing/selftests/bpf/ip_check_defrag_frags.h new file mode 100644 index 000000000000..70ab7e9fa22b --- /dev/null +++ b/tools/testing/selftests/bpf/ip_check_defrag_frags.h @@ -0,0 +1,57 @@ +// SPDX-License-Identifier: GPL-2.0 +/* DO NOT EDIT -- this file is generated */ + +#ifndef _IP_CHECK_DEFRAG_FRAGS_H +#define _IP_CHECK_DEFRAG_FRAGS_H + +#include + +static uint8_t frag_0[] = { + 0x45, 0x0, 0x0, 0x2c, 0x0, 0x1, 0x20, 0x0, 0x40, 0x11, + 0xac, 0xe8, 0x0, 0x0, 0x0, 0x0, 0xac, 0x10, 0x1, 0xc8, + 0xbe, 0xee, 0xbe, 0xef, 0x0, 0x3a, 0x0, 0x0, 0x54, 0x48, + 0x49, 0x53, 0x20, 0x49, 0x53, 0x20, 0x54, 0x48, 0x45, 0x20, + 0x4f, 0x52, 0x49, 0x47, +}; +static uint8_t frag_1[] = { + 0x45, 0x0, 0x0, 0x2c, 0x0, 0x1, 0x20, 0x3, 0x40, 0x11, + 0xac, 0xe5, 0x0, 0x0, 0x0, 0x0, 0xac, 0x10, 0x1, 0xc8, + 0x49, 0x4e, 0x41, 0x4c, 0x20, 0x4d, 0x45, 0x53, 0x53, 0x41, + 0x47, 0x45, 0x2c, 0x20, 0x50, 0x4c, 0x45, 0x41, 0x53, 0x45, + 0x20, 0x52, 0x45, 0x41, +}; +static uint8_t frag_2[] = { + 0x45, 0x0, 0x0, 0x1e, 0x0, 0x1, 0x0, 0x6, 0x40, 0x11, + 0xcc, 0xf0, 0x0, 0x0, 0x0, 0x0, 0xac, 0x10, 0x1, 0xc8, + 0x53, 0x53, 0x45, 0x4d, 0x42, 0x4c, 0x45, 0x20, 0x4d, 0x45, +}; +static uint8_t frag6_0[] = { + 0x60, 0x0, 0x0, 0x0, 0x0, 0x20, 0x2c, 0x40, 0xfc, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x1, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2, 0x0, + 0x11, 0x0, 0x0, 0x1, 0x0, 0x0, 0xbe, 0xef, 0xbe, 0xee, + 0xbe, 0xef, 0x0, 0x3a, 0xd0, 0xf8, 0x54, 0x48, 0x49, 0x53, + 0x20, 0x49, 0x53, 0x20, 0x54, 0x48, 0x45, 0x20, 0x4f, 0x52, + 0x49, 0x47, +}; +static uint8_t frag6_1[] = { + 0x60, 0x0, 0x0, 0x0, 0x0, 0x20, 0x2c, 0x40, 0xfc, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x1, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2, 0x0, + 0x11, 0x0, 0x0, 0x19, 0x0, 0x0, 0xbe, 0xef, 0x49, 0x4e, + 0x41, 0x4c, 0x20, 0x4d, 0x45, 0x53, 0x53, 0x41, 0x47, 0x45, + 0x2c, 0x20, 0x50, 0x4c, 0x45, 0x41, 0x53, 0x45, 0x20, 0x52, + 0x45, 0x41, +}; +static uint8_t frag6_2[] = { + 0x60, 0x0, 0x0, 0x0, 0x0, 0x12, 0x2c, 0x40, 0xfc, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x1, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2, 0x0, + 0x11, 0x0, 0x0, 0x30, 0x0, 0x0, 0xbe, 0xef, 0x53, 0x53, + 0x45, 0x4d, 0x42, 0x4c, 0x45, 0x20, 0x4d, 0x45, +}; + +#endif /* _IP_CHECK_DEFRAG_FRAGS_H */ diff --git a/tools/testing/selftests/bpf/prog_tests/ip_check_defrag.c b/tools/testing/selftests/bpf/prog_tests/ip_check_defrag.c new file mode 100644 index 000000000000..c79c4096aab4 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/ip_check_defrag.c @@ -0,0 +1,327 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include "ip_check_defrag.skel.h" +#include "ip_check_defrag_frags.h" + +/* + * This selftest spins up a client and an echo server, each in their own + * network namespace. The server will receive fragmented messages which + * the attached BPF prog should reassemble. We verify that reassembly + * occurred by checking the original (fragmented) message is received + * in whole. + * + * Topology: + * ========= + * NS0 | NS1 + * | + * client | server + * ---------- | ---------- + * | veth0 | --------- | veth1 | + * ---------- peer ---------- + * | + * | with bpf + */ + +#define NS0 "defrag_ns0" +#define NS1 "defrag_ns1" +#define VETH0 "veth0" +#define VETH1 "veth1" +#define VETH0_ADDR "172.16.1.100" +#define VETH0_ADDR6 "fc00::100" +/* The following constants must stay in sync with `generate_udp_fragments.py` */ +#define VETH1_ADDR "172.16.1.200" +#define VETH1_ADDR6 "fc00::200" +#define CLIENT_PORT 48878 +#define SERVER_PORT 48879 +#define MAGIC_MESSAGE "THIS IS THE ORIGINAL MESSAGE, PLEASE REASSEMBLE ME" + +static char log_buf[1024 * 1024]; + +static int setup_topology(bool ipv6) +{ + bool veth0_up; + bool veth1_up; + int i; + + SYS(fail, "ip netns add " NS0); + SYS(fail, "ip netns add " NS1); + SYS(fail, "ip link add " VETH0 " netns " NS0 " type veth peer name " VETH1 " netns " NS1); + if (ipv6) { + SYS(fail, "ip -6 -net " NS0 " addr add " VETH0_ADDR6 "/64 dev " VETH0 " nodad"); + SYS(fail, "ip -6 -net " NS1 " addr add " VETH1_ADDR6 "/64 dev " VETH1 " nodad"); + } else { + SYS(fail, "ip -net " NS0 " addr add " VETH0_ADDR "/24 dev " VETH0); + SYS(fail, "ip -net " NS1 " addr add " VETH1_ADDR "/24 dev " VETH1); + } + SYS(fail, "ip -net " NS0 " link set dev " VETH0 " up"); + SYS(fail, "ip -net " NS1 " link set dev " VETH1 " up"); + + /* Wait for up to 5s for links to come up */ + for (i = 0; i < 50; ++i) { + veth0_up = !system("ip -net " NS0 " link show " VETH0 " | grep 'state UP'"); + veth1_up = !system("ip -net " NS1 " link show " VETH1 " | grep 'state UP'"); + if (veth0_up && veth1_up) + break; + usleep(100000); + } + + if (!ASSERT_TRUE((veth0_up && veth1_up), "ifaces up")) + goto fail; + + return 0; +fail: + return -1; +} + +static void cleanup_topology(void) +{ + SYS_NOFAIL("test -f /var/run/netns/" NS0 " && ip netns delete " NS0); + SYS_NOFAIL("test -f /var/run/netns/" NS1 " && ip netns delete " NS1); +} + +static int attach(struct ip_check_defrag *skel) +{ + LIBBPF_OPTS(bpf_tc_hook, tc_hook, + .attach_point = BPF_TC_INGRESS); + LIBBPF_OPTS(bpf_tc_opts, tc_attach, + .prog_fd = bpf_program__fd(skel->progs.defrag)); + struct nstoken *nstoken; + int err = -1; + + nstoken = open_netns(NS1); + + tc_hook.ifindex = if_nametoindex(VETH1); + if (!ASSERT_OK(bpf_tc_hook_create(&tc_hook), "bpf_tc_hook_create")) + goto out; + + if (!ASSERT_OK(bpf_tc_attach(&tc_hook, &tc_attach), "bpf_tc_attach")) + goto out; + + err = 0; +out: + close_netns(nstoken); + return err; +} + +static int send_frags(int client) +{ + struct sockaddr_storage saddr; + struct sockaddr *saddr_p; + socklen_t saddr_len; + int err; + + saddr_p = (struct sockaddr *)&saddr; + err = make_sockaddr(AF_INET, VETH1_ADDR, SERVER_PORT, &saddr, &saddr_len); + if (!ASSERT_OK(err, "make_sockaddr")) + return -1; + + err = sendto(client, frag_0, sizeof(frag_0), 0, saddr_p, saddr_len); + if (!ASSERT_GE(err, 0, "sendto frag_0")) + return -1; + + err = sendto(client, frag_1, sizeof(frag_1), 0, saddr_p, saddr_len); + if (!ASSERT_GE(err, 0, "sendto frag_1")) + return -1; + + err = sendto(client, frag_2, sizeof(frag_2), 0, saddr_p, saddr_len); + if (!ASSERT_GE(err, 0, "sendto frag_2")) + return -1; + + return 0; +} + +static int send_frags6(int client) +{ + struct sockaddr_storage saddr; + struct sockaddr *saddr_p; + socklen_t saddr_len; + int err; + + saddr_p = (struct sockaddr *)&saddr; + /* Port needs to be set to 0 for raw ipv6 socket for some reason */ + err = make_sockaddr(AF_INET6, VETH1_ADDR6, 0, &saddr, &saddr_len); + if (!ASSERT_OK(err, "make_sockaddr")) + return -1; + + err = sendto(client, frag6_0, sizeof(frag6_0), 0, saddr_p, saddr_len); + if (!ASSERT_GE(err, 0, "sendto frag6_0")) + return -1; + + err = sendto(client, frag6_1, sizeof(frag6_1), 0, saddr_p, saddr_len); + if (!ASSERT_GE(err, 0, "sendto frag6_1")) + return -1; + + err = sendto(client, frag6_2, sizeof(frag6_2), 0, saddr_p, saddr_len); + if (!ASSERT_GE(err, 0, "sendto frag6_2")) + return -1; + + return 0; +} + +void test_bpf_ip_check_defrag_ok(bool ipv6) +{ + struct network_helper_opts rx_opts = { + .timeout_ms = 1000, + .noconnect = true, + }; + struct network_helper_opts tx_ops = { + .timeout_ms = 1000, + .type = SOCK_RAW, + .proto = IPPROTO_RAW, + .noconnect = true, + }; + struct sockaddr_storage caddr; + struct ip_check_defrag *skel; + struct nstoken *nstoken; + int client_tx_fd = -1; + int client_rx_fd = -1; + socklen_t caddr_len; + int srv_fd = -1; + char buf[1024]; + int len, err; + + skel = ip_check_defrag__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + return; + + if (!ASSERT_OK(setup_topology(ipv6), "setup_topology")) + goto out; + + if (!ASSERT_OK(attach(skel), "attach")) + goto out; + + /* Start server in ns1 */ + nstoken = open_netns(NS1); + if (!ASSERT_OK_PTR(nstoken, "setns ns1")) + goto out; + srv_fd = start_server(ipv6 ? AF_INET6 : AF_INET, SOCK_DGRAM, NULL, SERVER_PORT, 0); + close_netns(nstoken); + if (!ASSERT_GE(srv_fd, 0, "start_server")) + goto out; + + /* Open tx raw socket in ns0 */ + nstoken = open_netns(NS0); + if (!ASSERT_OK_PTR(nstoken, "setns ns0")) + goto out; + client_tx_fd = connect_to_fd_opts(srv_fd, &tx_ops); + close_netns(nstoken); + if (!ASSERT_GE(client_tx_fd, 0, "connect_to_fd_opts")) + goto out; + + /* Open rx socket in ns0 */ + nstoken = open_netns(NS0); + if (!ASSERT_OK_PTR(nstoken, "setns ns0")) + goto out; + client_rx_fd = connect_to_fd_opts(srv_fd, &rx_opts); + close_netns(nstoken); + if (!ASSERT_GE(client_rx_fd, 0, "connect_to_fd_opts")) + goto out; + + /* Bind rx socket to a premeditated port */ + memset(&caddr, 0, sizeof(caddr)); + nstoken = open_netns(NS0); + if (!ASSERT_OK_PTR(nstoken, "setns ns0")) + goto out; + if (ipv6) { + struct sockaddr_in6 *c = (struct sockaddr_in6 *)&caddr; + + c->sin6_family = AF_INET6; + inet_pton(AF_INET6, VETH0_ADDR6, &c->sin6_addr); + c->sin6_port = htons(CLIENT_PORT); + err = bind(client_rx_fd, (struct sockaddr *)c, sizeof(*c)); + } else { + struct sockaddr_in *c = (struct sockaddr_in *)&caddr; + + c->sin_family = AF_INET; + inet_pton(AF_INET, VETH0_ADDR, &c->sin_addr); + c->sin_port = htons(CLIENT_PORT); + err = bind(client_rx_fd, (struct sockaddr *)c, sizeof(*c)); + } + close_netns(nstoken); + if (!ASSERT_OK(err, "bind")) + goto out; + + /* Send message in fragments */ + if (ipv6) { + if (!ASSERT_OK(send_frags6(client_tx_fd), "send_frags6")) + goto out; + } else { + if (!ASSERT_OK(send_frags(client_tx_fd), "send_frags")) + goto out; + } + + if (!ASSERT_EQ(skel->bss->frags_seen, 3, "frags_seen")) + goto out; + + if (!ASSERT_FALSE(skel->data->is_final_frag, "is_final_frag")) + goto out; + + /* Receive reassembled msg on server and echo back to client */ + len = recvfrom(srv_fd, buf, sizeof(buf), 0, (struct sockaddr *)&caddr, &caddr_len); + if (!ASSERT_GE(len, 0, "server recvfrom")) + goto out; + len = sendto(srv_fd, buf, len, 0, (struct sockaddr *)&caddr, caddr_len); + if (!ASSERT_GE(len, 0, "server sendto")) + goto out; + + /* Expect reassembed message to be echoed back */ + len = recvfrom(client_rx_fd, buf, sizeof(buf), 0, NULL, NULL); + if (!ASSERT_EQ(len, sizeof(MAGIC_MESSAGE) - 1, "client short read")) + goto out; + +out: + if (client_rx_fd != -1) + close(client_rx_fd); + if (client_tx_fd != -1) + close(client_tx_fd); + if (srv_fd != -1) + close(srv_fd); + cleanup_topology(); + ip_check_defrag__destroy(skel); +} + +void test_bpf_ip_check_defrag_fail(void) +{ + const char *err_msg = "invalid mem access 'scalar'"; + LIBBPF_OPTS(bpf_object_open_opts, opts, + .kernel_log_buf = log_buf, + .kernel_log_size = sizeof(log_buf), + .kernel_log_level = 1); + struct ip_check_defrag *skel; + struct bpf_program *prog; + int err; + + skel = ip_check_defrag__open_opts(&opts); + if (!ASSERT_OK_PTR(skel, "ip_check_defrag__open_opts")) + return; + + prog = bpf_object__find_program_by_name(skel->obj, "defrag_fail"); + if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name")) + goto out; + + bpf_program__set_autoload(prog, true); + + err = ip_check_defrag__load(skel); + if (!ASSERT_ERR(err, "ip_check_defrag__load must fail")) + goto out; + + if (!ASSERT_OK_PTR(strstr(log_buf, err_msg), "expected error message")) { + fprintf(stderr, "Expected: %s\n", err_msg); + fprintf(stderr, "Verifier: %s\n", log_buf); + } + +out: + ip_check_defrag__destroy(skel); +} + +void test_bpf_ip_check_defrag(void) +{ + if (test__start_subtest("ok-v4")) + test_bpf_ip_check_defrag_ok(false); + if (test__start_subtest("ok-v6")) + test_bpf_ip_check_defrag_ok(true); + if (test__start_subtest("fail")) + test_bpf_ip_check_defrag_fail(); +} diff --git a/tools/testing/selftests/bpf/progs/bpf_tracing_net.h b/tools/testing/selftests/bpf/progs/bpf_tracing_net.h index cfed4df490f3..fde688b8af16 100644 --- a/tools/testing/selftests/bpf/progs/bpf_tracing_net.h +++ b/tools/testing/selftests/bpf/progs/bpf_tracing_net.h @@ -26,6 +26,7 @@ #define IPV6_AUTOFLOWLABEL 70 #define TC_ACT_UNSPEC (-1) +#define TC_ACT_OK 0 #define TC_ACT_SHOT 2 #define SOL_TCP 6 diff --git a/tools/testing/selftests/bpf/progs/ip_check_defrag.c b/tools/testing/selftests/bpf/progs/ip_check_defrag.c new file mode 100644 index 000000000000..5978fd2dd479 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/ip_check_defrag.c @@ -0,0 +1,133 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include "vmlinux.h" +#include +#include +#include "bpf_tracing_net.h" + +#define BPF_F_CURRENT_NETNS (-1) +#define ETH_P_IP 0x0800 +#define ETH_P_IPV6 0x86DD +#define IP_DF 0x4000 +#define IP_MF 0x2000 +#define IP_OFFSET 0x1FFF +#define NEXTHDR_FRAGMENT 44 +#define ctx_ptr(field) (void *)(long)(field) + +int bpf_ip_check_defrag(struct __sk_buff *ctx, u64 netns) __ksym; +int bpf_ipv6_frag_rcv(struct __sk_buff *ctx, u64 netns) __ksym; + +volatile int frags_seen = 0; +volatile bool is_final_frag = true; + +static bool is_frag_v4(struct iphdr *iph) +{ + int offset; + int flags; + + offset = bpf_ntohs(iph->frag_off); + flags = offset & ~IP_OFFSET; + offset &= IP_OFFSET; + offset <<= 3; + + return (flags & IP_MF) || offset; +} + +static bool is_frag_v6(struct ipv6hdr *ip6h) +{ + /* Simplifying assumption that there are no extension headers + * between fixed header and fragmentation header. This assumption + * is only valid in this test case. It saves us the hassle of + * searching all potential extension headers. + */ + return ip6h->nexthdr == NEXTHDR_FRAGMENT; +} + +static int defrag_v4(struct __sk_buff *skb) +{ + void *data_end = ctx_ptr(skb->data_end); + void *data = ctx_ptr(skb->data); + struct iphdr *iph; + + iph = data + sizeof(struct ethhdr); + if (iph + 1 > data_end) + return TC_ACT_SHOT; + + if (!is_frag_v4(iph)) + return TC_ACT_OK; + + frags_seen++; + if (bpf_ip_check_defrag(skb, BPF_F_CURRENT_NETNS)) + return TC_ACT_SHOT; + + data_end = ctx_ptr(skb->data_end); + data = ctx_ptr(skb->data); + iph = data + sizeof(struct ethhdr); + if (iph + 1 > data_end) + return TC_ACT_SHOT; + is_final_frag = is_frag_v4(iph); + + return TC_ACT_OK; +} + +static int defrag_v6(struct __sk_buff *skb) +{ + void *data_end = ctx_ptr(skb->data_end); + void *data = ctx_ptr(skb->data); + struct ipv6hdr *ip6h; + + ip6h = data + sizeof(struct ethhdr); + if (ip6h + 1 > data_end) + return TC_ACT_SHOT; + + if (!is_frag_v6(ip6h)) + return TC_ACT_OK; + + frags_seen++; + if (bpf_ipv6_frag_rcv(skb, BPF_F_CURRENT_NETNS)) + return TC_ACT_SHOT; + + data_end = ctx_ptr(skb->data_end); + data = ctx_ptr(skb->data); + ip6h = data + sizeof(struct ethhdr); + if (ip6h + 1 > data_end) + return TC_ACT_SHOT; + is_final_frag = is_frag_v6(ip6h); + + return TC_ACT_OK; +} + +SEC("tc") +int defrag(struct __sk_buff *skb) +{ + switch (bpf_ntohs(skb->protocol)) { + case ETH_P_IP: + return defrag_v4(skb); + case ETH_P_IPV6: + return defrag_v6(skb); + default: + return TC_ACT_OK; + } +} + +SEC("?tc") +int defrag_fail(struct __sk_buff *skb) +{ + void *data_end = ctx_ptr(skb->data_end); + void *data = ctx_ptr(skb->data); + struct iphdr *iph; + + if (skb->protocol != bpf_htons(ETH_P_IP)) + return TC_ACT_OK; + + iph = data + sizeof(struct ethhdr); + if (iph + 1 > data_end) + return TC_ACT_SHOT; + + if (bpf_ip_check_defrag(skb, BPF_F_CURRENT_NETNS)) + return TC_ACT_SHOT; + + /* Boom. Must revalidate pkt ptrs */ + return iph->ttl ? TC_ACT_OK : TC_ACT_SHOT; +} + +char _license[] SEC("license") = "GPL";