From patchwork Thu Mar 16 17:50:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 13178159 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1399CC6FD1F for ; Thu, 16 Mar 2023 17:53:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229704AbjCPRw4 (ORCPT ); Thu, 16 Mar 2023 13:52:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58834 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230031AbjCPRwz (ORCPT ); Thu, 16 Mar 2023 13:52:55 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 380261D918; Thu, 16 Mar 2023 10:52:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678989170; x=1710525170; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=a2ZOAIhOwVTv6T+fWNrtdzmGB8PmumzxRCDd1wYy0sw=; b=UEk8gvUuk1alqRKt0OQiINQgFqXduuS5BFjZVcLXruetz3zEsvOvPV0e 1m9XqFVz+UDaSQivY3GQw3HO6jJG/xCvvfJk40Po/B0q0MJe1sBF6LQH7 URtHpnBKnX/G7yfcOcYFJ0comCOJ0OXnOLtaNcZ+j12CceqloZidy9iMd 6g4XRnXoy3HOUlXmqVc1kuVsKtL4+LwUCJOU2ixDuKlOH6PLnXOin80AH vIjXYp41ktUeIF0f/hdN/LJeoKoSJrzg+R86DY+SJEOLUIu4jAXgIGQnj bzW7aPOLOY/hTquAmra71zTFqoSi1glwsNREEoZfaX0nAzWYcPYB2OwIL g==; X-IronPort-AV: E=McAfee;i="6600,9927,10651"; a="317721448" X-IronPort-AV: E=Sophos;i="5.98,265,1673942400"; d="scan'208";a="317721448" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Mar 2023 10:52:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10651"; a="823351306" X-IronPort-AV: E=Sophos;i="5.98,265,1673942400"; d="scan'208";a="823351306" Received: from newjersey.igk.intel.com ([10.102.20.203]) by fmsmga001.fm.intel.com with ESMTP; 16 Mar 2023 10:52:05 -0700 From: Alexander Lobakin To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau Cc: Alexander Lobakin , Maciej Fijalkowski , Larysa Zaremba , =?utf-8?q?Toke_H=C3=B8iland-J?= =?utf-8?q?=C3=B8rgensen?= , Ilya Leoshkevich , Song Liu , Jesper Dangaard Brouer , Jakub Kicinski , bpf@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, syzbot+e1d1b65f7c32f2a86a9f@syzkaller.appspotmail.com Subject: [PATCH bpf-next 1/2] bpf, test_run: fix crashes due to XDP frame overwriting/corruption Date: Thu, 16 Mar 2023 18:50:50 +0100 Message-Id: <20230316175051.922550-2-aleksander.lobakin@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230316175051.922550-1-aleksander.lobakin@intel.com> References: <20230316175051.922550-1-aleksander.lobakin@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net syzbot and Ilya faced the splats when %XDP_PASS happens for bpf_test_run after skb PP recycling was enabled for {__,}xdp_build_skb_from_frame(): BUG: kernel NULL pointer dereference, address: 0000000000000d28 RIP: 0010:memset_erms+0xd/0x20 arch/x86/lib/memset_64.S:66 [...] Call Trace: __finalize_skb_around net/core/skbuff.c:321 [inline] __build_skb_around+0x232/0x3a0 net/core/skbuff.c:379 build_skb_around+0x32/0x290 net/core/skbuff.c:444 __xdp_build_skb_from_frame+0x121/0x760 net/core/xdp.c:622 xdp_recv_frames net/bpf/test_run.c:248 [inline] xdp_test_run_batch net/bpf/test_run.c:334 [inline] bpf_test_run_xdp_live+0x1289/0x1930 net/bpf/test_run.c:362 bpf_prog_test_run_xdp+0xa05/0x14e0 net/bpf/test_run.c:1418 [...] This happens due to that it calls xdp_scrub_frame(), which nullifies xdpf->data. bpf_test_run code doesn't reinit the frame when the XDP program doesn't adjust head or tail. Previously, %XDP_PASS meant the page will be released from the pool and returned to the MM layer, but now it does return to the Pool with the nullified xdpf->data, which doesn't get reinitialized then. So, in addition to checking whether the head and/or tail have been adjusted, check also for a potential XDP frame corruption. xdpf->data is 100% affected and also xdpf->flags is the field closest to the metadata / frame start. Checking for these two should be enough for non-extreme cases. Fixes: 9c94bbf9a87b ("xdp: recycle Page Pool backed skbs built from XDP frames") Reported-by: syzbot+e1d1b65f7c32f2a86a9f@syzkaller.appspotmail.com Link: https://lore.kernel.org/bpf/000000000000f1985705f6ef2243@google.com Reported-by: Ilya Leoshkevich Link: https://lore.kernel.org/bpf/e07dd94022ad5731705891b9487cc9ed66328b94.camel@linux.ibm.com Signed-off-by: Alexander Lobakin Acked-by: Toke Høiland-Jørgensen Tested-by: Ilya Leoshkevich --- net/bpf/test_run.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index 71226f68270d..8d6b31209bd6 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -208,6 +208,16 @@ static void xdp_test_run_teardown(struct xdp_test_data *xdp) kfree(xdp->skbs); } +static bool frame_was_changed(const struct xdp_page_head *head) +{ + /* xdp_scrub_frame() zeroes the data pointer, flags is the last field, + * i.e. has the highest chances to be overwritten. If those two are + * untouched, it's most likely safe to skip the context reset. + */ + return head->frm.data != head->orig_ctx.data || + head->frm.flags != head->orig_ctx.flags; +} + static bool ctx_was_changed(struct xdp_page_head *head) { return head->orig_ctx.data != head->ctx.data || @@ -217,7 +227,7 @@ static bool ctx_was_changed(struct xdp_page_head *head) static void reset_ctx(struct xdp_page_head *head) { - if (likely(!ctx_was_changed(head))) + if (likely(!frame_was_changed(head) && !ctx_was_changed(head))) return; head->ctx.data = head->orig_ctx.data; From patchwork Thu Mar 16 17:50:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 13178160 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD17DC7618E for ; Thu, 16 Mar 2023 17:53:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230289AbjCPRxF (ORCPT ); Thu, 16 Mar 2023 13:53:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58862 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230235AbjCPRw4 (ORCPT ); Thu, 16 Mar 2023 13:52:56 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 48DAC1EFC1; Thu, 16 Mar 2023 10:52:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678989171; x=1710525171; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QRih3njZUr377Ux29J6E8DuVOV2jEWZRLenNqTC3HQw=; b=FNIx2AaUemVDBoHyq/v0YRfT15JHboRqBlN9pCf3Msmv7k2dsZeQ1MIC icHyBHo/V7/sY3LqCg7H/DAkjSnCp3hwdxn3kNkE/PTh2n55gFJtQ3WzW Rhp5Jyww7AJbCKc95r3Y6AgEcGzwNHR4F7Y7DVZBa5IWblO7XHfBnnMMh rDBrEDGRsU0trqPgRQ4JeL8BTbKBuXhXTa4H54nIqyJu2SGMUM+ewcMvc 8mNZ0BdpmxfZiBvoUrOftPtIxrwipqz42z5vv3N0fs/zFeEboMxU5gzgN YF4fT8UcDVlmi7T4wSNpTe7m5Y6gYFys3Pf+zVd+jkcLLKn8054zmvrsy Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10651"; a="317721473" X-IronPort-AV: E=Sophos;i="5.98,265,1673942400"; d="scan'208";a="317721473" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Mar 2023 10:52:12 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10651"; a="823351351" X-IronPort-AV: E=Sophos;i="5.98,265,1673942400"; d="scan'208";a="823351351" Received: from newjersey.igk.intel.com ([10.102.20.203]) by fmsmga001.fm.intel.com with ESMTP; 16 Mar 2023 10:52:09 -0700 From: Alexander Lobakin To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau Cc: Alexander Lobakin , Maciej Fijalkowski , Larysa Zaremba , =?utf-8?q?Toke_H=C3=B8iland-J?= =?utf-8?q?=C3=B8rgensen?= , Ilya Leoshkevich , Song Liu , Jesper Dangaard Brouer , Jakub Kicinski , bpf@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH bpf-next 2/2] selftests/bpf: fix "metadata marker" getting overwritten by the netstack Date: Thu, 16 Mar 2023 18:50:51 +0100 Message-Id: <20230316175051.922550-3-aleksander.lobakin@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230316175051.922550-1-aleksander.lobakin@intel.com> References: <20230316175051.922550-1-aleksander.lobakin@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Alexei noticed xdp_do_redirect test on BPF CI started failing on BE systems after skb PP recycling was enabled: test_xdp_do_redirect:PASS:prog_run 0 nsec test_xdp_do_redirect:PASS:pkt_count_xdp 0 nsec test_xdp_do_redirect:PASS:pkt_count_zero 0 nsec test_xdp_do_redirect:FAIL:pkt_count_tc unexpected pkt_count_tc: actual 220 != expected 9998 test_max_pkt_size:PASS:prog_run_max_size 0 nsec test_max_pkt_size:PASS:prog_run_too_big 0 nsec close_netns:PASS:setns 0 nsec #289 xdp_do_redirect:FAIL Summary: 270/1674 PASSED, 30 SKIPPED, 1 FAILED and it doesn't happen on LE systems. Ilya then hunted it down to: #0 0x0000000000aaeee6 in neigh_hh_output (hh=0x83258df0, skb=0x88142200) at linux/include/net/neighbour.h:503 #1 0x0000000000ab2cda in neigh_output (skip_cache=false, skb=0x88142200, n=) at linux/include/net/neighbour.h:544 #2 ip6_finish_output2 (net=net@entry=0x88edba00, sk=sk@entry=0x0, skb=skb@entry=0x88142200) at linux/net/ipv6/ip6_output.c:134 #3 0x0000000000ab4cbc in __ip6_finish_output (skb=0x88142200, sk=0x0, net=0x88edba00) at linux/net/ipv6/ip6_output.c:195 #4 ip6_finish_output (net=0x88edba00, sk=0x0, skb=0x88142200) at linux/net/ipv6/ip6_output.c:206 xdp_do_redirect test places a u32 marker (0x42) right before the Ethernet header to check it then in the XDP program and return %XDP_ABORTED if it's not there. Neigh xmit code likes to round up hard header length to speed up copying the header, so it overwrites two bytes in front of the Eth header. On LE systems, 0x42 is one byte at `data - 4`, while on BE it's `data - 1`, what explains why it happens only there. It didn't happen previously due to that %XDP_PASS meant the page will be discarded and replaced by a new one, but now it can be recycled as well, while bpf_test_run code doesn't reinitialize the content of recycled pages. This mark is limited to this particular test and its setup though, so there's no need to predict 1000 different possible cases. Just move it 4 bytes to the left, still keeping it 32 bit to match on more bytes. Fixes: 9c94bbf9a87b ("xdp: recycle Page Pool backed skbs built from XDP frames") Reported-by: Alexei Starovoitov Link: https://lore.kernel.org/bpf/CAADnVQ+B_JOU+EpP=DKhbY9yXdN6GiRPnpTTXfEZ9sNkUeb-yQ@mail.gmail.com Reported-by: Ilya Leoshkevich # + debugging Link: https://lore.kernel.org/bpf/8341c1d9f935f410438e79d3bd8a9cc50aefe105.camel@linux.ibm.com Signed-off-by: Alexander Lobakin Acked-by: Toke Høiland-Jørgensen Tested-by: Ilya Leoshkevich --- tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c | 7 ++++--- tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c | 2 +- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c index 856cbc29e6a1..4eaa3dcaebc8 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c @@ -86,12 +86,12 @@ static void test_max_pkt_size(int fd) void test_xdp_do_redirect(void) { int err, xdp_prog_fd, tc_prog_fd, ifindex_src, ifindex_dst; - char data[sizeof(pkt_udp) + sizeof(__u32)]; + char data[sizeof(pkt_udp) + sizeof(__u64)]; struct test_xdp_do_redirect *skel = NULL; struct nstoken *nstoken = NULL; struct bpf_link *link; LIBBPF_OPTS(bpf_xdp_query_opts, query_opts); - struct xdp_md ctx_in = { .data = sizeof(__u32), + struct xdp_md ctx_in = { .data = sizeof(__u64), .data_end = sizeof(data) }; DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts, .data_in = &data, @@ -105,8 +105,9 @@ void test_xdp_do_redirect(void) DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook, .attach_point = BPF_TC_INGRESS); - memcpy(&data[sizeof(__u32)], &pkt_udp, sizeof(pkt_udp)); + memcpy(&data[sizeof(__u64)], &pkt_udp, sizeof(pkt_udp)); *((__u32 *)data) = 0x42; /* metadata test value */ + *((__u32 *)data + 4) = 0; skel = test_xdp_do_redirect__open(); if (!ASSERT_OK_PTR(skel, "skel")) diff --git a/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c b/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c index cd2d4e3258b8..5baaafed0d2d 100644 --- a/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c +++ b/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c @@ -52,7 +52,7 @@ int xdp_redirect(struct xdp_md *xdp) *payload = MARK_IN; - if (bpf_xdp_adjust_meta(xdp, 4)) + if (bpf_xdp_adjust_meta(xdp, sizeof(__u64))) return XDP_ABORTED; if (retcode > XDP_PASS)