From patchwork Tue Jul 11 17:59:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Marchevsky X-Patchwork-Id: 13309299 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 575941E508 for ; Tue, 11 Jul 2023 18:00:36 +0000 (UTC) Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1181A173A for ; Tue, 11 Jul 2023 11:00:30 -0700 (PDT) Received: from pps.filterd (m0109333.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36BHhC3a016208 for ; Tue, 11 Jul 2023 11:00:29 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=BM8HYDmrXSO/pWVj8vMyHRx+wjF/5CBaLyVdGjIVhPY=; b=JyhJxGZYn/f10pusfMpfZ3zEPFNjkgfP0RGWhxvtSEay4PrBGJyi1umG9q+EwTDsCZBd jEeIUrAhF14V/3dYH4KtU3Ou3yEYihcHs54PxxKZR+FMyYqd10Dd6bmBZVKiVuNjhKw5 R5gfV+LSgLbz5RsL8RI/OFKKoPlIJuKIQXg= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3rs0p7w53a-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 11 Jul 2023 11:00:29 -0700 Received: from twshared35445.38.frc1.facebook.com (2620:10d:c085:208::f) by mail.thefacebook.com (2620:10d:c085:11d::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 11 Jul 2023 11:00:00 -0700 Received: by devbig077.ldc1.facebook.com (Postfix, from userid 158236) id 4407420EAEEBB; Tue, 11 Jul 2023 10:59:47 -0700 (PDT) From: Dave Marchevsky To: CC: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Kernel Team , Dave Marchevsky Subject: [PATCH bpf-next 1/6] [DONOTAPPLY] Revert "bpf: Disable bpf_refcount_acquire kfunc calls until race conditions are fixed" Date: Tue, 11 Jul 2023 10:59:40 -0700 Message-ID: <20230711175945.3298231-2-davemarchevsky@fb.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230711175945.3298231-1-davemarchevsky@fb.com> References: <20230711175945.3298231-1-davemarchevsky@fb.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: Ty7yAm5hkiZZgdLJGuwRfkpKR2gBbiEo X-Proofpoint-ORIG-GUID: Ty7yAm5hkiZZgdLJGuwRfkpKR2gBbiEo X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-11_10,2023-07-11_01,2023-05-22_02 X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net This is included in the series so that the test added in this series is run by CI. Signed-off-by: Dave Marchevsky --- kernel/bpf/verifier.c | 5 +---- tools/testing/selftests/bpf/prog_tests/refcounted_kptr.c | 2 ++ 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 11e54dd8b6dd..a4949e2442ad 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -11028,10 +11028,7 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ verbose(env, "arg#%d doesn't point to a type with bpf_refcount field\n", i); return -EINVAL; } - if (rec->refcount_off >= 0) { - verbose(env, "bpf_refcount_acquire calls are disabled for now\n"); - return -EINVAL; - } + meta->arg_btf = reg->btf; meta->arg_btf_id = reg->btf_id; break; diff --git a/tools/testing/selftests/bpf/prog_tests/refcounted_kptr.c b/tools/testing/selftests/bpf/prog_tests/refcounted_kptr.c index 595cbf92bff5..2ab23832062d 100644 --- a/tools/testing/selftests/bpf/prog_tests/refcounted_kptr.c +++ b/tools/testing/selftests/bpf/prog_tests/refcounted_kptr.c @@ -9,8 +9,10 @@ void test_refcounted_kptr(void) { + RUN_TESTS(refcounted_kptr); } void test_refcounted_kptr_fail(void) { + RUN_TESTS(refcounted_kptr_fail); } From patchwork Tue Jul 11 17:59:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Marchevsky X-Patchwork-Id: 13309295 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 07FAC1EA64 for ; Tue, 11 Jul 2023 18:00:12 +0000 (UTC) Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 455C0170A for ; Tue, 11 Jul 2023 11:00:06 -0700 (PDT) Received: from pps.filterd (m0089730.ppops.net [127.0.0.1]) by m0089730.ppops.net (8.17.1.19/8.17.1.19) with ESMTP id 36BHhCQM028576 for ; Tue, 11 Jul 2023 11:00:05 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=z7auqaZqV4RiCPjFSEs/W2fKT8JPGcSpdjN2yGznVaI=; b=mnskyRFwzfxGpYn+uMRSBz8wBq4KNOmIHYu1xzyYO9BkEvTtyRS/DuvGoIxXpKJFWmVn QFlS2uhXjzuE0cA8rT78TOy9Iy8/ZmkAoizrz+6INKw5N6l+mZQFHbvlinTwND2k59nh uyJsYpAbm+QjmlIcstZwplyI6pSF0tPfYus= Received: from maileast.thefacebook.com ([163.114.130.16]) by m0089730.ppops.net (PPS) with ESMTPS id 3rrtjnfhpu-9 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 11 Jul 2023 11:00:05 -0700 Received: from twshared52232.38.frc1.facebook.com (2620:10d:c0a8:1b::2d) by mail.thefacebook.com (2620:10d:c0a8:82::c) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 11 Jul 2023 10:59:59 -0700 Received: by devbig077.ldc1.facebook.com (Postfix, from userid 158236) id 2DA2E20EAEEC0; Tue, 11 Jul 2023 10:59:47 -0700 (PDT) From: Dave Marchevsky To: CC: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Kernel Team , Dave Marchevsky Subject: [PATCH bpf-next 2/6] bpf: Introduce internal definitions for UAPI-opaque bpf_{rb,list}_node Date: Tue, 11 Jul 2023 10:59:41 -0700 Message-ID: <20230711175945.3298231-3-davemarchevsky@fb.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230711175945.3298231-1-davemarchevsky@fb.com> References: <20230711175945.3298231-1-davemarchevsky@fb.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: YhI90VodUUhLpxGU_h1X14_ug71lGC1L X-Proofpoint-ORIG-GUID: YhI90VodUUhLpxGU_h1X14_ug71lGC1L X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-11_10,2023-07-11_01,2023-05-22_02 X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net Structs bpf_rb_node and bpf_list_node are opaquely defined in uapi/linux/bpf.h, as BPF program writers are not expected to touch their fields - nor does the verifier allow them to do so. Currently these structs are simple wrappers around structs rb_node and list_head and linked_list / rbtree implementation just casts and passes to library functions for those data structures. Later patches in this series, though, will add an "owner" field to bpf_{rb,list}_node, such that they're not just wrapping an underlying node type. Moreover, the bpf linked_list and rbtree implementations will deal with these owner pointers directly in a few different places. To avoid having to do void *owner = (void*)bpf_list_node + sizeof(struct list_head) with opaque UAPI node types, add bpf_{list,rb}_node_internal struct definitions to internal headers and modify linked_list and rbtree to use the internal types where appropriate. Signed-off-by: Dave Marchevsky --- include/linux/bpf.h | 10 ++++++++++ kernel/bpf/helpers.c | 23 +++++++++++++---------- 2 files changed, 23 insertions(+), 10 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 360433f14496..d5841059fd2f 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -228,6 +228,16 @@ struct btf_record { struct btf_field fields[]; }; +/* Non-opaque version of bpf_rb_node in uapi/linux/bpf.h */ +struct bpf_rb_node_internal { + struct rb_node rb_node; +} __attribute__((aligned(8))); + +/* Non-opaque version of bpf_list_node in uapi/linux/bpf.h */ +struct bpf_list_node_internal { + struct list_head list_head; +} __attribute__((aligned(8))); + struct bpf_map { /* The first two cachelines with read-mostly members of which some * are also accessed in fast-path (e.g. ops, max_entries). diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 9e80efa59a5d..f059adefbe82 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -1942,10 +1942,11 @@ __bpf_kfunc void *bpf_refcount_acquire_impl(void *p__refcounted_kptr, void *meta return (void *)p__refcounted_kptr; } -static int __bpf_list_add(struct bpf_list_node *node, struct bpf_list_head *head, +static int __bpf_list_add(struct bpf_list_node_internal *node, + struct bpf_list_head *head, bool tail, struct btf_record *rec, u64 off) { - struct list_head *n = (void *)node, *h = (void *)head; + struct list_head *n = &node->list_head, *h = (void *)head; /* If list_head was 0-initialized by map, bpf_obj_init_field wasn't * called on its fields, so init here @@ -1967,20 +1968,20 @@ __bpf_kfunc int bpf_list_push_front_impl(struct bpf_list_head *head, struct bpf_list_node *node, void *meta__ign, u64 off) { + struct bpf_list_node_internal *n = (void *)node; struct btf_struct_meta *meta = meta__ign; - return __bpf_list_add(node, head, false, - meta ? meta->record : NULL, off); + return __bpf_list_add(n, head, false, meta ? meta->record : NULL, off); } __bpf_kfunc int bpf_list_push_back_impl(struct bpf_list_head *head, struct bpf_list_node *node, void *meta__ign, u64 off) { + struct bpf_list_node_internal *n = (void *)node; struct btf_struct_meta *meta = meta__ign; - return __bpf_list_add(node, head, true, - meta ? meta->record : NULL, off); + return __bpf_list_add(n, head, true, meta ? meta->record : NULL, off); } static struct bpf_list_node *__bpf_list_del(struct bpf_list_head *head, bool tail) @@ -2013,7 +2014,7 @@ __bpf_kfunc struct bpf_rb_node *bpf_rbtree_remove(struct bpf_rb_root *root, struct bpf_rb_node *node) { struct rb_root_cached *r = (struct rb_root_cached *)root; - struct rb_node *n = (struct rb_node *)node; + struct rb_node *n = &((struct bpf_rb_node_internal *)node)->rb_node; if (RB_EMPTY_NODE(n)) return NULL; @@ -2026,11 +2027,12 @@ __bpf_kfunc struct bpf_rb_node *bpf_rbtree_remove(struct bpf_rb_root *root, /* Need to copy rbtree_add_cached's logic here because our 'less' is a BPF * program */ -static int __bpf_rbtree_add(struct bpf_rb_root *root, struct bpf_rb_node *node, +static int __bpf_rbtree_add(struct bpf_rb_root *root, + struct bpf_rb_node_internal *node, void *less, struct btf_record *rec, u64 off) { struct rb_node **link = &((struct rb_root_cached *)root)->rb_root.rb_node; - struct rb_node *parent = NULL, *n = (struct rb_node *)node; + struct rb_node *parent = NULL, *n = &node->rb_node; bpf_callback_t cb = (bpf_callback_t)less; bool leftmost = true; @@ -2060,8 +2062,9 @@ __bpf_kfunc int bpf_rbtree_add_impl(struct bpf_rb_root *root, struct bpf_rb_node void *meta__ign, u64 off) { struct btf_struct_meta *meta = meta__ign; + struct bpf_rb_node_internal *n = (void *)node; - return __bpf_rbtree_add(root, node, (void *)less, meta ? meta->record : NULL, off); + return __bpf_rbtree_add(root, n, (void *)less, meta ? meta->record : NULL, off); } __bpf_kfunc struct bpf_rb_node *bpf_rbtree_first(struct bpf_rb_root *root) From patchwork Tue Jul 11 17:59:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Marchevsky X-Patchwork-Id: 13309298 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5F7841E508 for ; Tue, 11 Jul 2023 18:00:30 +0000 (UTC) Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A5D161717 for ; Tue, 11 Jul 2023 11:00:26 -0700 (PDT) Received: from pps.filterd (m0109333.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36BHhC3S016208 for ; Tue, 11 Jul 2023 11:00:26 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-type : content-transfer-encoding : mime-version; s=facebook; bh=dxqBZI/CKtCsk+n8EEfZJ5uKrAgtdv4aR+8WXhPygbA=; b=ZRG9R3s1MRvqjQumBxOydESOp6iv5nAj3tXTpPfkKjuQCviuJJfnSGzuoOqVUSfXHxK9 2swhtymra+TzcwEIprV5xY2gbqfQRwUPEQ0SvhGC+qNBDq7xR6ufU8I2wit1lgVkJci6 k/JTgfk+n9V0B/zFi0XFZ+FwZvvMb781GCM= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3rs0p7w53a-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 11 Jul 2023 11:00:25 -0700 Received: from twshared6520.02.ash8.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:11d::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 11 Jul 2023 10:59:58 -0700 Received: by devbig077.ldc1.facebook.com (Postfix, from userid 158236) id 5E75720EAEEC9; Tue, 11 Jul 2023 10:59:49 -0700 (PDT) From: Dave Marchevsky To: CC: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Kernel Team , Dave Marchevsky , Kumar Kartikeya Dwivedi Subject: [PATCH bpf-next 3/6] bpf: Add 'owner' field to bpf_{list,rb}_node Date: Tue, 11 Jul 2023 10:59:42 -0700 Message-ID: <20230711175945.3298231-4-davemarchevsky@fb.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230711175945.3298231-1-davemarchevsky@fb.com> References: <20230711175945.3298231-1-davemarchevsky@fb.com> X-FB-Internal: Safe X-Proofpoint-GUID: ZdyUBLxoofvjOzMduYrcNNiU5Sv0Z7SL X-Proofpoint-ORIG-GUID: ZdyUBLxoofvjOzMduYrcNNiU5Sv0Z7SL X-Proofpoint-UnRewURL: 0 URL was un-rewritten Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-11_10,2023-07-11_01,2023-05-22_02 X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net As described by Kumar in [0], in shared ownership scenarios it is necessary to do runtime tracking of {rb,list} node ownership - and synchronize updates using this ownership information - in order to prevent races. This patch adds an 'owner' field to struct bpf_list_node and bpf_rb_node to implement such runtime tracking. The owner field is a void * that describes the ownership state of a node. It can have the following values: NULL - the node is not owned by any data structure BPF_PTR_POISON - the node is in the process of being added to a data structure ptr_to_root - the pointee is a data structure 'root' (bpf_rb_root / bpf_list_head) which owns this node The field is initially NULL (set by bpf_obj_init_field default behavior) and transitions states in the following sequence: Insertion: NULL -> BPF_PTR_POISON -> ptr_to_root Removal: ptr_to_root -> NULL Before a node has been successfully inserted, it is not protected by any root's lock, and therefore two programs can attempt to add the same node to different roots simultaneously. For this reason the intermediate BPF_PTR_POISON state is necessary. For removal, the node is protected by some root's lock so this intermediate hop isn't necessary. Note that bpf_list_pop_{front,back} helpers don't need to check owner before removing as the node-to-be-removed is not passed in as input and is instead taken directly from the list. Selftest changes in this patch are entirely mechanical: some BTF tests have hardcoded struct sizes for structs that contain bpf_{list,rb}_node fields, those were adjusted to account for the new sizes. Selftest additions to validate the owner field are added in a further patch in the series. [0]: https://lore.kernel.org/bpf/d7hyspcow5wtjcmw4fugdgyp3fwhljwuscp3xyut5qnwivyeru@ysdq543otzv2 Signed-off-by: Dave Marchevsky Suggested-by: Kumar Kartikeya Dwivedi --- include/linux/bpf.h | 2 + include/uapi/linux/bpf.h | 2 + kernel/bpf/helpers.c | 27 ++++++- .../selftests/bpf/prog_tests/linked_list.c | 78 +++++++++---------- 4 files changed, 66 insertions(+), 43 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index d5841059fd2f..dd371ebf0322 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -231,11 +231,13 @@ struct btf_record { /* Non-opaque version of bpf_rb_node in uapi/linux/bpf.h */ struct bpf_rb_node_internal { struct rb_node rb_node; + void *owner; } __attribute__((aligned(8))); /* Non-opaque version of bpf_list_node in uapi/linux/bpf.h */ struct bpf_list_node_internal { struct list_head list_head; + void *owner; } __attribute__((aligned(8))); struct bpf_map { diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 60a9d59beeab..542c957a3cdd 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -7012,6 +7012,7 @@ struct bpf_list_head { struct bpf_list_node { __u64 :64; __u64 :64; + __u64 :64; } __attribute__((aligned(8))); struct bpf_rb_root { @@ -7023,6 +7024,7 @@ struct bpf_rb_node { __u64 :64; __u64 :64; __u64 :64; + __u64 :64; } __attribute__((aligned(8))); struct bpf_refcount { diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index f059adefbe82..61b4720c1ceb 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -1953,13 +1953,18 @@ static int __bpf_list_add(struct bpf_list_node_internal *node, */ if (unlikely(!h->next)) INIT_LIST_HEAD(h); - if (!list_empty(n)) { + + /* node->owner != NULL implies !list_empty(n), no need to separately + * check the latter + */ + if (cmpxchg(&node->owner, NULL, BPF_PTR_POISON)) { /* Only called from BPF prog, no need to migrate_disable */ __bpf_obj_drop_impl((void *)n - off, rec); return -EINVAL; } tail ? list_add_tail(n, h) : list_add(n, h); + WRITE_ONCE(node->owner, head); return 0; } @@ -1987,6 +1992,7 @@ __bpf_kfunc int bpf_list_push_back_impl(struct bpf_list_head *head, static struct bpf_list_node *__bpf_list_del(struct bpf_list_head *head, bool tail) { struct list_head *n, *h = (void *)head; + struct bpf_list_node_internal *node; /* If list_head was 0-initialized by map, bpf_obj_init_field wasn't * called on its fields, so init here @@ -1995,8 +2001,12 @@ static struct bpf_list_node *__bpf_list_del(struct bpf_list_head *head, bool tai INIT_LIST_HEAD(h); if (list_empty(h)) return NULL; + n = tail ? h->prev : h->next; + node = container_of(n, struct bpf_list_node_internal, list_head); + list_del_init(n); + WRITE_ONCE(node->owner, NULL); return (struct bpf_list_node *)n; } @@ -2013,14 +2023,19 @@ __bpf_kfunc struct bpf_list_node *bpf_list_pop_back(struct bpf_list_head *head) __bpf_kfunc struct bpf_rb_node *bpf_rbtree_remove(struct bpf_rb_root *root, struct bpf_rb_node *node) { + struct bpf_rb_node_internal *node_internal = (struct bpf_rb_node_internal *)node; struct rb_root_cached *r = (struct rb_root_cached *)root; - struct rb_node *n = &((struct bpf_rb_node_internal *)node)->rb_node; + struct rb_node *n = &node_internal->rb_node; - if (RB_EMPTY_NODE(n)) + /* node_internal->owner != root implies either RB_EMPTY_NODE(n) or + * n is owned by some other tree. No need to check RB_EMPTY_NODE(n) + */ + if (READ_ONCE(node_internal->owner) != root) return NULL; rb_erase_cached(n, r); RB_CLEAR_NODE(n); + WRITE_ONCE(node_internal->owner, NULL); return (struct bpf_rb_node *)n; } @@ -2036,7 +2051,10 @@ static int __bpf_rbtree_add(struct bpf_rb_root *root, bpf_callback_t cb = (bpf_callback_t)less; bool leftmost = true; - if (!RB_EMPTY_NODE(n)) { + /* node->owner != NULL implies !RB_EMPTY_NODE(n), no need to separately + * check the latter + */ + if (cmpxchg(&node->owner, NULL, BPF_PTR_POISON)) { /* Only called from BPF prog, no need to migrate_disable */ __bpf_obj_drop_impl((void *)n - off, rec); return -EINVAL; @@ -2054,6 +2072,7 @@ static int __bpf_rbtree_add(struct bpf_rb_root *root, rb_link_node(n, parent, link); rb_insert_color_cached(n, (struct rb_root_cached *)root, leftmost); + WRITE_ONCE(node->owner, root); return 0; } diff --git a/tools/testing/selftests/bpf/prog_tests/linked_list.c b/tools/testing/selftests/bpf/prog_tests/linked_list.c index f63309fd0e28..18cf7b17463d 100644 --- a/tools/testing/selftests/bpf/prog_tests/linked_list.c +++ b/tools/testing/selftests/bpf/prog_tests/linked_list.c @@ -23,7 +23,7 @@ static struct { "bpf_spin_lock at off=" #off " must be held for bpf_list_head" }, \ { #test "_missing_lock_pop_back", \ "bpf_spin_lock at off=" #off " must be held for bpf_list_head" }, - TEST(kptr, 32) + TEST(kptr, 40) TEST(global, 16) TEST(map, 0) TEST(inner_map, 0) @@ -31,7 +31,7 @@ static struct { #define TEST(test, op) \ { #test "_kptr_incorrect_lock_" #op, \ "held lock and object are not in the same allocation\n" \ - "bpf_spin_lock at off=32 must be held for bpf_list_head" }, \ + "bpf_spin_lock at off=40 must be held for bpf_list_head" }, \ { #test "_global_incorrect_lock_" #op, \ "held lock and object are not in the same allocation\n" \ "bpf_spin_lock at off=16 must be held for bpf_list_head" }, \ @@ -84,23 +84,23 @@ static struct { { "double_push_back", "arg#1 expected pointer to allocated object" }, { "no_node_value_type", "bpf_list_node not found at offset=0" }, { "incorrect_value_type", - "operation on bpf_list_head expects arg#1 bpf_list_node at offset=40 in struct foo, " + "operation on bpf_list_head expects arg#1 bpf_list_node at offset=48 in struct foo, " "but arg is at offset=0 in struct bar" }, { "incorrect_node_var_off", "variable ptr_ access var_off=(0x0; 0xffffffff) disallowed" }, - { "incorrect_node_off1", "bpf_list_node not found at offset=41" }, - { "incorrect_node_off2", "arg#1 offset=0, but expected bpf_list_node at offset=40 in struct foo" }, + { "incorrect_node_off1", "bpf_list_node not found at offset=49" }, + { "incorrect_node_off2", "arg#1 offset=0, but expected bpf_list_node at offset=48 in struct foo" }, { "no_head_type", "bpf_list_head not found at offset=0" }, { "incorrect_head_var_off1", "R1 doesn't have constant offset" }, { "incorrect_head_var_off2", "variable ptr_ access var_off=(0x0; 0xffffffff) disallowed" }, - { "incorrect_head_off1", "bpf_list_head not found at offset=17" }, + { "incorrect_head_off1", "bpf_list_head not found at offset=25" }, { "incorrect_head_off2", "bpf_list_head not found at offset=1" }, { "pop_front_off", - "15: (bf) r1 = r6 ; R1_w=ptr_or_null_foo(id=4,ref_obj_id=4,off=40,imm=0) " - "R6_w=ptr_or_null_foo(id=4,ref_obj_id=4,off=40,imm=0) refs=2,4\n" + "15: (bf) r1 = r6 ; R1_w=ptr_or_null_foo(id=4,ref_obj_id=4,off=48,imm=0) " + "R6_w=ptr_or_null_foo(id=4,ref_obj_id=4,off=48,imm=0) refs=2,4\n" "16: (85) call bpf_this_cpu_ptr#154\nR1 type=ptr_or_null_ expected=percpu_ptr_" }, { "pop_back_off", - "15: (bf) r1 = r6 ; R1_w=ptr_or_null_foo(id=4,ref_obj_id=4,off=40,imm=0) " - "R6_w=ptr_or_null_foo(id=4,ref_obj_id=4,off=40,imm=0) refs=2,4\n" + "15: (bf) r1 = r6 ; R1_w=ptr_or_null_foo(id=4,ref_obj_id=4,off=48,imm=0) " + "R6_w=ptr_or_null_foo(id=4,ref_obj_id=4,off=48,imm=0) refs=2,4\n" "16: (85) call bpf_this_cpu_ptr#154\nR1 type=ptr_or_null_ expected=percpu_ptr_" }, }; @@ -257,7 +257,7 @@ static struct btf *init_btf(void) hid = btf__add_struct(btf, "bpf_list_head", 16); if (!ASSERT_EQ(hid, LIST_HEAD, "btf__add_struct bpf_list_head")) goto end; - nid = btf__add_struct(btf, "bpf_list_node", 16); + nid = btf__add_struct(btf, "bpf_list_node", 24); if (!ASSERT_EQ(nid, LIST_NODE, "btf__add_struct bpf_list_node")) goto end; return btf; @@ -276,7 +276,7 @@ static void list_and_rb_node_same_struct(bool refcount_field) if (!ASSERT_OK_PTR(btf, "init_btf")) return; - bpf_rb_node_btf_id = btf__add_struct(btf, "bpf_rb_node", 24); + bpf_rb_node_btf_id = btf__add_struct(btf, "bpf_rb_node", 32); if (!ASSERT_GT(bpf_rb_node_btf_id, 0, "btf__add_struct bpf_rb_node")) return; @@ -286,17 +286,17 @@ static void list_and_rb_node_same_struct(bool refcount_field) return; } - id = btf__add_struct(btf, "bar", refcount_field ? 44 : 40); + id = btf__add_struct(btf, "bar", refcount_field ? 60 : 56); if (!ASSERT_GT(id, 0, "btf__add_struct bar")) return; err = btf__add_field(btf, "a", LIST_NODE, 0, 0); if (!ASSERT_OK(err, "btf__add_field bar::a")) return; - err = btf__add_field(btf, "c", bpf_rb_node_btf_id, 128, 0); + err = btf__add_field(btf, "c", bpf_rb_node_btf_id, 192, 0); if (!ASSERT_OK(err, "btf__add_field bar::c")) return; if (refcount_field) { - err = btf__add_field(btf, "ref", bpf_refcount_btf_id, 320, 0); + err = btf__add_field(btf, "ref", bpf_refcount_btf_id, 448, 0); if (!ASSERT_OK(err, "btf__add_field bar::ref")) return; } @@ -527,7 +527,7 @@ static void test_btf(void) btf = init_btf(); if (!ASSERT_OK_PTR(btf, "init_btf")) break; - id = btf__add_struct(btf, "foo", 36); + id = btf__add_struct(btf, "foo", 44); if (!ASSERT_EQ(id, 5, "btf__add_struct foo")) break; err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); @@ -536,7 +536,7 @@ static void test_btf(void) err = btf__add_field(btf, "b", LIST_NODE, 128, 0); if (!ASSERT_OK(err, "btf__add_field foo::b")) break; - err = btf__add_field(btf, "c", SPIN_LOCK, 256, 0); + err = btf__add_field(btf, "c", SPIN_LOCK, 320, 0); if (!ASSERT_OK(err, "btf__add_field foo::c")) break; id = btf__add_decl_tag(btf, "contains:foo:b", 5, 0); @@ -553,7 +553,7 @@ static void test_btf(void) btf = init_btf(); if (!ASSERT_OK_PTR(btf, "init_btf")) break; - id = btf__add_struct(btf, "foo", 36); + id = btf__add_struct(btf, "foo", 44); if (!ASSERT_EQ(id, 5, "btf__add_struct foo")) break; err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); @@ -562,13 +562,13 @@ static void test_btf(void) err = btf__add_field(btf, "b", LIST_NODE, 128, 0); if (!ASSERT_OK(err, "btf__add_field foo::b")) break; - err = btf__add_field(btf, "c", SPIN_LOCK, 256, 0); + err = btf__add_field(btf, "c", SPIN_LOCK, 320, 0); if (!ASSERT_OK(err, "btf__add_field foo::c")) break; id = btf__add_decl_tag(btf, "contains:bar:b", 5, 0); if (!ASSERT_EQ(id, 6, "btf__add_decl_tag contains:bar:b")) break; - id = btf__add_struct(btf, "bar", 36); + id = btf__add_struct(btf, "bar", 44); if (!ASSERT_EQ(id, 7, "btf__add_struct bar")) break; err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); @@ -577,7 +577,7 @@ static void test_btf(void) err = btf__add_field(btf, "b", LIST_NODE, 128, 0); if (!ASSERT_OK(err, "btf__add_field bar::b")) break; - err = btf__add_field(btf, "c", SPIN_LOCK, 256, 0); + err = btf__add_field(btf, "c", SPIN_LOCK, 320, 0); if (!ASSERT_OK(err, "btf__add_field bar::c")) break; id = btf__add_decl_tag(btf, "contains:foo:b", 7, 0); @@ -594,19 +594,19 @@ static void test_btf(void) btf = init_btf(); if (!ASSERT_OK_PTR(btf, "init_btf")) break; - id = btf__add_struct(btf, "foo", 20); + id = btf__add_struct(btf, "foo", 28); if (!ASSERT_EQ(id, 5, "btf__add_struct foo")) break; err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); if (!ASSERT_OK(err, "btf__add_field foo::a")) break; - err = btf__add_field(btf, "b", SPIN_LOCK, 128, 0); + err = btf__add_field(btf, "b", SPIN_LOCK, 192, 0); if (!ASSERT_OK(err, "btf__add_field foo::b")) break; id = btf__add_decl_tag(btf, "contains:bar:a", 5, 0); if (!ASSERT_EQ(id, 6, "btf__add_decl_tag contains:bar:a")) break; - id = btf__add_struct(btf, "bar", 16); + id = btf__add_struct(btf, "bar", 24); if (!ASSERT_EQ(id, 7, "btf__add_struct bar")) break; err = btf__add_field(btf, "a", LIST_NODE, 0, 0); @@ -623,19 +623,19 @@ static void test_btf(void) btf = init_btf(); if (!ASSERT_OK_PTR(btf, "init_btf")) break; - id = btf__add_struct(btf, "foo", 20); + id = btf__add_struct(btf, "foo", 28); if (!ASSERT_EQ(id, 5, "btf__add_struct foo")) break; err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); if (!ASSERT_OK(err, "btf__add_field foo::a")) break; - err = btf__add_field(btf, "b", SPIN_LOCK, 128, 0); + err = btf__add_field(btf, "b", SPIN_LOCK, 192, 0); if (!ASSERT_OK(err, "btf__add_field foo::b")) break; id = btf__add_decl_tag(btf, "contains:bar:b", 5, 0); if (!ASSERT_EQ(id, 6, "btf__add_decl_tag contains:bar:b")) break; - id = btf__add_struct(btf, "bar", 36); + id = btf__add_struct(btf, "bar", 44); if (!ASSERT_EQ(id, 7, "btf__add_struct bar")) break; err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); @@ -644,13 +644,13 @@ static void test_btf(void) err = btf__add_field(btf, "b", LIST_NODE, 128, 0); if (!ASSERT_OK(err, "btf__add_field bar::b")) break; - err = btf__add_field(btf, "c", SPIN_LOCK, 256, 0); + err = btf__add_field(btf, "c", SPIN_LOCK, 320, 0); if (!ASSERT_OK(err, "btf__add_field bar::c")) break; id = btf__add_decl_tag(btf, "contains:baz:a", 7, 0); if (!ASSERT_EQ(id, 8, "btf__add_decl_tag contains:baz:a")) break; - id = btf__add_struct(btf, "baz", 16); + id = btf__add_struct(btf, "baz", 24); if (!ASSERT_EQ(id, 9, "btf__add_struct baz")) break; err = btf__add_field(btf, "a", LIST_NODE, 0, 0); @@ -667,7 +667,7 @@ static void test_btf(void) btf = init_btf(); if (!ASSERT_OK_PTR(btf, "init_btf")) break; - id = btf__add_struct(btf, "foo", 36); + id = btf__add_struct(btf, "foo", 44); if (!ASSERT_EQ(id, 5, "btf__add_struct foo")) break; err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); @@ -676,13 +676,13 @@ static void test_btf(void) err = btf__add_field(btf, "b", LIST_NODE, 128, 0); if (!ASSERT_OK(err, "btf__add_field foo::b")) break; - err = btf__add_field(btf, "c", SPIN_LOCK, 256, 0); + err = btf__add_field(btf, "c", SPIN_LOCK, 320, 0); if (!ASSERT_OK(err, "btf__add_field foo::c")) break; id = btf__add_decl_tag(btf, "contains:bar:b", 5, 0); if (!ASSERT_EQ(id, 6, "btf__add_decl_tag contains:bar:b")) break; - id = btf__add_struct(btf, "bar", 36); + id = btf__add_struct(btf, "bar", 44); if (!ASSERT_EQ(id, 7, "btf__add_struct bar")) break; err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); @@ -691,13 +691,13 @@ static void test_btf(void) err = btf__add_field(btf, "b", LIST_NODE, 128, 0); if (!ASSERT_OK(err, "btf__add_field bar:b")) break; - err = btf__add_field(btf, "c", SPIN_LOCK, 256, 0); + err = btf__add_field(btf, "c", SPIN_LOCK, 320, 0); if (!ASSERT_OK(err, "btf__add_field bar:c")) break; id = btf__add_decl_tag(btf, "contains:baz:a", 7, 0); if (!ASSERT_EQ(id, 8, "btf__add_decl_tag contains:baz:a")) break; - id = btf__add_struct(btf, "baz", 16); + id = btf__add_struct(btf, "baz", 24); if (!ASSERT_EQ(id, 9, "btf__add_struct baz")) break; err = btf__add_field(btf, "a", LIST_NODE, 0, 0); @@ -726,7 +726,7 @@ static void test_btf(void) id = btf__add_decl_tag(btf, "contains:bar:b", 5, 0); if (!ASSERT_EQ(id, 6, "btf__add_decl_tag contains:bar:b")) break; - id = btf__add_struct(btf, "bar", 36); + id = btf__add_struct(btf, "bar", 44); if (!ASSERT_EQ(id, 7, "btf__add_struct bar")) break; err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); @@ -735,13 +735,13 @@ static void test_btf(void) err = btf__add_field(btf, "b", LIST_NODE, 128, 0); if (!ASSERT_OK(err, "btf__add_field bar::b")) break; - err = btf__add_field(btf, "c", SPIN_LOCK, 256, 0); + err = btf__add_field(btf, "c", SPIN_LOCK, 320, 0); if (!ASSERT_OK(err, "btf__add_field bar::c")) break; id = btf__add_decl_tag(btf, "contains:baz:b", 7, 0); if (!ASSERT_EQ(id, 8, "btf__add_decl_tag")) break; - id = btf__add_struct(btf, "baz", 36); + id = btf__add_struct(btf, "baz", 44); if (!ASSERT_EQ(id, 9, "btf__add_struct baz")) break; err = btf__add_field(btf, "a", LIST_HEAD, 0, 0); @@ -750,13 +750,13 @@ static void test_btf(void) err = btf__add_field(btf, "b", LIST_NODE, 128, 0); if (!ASSERT_OK(err, "btf__add_field bar::b")) break; - err = btf__add_field(btf, "c", SPIN_LOCK, 256, 0); + err = btf__add_field(btf, "c", SPIN_LOCK, 320, 0); if (!ASSERT_OK(err, "btf__add_field bar::c")) break; id = btf__add_decl_tag(btf, "contains:bam:a", 9, 0); if (!ASSERT_EQ(id, 10, "btf__add_decl_tag contains:bam:a")) break; - id = btf__add_struct(btf, "bam", 16); + id = btf__add_struct(btf, "bam", 24); if (!ASSERT_EQ(id, 11, "btf__add_struct bam")) break; err = btf__add_field(btf, "a", LIST_NODE, 0, 0); From patchwork Tue Jul 11 17:59:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Marchevsky X-Patchwork-Id: 13309296 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6770C1E508 for ; Tue, 11 Jul 2023 18:00:24 +0000 (UTC) Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 679D21739 for ; Tue, 11 Jul 2023 11:00:22 -0700 (PDT) Received: from pps.filterd (m0044010.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36BHhItt004429 for ; Tue, 11 Jul 2023 11:00:22 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-type : content-transfer-encoding : mime-version; s=facebook; bh=Y3Wg+Px0GnG/bjmqKhLr6j/fe+eRbIw8UpzIx0+oFKc=; b=AGaHQha8uEK/xy1VqJqj8JHeSLpphDh8UhlJhamGDxP/x8whpVGztTHDCWB8m1XRc/7O Yzek9IOgaPK2wFs4z36UwX3CGg3szmjMRD3GxJjtlEPRs8Y+RayPjZxhDKJJ2ImjKXWH DdH00+JdpEwSOXKRakJwaTpJh7LdEAgL2/A= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3rrwqnp9d5-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 11 Jul 2023 11:00:20 -0700 Received: from ash-exhub204.TheFacebook.com (2620:10d:c0a8:83::4) by ash-exhub201.TheFacebook.com (2620:10d:c0a8:83::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 11 Jul 2023 10:59:59 -0700 Received: from twshared52232.38.frc1.facebook.com (2620:10d:c0a8:1b::2d) by mail.thefacebook.com (2620:10d:c0a8:83::4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 11 Jul 2023 10:59:59 -0700 Received: by devbig077.ldc1.facebook.com (Postfix, from userid 158236) id 04E1C20EAEECD; Tue, 11 Jul 2023 10:59:49 -0700 (PDT) From: Dave Marchevsky To: CC: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Kernel Team , Dave Marchevsky , Kumar Kartikeya Dwivedi Subject: [PATCH bpf-next 4/6] selftests/bpf: Add rbtree test exercising race which 'owner' field prevents Date: Tue, 11 Jul 2023 10:59:43 -0700 Message-ID: <20230711175945.3298231-5-davemarchevsky@fb.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230711175945.3298231-1-davemarchevsky@fb.com> References: <20230711175945.3298231-1-davemarchevsky@fb.com> X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: FE0RuMhvz9_CUHqhU2jT6cVVAEpO-Z6d X-Proofpoint-GUID: FE0RuMhvz9_CUHqhU2jT6cVVAEpO-Z6d X-Proofpoint-UnRewURL: 0 URL was un-rewritten Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-11_10,2023-07-11_01,2023-05-22_02 X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net This patch adds a runnable version of one of the races described by Kumar in [0]. Specifically, this interleaving: (rbtree1 and list head protected by lock1, rbtree2 protected by lock2) Prog A Prog B ====================================== n = bpf_obj_new(...) m = bpf_refcount_acquire(n) kptr_xchg(map, m) m = kptr_xchg(map, NULL) lock(lock2) bpf_rbtree_add(rbtree2, m->r, less) unlock(lock2) lock(lock1) bpf_list_push_back(head, n->l) /* make n non-owning ref */ bpf_rbtree_remove(rbtree1, n->r) unlock(lock1) The above interleaving, the node's struct bpf_rb_node *r can be used to add it to either rbtree1 or rbtree2, which are protected by different locks. If the node has been added to rbtree2, we should not be allowed to remove it while holding rbtree1's lock. Before changes in the previous patch in this series, the rbtree_remove in the second part of Prog A would succeed as the verifier has no way of knowing which tree owns a particular node at verification time. The addition of 'owner' field results in bpf_rbtree_remove correctly failing. The test added in this patch splits "Prog A" above into two separate BPF programs - A1 and A2 - and uses a second mapval + kptr_xchg to pass n from A1 to A2 similarly to the pass from A1 to B. If the test is run without the fix applied, the remove will succeed. Kumar's example had the two programs running on separate CPUs. This patch doesn't do this as it's not necessary to exercise the broken behavior / validate fixed behavior. [0]: https://lore.kernel.org/bpf/d7hyspcow5wtjcmw4fugdgyp3fwhljwuscp3xyut5qnwivyeru@ysdq543otzv2 Signed-off-by: Dave Marchevsky Suggested-by: Kumar Kartikeya Dwivedi --- .../bpf/prog_tests/refcounted_kptr.c | 28 ++++++ .../selftests/bpf/progs/refcounted_kptr.c | 94 ++++++++++++++++++- 2 files changed, 121 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/bpf/prog_tests/refcounted_kptr.c b/tools/testing/selftests/bpf/prog_tests/refcounted_kptr.c index 2ab23832062d..d6bd5e16e637 100644 --- a/tools/testing/selftests/bpf/prog_tests/refcounted_kptr.c +++ b/tools/testing/selftests/bpf/prog_tests/refcounted_kptr.c @@ -16,3 +16,31 @@ void test_refcounted_kptr_fail(void) { RUN_TESTS(refcounted_kptr_fail); } + +void test_refcounted_kptr_wrong_owner(void) +{ + LIBBPF_OPTS(bpf_test_run_opts, opts, + .data_in = &pkt_v4, + .data_size_in = sizeof(pkt_v4), + .repeat = 1, + ); + struct refcounted_kptr *skel; + int ret; + + skel = refcounted_kptr__open_and_load(); + if (!ASSERT_OK_PTR(skel, "refcounted_kptr__open_and_load")) + return; + + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.rbtree_wrong_owner_remove_fail_a1), &opts); + ASSERT_OK(ret, "rbtree_wrong_owner_remove_fail_a1"); + ASSERT_OK(opts.retval, "rbtree_wrong_owner_remove_fail_a1 retval"); + + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.rbtree_wrong_owner_remove_fail_b), &opts); + ASSERT_OK(ret, "rbtree_wrong_owner_remove_fail_b"); + ASSERT_OK(opts.retval, "rbtree_wrong_owner_remove_fail_b retval"); + + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.rbtree_wrong_owner_remove_fail_a2), &opts); + ASSERT_OK(ret, "rbtree_wrong_owner_remove_fail_a2"); + ASSERT_OK(opts.retval, "rbtree_wrong_owner_remove_fail_a2 retval"); + refcounted_kptr__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/refcounted_kptr.c b/tools/testing/selftests/bpf/progs/refcounted_kptr.c index a3da610b1e6b..c55652fdc63a 100644 --- a/tools/testing/selftests/bpf/progs/refcounted_kptr.c +++ b/tools/testing/selftests/bpf/progs/refcounted_kptr.c @@ -24,7 +24,7 @@ struct { __uint(type, BPF_MAP_TYPE_ARRAY); __type(key, int); __type(value, struct map_value); - __uint(max_entries, 1); + __uint(max_entries, 2); } stashed_nodes SEC(".maps"); struct node_acquire { @@ -42,6 +42,9 @@ private(A) struct bpf_list_head head __contains(node_data, l); private(B) struct bpf_spin_lock alock; private(B) struct bpf_rb_root aroot __contains(node_acquire, node); +private(C) struct bpf_spin_lock block; +private(C) struct bpf_rb_root broot __contains(node_data, r); + static bool less(struct bpf_rb_node *node_a, const struct bpf_rb_node *node_b) { struct node_data *a; @@ -405,4 +408,93 @@ long rbtree_refcounted_node_ref_escapes_owning_input(void *ctx) return 0; } +static long __stash_map_empty_xchg(struct node_data *n, int idx) +{ + struct map_value *mapval = bpf_map_lookup_elem(&stashed_nodes, &idx); + + if (!mapval) { + bpf_obj_drop(n); + return 1; + } + n = bpf_kptr_xchg(&mapval->node, n); + if (n) { + bpf_obj_drop(n); + return 2; + } + return 0; +} + +SEC("tc") +long rbtree_wrong_owner_remove_fail_a1(void *ctx) +{ + struct node_data *n, *m; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + m = bpf_refcount_acquire(n); + + if (__stash_map_empty_xchg(n, 0)) { + bpf_obj_drop(m); + return 2; + } + + if (__stash_map_empty_xchg(m, 1)) + return 3; + + return 0; +} + +SEC("tc") +long rbtree_wrong_owner_remove_fail_b(void *ctx) +{ + struct map_value *mapval; + struct node_data *n; + int idx = 0; + + mapval = bpf_map_lookup_elem(&stashed_nodes, &idx); + if (!mapval) + return 1; + + n = bpf_kptr_xchg(&mapval->node, NULL); + if (!n) + return 2; + + bpf_spin_lock(&block); + + bpf_rbtree_add(&broot, &n->r, less); + + bpf_spin_unlock(&block); + return 0; +} + +SEC("tc") +long rbtree_wrong_owner_remove_fail_a2(void *ctx) +{ + struct map_value *mapval; + struct bpf_rb_node *res; + struct node_data *m; + int idx = 1; + + mapval = bpf_map_lookup_elem(&stashed_nodes, &idx); + if (!mapval) + return 1; + + m = bpf_kptr_xchg(&mapval->node, NULL); + if (!m) + return 2; + bpf_spin_lock(&lock); + + /* make m non-owning ref */ + bpf_list_push_back(&head, &m->l); + res = bpf_rbtree_remove(&root, &m->r); + + bpf_spin_unlock(&lock); + if (res) { + bpf_obj_drop(container_of(res, struct node_data, r)); + return 3; + } + return 0; +} + char _license[] SEC("license") = "GPL"; From patchwork Tue Jul 11 17:59:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Marchevsky X-Patchwork-Id: 13309293 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 14C211E518 for ; Tue, 11 Jul 2023 18:00:04 +0000 (UTC) Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2A91310D2 for ; Tue, 11 Jul 2023 11:00:03 -0700 (PDT) Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36BHhfdO018505 for ; Tue, 11 Jul 2023 11:00:02 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=mvkctxTLmtbzDrs7XXyOf3QyCOSGxvHotv7uF5HfKGM=; b=lgtomm0mwNyh39HrX372bTcCNzF4yrGyQgWfkE0yLiBKQ7EcUg5jgFikWaIg9tOPQ+Db EwjkjwBqxsQFwbqasFcst/QKfw7Ax2D9LRFCBlFO9o4ovIIRm3yaS1rnTEp6SbxVM5P1 MSoLzs08Lz7cmcJEQoz6SLaoYe3MdZEhQlU= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3rs42mkssb-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 11 Jul 2023 11:00:02 -0700 Received: from twshared52232.38.frc1.facebook.com (2620:10d:c0a8:1c::1b) by mail.thefacebook.com (2620:10d:c0a8:82::f) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 11 Jul 2023 10:59:59 -0700 Received: by devbig077.ldc1.facebook.com (Postfix, from userid 158236) id 9974F20EAEED3; Tue, 11 Jul 2023 10:59:50 -0700 (PDT) From: Dave Marchevsky To: CC: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Kernel Team , Dave Marchevsky Subject: [PATCH bpf-next 5/6] selftests/bpf: Disable newly-added 'owner' field test until refcount re-enabled Date: Tue, 11 Jul 2023 10:59:44 -0700 Message-ID: <20230711175945.3298231-6-davemarchevsky@fb.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230711175945.3298231-1-davemarchevsky@fb.com> References: <20230711175945.3298231-1-davemarchevsky@fb.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: 7rkKmgf9imfkgpLqlxhcNLTobZIgDG9X X-Proofpoint-ORIG-GUID: 7rkKmgf9imfkgpLqlxhcNLTobZIgDG9X X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-11_10,2023-07-11_01,2023-05-22_02 X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net The test added in previous patch will fail with bpf_refcount_acquire disabled. Until all races are fixed and bpf_refcount_acquire is re-enabled on bpf-next, disable the test so CI doesn't complain. Signed-off-by: Dave Marchevsky --- .../bpf/prog_tests/refcounted_kptr.c | 24 ------------------- 1 file changed, 24 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/refcounted_kptr.c b/tools/testing/selftests/bpf/prog_tests/refcounted_kptr.c index d6bd5e16e637..f6b446512958 100644 --- a/tools/testing/selftests/bpf/prog_tests/refcounted_kptr.c +++ b/tools/testing/selftests/bpf/prog_tests/refcounted_kptr.c @@ -19,28 +19,4 @@ void test_refcounted_kptr_fail(void) void test_refcounted_kptr_wrong_owner(void) { - LIBBPF_OPTS(bpf_test_run_opts, opts, - .data_in = &pkt_v4, - .data_size_in = sizeof(pkt_v4), - .repeat = 1, - ); - struct refcounted_kptr *skel; - int ret; - - skel = refcounted_kptr__open_and_load(); - if (!ASSERT_OK_PTR(skel, "refcounted_kptr__open_and_load")) - return; - - ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.rbtree_wrong_owner_remove_fail_a1), &opts); - ASSERT_OK(ret, "rbtree_wrong_owner_remove_fail_a1"); - ASSERT_OK(opts.retval, "rbtree_wrong_owner_remove_fail_a1 retval"); - - ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.rbtree_wrong_owner_remove_fail_b), &opts); - ASSERT_OK(ret, "rbtree_wrong_owner_remove_fail_b"); - ASSERT_OK(opts.retval, "rbtree_wrong_owner_remove_fail_b retval"); - - ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.rbtree_wrong_owner_remove_fail_a2), &opts); - ASSERT_OK(ret, "rbtree_wrong_owner_remove_fail_a2"); - ASSERT_OK(opts.retval, "rbtree_wrong_owner_remove_fail_a2 retval"); - refcounted_kptr__destroy(skel); } From patchwork Tue Jul 11 17:59:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Marchevsky X-Patchwork-Id: 13309294 X-Patchwork-Delegate: bpf@iogearbox.net Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9A97B1E518 for ; Tue, 11 Jul 2023 18:00:08 +0000 (UTC) Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D67E710EF for ; Tue, 11 Jul 2023 11:00:03 -0700 (PDT) Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36BHhTOo007273 for ; Tue, 11 Jul 2023 11:00:03 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=MDtwt1dK9ZxsVVofDFJtiOVp/VcNZhn1UkJVGEcBXuY=; b=Ovnve9tB4Gi3e/aFeWisvXVqKvIAYMoLc6/jgwS2ed2oDfigRdmO0Td3qmQidVK+Ypx+ 7EwAaVR0tdZn4oL25Shd5gELCuwLQ+bxPfh/ovD/xSGENeYFL26//nZkqysVtWqowA11 jisJnJyEQsTky2fMewPkieSq8w+HYOCCCRI= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3rrk6bks35-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 11 Jul 2023 11:00:02 -0700 Received: from twshared24695.38.frc1.facebook.com (2620:10d:c085:208::11) by mail.thefacebook.com (2620:10d:c085:11d::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 11 Jul 2023 11:00:01 -0700 Received: by devbig077.ldc1.facebook.com (Postfix, from userid 158236) id 480D220EAEED5; Tue, 11 Jul 2023 10:59:51 -0700 (PDT) From: Dave Marchevsky To: CC: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Kernel Team , Dave Marchevsky Subject: [PATCH bpf-next 6/6] [DONOTAPPLY] Revert "selftests/bpf: Disable newly-added 'owner' field test until refcount re-enabled" Date: Tue, 11 Jul 2023 10:59:45 -0700 Message-ID: <20230711175945.3298231-7-davemarchevsky@fb.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230711175945.3298231-1-davemarchevsky@fb.com> References: <20230711175945.3298231-1-davemarchevsky@fb.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: IUHUBl1ZqCtly9gX9xSYQp9abw4JOFCI X-Proofpoint-GUID: IUHUBl1ZqCtly9gX9xSYQp9abw4JOFCI X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-11_10,2023-07-11_01,2023-05-22_02 X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: bpf@iogearbox.net This reverts the previous patch so that CI can run the newly-added test, while keeping the series easily applyable in a test-disabled state. Signed-off-by: Dave Marchevsky --- .../bpf/prog_tests/refcounted_kptr.c | 24 +++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/refcounted_kptr.c b/tools/testing/selftests/bpf/prog_tests/refcounted_kptr.c index f6b446512958..d6bd5e16e637 100644 --- a/tools/testing/selftests/bpf/prog_tests/refcounted_kptr.c +++ b/tools/testing/selftests/bpf/prog_tests/refcounted_kptr.c @@ -19,4 +19,28 @@ void test_refcounted_kptr_fail(void) void test_refcounted_kptr_wrong_owner(void) { + LIBBPF_OPTS(bpf_test_run_opts, opts, + .data_in = &pkt_v4, + .data_size_in = sizeof(pkt_v4), + .repeat = 1, + ); + struct refcounted_kptr *skel; + int ret; + + skel = refcounted_kptr__open_and_load(); + if (!ASSERT_OK_PTR(skel, "refcounted_kptr__open_and_load")) + return; + + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.rbtree_wrong_owner_remove_fail_a1), &opts); + ASSERT_OK(ret, "rbtree_wrong_owner_remove_fail_a1"); + ASSERT_OK(opts.retval, "rbtree_wrong_owner_remove_fail_a1 retval"); + + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.rbtree_wrong_owner_remove_fail_b), &opts); + ASSERT_OK(ret, "rbtree_wrong_owner_remove_fail_b"); + ASSERT_OK(opts.retval, "rbtree_wrong_owner_remove_fail_b retval"); + + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.rbtree_wrong_owner_remove_fail_a2), &opts); + ASSERT_OK(ret, "rbtree_wrong_owner_remove_fail_a2"); + ASSERT_OK(opts.retval, "rbtree_wrong_owner_remove_fail_a2 retval"); + refcounted_kptr__destroy(skel); }