From patchwork Tue Dec 6 23:09:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Marchevsky X-Patchwork-Id: 13066340 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7002AC352A1 for ; Tue, 6 Dec 2022 23:10:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229607AbiLFXKS (ORCPT ); Tue, 6 Dec 2022 18:10:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55480 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229609AbiLFXKQ (ORCPT ); Tue, 6 Dec 2022 18:10:16 -0500 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0B50B42998 for ; Tue, 6 Dec 2022 15:10:16 -0800 (PST) Received: from pps.filterd (m0109333.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2B6LhDGv023764 for ; Tue, 6 Dec 2022 15:10:15 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=fl2kr2icsrEhwxHZenPSpliFcBg8UqVUB4c472SshEA=; b=K0tw5aKyuqlS5m2dwvbgHXXSII2lY32L+h0AJv/Qo46sq9/R+4VBnITFiEyR9gM5GuDT OIdKVDQiYTj0jBV49cEeGyzObgGUF/IAghd8Np/FFiroVsdwEE3sLVaawYfmjeSnaMZT 8MxWIsn1eLaSQMuciYx6IVBgycyAB+R5BeE= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3m9sbt8bwm-15 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 06 Dec 2022 15:10:15 -0800 Received: from twshared8047.05.ash9.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::c) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Tue, 6 Dec 2022 15:10:10 -0800 Received: by devbig077.ldc1.facebook.com (Postfix, from userid 158236) id A868B120B3760; Tue, 6 Dec 2022 15:10:02 -0800 (PST) From: Dave Marchevsky To: CC: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Kernel Team , Kumar Kartikeya Dwivedi , Tejun Heo , Dave Marchevsky Subject: [PATCH bpf-next 01/13] bpf: Loosen alloc obj test in verifier's reg_btf_record Date: Tue, 6 Dec 2022 15:09:48 -0800 Message-ID: <20221206231000.3180914-2-davemarchevsky@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221206231000.3180914-1-davemarchevsky@fb.com> References: <20221206231000.3180914-1-davemarchevsky@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: 2uQjib2Cq-d_bpD9sSdJg2kf2X_Qx9Xw X-Proofpoint-ORIG-GUID: 2uQjib2Cq-d_bpD9sSdJg2kf2X_Qx9Xw X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-06_12,2022-12-06_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net btf->struct_meta_tab is populated by btf_parse_struct_metas in btf.c. There, a BTF record is created for any type containing a spin_lock or any next-gen datastructure node/head. Currently, for non-MAP_VALUE types, reg_btf_record will only search for a record using struct_meta_tab if the reg->type exactly matches (PTR_TO_BTF_ID | MEM_ALLOC). This exact match is too strict: an "allocated obj" type - returned from bpf_obj_new - might pick up other flags while working its way through the program. Loosen the check to be exact for base_type and just use MEM_ALLOC mask for type_flag. This patch is marked Fixes as the original intent of reg_btf_record was unlikely to have been to fail finding btf_record for valid alloc obj types with additional flags, some of which (e.g. PTR_UNTRUSTED) are valid register type states for alloc obj independent of this series. However, I didn't find a specific broken repro case outside of this series' added functionality, so it's possible that nothing was triggering this logic error before. Signed-off-by: Dave Marchevsky cc: Kumar Kartikeya Dwivedi Fixes: 4e814da0d599 ("bpf: Allow locking bpf_spin_lock in allocated objects") --- kernel/bpf/verifier.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 1d51bd9596da..67a13110bc22 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -451,6 +451,11 @@ static bool reg_type_not_null(enum bpf_reg_type type) type == PTR_TO_SOCK_COMMON; } +static bool type_is_ptr_alloc_obj(u32 type) +{ + return base_type(type) == PTR_TO_BTF_ID && type_flag(type) & MEM_ALLOC; +} + static struct btf_record *reg_btf_record(const struct bpf_reg_state *reg) { struct btf_record *rec = NULL; @@ -458,7 +463,7 @@ static struct btf_record *reg_btf_record(const struct bpf_reg_state *reg) if (reg->type == PTR_TO_MAP_VALUE) { rec = reg->map_ptr->record; - } else if (reg->type == (PTR_TO_BTF_ID | MEM_ALLOC)) { + } else if (type_is_ptr_alloc_obj(reg->type)) { meta = btf_find_struct_meta(reg->btf, reg->btf_id); if (meta) rec = meta->record; From patchwork Tue Dec 6 23:09:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Marchevsky X-Patchwork-Id: 13066339 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C613C352A1 for ; Tue, 6 Dec 2022 23:10:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229576AbiLFXKN (ORCPT ); Tue, 6 Dec 2022 18:10:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55414 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229607AbiLFXKM (ORCPT ); Tue, 6 Dec 2022 18:10:12 -0500 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 96AF14298E for ; Tue, 6 Dec 2022 15:10:11 -0800 (PST) Received: from pps.filterd (m0109333.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2B6LhDGo023764 for ; Tue, 6 Dec 2022 15:10:11 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=UPfqMIeSSu1MikIa77B5XLfGlxDRi8Svjkb+c38YdQ8=; b=hiFmp+2IqrJND/CVvEbf7ARRs7v7dYzQ/7hwc6Vxx7dzAPvfAEX2hFU5okaCT6Ml04On c4gLzh5eC4mYHCkskoVObbZrctP8H5/lsG09JGe/W8qXoISg38xo7KMr9711dN7KniJV Mksn7xHxXwAzgE0hMiy6ZUB6pxCYZrGV5PI= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3m9sbt8bwm-8 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 06 Dec 2022 15:10:11 -0800 Received: from twshared8047.05.ash9.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::c) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Tue, 6 Dec 2022 15:10:08 -0800 Received: by devbig077.ldc1.facebook.com (Postfix, from userid 158236) id 52EFE120B3762; Tue, 6 Dec 2022 15:10:03 -0800 (PST) From: Dave Marchevsky To: CC: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Kernel Team , Kumar Kartikeya Dwivedi , Tejun Heo , Dave Marchevsky Subject: [PATCH bpf-next 02/13] bpf: map_check_btf should fail if btf_parse_fields fails Date: Tue, 6 Dec 2022 15:09:49 -0800 Message-ID: <20221206231000.3180914-3-davemarchevsky@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221206231000.3180914-1-davemarchevsky@fb.com> References: <20221206231000.3180914-1-davemarchevsky@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: 3irBWWqr4eRC1PvLL3gEo35BPFNJ_Hf_ X-Proofpoint-ORIG-GUID: 3irBWWqr4eRC1PvLL3gEo35BPFNJ_Hf_ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-06_12,2022-12-06_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net map_check_btf calls btf_parse_fields to create a btf_record for its value_type. If there are no special fields in the value_type btf_parse_fields returns NULL, whereas if there special value_type fields but they are invalid in some way an error is returned. An example invalid state would be: struct node_data { struct bpf_rb_node node; int data; }; private(A) struct bpf_spin_lock glock; private(A) struct bpf_list_head ghead __contains(node_data, node); groot should be invalid as its __contains tag points to a field with type != "bpf_list_node". Before this patch, such a scenario would result in btf_parse_fields returning an error ptr, subsequent !IS_ERR_OR_NULL check failing, and btf_check_and_fixup_fields returning 0, which would then be returned by map_check_btf. After this patch's changes, -EINVAL would be returned by map_check_btf and the map would correctly fail to load. Signed-off-by: Dave Marchevsky cc: Kumar Kartikeya Dwivedi Fixes: aa3496accc41 ("bpf: Refactor kptr_off_tab into btf_record") --- kernel/bpf/syscall.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 35972afb6850..c3599a7902f0 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1007,7 +1007,10 @@ static int map_check_btf(struct bpf_map *map, const struct btf *btf, map->record = btf_parse_fields(btf, value_type, BPF_SPIN_LOCK | BPF_TIMER | BPF_KPTR | BPF_LIST_HEAD, map->value_size); - if (!IS_ERR_OR_NULL(map->record)) { + if (IS_ERR(map->record)) + return -EINVAL; + + if (map->record) { int i; if (!bpf_capable()) { From patchwork Tue Dec 6 23:09:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Marchevsky X-Patchwork-Id: 13066348 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4267C4708D for ; Tue, 6 Dec 2022 23:10:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229565AbiLFXKb (ORCPT ); Tue, 6 Dec 2022 18:10:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55654 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229649AbiLFXK3 (ORCPT ); Tue, 6 Dec 2022 18:10:29 -0500 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3C35429AC for ; Tue, 6 Dec 2022 15:10:28 -0800 (PST) Received: from pps.filterd (m0089730.ppops.net [127.0.0.1]) by m0089730.ppops.net (8.17.1.19/8.17.1.19) with ESMTP id 2B6LhAMG013912 for ; Tue, 6 Dec 2022 15:10:28 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=AbOeVTPiDkhygRitG0ME7dPETq/dImrFr6jfY7hkL/8=; b=XG84iKE9fHOSBu9RbaBopIzRcURWEq5R51mg0Bbb705dHZAMl2jexOmLxJ8FjZjpo8o0 fzs8h815ODzqOdzrzPaGENST0PIHhpLxYhhOConooiD+B+oL0pceCjxMkbp+XxSgJQp/ 1Jqa5sRwnfc9ywXIh5atEVtZo0rI6v58HsI= Received: from mail.thefacebook.com ([163.114.132.120]) by m0089730.ppops.net (PPS) with ESMTPS id 3m9s5q0c24-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 06 Dec 2022 15:10:28 -0800 Received: from twshared15216.17.frc2.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:21d::4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Tue, 6 Dec 2022 15:10:26 -0800 Received: by devbig077.ldc1.facebook.com (Postfix, from userid 158236) id E154A120B3768; Tue, 6 Dec 2022 15:10:03 -0800 (PST) From: Dave Marchevsky To: CC: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Kernel Team , Kumar Kartikeya Dwivedi , Tejun Heo , Dave Marchevsky Subject: [PATCH bpf-next 03/13] bpf: Minor refactor of ref_set_release_on_unlock Date: Tue, 6 Dec 2022 15:09:50 -0800 Message-ID: <20221206231000.3180914-4-davemarchevsky@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221206231000.3180914-1-davemarchevsky@fb.com> References: <20221206231000.3180914-1-davemarchevsky@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: 4IoK4-7AJT2-aBCLMb8beEwuuW0xi-9c X-Proofpoint-ORIG-GUID: 4IoK4-7AJT2-aBCLMb8beEwuuW0xi-9c X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-06_12,2022-12-06_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This is mostly a nonfunctional change. The verifier log message "expected false release_on_unlock" was missing a newline, so add it and move some checks around to reduce indentation level. Signed-off-by: Dave Marchevsky --- kernel/bpf/verifier.c | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 67a13110bc22..6f0aac837d77 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -8438,19 +8438,21 @@ static int ref_set_release_on_unlock(struct bpf_verifier_env *env, u32 ref_obj_i return -EFAULT; } for (i = 0; i < state->acquired_refs; i++) { - if (state->refs[i].id == ref_obj_id) { - if (state->refs[i].release_on_unlock) { - verbose(env, "verifier internal error: expected false release_on_unlock"); - return -EFAULT; - } - state->refs[i].release_on_unlock = true; - /* Now mark everyone sharing same ref_obj_id as untrusted */ - bpf_for_each_reg_in_vstate(env->cur_state, state, reg, ({ - if (reg->ref_obj_id == ref_obj_id) - reg->type |= PTR_UNTRUSTED; - })); - return 0; + if (state->refs[i].id != ref_obj_id) + continue; + + if (state->refs[i].release_on_unlock) { + verbose(env, "verifier internal error: expected false release_on_unlock\n"); + return -EFAULT; } + + state->refs[i].release_on_unlock = true; + /* Now mark everyone sharing same ref_obj_id as untrusted */ + bpf_for_each_reg_in_vstate(env->cur_state, state, reg, ({ + if (reg->ref_obj_id == ref_obj_id) + reg->type |= PTR_UNTRUSTED; + })); + return 0; } verbose(env, "verifier internal error: ref state missing for ref_obj_id\n"); return -EFAULT; From patchwork Tue Dec 6 23:09:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Marchevsky X-Patchwork-Id: 13066341 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B337C4708D for ; Tue, 6 Dec 2022 23:10:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229609AbiLFXKS (ORCPT ); Tue, 6 Dec 2022 18:10:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55490 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229583AbiLFXKQ (ORCPT ); Tue, 6 Dec 2022 18:10:16 -0500 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1516C4299E for ; Tue, 6 Dec 2022 15:10:16 -0800 (PST) Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2B6LhAIe005404 for ; Tue, 6 Dec 2022 15:10:15 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=GJHnREvB4ds25DN4QssMCHuYvpiRk8k4zF7gMzr7kkU=; b=RyX+DXkYDuLDJdeprpTKHd1xRjnmmYpSkv1nsYVQnZNzGlwsYxHz8jt3J8DDGZurs41/ aQdID2i3X9rqML57eUDjP8hzO3Mt6fNK1c0SFM9doqGjVhD5XWjCpFaud/FFkrK+FCg6 Dpu7/LmjwB51qhMXi7O91bw/HZYlaq5RcgE= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3ma9dxtxe2-6 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 06 Dec 2022 15:10:15 -0800 Received: from twshared21592.39.frc1.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:11d::4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Tue, 6 Dec 2022 15:10:12 -0800 Received: by devbig077.ldc1.facebook.com (Postfix, from userid 158236) id 757D9120B376A; Tue, 6 Dec 2022 15:10:04 -0800 (PST) From: Dave Marchevsky To: CC: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Kernel Team , Kumar Kartikeya Dwivedi , Tejun Heo , Dave Marchevsky Subject: [PATCH bpf-next 04/13] bpf: rename list_head -> datastructure_head in field info types Date: Tue, 6 Dec 2022 15:09:51 -0800 Message-ID: <20221206231000.3180914-5-davemarchevsky@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221206231000.3180914-1-davemarchevsky@fb.com> References: <20221206231000.3180914-1-davemarchevsky@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: qe2SvtNj-ib0vGplf8oU8ktpfe33BG-M X-Proofpoint-ORIG-GUID: qe2SvtNj-ib0vGplf8oU8ktpfe33BG-M X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-06_12,2022-12-06_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Many of the structs recently added to track field info for linked-list head are useful as-is for rbtree root. So let's do a mechanical renaming of list_head-related types and fields: include/linux/bpf.h: struct btf_field_list_head -> struct btf_field_datastructure_head list_head -> datastructure_head in struct btf_field union kernel/bpf/btf.c: list_head -> datastructure_head in struct btf_field_info This is a nonfunctional change, functionality to actually use these fields for rbtree will be added in further patches. Signed-off-by: Dave Marchevsky --- include/linux/bpf.h | 4 ++-- kernel/bpf/btf.c | 21 +++++++++++---------- kernel/bpf/helpers.c | 4 ++-- kernel/bpf/verifier.c | 21 +++++++++++---------- 4 files changed, 26 insertions(+), 24 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 4920ac252754..9e8b12c7061e 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -189,7 +189,7 @@ struct btf_field_kptr { u32 btf_id; }; -struct btf_field_list_head { +struct btf_field_datastructure_head { struct btf *btf; u32 value_btf_id; u32 node_offset; @@ -201,7 +201,7 @@ struct btf_field { enum btf_field_type type; union { struct btf_field_kptr kptr; - struct btf_field_list_head list_head; + struct btf_field_datastructure_head datastructure_head; }; }; diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index c80bd8709e69..284e3e4b76b7 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -3227,7 +3227,7 @@ struct btf_field_info { struct { const char *node_name; u32 value_btf_id; - } list_head; + } datastructure_head; }; }; @@ -3334,8 +3334,8 @@ static int btf_find_list_head(const struct btf *btf, const struct btf_type *pt, return -EINVAL; info->type = BPF_LIST_HEAD; info->off = off; - info->list_head.value_btf_id = id; - info->list_head.node_name = list_node; + info->datastructure_head.value_btf_id = id; + info->datastructure_head.node_name = list_node; return BTF_FIELD_FOUND; } @@ -3603,13 +3603,14 @@ static int btf_parse_list_head(const struct btf *btf, struct btf_field *field, u32 offset; int i; - t = btf_type_by_id(btf, info->list_head.value_btf_id); + t = btf_type_by_id(btf, info->datastructure_head.value_btf_id); /* We've already checked that value_btf_id is a struct type. We * just need to figure out the offset of the list_node, and * verify its type. */ for_each_member(i, t, member) { - if (strcmp(info->list_head.node_name, __btf_name_by_offset(btf, member->name_off))) + if (strcmp(info->datastructure_head.node_name, + __btf_name_by_offset(btf, member->name_off))) continue; /* Invalid BTF, two members with same name */ if (n) @@ -3626,9 +3627,9 @@ static int btf_parse_list_head(const struct btf *btf, struct btf_field *field, if (offset % __alignof__(struct bpf_list_node)) return -EINVAL; - field->list_head.btf = (struct btf *)btf; - field->list_head.value_btf_id = info->list_head.value_btf_id; - field->list_head.node_offset = offset; + field->datastructure_head.btf = (struct btf *)btf; + field->datastructure_head.value_btf_id = info->datastructure_head.value_btf_id; + field->datastructure_head.node_offset = offset; } if (!n) return -ENOENT; @@ -3735,11 +3736,11 @@ int btf_check_and_fixup_fields(const struct btf *btf, struct btf_record *rec) if (!(rec->fields[i].type & BPF_LIST_HEAD)) continue; - btf_id = rec->fields[i].list_head.value_btf_id; + btf_id = rec->fields[i].datastructure_head.value_btf_id; meta = btf_find_struct_meta(btf, btf_id); if (!meta) return -EFAULT; - rec->fields[i].list_head.value_rec = meta->record; + rec->fields[i].datastructure_head.value_rec = meta->record; if (!(rec->field_mask & BPF_LIST_NODE)) continue; diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index cca642358e80..6c67740222c2 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -1737,12 +1737,12 @@ void bpf_list_head_free(const struct btf_field *field, void *list_head, while (head != orig_head) { void *obj = head; - obj -= field->list_head.node_offset; + obj -= field->datastructure_head.node_offset; head = head->next; /* The contained type can also have resources, including a * bpf_list_head which needs to be freed. */ - bpf_obj_free_fields(field->list_head.value_rec, obj); + bpf_obj_free_fields(field->datastructure_head.value_rec, obj); /* bpf_mem_free requires migrate_disable(), since we can be * called from map free path as well apart from BPF program (as * part of map ops doing bpf_obj_free_fields). diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 6f0aac837d77..bc80b4c4377b 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -8615,21 +8615,22 @@ static int process_kf_arg_ptr_to_list_node(struct bpf_verifier_env *env, field = meta->arg_list_head.field; - et = btf_type_by_id(field->list_head.btf, field->list_head.value_btf_id); + et = btf_type_by_id(field->datastructure_head.btf, field->datastructure_head.value_btf_id); t = btf_type_by_id(reg->btf, reg->btf_id); - if (!btf_struct_ids_match(&env->log, reg->btf, reg->btf_id, 0, field->list_head.btf, - field->list_head.value_btf_id, true)) { + if (!btf_struct_ids_match(&env->log, reg->btf, reg->btf_id, 0, field->datastructure_head.btf, + field->datastructure_head.value_btf_id, true)) { verbose(env, "operation on bpf_list_head expects arg#1 bpf_list_node at offset=%d " "in struct %s, but arg is at offset=%d in struct %s\n", - field->list_head.node_offset, btf_name_by_offset(field->list_head.btf, et->name_off), + field->datastructure_head.node_offset, + btf_name_by_offset(field->datastructure_head.btf, et->name_off), list_node_off, btf_name_by_offset(reg->btf, t->name_off)); return -EINVAL; } - if (list_node_off != field->list_head.node_offset) { + if (list_node_off != field->datastructure_head.node_offset) { verbose(env, "arg#1 offset=%d, but expected bpf_list_node at offset=%d in struct %s\n", - list_node_off, field->list_head.node_offset, - btf_name_by_offset(field->list_head.btf, et->name_off)); + list_node_off, field->datastructure_head.node_offset, + btf_name_by_offset(field->datastructure_head.btf, et->name_off)); return -EINVAL; } /* Set arg#1 for expiration after unlock */ @@ -9078,9 +9079,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, mark_reg_known_zero(env, regs, BPF_REG_0); regs[BPF_REG_0].type = PTR_TO_BTF_ID | MEM_ALLOC; - regs[BPF_REG_0].btf = field->list_head.btf; - regs[BPF_REG_0].btf_id = field->list_head.value_btf_id; - regs[BPF_REG_0].off = field->list_head.node_offset; + regs[BPF_REG_0].btf = field->datastructure_head.btf; + regs[BPF_REG_0].btf_id = field->datastructure_head.value_btf_id; + regs[BPF_REG_0].off = field->datastructure_head.node_offset; } else if (meta.func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx]) { mark_reg_known_zero(env, regs, BPF_REG_0); regs[BPF_REG_0].type = PTR_TO_BTF_ID | PTR_TRUSTED; From patchwork Tue Dec 6 23:09:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Marchevsky X-Patchwork-Id: 13066343 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DFD2C352A1 for ; Tue, 6 Dec 2022 23:10:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229544AbiLFXK0 (ORCPT ); Tue, 6 Dec 2022 18:10:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229583AbiLFXKZ (ORCPT ); Tue, 6 Dec 2022 18:10:25 -0500 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7C4ED4299E for ; Tue, 6 Dec 2022 15:10:23 -0800 (PST) Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2B6LhJ5a032730 for ; Tue, 6 Dec 2022 15:10:23 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=rMzlYtK266bsWrt5xUcakhe+1IBThHhkY3//DMdoEoY=; b=nWAtSwE+0jSt9Du4aE0QVZmd9yMphI8JeAcgS3iCKgj+K84fpd5lOdrFcuzTAYf2Wpen mcaTRnUvhfQNx70pC9Vsc7qyZeTpKISa3UREBm+UEFQ9LI8k7Ja6j/Qg2frAVHWLQemX ZnQ1n7O4fAimbTmt/zkMh3XURzWXb57E0BE= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3m9x70xv4w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 06 Dec 2022 15:10:23 -0800 Received: from twshared26225.38.frc1.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::d) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Tue, 6 Dec 2022 15:10:20 -0800 Received: by devbig077.ldc1.facebook.com (Postfix, from userid 158236) id F19F7120B376D; Tue, 6 Dec 2022 15:10:04 -0800 (PST) From: Dave Marchevsky To: CC: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Kernel Team , Kumar Kartikeya Dwivedi , Tejun Heo , Dave Marchevsky Subject: [PATCH bpf-next 05/13] bpf: Add basic bpf_rb_{root,node} support Date: Tue, 6 Dec 2022 15:09:52 -0800 Message-ID: <20221206231000.3180914-6-davemarchevsky@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221206231000.3180914-1-davemarchevsky@fb.com> References: <20221206231000.3180914-1-davemarchevsky@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: 0YtMTHGEIoUbD5frjdjguCY_Wlxz7S5y X-Proofpoint-GUID: 0YtMTHGEIoUbD5frjdjguCY_Wlxz7S5y X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-06_12,2022-12-06_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This patch adds special BPF_RB_{ROOT,NODE} btf_field_types similar to BPF_LIST_{HEAD,NODE}, adds the necessary plumbing to detect the new types, and adds bpf_rb_root_free function for freeing bpf_rb_root in map_values. structs bpf_rb_root and bpf_rb_node are opaque types meant to obscure structs rb_root_cached rb_node, respectively. btf_struct_access will prevent BPF programs from touching these special fields automatically now that they're recognized. btf_check_and_fixup_fields now groups list_head and rb_root together as "owner" fields and {list,rb}_node as "ownee", and does same ownership cycle checking as before. Note this function does _not_ prevent ownership type mixups (e.g. rb_root owning list_node) - that's handled by btf_parse_datastructure_head. After this patch, a bpf program can have a struct bpf_rb_root in a map_value, but not add anything to nor do anything useful with it. Signed-off-by: Dave Marchevsky --- include/linux/bpf.h | 17 ++ include/uapi/linux/bpf.h | 11 ++ kernel/bpf/btf.c | 162 ++++++++++++------ kernel/bpf/helpers.c | 40 +++++ kernel/bpf/syscall.c | 28 ++- kernel/bpf/verifier.c | 5 +- tools/include/uapi/linux/bpf.h | 11 ++ .../selftests/bpf/prog_tests/linked_list.c | 12 +- 8 files changed, 214 insertions(+), 72 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 9e8b12c7061e..2f8c4960390e 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -180,6 +180,8 @@ enum btf_field_type { BPF_KPTR = BPF_KPTR_UNREF | BPF_KPTR_REF, BPF_LIST_HEAD = (1 << 4), BPF_LIST_NODE = (1 << 5), + BPF_RB_ROOT = (1 << 6), + BPF_RB_NODE = (1 << 7), }; struct btf_field_kptr { @@ -283,6 +285,10 @@ static inline const char *btf_field_type_name(enum btf_field_type type) return "bpf_list_head"; case BPF_LIST_NODE: return "bpf_list_node"; + case BPF_RB_ROOT: + return "bpf_rb_root"; + case BPF_RB_NODE: + return "bpf_rb_node"; default: WARN_ON_ONCE(1); return "unknown"; @@ -303,6 +309,10 @@ static inline u32 btf_field_type_size(enum btf_field_type type) return sizeof(struct bpf_list_head); case BPF_LIST_NODE: return sizeof(struct bpf_list_node); + case BPF_RB_ROOT: + return sizeof(struct bpf_rb_root); + case BPF_RB_NODE: + return sizeof(struct bpf_rb_node); default: WARN_ON_ONCE(1); return 0; @@ -323,6 +333,10 @@ static inline u32 btf_field_type_align(enum btf_field_type type) return __alignof__(struct bpf_list_head); case BPF_LIST_NODE: return __alignof__(struct bpf_list_node); + case BPF_RB_ROOT: + return __alignof__(struct bpf_rb_root); + case BPF_RB_NODE: + return __alignof__(struct bpf_rb_node); default: WARN_ON_ONCE(1); return 0; @@ -433,6 +447,9 @@ void copy_map_value_locked(struct bpf_map *map, void *dst, void *src, void bpf_timer_cancel_and_free(void *timer); void bpf_list_head_free(const struct btf_field *field, void *list_head, struct bpf_spin_lock *spin_lock); +void bpf_rb_root_free(const struct btf_field *field, void *rb_root, + struct bpf_spin_lock *spin_lock); + int bpf_obj_name_cpy(char *dst, const char *src, unsigned int size); diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index f89de51a45db..02e68c352372 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -6901,6 +6901,17 @@ struct bpf_list_node { __u64 :64; } __attribute__((aligned(8))); +struct bpf_rb_root { + __u64 :64; + __u64 :64; +} __attribute__((aligned(8))); + +struct bpf_rb_node { + __u64 :64; + __u64 :64; + __u64 :64; +} __attribute__((aligned(8))); + struct bpf_sysctl { __u32 write; /* Sysctl is being read (= 0) or written (= 1). * Allows 1,2,4-byte read, but no write. diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 284e3e4b76b7..a42f67031963 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -3304,12 +3304,14 @@ static const char *btf_find_decl_tag_value(const struct btf *btf, return NULL; } -static int btf_find_list_head(const struct btf *btf, const struct btf_type *pt, - const struct btf_type *t, int comp_idx, - u32 off, int sz, struct btf_field_info *info) +static int +btf_find_datastructure_head(const struct btf *btf, const struct btf_type *pt, + const struct btf_type *t, int comp_idx, u32 off, + int sz, struct btf_field_info *info, + enum btf_field_type head_type) { + const char *node_field_name; const char *value_type; - const char *list_node; s32 id; if (!__btf_type_is_struct(t)) @@ -3319,26 +3321,32 @@ static int btf_find_list_head(const struct btf *btf, const struct btf_type *pt, value_type = btf_find_decl_tag_value(btf, pt, comp_idx, "contains:"); if (!value_type) return -EINVAL; - list_node = strstr(value_type, ":"); - if (!list_node) + node_field_name = strstr(value_type, ":"); + if (!node_field_name) return -EINVAL; - value_type = kstrndup(value_type, list_node - value_type, GFP_KERNEL | __GFP_NOWARN); + value_type = kstrndup(value_type, node_field_name - value_type, GFP_KERNEL | __GFP_NOWARN); if (!value_type) return -ENOMEM; id = btf_find_by_name_kind(btf, value_type, BTF_KIND_STRUCT); kfree(value_type); if (id < 0) return id; - list_node++; - if (str_is_empty(list_node)) + node_field_name++; + if (str_is_empty(node_field_name)) return -EINVAL; - info->type = BPF_LIST_HEAD; + info->type = head_type; info->off = off; info->datastructure_head.value_btf_id = id; - info->datastructure_head.node_name = list_node; + info->datastructure_head.node_name = node_field_name; return BTF_FIELD_FOUND; } +#define field_mask_test_name(field_type, field_type_str) \ + if (field_mask & field_type && !strcmp(name, field_type_str)) { \ + type = field_type; \ + goto end; \ + } + static int btf_get_field_type(const char *name, u32 field_mask, u32 *seen_mask, int *align, int *sz) { @@ -3362,18 +3370,11 @@ static int btf_get_field_type(const char *name, u32 field_mask, u32 *seen_mask, goto end; } } - if (field_mask & BPF_LIST_HEAD) { - if (!strcmp(name, "bpf_list_head")) { - type = BPF_LIST_HEAD; - goto end; - } - } - if (field_mask & BPF_LIST_NODE) { - if (!strcmp(name, "bpf_list_node")) { - type = BPF_LIST_NODE; - goto end; - } - } + field_mask_test_name(BPF_LIST_HEAD, "bpf_list_head"); + field_mask_test_name(BPF_LIST_NODE, "bpf_list_node"); + field_mask_test_name(BPF_RB_ROOT, "bpf_rb_root"); + field_mask_test_name(BPF_RB_NODE, "bpf_rb_node"); + /* Only return BPF_KPTR when all other types with matchable names fail */ if (field_mask & BPF_KPTR) { type = BPF_KPTR_REF; @@ -3386,6 +3387,8 @@ static int btf_get_field_type(const char *name, u32 field_mask, u32 *seen_mask, return type; } +#undef field_mask_test_name + static int btf_find_struct_field(const struct btf *btf, const struct btf_type *t, u32 field_mask, struct btf_field_info *info, int info_cnt) @@ -3418,6 +3421,7 @@ static int btf_find_struct_field(const struct btf *btf, case BPF_SPIN_LOCK: case BPF_TIMER: case BPF_LIST_NODE: + case BPF_RB_NODE: ret = btf_find_struct(btf, member_type, off, sz, field_type, idx < info_cnt ? &info[idx] : &tmp); if (ret < 0) @@ -3431,8 +3435,11 @@ static int btf_find_struct_field(const struct btf *btf, return ret; break; case BPF_LIST_HEAD: - ret = btf_find_list_head(btf, t, member_type, i, off, sz, - idx < info_cnt ? &info[idx] : &tmp); + case BPF_RB_ROOT: + ret = btf_find_datastructure_head(btf, t, member_type, + i, off, sz, + idx < info_cnt ? &info[idx] : &tmp, + field_type); if (ret < 0) return ret; break; @@ -3479,6 +3486,7 @@ static int btf_find_datasec_var(const struct btf *btf, const struct btf_type *t, case BPF_SPIN_LOCK: case BPF_TIMER: case BPF_LIST_NODE: + case BPF_RB_NODE: ret = btf_find_struct(btf, var_type, off, sz, field_type, idx < info_cnt ? &info[idx] : &tmp); if (ret < 0) @@ -3492,8 +3500,11 @@ static int btf_find_datasec_var(const struct btf *btf, const struct btf_type *t, return ret; break; case BPF_LIST_HEAD: - ret = btf_find_list_head(btf, var, var_type, -1, off, sz, - idx < info_cnt ? &info[idx] : &tmp); + case BPF_RB_ROOT: + ret = btf_find_datastructure_head(btf, var, var_type, + -1, off, sz, + idx < info_cnt ? &info[idx] : &tmp, + field_type); if (ret < 0) return ret; break; @@ -3595,8 +3606,11 @@ static int btf_parse_kptr(const struct btf *btf, struct btf_field *field, return ret; } -static int btf_parse_list_head(const struct btf *btf, struct btf_field *field, - struct btf_field_info *info) +static int btf_parse_datastructure_head(const struct btf *btf, + struct btf_field *field, + struct btf_field_info *info, + const char *node_type_name, + size_t node_type_align) { const struct btf_type *t, *n = NULL; const struct btf_member *member; @@ -3618,13 +3632,13 @@ static int btf_parse_list_head(const struct btf *btf, struct btf_field *field, n = btf_type_by_id(btf, member->type); if (!__btf_type_is_struct(n)) return -EINVAL; - if (strcmp("bpf_list_node", __btf_name_by_offset(btf, n->name_off))) + if (strcmp(node_type_name, __btf_name_by_offset(btf, n->name_off))) return -EINVAL; offset = __btf_member_bit_offset(n, member); if (offset % 8) return -EINVAL; offset /= 8; - if (offset % __alignof__(struct bpf_list_node)) + if (offset % node_type_align) return -EINVAL; field->datastructure_head.btf = (struct btf *)btf; @@ -3636,6 +3650,20 @@ static int btf_parse_list_head(const struct btf *btf, struct btf_field *field, return 0; } +static int btf_parse_list_head(const struct btf *btf, struct btf_field *field, + struct btf_field_info *info) +{ + return btf_parse_datastructure_head(btf, field, info, "bpf_list_node", + __alignof__(struct bpf_list_node)); +} + +static int btf_parse_rb_root(const struct btf *btf, struct btf_field *field, + struct btf_field_info *info) +{ + return btf_parse_datastructure_head(btf, field, info, "bpf_rb_node", + __alignof__(struct bpf_rb_node)); +} + struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type *t, u32 field_mask, u32 value_size) { @@ -3698,7 +3726,13 @@ struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type if (ret < 0) goto end; break; + case BPF_RB_ROOT: + ret = btf_parse_rb_root(btf, &rec->fields[i], &info_arr[i]); + if (ret < 0) + goto end; + break; case BPF_LIST_NODE: + case BPF_RB_NODE: break; default: ret = -EFAULT; @@ -3707,8 +3741,9 @@ struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type rec->cnt++; } - /* bpf_list_head requires bpf_spin_lock */ - if (btf_record_has_field(rec, BPF_LIST_HEAD) && rec->spin_lock_off < 0) { + /* bpf_{list_head, rb_node} require bpf_spin_lock */ + if ((btf_record_has_field(rec, BPF_LIST_HEAD) || + btf_record_has_field(rec, BPF_RB_ROOT)) && rec->spin_lock_off < 0) { ret = -EINVAL; goto end; } @@ -3719,22 +3754,28 @@ struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type return ERR_PTR(ret); } +#define OWNER_FIELD_MASK (BPF_LIST_HEAD | BPF_RB_ROOT) +#define OWNEE_FIELD_MASK (BPF_LIST_NODE | BPF_RB_NODE) + int btf_check_and_fixup_fields(const struct btf *btf, struct btf_record *rec) { int i; - /* There are two owning types, kptr_ref and bpf_list_head. The former - * only supports storing kernel types, which can never store references - * to program allocated local types, atleast not yet. Hence we only need - * to ensure that bpf_list_head ownership does not form cycles. + /* There are three types that signify ownership of some other type: + * kptr_ref, bpf_list_head, bpf_rb_root. + * kptr_ref only supports storing kernel types, which can't store + * references to program allocated local types. + * + * Hence we only need to ensure that bpf_{list_head,rb_root} ownership + * does not form cycles. */ - if (IS_ERR_OR_NULL(rec) || !(rec->field_mask & BPF_LIST_HEAD)) + if (IS_ERR_OR_NULL(rec) || !(rec->field_mask & OWNER_FIELD_MASK)) return 0; for (i = 0; i < rec->cnt; i++) { struct btf_struct_meta *meta; u32 btf_id; - if (!(rec->fields[i].type & BPF_LIST_HEAD)) + if (!(rec->fields[i].type & OWNER_FIELD_MASK)) continue; btf_id = rec->fields[i].datastructure_head.value_btf_id; meta = btf_find_struct_meta(btf, btf_id); @@ -3742,39 +3783,47 @@ int btf_check_and_fixup_fields(const struct btf *btf, struct btf_record *rec) return -EFAULT; rec->fields[i].datastructure_head.value_rec = meta->record; - if (!(rec->field_mask & BPF_LIST_NODE)) + /* We need to set value_rec for all owner types, but no need + * to check ownership cycle for a type unless it's also an + * ownee type. + */ + if (!(rec->field_mask & OWNEE_FIELD_MASK)) continue; /* We need to ensure ownership acyclicity among all types. The * proper way to do it would be to topologically sort all BTF * IDs based on the ownership edges, since there can be multiple - * bpf_list_head in a type. Instead, we use the following - * reasoning: + * bpf_{list_head,rb_node} in a type. Instead, we use the + * following resaoning: * * - A type can only be owned by another type in user BTF if it - * has a bpf_list_node. + * has a bpf_{list,rb}_node. Let's call these ownee types. * - A type can only _own_ another type in user BTF if it has a - * bpf_list_head. + * bpf_{list_head,rb_root}. Let's call these owner types. * - * We ensure that if a type has both bpf_list_head and - * bpf_list_node, its element types cannot be owning types. + * We ensure that if a type is both an owner and ownee, its + * element types cannot be owner types. * * To ensure acyclicity: * - * When A only has bpf_list_head, ownership chain can be: + * When A is an owner type but not an ownee, its ownership + * chain can be: * A -> B -> C * Where: - * - B has both bpf_list_head and bpf_list_node. - * - C only has bpf_list_node. + * - A is an owner, e.g. has bpf_rb_root. + * - B is both an owner and ownee, e.g. has bpf_rb_node and + * bpf_list_head. + * - C is only an owner, e.g. has bpf_list_node * - * When A has both bpf_list_head and bpf_list_node, some other - * type already owns it in the BTF domain, hence it can not own - * another owning type through any of the bpf_list_head edges. + * When A is both an owner and ownee, some other type already + * owns it in the BTF domain, hence it can not own + * another owner type through any of the ownership edges. * A -> B * Where: - * - B only has bpf_list_node. + * - A is both an owner and ownee. + * - B is only an ownee. */ - if (meta->record->field_mask & BPF_LIST_HEAD) + if (meta->record->field_mask & OWNER_FIELD_MASK) return -ELOOP; } return 0; @@ -5236,6 +5285,8 @@ static const char *alloc_obj_fields[] = { "bpf_spin_lock", "bpf_list_head", "bpf_list_node", + "bpf_rb_root", + "bpf_rb_node", }; static struct btf_struct_metas * @@ -5309,7 +5360,8 @@ btf_parse_struct_metas(struct bpf_verifier_log *log, struct btf *btf) type = &tab->types[tab->cnt]; type->btf_id = i; - record = btf_parse_fields(btf, t, BPF_SPIN_LOCK | BPF_LIST_HEAD | BPF_LIST_NODE, t->size); + record = btf_parse_fields(btf, t, BPF_SPIN_LOCK | BPF_LIST_HEAD | BPF_LIST_NODE | + BPF_RB_ROOT | BPF_RB_NODE, t->size); /* The record cannot be unset, treat it as an error if so */ if (IS_ERR_OR_NULL(record)) { ret = PTR_ERR_OR_ZERO(record) ?: -EFAULT; diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 6c67740222c2..4d04432b162e 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -1753,6 +1753,46 @@ void bpf_list_head_free(const struct btf_field *field, void *list_head, } } +/* Like rbtree_postorder_for_each_entry_safe, but 'pos' and 'n' are + * 'rb_node *', so field name of rb_node within containing struct is not + * needed. + * + * Since bpf_rb_tree's node type has a corresponding struct btf_field with + * datastructure_head.node_offset, it's not necessary to know field name + * or type of node struct + */ +#define bpf_rbtree_postorder_for_each_entry_safe(pos, n, root) \ + for (pos = rb_first_postorder(root); \ + pos && ({ n = rb_next_postorder(pos); 1; }); \ + pos = n) + +void bpf_rb_root_free(const struct btf_field *field, void *rb_root, + struct bpf_spin_lock *spin_lock) +{ + struct rb_root_cached orig_root, *root = rb_root; + struct rb_node *pos, *n; + void *obj; + + BUILD_BUG_ON(sizeof(struct rb_root_cached) > sizeof(struct bpf_rb_root)); + BUILD_BUG_ON(__alignof__(struct rb_root_cached) > __alignof__(struct bpf_rb_root)); + + __bpf_spin_lock_irqsave(spin_lock); + orig_root = *root; + *root = RB_ROOT_CACHED; + __bpf_spin_unlock_irqrestore(spin_lock); + + bpf_rbtree_postorder_for_each_entry_safe(pos, n, &orig_root.rb_root) { + obj = pos; + obj -= field->datastructure_head.node_offset; + + bpf_obj_free_fields(field->datastructure_head.value_rec, obj); + + migrate_disable(); + bpf_mem_free(&bpf_global_ma, obj); + migrate_enable(); + } +} + __diag_push(); __diag_ignore_all("-Wmissing-prototypes", "Global functions as their definitions will be in vmlinux BTF"); diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index c3599a7902f0..b6b464c15575 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -527,9 +527,6 @@ void btf_record_free(struct btf_record *rec) return; for (i = 0; i < rec->cnt; i++) { switch (rec->fields[i].type) { - case BPF_SPIN_LOCK: - case BPF_TIMER: - break; case BPF_KPTR_UNREF: case BPF_KPTR_REF: if (rec->fields[i].kptr.module) @@ -538,7 +535,11 @@ void btf_record_free(struct btf_record *rec) break; case BPF_LIST_HEAD: case BPF_LIST_NODE: - /* Nothing to release for bpf_list_head */ + case BPF_RB_ROOT: + case BPF_RB_NODE: + case BPF_SPIN_LOCK: + case BPF_TIMER: + /* Nothing to release */ break; default: WARN_ON_ONCE(1); @@ -571,9 +572,6 @@ struct btf_record *btf_record_dup(const struct btf_record *rec) new_rec->cnt = 0; for (i = 0; i < rec->cnt; i++) { switch (fields[i].type) { - case BPF_SPIN_LOCK: - case BPF_TIMER: - break; case BPF_KPTR_UNREF: case BPF_KPTR_REF: btf_get(fields[i].kptr.btf); @@ -584,7 +582,11 @@ struct btf_record *btf_record_dup(const struct btf_record *rec) break; case BPF_LIST_HEAD: case BPF_LIST_NODE: - /* Nothing to acquire for bpf_list_head */ + case BPF_RB_ROOT: + case BPF_RB_NODE: + case BPF_SPIN_LOCK: + case BPF_TIMER: + /* Nothing to acquire */ break; default: ret = -EFAULT; @@ -664,7 +666,13 @@ void bpf_obj_free_fields(const struct btf_record *rec, void *obj) continue; bpf_list_head_free(field, field_ptr, obj + rec->spin_lock_off); break; + case BPF_RB_ROOT: + if (WARN_ON_ONCE(rec->spin_lock_off < 0)) + continue; + bpf_rb_root_free(field, field_ptr, obj + rec->spin_lock_off); + break; case BPF_LIST_NODE: + case BPF_RB_NODE: break; default: WARN_ON_ONCE(1); @@ -1005,7 +1013,8 @@ static int map_check_btf(struct bpf_map *map, const struct btf *btf, return -EINVAL; map->record = btf_parse_fields(btf, value_type, - BPF_SPIN_LOCK | BPF_TIMER | BPF_KPTR | BPF_LIST_HEAD, + BPF_SPIN_LOCK | BPF_TIMER | BPF_KPTR | BPF_LIST_HEAD | + BPF_RB_ROOT, map->value_size); if (IS_ERR(map->record)) return -EINVAL; @@ -1056,6 +1065,7 @@ static int map_check_btf(struct bpf_map *map, const struct btf *btf, } break; case BPF_LIST_HEAD: + case BPF_RB_ROOT: if (map->map_type != BPF_MAP_TYPE_HASH && map->map_type != BPF_MAP_TYPE_LRU_HASH && map->map_type != BPF_MAP_TYPE_ARRAY) { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index bc80b4c4377b..9d9e00fd6dfa 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -14105,9 +14105,10 @@ static int check_map_prog_compatibility(struct bpf_verifier_env *env, { enum bpf_prog_type prog_type = resolve_prog_type(prog); - if (btf_record_has_field(map->record, BPF_LIST_HEAD)) { + if (btf_record_has_field(map->record, BPF_LIST_HEAD) || + btf_record_has_field(map->record, BPF_RB_ROOT)) { if (is_tracing_prog_type(prog_type)) { - verbose(env, "tracing progs cannot use bpf_list_head yet\n"); + verbose(env, "tracing progs cannot use bpf_{list_head,rb_root} yet\n"); return -EINVAL; } } diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index f89de51a45db..02e68c352372 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -6901,6 +6901,17 @@ struct bpf_list_node { __u64 :64; } __attribute__((aligned(8))); +struct bpf_rb_root { + __u64 :64; + __u64 :64; +} __attribute__((aligned(8))); + +struct bpf_rb_node { + __u64 :64; + __u64 :64; + __u64 :64; +} __attribute__((aligned(8))); + struct bpf_sysctl { __u32 write; /* Sysctl is being read (= 0) or written (= 1). * Allows 1,2,4-byte read, but no write. diff --git a/tools/testing/selftests/bpf/prog_tests/linked_list.c b/tools/testing/selftests/bpf/prog_tests/linked_list.c index 9a7d4c47af63..b124028ab51a 100644 --- a/tools/testing/selftests/bpf/prog_tests/linked_list.c +++ b/tools/testing/selftests/bpf/prog_tests/linked_list.c @@ -58,12 +58,12 @@ static struct { TEST(inner_map, pop_front) TEST(inner_map, pop_back) #undef TEST - { "map_compat_kprobe", "tracing progs cannot use bpf_list_head yet" }, - { "map_compat_kretprobe", "tracing progs cannot use bpf_list_head yet" }, - { "map_compat_tp", "tracing progs cannot use bpf_list_head yet" }, - { "map_compat_perf", "tracing progs cannot use bpf_list_head yet" }, - { "map_compat_raw_tp", "tracing progs cannot use bpf_list_head yet" }, - { "map_compat_raw_tp_w", "tracing progs cannot use bpf_list_head yet" }, + { "map_compat_kprobe", "tracing progs cannot use bpf_{list_head,rb_root} yet" }, + { "map_compat_kretprobe", "tracing progs cannot use bpf_{list_head,rb_root} yet" }, + { "map_compat_tp", "tracing progs cannot use bpf_{list_head,rb_root} yet" }, + { "map_compat_perf", "tracing progs cannot use bpf_{list_head,rb_root} yet" }, + { "map_compat_raw_tp", "tracing progs cannot use bpf_{list_head,rb_root} yet" }, + { "map_compat_raw_tp_w", "tracing progs cannot use bpf_{list_head,rb_root} yet" }, { "obj_type_id_oor", "local type ID argument must be in range [0, U32_MAX]" }, { "obj_new_no_composite", "bpf_obj_new type ID argument must be of a struct" }, { "obj_new_no_struct", "bpf_obj_new type ID argument must be of a struct" }, From patchwork Tue Dec 6 23:09:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Marchevsky X-Patchwork-Id: 13066342 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A950BC3A5A7 for ; Tue, 6 Dec 2022 23:10:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229605AbiLFXKZ (ORCPT ); Tue, 6 Dec 2022 18:10:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55576 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229544AbiLFXKY (ORCPT ); Tue, 6 Dec 2022 18:10:24 -0500 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9165D42990 for ; Tue, 6 Dec 2022 15:10:23 -0800 (PST) Received: from pps.filterd (m0001303.ppops.net [127.0.0.1]) by m0001303.ppops.net (8.17.1.19/8.17.1.19) with ESMTP id 2B6LhJTo020790 for ; Tue, 6 Dec 2022 15:10:22 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=6mm73TzkCorKP8T+ZzdSvErGNG0u8UCkXICczBQ+EBk=; b=L215kiPHLqcNTmEBeGLTLAZNdqoi7NwEW1wScgApgZ84VsyZNKaIJkuzfhUOU7T+EZSY W8IU3FbTzz9XqhH8GDHB233NQgK/gDGrMb0Es4ekWmOyWAX8FXlL/GBRomSlWK1KEIK9 fLa7ciBwsr8fD95Dn+AVMmBAldVWS+8NFdk= Received: from maileast.thefacebook.com ([163.114.130.16]) by m0001303.ppops.net (PPS) with ESMTPS id 3m9g8cdf9g-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 06 Dec 2022 15:10:22 -0800 Received: from twshared0551.06.ash8.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::f) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Tue, 6 Dec 2022 15:10:21 -0800 Received: by devbig077.ldc1.facebook.com (Postfix, from userid 158236) id 778C6120B376F; Tue, 6 Dec 2022 15:10:05 -0800 (PST) From: Dave Marchevsky To: CC: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Kernel Team , Kumar Kartikeya Dwivedi , Tejun Heo , Dave Marchevsky Subject: [PATCH bpf-next 06/13] bpf: Add bpf_rbtree_{add,remove,first} kfuncs Date: Tue, 6 Dec 2022 15:09:53 -0800 Message-ID: <20221206231000.3180914-7-davemarchevsky@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221206231000.3180914-1-davemarchevsky@fb.com> References: <20221206231000.3180914-1-davemarchevsky@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: E1ODOP83ohtx76CVibLloXVIjce7ykKS X-Proofpoint-ORIG-GUID: E1ODOP83ohtx76CVibLloXVIjce7ykKS X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-06_12,2022-12-06_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This patch adds implementations of bpf_rbtree_{add,remove,first} and teaches verifier about their BTF_IDs as well as those of bpf_rb_{root,node}. All three kfuncs have some nonstandard component to their verification that needs to be addressed in future patches before programs can properly use them: * bpf_rbtree_add: Takes 'less' callback, need to verify it * bpf_rbtree_first: Returns ptr_to_node_type(off=rb_node_off) instead of ptr_to_rb_node(off=0). Return value ref is should be released on unlock. * bpf_rbtree_remove: Returns ptr_to_node_type(off=rb_node_off) instead of ptr_to_rb_node(off=0). 2nd arg (node) is a release_on_unlock + PTR_UNTRUSTED reg. Signed-off-by: Dave Marchevsky --- kernel/bpf/helpers.c | 31 +++++++++++++++++++++++++++++++ kernel/bpf/verifier.c | 11 +++++++++++ 2 files changed, 42 insertions(+) diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 4d04432b162e..d216c54b65ab 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -1865,6 +1865,33 @@ struct bpf_list_node *bpf_list_pop_back(struct bpf_list_head *head) return __bpf_list_del(head, true); } +struct bpf_rb_node *bpf_rbtree_remove(struct bpf_rb_root *root, struct bpf_rb_node *node) +{ + struct rb_root_cached *r = (struct rb_root_cached *)root; + struct rb_node *n = (struct rb_node *)node; + + if (WARN_ON_ONCE(RB_EMPTY_NODE(n))) + return (struct bpf_rb_node *)NULL; + + rb_erase_cached(n, r); + RB_CLEAR_NODE(n); + return (struct bpf_rb_node *)n; +} + +void bpf_rbtree_add(struct bpf_rb_root *root, struct bpf_rb_node *node, + bool (less)(struct bpf_rb_node *a, const struct bpf_rb_node *b)) +{ + rb_add_cached((struct rb_node *)node, (struct rb_root_cached *)root, + (bool (*)(struct rb_node *, const struct rb_node *))less); +} + +struct bpf_rb_node *bpf_rbtree_first(struct bpf_rb_root *root) +{ + struct rb_root_cached *r = (struct rb_root_cached *)root; + + return (struct bpf_rb_node *)rb_first_cached(r); +} + /** * bpf_task_acquire - Acquire a reference to a task. A task acquired by this * kfunc which is not stored in a map as a kptr, must be released by calling @@ -2069,6 +2096,10 @@ BTF_ID_FLAGS(func, bpf_task_acquire, KF_ACQUIRE | KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_task_acquire_not_zero, KF_ACQUIRE | KF_RCU | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_task_kptr_get, KF_ACQUIRE | KF_KPTR_GET | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_task_release, KF_RELEASE) +BTF_ID_FLAGS(func, bpf_rbtree_remove, KF_ACQUIRE | KF_RET_NULL) +BTF_ID_FLAGS(func, bpf_rbtree_add) +BTF_ID_FLAGS(func, bpf_rbtree_first, KF_ACQUIRE | KF_RET_NULL) + #ifdef CONFIG_CGROUPS BTF_ID_FLAGS(func, bpf_cgroup_acquire, KF_ACQUIRE | KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_cgroup_kptr_get, KF_ACQUIRE | KF_KPTR_GET | KF_RET_NULL) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9d9e00fd6dfa..e36dbde8736c 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -8135,6 +8135,8 @@ BTF_ID_LIST(kf_arg_btf_ids) BTF_ID(struct, bpf_dynptr_kern) BTF_ID(struct, bpf_list_head) BTF_ID(struct, bpf_list_node) +BTF_ID(struct, bpf_rb_root) +BTF_ID(struct, bpf_rb_node) static bool __is_kfunc_ptr_arg_type(const struct btf *btf, const struct btf_param *arg, int type) @@ -8240,6 +8242,9 @@ enum special_kfunc_type { KF_bpf_rdonly_cast, KF_bpf_rcu_read_lock, KF_bpf_rcu_read_unlock, + KF_bpf_rbtree_remove, + KF_bpf_rbtree_add, + KF_bpf_rbtree_first, }; BTF_SET_START(special_kfunc_set) @@ -8251,6 +8256,9 @@ BTF_ID(func, bpf_list_pop_front) BTF_ID(func, bpf_list_pop_back) BTF_ID(func, bpf_cast_to_kern_ctx) BTF_ID(func, bpf_rdonly_cast) +BTF_ID(func, bpf_rbtree_remove) +BTF_ID(func, bpf_rbtree_add) +BTF_ID(func, bpf_rbtree_first) BTF_SET_END(special_kfunc_set) BTF_ID_LIST(special_kfunc_list) @@ -8264,6 +8272,9 @@ BTF_ID(func, bpf_cast_to_kern_ctx) BTF_ID(func, bpf_rdonly_cast) BTF_ID(func, bpf_rcu_read_lock) BTF_ID(func, bpf_rcu_read_unlock) +BTF_ID(func, bpf_rbtree_remove) +BTF_ID(func, bpf_rbtree_add) +BTF_ID(func, bpf_rbtree_first) static bool is_kfunc_bpf_rcu_read_lock(struct bpf_kfunc_call_arg_meta *meta) { From patchwork Tue Dec 6 23:09:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Marchevsky X-Patchwork-Id: 13066346 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA6DFC352A1 for ; Tue, 6 Dec 2022 23:10:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229678AbiLFXKa (ORCPT ); Tue, 6 Dec 2022 18:10:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55634 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229565AbiLFXK3 (ORCPT ); Tue, 6 Dec 2022 18:10:29 -0500 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 117964298E for ; Tue, 6 Dec 2022 15:10:28 -0800 (PST) Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2B6LhAsq005414 for ; Tue, 6 Dec 2022 15:10:27 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=g35IKAD7zoDcJF+8Rx0XRLig7wldqLGrf2a4lIVFJpc=; b=i8aWQDzLIxgNh4TqdeHakNJfmDyZHa4L390GzjIXOMnpT7R/jz5Hkco082CXWAbGnyMg eN1V4tuneGwdgCZELMXjmxMP328WeXFoFuCirt6MbSiRQg8g+v9X2sPm0vWp08Xc7CW9 2FvGqntlb1UWaMOCAIeKRzyMZ4gfzjfiT1A= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3ma9dxtxf9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 06 Dec 2022 15:10:27 -0800 Received: from twshared8047.05.ash9.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::f) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Tue, 6 Dec 2022 15:10:26 -0800 Received: by devbig077.ldc1.facebook.com (Postfix, from userid 158236) id F1A54120B3782; Tue, 6 Dec 2022 15:10:05 -0800 (PST) From: Dave Marchevsky To: CC: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Kernel Team , Kumar Kartikeya Dwivedi , Tejun Heo , Dave Marchevsky Subject: [PATCH bpf-next 07/13] bpf: Add support for bpf_rb_root and bpf_rb_node in kfunc args Date: Tue, 6 Dec 2022 15:09:54 -0800 Message-ID: <20221206231000.3180914-8-davemarchevsky@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221206231000.3180914-1-davemarchevsky@fb.com> References: <20221206231000.3180914-1-davemarchevsky@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: Lmdu4Vev81XOT7XThlCcvmRcxwVeLtCq X-Proofpoint-ORIG-GUID: Lmdu4Vev81XOT7XThlCcvmRcxwVeLtCq X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-06_12,2022-12-06_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Now that we find bpf_rb_root and bpf_rb_node in structs, let's give args that contain those types special classification and properly handle these types when checking kfunc args. "Properly handling" these types largely requires generalizing similar handling for bpf_list_{head,node}, with little new logic added in this patch. Signed-off-by: Dave Marchevsky --- kernel/bpf/verifier.c | 237 ++++++++++++++++++++++++++++++++++++------ 1 file changed, 203 insertions(+), 34 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index e36dbde8736c..652112007b2c 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -8018,6 +8018,9 @@ struct bpf_kfunc_call_arg_meta { struct { struct btf_field *field; } arg_list_head; + struct { + struct btf_field *field; + } arg_rbtree_root; }; static bool is_kfunc_acquire(struct bpf_kfunc_call_arg_meta *meta) @@ -8129,6 +8132,8 @@ enum { KF_ARG_DYNPTR_ID, KF_ARG_LIST_HEAD_ID, KF_ARG_LIST_NODE_ID, + KF_ARG_RB_ROOT_ID, + KF_ARG_RB_NODE_ID, }; BTF_ID_LIST(kf_arg_btf_ids) @@ -8170,6 +8175,16 @@ static bool is_kfunc_arg_list_node(const struct btf *btf, const struct btf_param return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_LIST_NODE_ID); } +static bool is_kfunc_arg_rbtree_root(const struct btf *btf, const struct btf_param *arg) +{ + return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_RB_ROOT_ID); +} + +static bool is_kfunc_arg_rbtree_node(const struct btf *btf, const struct btf_param *arg) +{ + return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_RB_NODE_ID); +} + /* Returns true if struct is composed of scalars, 4 levels of nesting allowed */ static bool __btf_type_is_scalar_struct(struct bpf_verifier_env *env, const struct btf *btf, @@ -8229,6 +8244,8 @@ enum kfunc_ptr_arg_type { KF_ARG_PTR_TO_BTF_ID, /* Also covers reg2btf_ids conversions */ KF_ARG_PTR_TO_MEM, KF_ARG_PTR_TO_MEM_SIZE, /* Size derived from next argument, skip it */ + KF_ARG_PTR_TO_RB_ROOT, + KF_ARG_PTR_TO_RB_NODE, }; enum special_kfunc_type { @@ -8336,6 +8353,12 @@ get_kfunc_ptr_arg_type(struct bpf_verifier_env *env, if (is_kfunc_arg_list_node(meta->btf, &args[argno])) return KF_ARG_PTR_TO_LIST_NODE; + if (is_kfunc_arg_rbtree_root(meta->btf, &args[argno])) + return KF_ARG_PTR_TO_RB_ROOT; + + if (is_kfunc_arg_rbtree_node(meta->btf, &args[argno])) + return KF_ARG_PTR_TO_RB_NODE; + if ((base_type(reg->type) == PTR_TO_BTF_ID || reg2btf_ids[base_type(reg->type)])) { if (!btf_type_is_struct(ref_t)) { verbose(env, "kernel function %s args#%d pointer type %s %s is not supported\n", @@ -8550,97 +8573,196 @@ static bool is_bpf_list_api_kfunc(u32 btf_id) btf_id == special_kfunc_list[KF_bpf_list_pop_back]; } -static int process_kf_arg_ptr_to_list_head(struct bpf_verifier_env *env, +static bool is_bpf_rbtree_api_kfunc(u32 btf_id) +{ + return btf_id == special_kfunc_list[KF_bpf_rbtree_add] || + btf_id == special_kfunc_list[KF_bpf_rbtree_remove] || + btf_id == special_kfunc_list[KF_bpf_rbtree_first]; +} + +static bool is_bpf_datastructure_api_kfunc(u32 btf_id) +{ + return is_bpf_list_api_kfunc(btf_id) || is_bpf_rbtree_api_kfunc(btf_id); +} + +static bool check_kfunc_is_datastructure_head_api(struct bpf_verifier_env *env, + enum btf_field_type head_field_type, + u32 kfunc_btf_id) +{ + bool ret; + + switch (head_field_type) { + case BPF_LIST_HEAD: + ret = is_bpf_list_api_kfunc(kfunc_btf_id); + break; + case BPF_RB_ROOT: + ret = is_bpf_rbtree_api_kfunc(kfunc_btf_id); + break; + default: + verbose(env, "verifier internal error: unexpected datastructure head argument type %s\n", + btf_field_type_name(head_field_type)); + return false; + } + + if (!ret) + verbose(env, "verifier internal error: %s head arg for unknown kfunc\n", + btf_field_type_name(head_field_type)); + return ret; +} + +static bool check_kfunc_is_datastructure_node_api(struct bpf_verifier_env *env, + enum btf_field_type node_field_type, + u32 kfunc_btf_id) +{ + bool ret; + + switch (node_field_type) { + case BPF_LIST_NODE: + ret = (kfunc_btf_id == special_kfunc_list[KF_bpf_list_push_front] || + kfunc_btf_id == special_kfunc_list[KF_bpf_list_push_back]); + break; + case BPF_RB_NODE: + ret = (kfunc_btf_id == special_kfunc_list[KF_bpf_rbtree_remove] || + kfunc_btf_id == special_kfunc_list[KF_bpf_rbtree_add]); + break; + default: + verbose(env, "verifier internal error: unexpected datastructure node argument type %s\n", + btf_field_type_name(node_field_type)); + return false; + } + + if (!ret) + verbose(env, "verifier internal error: %s node arg for unknown kfunc\n", + btf_field_type_name(node_field_type)); + return ret; +} + +static int +__process_kf_arg_ptr_to_datastructure_head(struct bpf_verifier_env *env, struct bpf_reg_state *reg, u32 regno, - struct bpf_kfunc_call_arg_meta *meta) + struct bpf_kfunc_call_arg_meta *meta, + enum btf_field_type head_field_type, + struct btf_field **head_field) { + const char *head_type_name; struct btf_field *field; struct btf_record *rec; - u32 list_head_off; + u32 head_off; - if (meta->btf != btf_vmlinux || !is_bpf_list_api_kfunc(meta->func_id)) { - verbose(env, "verifier internal error: bpf_list_head argument for unknown kfunc\n"); + if (meta->btf != btf_vmlinux) { + verbose(env, "verifier internal error: unexpected btf mismatch in kfunc call\n"); return -EFAULT; } + if (!check_kfunc_is_datastructure_head_api(env, head_field_type, meta->func_id)) + return -EFAULT; + + head_type_name = btf_field_type_name(head_field_type); if (!tnum_is_const(reg->var_off)) { verbose(env, - "R%d doesn't have constant offset. bpf_list_head has to be at the constant offset\n", - regno); + "R%d doesn't have constant offset. %s has to be at the constant offset\n", + regno, head_type_name); return -EINVAL; } rec = reg_btf_record(reg); - list_head_off = reg->off + reg->var_off.value; - field = btf_record_find(rec, list_head_off, BPF_LIST_HEAD); + head_off = reg->off + reg->var_off.value; + field = btf_record_find(rec, head_off, head_field_type); if (!field) { - verbose(env, "bpf_list_head not found at offset=%u\n", list_head_off); + verbose(env, "%s not found at offset=%u\n", head_type_name, head_off); return -EINVAL; } /* All functions require bpf_list_head to be protected using a bpf_spin_lock */ if (check_reg_allocation_locked(env, reg)) { - verbose(env, "bpf_spin_lock at off=%d must be held for bpf_list_head\n", - rec->spin_lock_off); + verbose(env, "bpf_spin_lock at off=%d must be held for %s\n", + rec->spin_lock_off, head_type_name); return -EINVAL; } - if (meta->arg_list_head.field) { - verbose(env, "verifier internal error: repeating bpf_list_head arg\n"); + if (*head_field) { + verbose(env, "verifier internal error: repeating %s arg\n", head_type_name); return -EFAULT; } - meta->arg_list_head.field = field; + *head_field = field; return 0; } -static int process_kf_arg_ptr_to_list_node(struct bpf_verifier_env *env, + +static int process_kf_arg_ptr_to_list_head(struct bpf_verifier_env *env, struct bpf_reg_state *reg, u32 regno, struct bpf_kfunc_call_arg_meta *meta) { + return __process_kf_arg_ptr_to_datastructure_head(env, reg, regno, meta, BPF_LIST_HEAD, + &meta->arg_list_head.field); +} + +static int process_kf_arg_ptr_to_rbtree_root(struct bpf_verifier_env *env, + struct bpf_reg_state *reg, u32 regno, + struct bpf_kfunc_call_arg_meta *meta) +{ + return __process_kf_arg_ptr_to_datastructure_head(env, reg, regno, meta, BPF_RB_ROOT, + &meta->arg_rbtree_root.field); +} + +static int +__process_kf_arg_ptr_to_datastructure_node(struct bpf_verifier_env *env, + struct bpf_reg_state *reg, u32 regno, + struct bpf_kfunc_call_arg_meta *meta, + enum btf_field_type head_field_type, + enum btf_field_type node_field_type, + struct btf_field **node_field) +{ + const char *node_type_name; const struct btf_type *et, *t; struct btf_field *field; struct btf_record *rec; - u32 list_node_off; + u32 node_off; - if (meta->btf != btf_vmlinux || - (meta->func_id != special_kfunc_list[KF_bpf_list_push_front] && - meta->func_id != special_kfunc_list[KF_bpf_list_push_back])) { - verbose(env, "verifier internal error: bpf_list_node argument for unknown kfunc\n"); + if (meta->btf != btf_vmlinux) { + verbose(env, "verifier internal error: unexpected btf mismatch in kfunc call\n"); return -EFAULT; } + if (!check_kfunc_is_datastructure_node_api(env, node_field_type, meta->func_id)) + return -EFAULT; + + node_type_name = btf_field_type_name(node_field_type); if (!tnum_is_const(reg->var_off)) { verbose(env, - "R%d doesn't have constant offset. bpf_list_node has to be at the constant offset\n", - regno); + "R%d doesn't have constant offset. %s has to be at the constant offset\n", + regno, node_type_name); return -EINVAL; } rec = reg_btf_record(reg); - list_node_off = reg->off + reg->var_off.value; - field = btf_record_find(rec, list_node_off, BPF_LIST_NODE); - if (!field || field->offset != list_node_off) { - verbose(env, "bpf_list_node not found at offset=%u\n", list_node_off); + node_off = reg->off + reg->var_off.value; + field = btf_record_find(rec, node_off, node_field_type); + if (!field || field->offset != node_off) { + verbose(env, "%s not found at offset=%u\n", node_type_name, node_off); return -EINVAL; } - field = meta->arg_list_head.field; + field = *node_field; et = btf_type_by_id(field->datastructure_head.btf, field->datastructure_head.value_btf_id); t = btf_type_by_id(reg->btf, reg->btf_id); if (!btf_struct_ids_match(&env->log, reg->btf, reg->btf_id, 0, field->datastructure_head.btf, field->datastructure_head.value_btf_id, true)) { - verbose(env, "operation on bpf_list_head expects arg#1 bpf_list_node at offset=%d " + verbose(env, "operation on %s expects arg#1 %s at offset=%d " "in struct %s, but arg is at offset=%d in struct %s\n", + btf_field_type_name(head_field_type), + btf_field_type_name(node_field_type), field->datastructure_head.node_offset, btf_name_by_offset(field->datastructure_head.btf, et->name_off), - list_node_off, btf_name_by_offset(reg->btf, t->name_off)); + node_off, btf_name_by_offset(reg->btf, t->name_off)); return -EINVAL; } - if (list_node_off != field->datastructure_head.node_offset) { - verbose(env, "arg#1 offset=%d, but expected bpf_list_node at offset=%d in struct %s\n", - list_node_off, field->datastructure_head.node_offset, + if (node_off != field->datastructure_head.node_offset) { + verbose(env, "arg#1 offset=%d, but expected %s at offset=%d in struct %s\n", + node_off, btf_field_type_name(node_field_type), + field->datastructure_head.node_offset, btf_name_by_offset(field->datastructure_head.btf, et->name_off)); return -EINVAL; } @@ -8648,6 +8770,24 @@ static int process_kf_arg_ptr_to_list_node(struct bpf_verifier_env *env, return ref_set_release_on_unlock(env, reg->ref_obj_id); } +static int process_kf_arg_ptr_to_list_node(struct bpf_verifier_env *env, + struct bpf_reg_state *reg, u32 regno, + struct bpf_kfunc_call_arg_meta *meta) +{ + return __process_kf_arg_ptr_to_datastructure_node(env, reg, regno, meta, + BPF_LIST_HEAD, BPF_LIST_NODE, + &meta->arg_list_head.field); +} + +static int process_kf_arg_ptr_to_rbtree_node(struct bpf_verifier_env *env, + struct bpf_reg_state *reg, u32 regno, + struct bpf_kfunc_call_arg_meta *meta) +{ + return __process_kf_arg_ptr_to_datastructure_node(env, reg, regno, meta, + BPF_RB_ROOT, BPF_RB_NODE, + &meta->arg_rbtree_root.field); +} + static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_arg_meta *meta) { const char *func_name = meta->func_name, *ref_tname; @@ -8776,6 +8916,8 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ case KF_ARG_PTR_TO_DYNPTR: case KF_ARG_PTR_TO_LIST_HEAD: case KF_ARG_PTR_TO_LIST_NODE: + case KF_ARG_PTR_TO_RB_ROOT: + case KF_ARG_PTR_TO_RB_NODE: case KF_ARG_PTR_TO_MEM: case KF_ARG_PTR_TO_MEM_SIZE: /* Trusted by default */ @@ -8861,6 +9003,20 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ if (ret < 0) return ret; break; + case KF_ARG_PTR_TO_RB_ROOT: + if (reg->type != PTR_TO_MAP_VALUE && + reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) { + verbose(env, "arg#%d expected pointer to map value or allocated object\n", i); + return -EINVAL; + } + if (reg->type == (PTR_TO_BTF_ID | MEM_ALLOC) && !reg->ref_obj_id) { + verbose(env, "allocated object must be referenced\n"); + return -EINVAL; + } + ret = process_kf_arg_ptr_to_rbtree_root(env, reg, regno, meta); + if (ret < 0) + return ret; + break; case KF_ARG_PTR_TO_LIST_NODE: if (reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) { verbose(env, "arg#%d expected pointer to allocated object\n", i); @@ -8874,6 +9030,19 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ if (ret < 0) return ret; break; + case KF_ARG_PTR_TO_RB_NODE: + if (reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) { + verbose(env, "arg#%d expected pointer to allocated object\n", i); + return -EINVAL; + } + if (!reg->ref_obj_id) { + verbose(env, "allocated object must be referenced\n"); + return -EINVAL; + } + ret = process_kf_arg_ptr_to_rbtree_node(env, reg, regno, meta); + if (ret < 0) + return ret; + break; case KF_ARG_PTR_TO_BTF_ID: /* Only base_type is checked, further checks are done here */ if ((base_type(reg->type) != PTR_TO_BTF_ID || @@ -13818,7 +13987,7 @@ static int do_check(struct bpf_verifier_env *env) if ((insn->src_reg == BPF_REG_0 && insn->imm != BPF_FUNC_spin_unlock) || (insn->src_reg == BPF_PSEUDO_CALL) || (insn->src_reg == BPF_PSEUDO_KFUNC_CALL && - (insn->off != 0 || !is_bpf_list_api_kfunc(insn->imm)))) { + (insn->off != 0 || !is_bpf_datastructure_api_kfunc(insn->imm)))) { verbose(env, "function calls are not allowed while holding a lock\n"); return -EINVAL; } From patchwork Tue Dec 6 23:09:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Marchevsky X-Patchwork-Id: 13066349 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97D46C63707 for ; Tue, 6 Dec 2022 23:10:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229583AbiLFXKc (ORCPT ); Tue, 6 Dec 2022 18:10:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55672 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229661AbiLFXKa (ORCPT ); Tue, 6 Dec 2022 18:10:30 -0500 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7914F4299E for ; Tue, 6 Dec 2022 15:10:28 -0800 (PST) Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2B6LhBct005482 for ; Tue, 6 Dec 2022 15:10:27 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=NPr5g30TzpiKBwCbkbTwiLeOFyeqeIkDi9luyr0QAp0=; b=nhQWbdiLjXiAotMeBbHIQdjqOPwe5GrnXh4p9u7g1ph8QB1BiN4XAuQhXWljG+FyBdkB wy037mvODqjqMUSFiz9x796164QNFSey9FJ3M01PSrVcg/RHvy+zEkWlOP2XGMuIFkcO 7WCJA6ndBC62SAkDl7yMBmO5s1lRYGetGwA= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3ma9dxtxf8-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 06 Dec 2022 15:10:27 -0800 Received: from twshared8047.05.ash9.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::e) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Tue, 6 Dec 2022 15:10:26 -0800 Received: by devbig077.ldc1.facebook.com (Postfix, from userid 158236) id 6744E120B3785; Tue, 6 Dec 2022 15:10:06 -0800 (PST) From: Dave Marchevsky To: CC: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Kernel Team , Kumar Kartikeya Dwivedi , Tejun Heo , Dave Marchevsky Subject: [PATCH bpf-next 08/13] bpf: Add callback validation to kfunc verifier logic Date: Tue, 6 Dec 2022 15:09:55 -0800 Message-ID: <20221206231000.3180914-9-davemarchevsky@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221206231000.3180914-1-davemarchevsky@fb.com> References: <20221206231000.3180914-1-davemarchevsky@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: RLr6fhFUPPAW5FP-wJVPfiY_DuF0JMFI X-Proofpoint-ORIG-GUID: RLr6fhFUPPAW5FP-wJVPfiY_DuF0JMFI X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-06_12,2022-12-06_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Some BPF helpers take a callback function which the helper calls. For each helper that takes such a callback, there's a special call to __check_func_call with a callback-state-setting callback that sets up verifier bpf_func_state for the callback's frame. kfuncs don't have any of this infrastructure yet, so let's add it in this patch, following existing helper pattern as much as possible. To validate functionality of this added plumbing, this patch adds callback handling for the bpf_rbtree_add kfunc and hopes to lay groundwork for future next-gen datastructure callbacks. In the "general plumbing" category we have: * check_kfunc_call doing callback verification right before clearing CALLER_SAVED_REGS, exactly like check_helper_call * recognition of func_ptr BTF types in kfunc args as KF_ARG_PTR_TO_CALLBACK + propagation of subprogno for this arg type In the "rbtree_add / next-gen datastructure-specific plumbing" category: * Since bpf_rbtree_add must be called while the spin_lock associated with the tree is held, don't complain when callback's func_state doesn't unlock it by frame exit * Mark rbtree_add callback's args PTR_UNTRUSTED to prevent rbtree api functions from being called in the callback Signed-off-by: Dave Marchevsky --- kernel/bpf/verifier.c | 136 ++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 130 insertions(+), 6 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 652112007b2c..9ad8c0b264dc 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1448,6 +1448,16 @@ static void mark_ptr_not_null_reg(struct bpf_reg_state *reg) reg->type &= ~PTR_MAYBE_NULL; } +static void mark_reg_datastructure_node(struct bpf_reg_state *regs, u32 regno, + struct btf_field_datastructure_head *ds_head) +{ + __mark_reg_known_zero(®s[regno]); + regs[regno].type = PTR_TO_BTF_ID | MEM_ALLOC; + regs[regno].btf = ds_head->btf; + regs[regno].btf_id = ds_head->value_btf_id; + regs[regno].off = ds_head->node_offset; +} + static bool reg_is_pkt_pointer(const struct bpf_reg_state *reg) { return type_is_pkt_pointer(reg->type); @@ -4771,7 +4781,8 @@ static int check_ptr_to_btf_access(struct bpf_verifier_env *env, return -EACCES; } - if (type_is_alloc(reg->type) && !reg->ref_obj_id) { + if (type_is_alloc(reg->type) && !reg->ref_obj_id && + !cur_func(env)->in_callback_fn) { verbose(env, "verifier internal error: ref_obj_id for allocated object must be non-zero\n"); return -EFAULT; } @@ -6952,6 +6963,8 @@ static int set_callee_state(struct bpf_verifier_env *env, struct bpf_func_state *caller, struct bpf_func_state *callee, int insn_idx); +static bool is_callback_calling_kfunc(u32 btf_id); + static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn, int *insn_idx, int subprog, set_callee_state_fn set_callee_state_cb) @@ -7006,10 +7019,18 @@ static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn * interested in validating only BPF helpers that can call subprogs as * callbacks */ - if (set_callee_state_cb != set_callee_state && !is_callback_calling_function(insn->imm)) { - verbose(env, "verifier bug: helper %s#%d is not marked as callback-calling\n", - func_id_name(insn->imm), insn->imm); - return -EFAULT; + if (set_callee_state_cb != set_callee_state) { + if (bpf_pseudo_kfunc_call(insn) && + !is_callback_calling_kfunc(insn->imm)) { + verbose(env, "verifier bug: kfunc %s#%d not marked as callback-calling\n", + func_id_name(insn->imm), insn->imm); + return -EFAULT; + } else if (!bpf_pseudo_kfunc_call(insn) && + !is_callback_calling_function(insn->imm)) { /* helper */ + verbose(env, "verifier bug: helper %s#%d not marked as callback-calling\n", + func_id_name(insn->imm), insn->imm); + return -EFAULT; + } } if (insn->code == (BPF_JMP | BPF_CALL) && @@ -7275,6 +7296,67 @@ static int set_user_ringbuf_callback_state(struct bpf_verifier_env *env, return 0; } +static int set_rbtree_add_callback_state(struct bpf_verifier_env *env, + struct bpf_func_state *caller, + struct bpf_func_state *callee, + int insn_idx) +{ + /* void bpf_rbtree_add(struct bpf_rb_root *root, struct bpf_rb_node *node, + * bool (less)(struct bpf_rb_node *a, const struct bpf_rb_node *b)); + * + * 'struct bpf_rb_node *node' arg to bpf_rbtree_add is the same PTR_TO_BTF_ID w/ offset + * that 'less' callback args will be receiving. However, 'node' arg was release_reference'd + * by this point, so look at 'root' + */ + struct btf_field *field; + struct btf_record *rec; + + rec = reg_btf_record(&caller->regs[BPF_REG_1]); + if (!rec) + return -EFAULT; + + field = btf_record_find(rec, caller->regs[BPF_REG_1].off, BPF_RB_ROOT); + if (!field || !field->datastructure_head.value_btf_id) + return -EFAULT; + + mark_reg_datastructure_node(callee->regs, BPF_REG_1, &field->datastructure_head); + callee->regs[BPF_REG_1].type |= PTR_UNTRUSTED; + mark_reg_datastructure_node(callee->regs, BPF_REG_2, &field->datastructure_head); + callee->regs[BPF_REG_2].type |= PTR_UNTRUSTED; + + __mark_reg_not_init(env, &callee->regs[BPF_REG_3]); + __mark_reg_not_init(env, &callee->regs[BPF_REG_4]); + __mark_reg_not_init(env, &callee->regs[BPF_REG_5]); + callee->in_callback_fn = true; + callee->callback_ret_range = tnum_range(0, 1); + return 0; +} + +static bool is_rbtree_lock_required_kfunc(u32 btf_id); + +/* Are we currently verifying the callback for a rbtree helper that must + * be called with lock held? If so, no need to complain about unreleased + * lock + */ +static bool in_rbtree_lock_required_cb(struct bpf_verifier_env *env) +{ + struct bpf_verifier_state *state = env->cur_state; + struct bpf_insn *insn = env->prog->insnsi; + struct bpf_func_state *callee; + int kfunc_btf_id; + + if (!state->curframe) + return false; + + callee = state->frame[state->curframe]; + + if (!callee->in_callback_fn) + return false; + + kfunc_btf_id = insn[callee->callsite].imm; + return is_rbtree_lock_required_kfunc(kfunc_btf_id); +} + static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx) { struct bpf_verifier_state *state = env->cur_state; @@ -8007,6 +8089,7 @@ struct bpf_kfunc_call_arg_meta { bool r0_rdonly; u32 ret_btf_id; u64 r0_size; + u32 subprogno; struct { u64 value; bool found; @@ -8185,6 +8268,18 @@ static bool is_kfunc_arg_rbtree_node(const struct btf *btf, const struct btf_par return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_RB_NODE_ID); } +static bool is_kfunc_arg_callback(struct bpf_verifier_env *env, const struct btf *btf, + const struct btf_param *arg) +{ + const struct btf_type *t; + + t = btf_type_resolve_func_ptr(btf, arg->type, NULL); + if (!t) + return false; + + return true; +} + /* Returns true if struct is composed of scalars, 4 levels of nesting allowed */ static bool __btf_type_is_scalar_struct(struct bpf_verifier_env *env, const struct btf *btf, @@ -8244,6 +8339,7 @@ enum kfunc_ptr_arg_type { KF_ARG_PTR_TO_BTF_ID, /* Also covers reg2btf_ids conversions */ KF_ARG_PTR_TO_MEM, KF_ARG_PTR_TO_MEM_SIZE, /* Size derived from next argument, skip it */ + KF_ARG_PTR_TO_CALLBACK, KF_ARG_PTR_TO_RB_ROOT, KF_ARG_PTR_TO_RB_NODE, }; @@ -8368,6 +8464,9 @@ get_kfunc_ptr_arg_type(struct bpf_verifier_env *env, return KF_ARG_PTR_TO_BTF_ID; } + if (is_kfunc_arg_callback(env, meta->btf, &args[argno])) + return KF_ARG_PTR_TO_CALLBACK; + if (argno + 1 < nargs && is_kfunc_arg_mem_size(meta->btf, &args[argno + 1], ®s[regno + 1])) arg_mem_size = true; @@ -8585,6 +8684,16 @@ static bool is_bpf_datastructure_api_kfunc(u32 btf_id) return is_bpf_list_api_kfunc(btf_id) || is_bpf_rbtree_api_kfunc(btf_id); } +static bool is_callback_calling_kfunc(u32 btf_id) +{ + return btf_id == special_kfunc_list[KF_bpf_rbtree_add]; +} + +static bool is_rbtree_lock_required_kfunc(u32 btf_id) +{ + return is_bpf_rbtree_api_kfunc(btf_id); +} + static bool check_kfunc_is_datastructure_head_api(struct bpf_verifier_env *env, enum btf_field_type head_field_type, u32 kfunc_btf_id) @@ -8920,6 +9029,7 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ case KF_ARG_PTR_TO_RB_NODE: case KF_ARG_PTR_TO_MEM: case KF_ARG_PTR_TO_MEM_SIZE: + case KF_ARG_PTR_TO_CALLBACK: /* Trusted by default */ break; default: @@ -9078,6 +9188,9 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ /* Skip next '__sz' argument */ i++; break; + case KF_ARG_PTR_TO_CALLBACK: + meta->subprogno = reg->subprogno; + break; } } @@ -9193,6 +9306,16 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, } } + if (meta.func_id == special_kfunc_list[KF_bpf_rbtree_add]) { + err = __check_func_call(env, insn, insn_idx_p, meta.subprogno, + set_rbtree_add_callback_state); + if (err) { + verbose(env, "kfunc %s#%d failed callback verification\n", + func_name, func_id); + return err; + } + } + for (i = 0; i < CALLER_SAVED_REGS; i++) mark_reg_not_init(env, regs, caller_saved[i]); @@ -14023,7 +14146,8 @@ static int do_check(struct bpf_verifier_env *env) return -EINVAL; } - if (env->cur_state->active_lock.ptr) { + if (env->cur_state->active_lock.ptr && + !in_rbtree_lock_required_cb(env)) { verbose(env, "bpf_spin_unlock is missing\n"); return -EINVAL; } From patchwork Tue Dec 6 23:09:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Marchevsky X-Patchwork-Id: 13066344 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C46F7C4708D for ; Tue, 6 Dec 2022 23:10:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229523AbiLFXK1 (ORCPT ); Tue, 6 Dec 2022 18:10:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55598 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229621AbiLFXK0 (ORCPT ); Tue, 6 Dec 2022 18:10:26 -0500 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6B7EF4298E for ; Tue, 6 Dec 2022 15:10:25 -0800 (PST) Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2B6LhJ5d032730 for ; Tue, 6 Dec 2022 15:10:25 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=DgcPGFT4Kqc9QQP4cTSX7FDCFdBcJdilQ//tNgqCojo=; b=LbyMGD//xK2dhmIDeHFrEFaqo1e6bkbqSDJYM/Ek+hrHqvZr1CUHJ5IjcOyBv+OSfyq0 3mQsKJZLaJJ6mmRZE9FPB5uIxAiB2SAJTbHbpqQGayk+MsmeTsLpzQJ5qNSS4skZzJEU 56cKJDK2+SO4+tY780gvZ4nZALPEKM0qqAQ= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3m9x70xv4w-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 06 Dec 2022 15:10:25 -0800 Received: from twshared26225.38.frc1.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::d) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Tue, 6 Dec 2022 15:10:21 -0800 Received: by devbig077.ldc1.facebook.com (Postfix, from userid 158236) id D728D120B3788; Tue, 6 Dec 2022 15:10:06 -0800 (PST) From: Dave Marchevsky To: CC: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Kernel Team , Kumar Kartikeya Dwivedi , Tejun Heo , Dave Marchevsky Subject: [PATCH bpf-next 09/13] bpf: Special verifier handling for bpf_rbtree_{remove, first} Date: Tue, 6 Dec 2022 15:09:56 -0800 Message-ID: <20221206231000.3180914-10-davemarchevsky@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221206231000.3180914-1-davemarchevsky@fb.com> References: <20221206231000.3180914-1-davemarchevsky@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: ZEOe3eBegH9NTIsFrSvVS8Bri8aQynY3 X-Proofpoint-GUID: ZEOe3eBegH9NTIsFrSvVS8Bri8aQynY3 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-06_12,2022-12-06_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Newly-added bpf_rbtree_{remove,first} kfuncs have some special properties that require handling in the verifier: * both bpf_rbtree_remove and bpf_rbtree_first return the type containing the bpf_rb_node field, with the offset set to that field's offset, instead of a struct bpf_rb_node * * Generalized existing next-gen list verifier handling for this as mark_reg_datastructure_node helper * Unlike other functions, which set release_on_unlock on one of their args, bpf_rbtree_first takes no arguments, rather setting release_on_unlock on its return value * bpf_rbtree_remove's node input is a node that's been inserted in the tree. Only non-owning references (PTR_UNTRUSTED + release_on_unlock) refer to such nodes, but kfuncs don't take PTR_UNTRUSTED args * Added special carveout for bpf_rbtree_remove to take PTR_UNTRUSTED * Since node input already has release_on_unlock set, don't set it again This patch, along with the previous one, complete special verifier handling for all rbtree API functions added in this series. Signed-off-by: Dave Marchevsky --- kernel/bpf/verifier.c | 89 +++++++++++++++++++++++++++++++++++-------- 1 file changed, 73 insertions(+), 16 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9ad8c0b264dc..29983e2c27df 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -6122,6 +6122,23 @@ static int check_reg_type(struct bpf_verifier_env *env, u32 regno, return 0; } +static bool +func_arg_reg_rb_node_offset(const struct bpf_reg_state *reg, s32 off) +{ + struct btf_record *rec; + struct btf_field *field; + + rec = reg_btf_record(reg); + if (!rec) + return false; + + field = btf_record_find(rec, off, BPF_RB_NODE); + if (!field) + return false; + + return true; +} + int check_func_arg_reg_off(struct bpf_verifier_env *env, const struct bpf_reg_state *reg, int regno, enum bpf_arg_type arg_type) @@ -6176,6 +6193,13 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env, */ fixed_off_ok = true; break; + case PTR_TO_BTF_ID | MEM_ALLOC | PTR_UNTRUSTED: + /* Currently only bpf_rbtree_remove accepts a PTR_UNTRUSTED + * bpf_rb_node. Fixed off of the node type is OK + */ + if (reg->off && func_arg_reg_rb_node_offset(reg, reg->off)) + fixed_off_ok = true; + break; default: break; } @@ -8875,26 +8899,44 @@ __process_kf_arg_ptr_to_datastructure_node(struct bpf_verifier_env *env, btf_name_by_offset(field->datastructure_head.btf, et->name_off)); return -EINVAL; } - /* Set arg#1 for expiration after unlock */ - return ref_set_release_on_unlock(env, reg->ref_obj_id); + + return 0; } static int process_kf_arg_ptr_to_list_node(struct bpf_verifier_env *env, struct bpf_reg_state *reg, u32 regno, struct bpf_kfunc_call_arg_meta *meta) { - return __process_kf_arg_ptr_to_datastructure_node(env, reg, regno, meta, - BPF_LIST_HEAD, BPF_LIST_NODE, - &meta->arg_list_head.field); + int err; + + err = __process_kf_arg_ptr_to_datastructure_node(env, reg, regno, meta, + BPF_LIST_HEAD, BPF_LIST_NODE, + &meta->arg_list_head.field); + if (err) + return err; + + return ref_set_release_on_unlock(env, reg->ref_obj_id); } static int process_kf_arg_ptr_to_rbtree_node(struct bpf_verifier_env *env, struct bpf_reg_state *reg, u32 regno, struct bpf_kfunc_call_arg_meta *meta) { - return __process_kf_arg_ptr_to_datastructure_node(env, reg, regno, meta, - BPF_RB_ROOT, BPF_RB_NODE, - &meta->arg_rbtree_root.field); + int err; + + err = __process_kf_arg_ptr_to_datastructure_node(env, reg, regno, meta, + BPF_RB_ROOT, BPF_RB_NODE, + &meta->arg_rbtree_root.field); + if (err) + return err; + + /* bpf_rbtree_remove's node parameter is a non-owning reference to + * a bpf_rb_node, so release_on_unlock is already set + */ + if (meta->func_id == special_kfunc_list[KF_bpf_rbtree_remove]) + return 0; + + return ref_set_release_on_unlock(env, reg->ref_obj_id); } static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_arg_meta *meta) @@ -8902,7 +8944,7 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ const char *func_name = meta->func_name, *ref_tname; const struct btf *btf = meta->btf; const struct btf_param *args; - u32 i, nargs; + u32 i, nargs, check_type; int ret; args = (const struct btf_param *)(meta->func_proto + 1); @@ -9141,7 +9183,13 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ return ret; break; case KF_ARG_PTR_TO_RB_NODE: - if (reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) { + if (meta->btf == btf_vmlinux && + meta->func_id == special_kfunc_list[KF_bpf_rbtree_remove]) + check_type = (PTR_TO_BTF_ID | MEM_ALLOC | PTR_UNTRUSTED); + else + check_type = (PTR_TO_BTF_ID | MEM_ALLOC); + + if (reg->type != check_type) { verbose(env, "arg#%d expected pointer to allocated object\n", i); return -EINVAL; } @@ -9380,11 +9428,14 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, meta.func_id == special_kfunc_list[KF_bpf_list_pop_back]) { struct btf_field *field = meta.arg_list_head.field; - mark_reg_known_zero(env, regs, BPF_REG_0); - regs[BPF_REG_0].type = PTR_TO_BTF_ID | MEM_ALLOC; - regs[BPF_REG_0].btf = field->datastructure_head.btf; - regs[BPF_REG_0].btf_id = field->datastructure_head.value_btf_id; - regs[BPF_REG_0].off = field->datastructure_head.node_offset; + mark_reg_datastructure_node(regs, BPF_REG_0, + &field->datastructure_head); + } else if (meta.func_id == special_kfunc_list[KF_bpf_rbtree_remove] || + meta.func_id == special_kfunc_list[KF_bpf_rbtree_first]) { + struct btf_field *field = meta.arg_rbtree_root.field; + + mark_reg_datastructure_node(regs, BPF_REG_0, + &field->datastructure_head); } else if (meta.func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx]) { mark_reg_known_zero(env, regs, BPF_REG_0); regs[BPF_REG_0].type = PTR_TO_BTF_ID | PTR_TRUSTED; @@ -9450,6 +9501,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, if (is_kfunc_ret_null(&meta)) regs[BPF_REG_0].id = id; regs[BPF_REG_0].ref_obj_id = id; + + if (meta.func_id == special_kfunc_list[KF_bpf_rbtree_first]) + ref_set_release_on_unlock(env, regs[BPF_REG_0].ref_obj_id); } if (reg_may_point_to_spin_lock(®s[BPF_REG_0]) && !regs[BPF_REG_0].id) regs[BPF_REG_0].id = ++env->id_gen; @@ -11636,8 +11690,11 @@ static void mark_ptr_or_null_reg(struct bpf_func_state *state, */ if (WARN_ON_ONCE(reg->smin_value || reg->smax_value || !tnum_equals_const(reg->var_off, 0))) return; - if (reg->type != (PTR_TO_BTF_ID | MEM_ALLOC | PTR_MAYBE_NULL) && WARN_ON_ONCE(reg->off)) + if (reg->type != (PTR_TO_BTF_ID | MEM_ALLOC | PTR_MAYBE_NULL) && + reg->type != (PTR_TO_BTF_ID | MEM_ALLOC | PTR_MAYBE_NULL | PTR_UNTRUSTED) && + WARN_ON_ONCE(reg->off)) { return; + } if (is_null) { reg->type = SCALAR_VALUE; /* We don't need id and ref_obj_id from this point From patchwork Tue Dec 6 23:09:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Marchevsky X-Patchwork-Id: 13066351 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6827AC3A5A7 for ; Tue, 6 Dec 2022 23:10:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229661AbiLFXKf (ORCPT ); Tue, 6 Dec 2022 18:10:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55742 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229628AbiLFXKe (ORCPT ); Tue, 6 Dec 2022 18:10:34 -0500 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D77504298E for ; Tue, 6 Dec 2022 15:10:32 -0800 (PST) Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2B6LhJ5n032730 for ; Tue, 6 Dec 2022 15:10:32 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=+Ema17lhkGx/i/301jhTliryFi0neBmgVkp0cqpGJQA=; b=fVDfLamWjHoGJ91lMKmqjeX+ORVscFWu+1/hvpQZ0OVtiOHePA+e2ZX2TT5CzYivet3D JSX/qXnptI7nf8EFUc/SmlwBdaXwk1VBQO9UEWYCHZXgj9zsic7a+Jp/Ib1QMFlmRKSy ryhz+SJb+jYy92V1+W3hQVCUj/Df2VojNm4= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3m9x70xv4w-15 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 06 Dec 2022 15:10:32 -0800 Received: from twshared25383.14.frc2.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::d) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Tue, 6 Dec 2022 15:10:31 -0800 Received: by devbig077.ldc1.facebook.com (Postfix, from userid 158236) id 4CD77120B378A; Tue, 6 Dec 2022 15:10:07 -0800 (PST) From: Dave Marchevsky To: CC: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Kernel Team , Kumar Kartikeya Dwivedi , Tejun Heo , Dave Marchevsky Subject: [PATCH bpf-next 10/13] bpf, x86: BPF_PROBE_MEM handling for insn->off < 0 Date: Tue, 6 Dec 2022 15:09:57 -0800 Message-ID: <20221206231000.3180914-11-davemarchevsky@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221206231000.3180914-1-davemarchevsky@fb.com> References: <20221206231000.3180914-1-davemarchevsky@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: TGr0BQd9FlNe5FDX_CGNRXdpIzuEiYTv X-Proofpoint-GUID: TGr0BQd9FlNe5FDX_CGNRXdpIzuEiYTv X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-06_12,2022-12-06_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Current comment in BPF_PROBE_MEM jit code claims that verifier prevents insn->off < 0, but this appears to not be true irrespective of changes in this series. Regardless, changes in this series will result in an example like: struct example_node { long key; long val; struct bpf_rb_node node; } /* In BPF prog, assume root contains example_node nodes */ struct bpf_rb_node res = bpf_rbtree_first(&root); if (!res) return 1; struct example_node n = container_of(res, struct example_node, node); long key = n->key; Resulting in a load with off = -16, as bpf_rbtree_first's return is modified by verifier to be PTR_TO_BTF_ID of example_node w/ offset = offsetof(struct example_node, node), instead of PTR_TO_BTF_ID of bpf_rb_node. So it's necessary to support negative insn->off when jitting BPF_PROBE_MEM. In order to ensure that page fault for a BPF_PROBE_MEM load of *src_reg + insn->off is safely handled, we must confirm that *src_reg + insn->off is in kernel's memory. Two runtime checks are emitted to confirm that: 1) (*src_reg + insn->off) > boundary between user and kernel address spaces 2) (*src_reg + insn->off) does not overflow to a small positive number. This might happen if some function meant to set src_reg returns ERR_PTR(-EINVAL) or similar. Check 1 currently is sligtly off - it compares a u64 limit = TASK_SIZE_MAX + PAGE_SIZE + abs(insn->off); to *src_reg, aborting the load if limit is larger. Rewriting this as an inequality: *src_reg > TASK_SIZE_MAX + PAGE_SIZE + abs(insn->off) *src_reg - abs(insn->off) > TASK_SIZE_MAX + PAGE_SIZE shows that this isn't quite right even if insn->off is positive, as we really want: *src_reg + insn->off > TASK_SIZE_MAX + PAGE_SIZE *src_reg > TASK_SIZE_MAX + PAGE_SIZE - insn_off Since *src_reg + insn->off is the address we'll be loading from, not *src_reg - insn->off or *src_reg - abs(insn->off). So change the subtraction to an addition and remove the abs(), as comment indicates that it was only added to ignore negative insn->off. For Check 2, currently "does not overflow to a small positive number" is confirmed by emitting an 'add insn->off, src_reg' instruction and checking for carry flag. While this works fine for a positive insn->off, a small negative insn->off like -16 is almost guaranteed to wrap over to a small positive number when added to any kernel address. This patch addresses this by not doing Check 2 at BPF prog runtime when insn->off is negative, rather doing a stronger check at JIT-time. The logic supporting this is as follows: 1) Assume insn->off is negative, call the largest such negative offset MAX_NEGATIVE_OFF. So insn->off >= MAX_NEGATIVE_OFF for all possible insn->off. 2) *src_reg + insn->off will not wrap over to an unexpected address by virtue of negative insn->off, but it might wrap under if -insn->off > *src_reg, as that implies *src_reg + insn->off < 0 3) Inequality (TASK_SIZE_MAX + PAGE_SIZE - insn->off) > (TASK_SIZE_MAX + PAGE_SIZE) must be true since insn->off is negative. 4) If we've completed check 1, we know that src_reg >= (TASK_SIZE_MAX + PAGE_SIZE - insn->off) 5) Combining statements 3 and 4, we know src_reg > (TASK_SIZE_MAX + PAGE_SIZE) 6) By statements 1, 4, and 5, if we can prove (TASK_SIZE_MAX + PAGE_SIZE) > -MAX_NEGATIVE_OFF, we'll know that (TASK_SIZE_MAX + PAGE_SIZE) > -insn->off for all possible insn->off values. We can rewrite this as (TASK_SIZE_MAX + PAGE_SIZE) + MAX_NEGATIVE_OFF > 0. Since src_reg > TASK_SIZE_MAX + PAGE_SIZE and MAX_NEGATIVE_OFF is negative, if the previous inequality is true, src_reg + MAX_NEGATIVE_OFF > 0 is also true for all src_reg values. Similarly, since insn->off >= MAX_NEGATIVE_OFF for all possible negative insn->off vals, src_reg + insn->off > 0 and there can be no wrapping under. So proving (TASK_SIZE_MAX + PAGE_SIZE) + MAX_NEGATIVE_OFF > 0 implies *src_reg + insn->off > 0 for any src_reg that's passed check 1 and any negative insn->off. Luckily the former inequality does not need to be checked at runtime, and in fact could be a static_assert if TASK_SIZE_MAX wasn't determined by a function when CONFIG_X86_5LEVEL kconfig is used. Regardless, we can just check (TASK_SIZE_MAX + PAGE_SIZE) + MAX_NEGATIVE_OFF > 0 once per do_jit call instead of emitting a runtime check. Given that insn->off is a s16 and is unlikely to grow larger, this check should always succeed on any x86 processor made in the 21st century. If it doesn't fail all do_jit calls and complain loudly with the assumption that the BPF subsystem is misconfigured or has a bug. A few instructions are saved for negative insn->offs as a result. Using the struct example_node / off = -16 example from before, code looks like: BEFORE CHANGE 72: movabs $0x800000000010,%r11 7c: cmp %r11,%rdi 7f: jb 0x000000000000008d (check 1 on 7c and here) 81: mov %rdi,%r11 84: add $0xfffffffffffffff0,%r11 (check 2, will set carry for almost any r11, so bug for 8b: jae 0x0000000000000091 negative insn->off) 8d: xor %edi,%edi (as a result long key = n->key; will be 0'd out here) 8f: jmp 0x0000000000000095 91: mov -0x10(%rdi),%rdi 95: AFTER CHANGE: 5a: movabs $0x800000000010,%r11 64: cmp %r11,%rdi 67: jae 0x000000000000006d (check 1 on 64 and here, but now JNC instead of JC) 69: xor %edi,%edi (no check 2, 0 out if %rdi - %r11 < 0) 6b: jmp 0x0000000000000071 6d: mov -0x10(%rdi),%rdi 71: We could do the same for insn->off == 0, but for now keep code generation unchanged for previously working nonnegative insn->offs. Signed-off-by: Dave Marchevsky --- arch/x86/net/bpf_jit_comp.c | 123 +++++++++++++++++++++++++++--------- 1 file changed, 92 insertions(+), 31 deletions(-) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 36ffe67ad6e5..843f619d0d35 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -94,6 +95,7 @@ static int bpf_size_to_x86_bytes(int bpf_size) */ #define X86_JB 0x72 #define X86_JAE 0x73 +#define X86_JNC 0x73 #define X86_JE 0x74 #define X86_JNE 0x75 #define X86_JBE 0x76 @@ -950,6 +952,36 @@ static void emit_shiftx(u8 **pprog, u32 dst_reg, u8 src_reg, bool is64, u8 op) *pprog = prog; } +/* Check that condition necessary for PROBE_MEM handling for insn->off < 0 + * holds. + * + * This could be a static_assert((TASK_SIZE_MAX + PAGE_SIZE) > -S16_MIN), + * but TASK_SIZE_MAX can't always be evaluated at compile time, so let's not + * assume insn->off size either + */ +static int check_probe_mem_task_size_overflow(void) +{ + struct bpf_insn insn; + s64 max_negative; + + switch (sizeof(insn.off)) { + case 2: + max_negative = S16_MIN; + break; + default: + pr_err("bpf_jit_error: unexpected bpf_insn->off size\n"); + return -EFAULT; + } + + if (!((TASK_SIZE_MAX + PAGE_SIZE) > -max_negative)) { + pr_err("bpf jit error: assumption does not hold:\n"); + pr_err("\t(TASK_SIZE_MAX + PAGE_SIZE) + (max negative insn->off) > 0\n"); + return -EFAULT; + } + + return 0; +} + #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp))) static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image, @@ -967,6 +999,10 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image u8 *prog = temp; int err; + err = check_probe_mem_task_size_overflow(); + if (err) + return err; + detect_reg_usage(insn, insn_cnt, callee_regs_used, &tail_call_seen); @@ -1359,20 +1395,30 @@ st: if (is_imm8(insn->off)) case BPF_LDX | BPF_MEM | BPF_DW: case BPF_LDX | BPF_PROBE_MEM | BPF_DW: if (BPF_MODE(insn->code) == BPF_PROBE_MEM) { - /* Though the verifier prevents negative insn->off in BPF_PROBE_MEM - * add abs(insn->off) to the limit to make sure that negative - * offset won't be an issue. - * insn->off is s16, so it won't affect valid pointers. - */ - u64 limit = TASK_SIZE_MAX + PAGE_SIZE + abs(insn->off); - u8 *end_of_jmp1, *end_of_jmp2; - /* Conservatively check that src_reg + insn->off is a kernel address: - * 1. src_reg + insn->off >= limit - * 2. src_reg + insn->off doesn't become small positive. - * Cannot do src_reg + insn->off >= limit in one branch, - * since it needs two spare registers, but JIT has only one. + * 1. src_reg + insn->off >= TASK_SIZE_MAX + PAGE_SIZE + * 2. src_reg + insn->off doesn't overflow and become small positive + * + * For check 1, to save regs, do + * src_reg >= (TASK_SIZE_MAX + PAGE_SIZE - insn->off) call rhs + * of inequality 'limit' + * + * For check 2: + * If insn->off is positive, add src_reg + insn->off and check + * overflow directly + * If insn->off is negative, we know that + * (TASK_SIZE_MAX + PAGE_SIZE - insn->off) > (TASK_SIZE_MAX + PAGE_SIZE) + * and from check 1 we know + * src_reg >= (TASK_SIZE_MAX + PAGE_SIZE - insn->off) + * So if (TASK_SIZE_MAX + PAGE_SIZE) + MAX_NEGATIVE_OFF > 0 we can + * be sure that src_reg + insn->off won't overflow in either + * direction and avoid runtime check entirely. + * + * check_probe_mem_task_size_overflow confirms the above assumption + * at the beginning of this function */ + u64 limit = TASK_SIZE_MAX + PAGE_SIZE - insn->off; + u8 *end_of_jmp1, *end_of_jmp2; /* movabsq r11, limit */ EMIT2(add_1mod(0x48, AUX_REG), add_1reg(0xB8, AUX_REG)); @@ -1381,32 +1427,47 @@ st: if (is_imm8(insn->off)) /* cmp src_reg, r11 */ maybe_emit_mod(&prog, src_reg, AUX_REG, true); EMIT2(0x39, add_2reg(0xC0, src_reg, AUX_REG)); - /* if unsigned '<' goto end_of_jmp2 */ - EMIT2(X86_JB, 0); - end_of_jmp1 = prog; - - /* mov r11, src_reg */ - emit_mov_reg(&prog, true, AUX_REG, src_reg); - /* add r11, insn->off */ - maybe_emit_1mod(&prog, AUX_REG, true); - EMIT2_off32(0x81, add_1reg(0xC0, AUX_REG), insn->off); - /* jmp if not carry to start_of_ldx - * Otherwise ERR_PTR(-EINVAL) + 128 will be the user addr - * that has to be rejected. - */ - EMIT2(0x73 /* JNC */, 0); - end_of_jmp2 = prog; + if (insn->off >= 0) { + /* cmp src_reg, r11 */ + /* if unsigned '<' goto end_of_jmp2 */ + EMIT2(X86_JB, 0); + end_of_jmp1 = prog; + + /* mov r11, src_reg */ + emit_mov_reg(&prog, true, AUX_REG, src_reg); + /* add r11, insn->off */ + maybe_emit_1mod(&prog, AUX_REG, true); + EMIT2_off32(0x81, add_1reg(0xC0, AUX_REG), insn->off); + /* jmp if not carry to start_of_ldx + * Otherwise ERR_PTR(-EINVAL) + 128 will be the user addr + * that has to be rejected. + */ + EMIT2(X86_JNC, 0); + end_of_jmp2 = prog; + } else { + /* cmp src_reg, r11 */ + /* if unsigned '>=' goto start_of_ldx + * w/o needing to do check 2 + */ + EMIT2(X86_JAE, 0); + end_of_jmp1 = prog; + } /* xor dst_reg, dst_reg */ emit_mov_imm32(&prog, false, dst_reg, 0); /* jmp byte_after_ldx */ EMIT2(0xEB, 0); - /* populate jmp_offset for JB above to jump to xor dst_reg */ - end_of_jmp1[-1] = end_of_jmp2 - end_of_jmp1; - /* populate jmp_offset for JNC above to jump to start_of_ldx */ start_of_ldx = prog; - end_of_jmp2[-1] = start_of_ldx - end_of_jmp2; + if (insn->off >= 0) { + /* populate jmp_offset for JB above to jump to xor dst_reg */ + end_of_jmp1[-1] = end_of_jmp2 - end_of_jmp1; + /* populate jmp_offset for JNC above to jump to start_of_ldx */ + end_of_jmp2[-1] = start_of_ldx - end_of_jmp2; + } else { + /* populate jmp_offset for JAE above to jump to start_of_ldx */ + end_of_jmp1[-1] = start_of_ldx - end_of_jmp1; + } } emit_ldx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off); if (BPF_MODE(insn->code) == BPF_PROBE_MEM) { From patchwork Tue Dec 6 23:09:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Marchevsky X-Patchwork-Id: 13066347 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BBCDC63706 for ; Tue, 6 Dec 2022 23:10:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229617AbiLFXKc (ORCPT ); Tue, 6 Dec 2022 18:10:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55682 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229665AbiLFXKa (ORCPT ); Tue, 6 Dec 2022 18:10:30 -0500 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2C3F429AE for ; Tue, 6 Dec 2022 15:10:28 -0800 (PST) Received: from pps.filterd (m0001303.ppops.net [127.0.0.1]) by m0001303.ppops.net (8.17.1.19/8.17.1.19) with ESMTP id 2B6LhKLX020825 for ; Tue, 6 Dec 2022 15:10:28 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=YNK9Y1Lrn83gq4Dlrhl4SQoz5DZE4jYE8+PAFCc56xg=; b=j57MH0WXkjb+QlsFHLcnXYC83qQqdu+4XZUe7XIR6iOXLtmrRKWiKJaOMtmWMPzCRGo4 XLjNOcQM6YwiayfH2NhYXIYcNu/Qn/4OQEzVw2a1Djo+K+/6NcTi0OqE2YY1o2Vqtjap cqlVNaf60248zri/BQ0OOuKum8sfys3tR24= Received: from maileast.thefacebook.com ([163.114.130.16]) by m0001303.ppops.net (PPS) with ESMTPS id 3m9g8cdfa0-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 06 Dec 2022 15:10:28 -0800 Received: from twshared8047.05.ash9.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Tue, 6 Dec 2022 15:10:26 -0800 Received: by devbig077.ldc1.facebook.com (Postfix, from userid 158236) id C0C96120B3795; Tue, 6 Dec 2022 15:10:07 -0800 (PST) From: Dave Marchevsky To: CC: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Kernel Team , Kumar Kartikeya Dwivedi , Tejun Heo , Dave Marchevsky Subject: [PATCH bpf-next 11/13] bpf: Add bpf_rbtree_{add,remove,first} decls to bpf_experimental.h Date: Tue, 6 Dec 2022 15:09:58 -0800 Message-ID: <20221206231000.3180914-12-davemarchevsky@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221206231000.3180914-1-davemarchevsky@fb.com> References: <20221206231000.3180914-1-davemarchevsky@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: YlFIlbmHPYIlPCrstybK3gCdJUx6CBiz X-Proofpoint-ORIG-GUID: YlFIlbmHPYIlPCrstybK3gCdJUx6CBiz X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-06_12,2022-12-06_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Signed-off-by: Dave Marchevsky --- .../testing/selftests/bpf/bpf_experimental.h | 24 +++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h index 424f7bbbfe9b..dbd2c729781a 100644 --- a/tools/testing/selftests/bpf/bpf_experimental.h +++ b/tools/testing/selftests/bpf/bpf_experimental.h @@ -65,4 +65,28 @@ extern struct bpf_list_node *bpf_list_pop_front(struct bpf_list_head *head) __ks */ extern struct bpf_list_node *bpf_list_pop_back(struct bpf_list_head *head) __ksym; +/* Description + * Remove 'node' from rbtree with root 'root' + * Returns + * Pointer to the removed node, or NULL if 'root' didn't contain 'node' + */ +extern struct bpf_rb_node *bpf_rbtree_remove(struct bpf_rb_root *root, + struct bpf_rb_node *node) __ksym; + +/* Description + * Add 'node' to rbtree with root 'root' using comparator 'less' + * Returns + * Nothing + */ +extern void bpf_rbtree_add(struct bpf_rb_root *root, struct bpf_rb_node *node, + bool (less)(struct bpf_rb_node *a, const struct bpf_rb_node *b)) __ksym; + +/* Description + * Return the first (leftmost) node in input tree + * Returns + * Pointer to the node, which is _not_ removed from the tree. If the tree + * contains no nodes, returns NULL. + */ +extern struct bpf_rb_node *bpf_rbtree_first(struct bpf_rb_root *root) __ksym; + #endif From patchwork Tue Dec 6 23:09:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Marchevsky X-Patchwork-Id: 13066350 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3DC9C63708 for ; Tue, 6 Dec 2022 23:10:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229548AbiLFXKd (ORCPT ); Tue, 6 Dec 2022 18:10:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55688 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229675AbiLFXKa (ORCPT ); Tue, 6 Dec 2022 18:10:30 -0500 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B4EF4429B5 for ; Tue, 6 Dec 2022 15:10:29 -0800 (PST) Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2B6LhOjh000527 for ; Tue, 6 Dec 2022 15:10:29 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=cKaWZ4JoDPjHkfjAHfVl7L33sgS19nwfFkO0BOGDSyY=; b=hN8hDfPVo3mWbDyEecOyggyVtYYxAh6AmqJ5nd0aHDvaaUyCxUe2oLFwmCPcsMfWE8BP yFvu9GJfOG6sfdGY0OYprHKxVkrbgZVGMZvYkOg5+XuzG7BAthnO0VsL5aoDRUovEYuS hlUw5oxw6HY5c/Aa31lYwBHNDKpD4+QOM6Q= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3m9x70xv50-7 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 06 Dec 2022 15:10:29 -0800 Received: from twshared8047.05.ash9.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Tue, 6 Dec 2022 15:10:26 -0800 Received: by devbig077.ldc1.facebook.com (Postfix, from userid 158236) id 38784120B3798; Tue, 6 Dec 2022 15:10:08 -0800 (PST) From: Dave Marchevsky To: CC: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Kernel Team , Kumar Kartikeya Dwivedi , Tejun Heo , Dave Marchevsky Subject: [PATCH bpf-next 12/13] libbpf: Make BTF mandatory if program BTF has spin_lock or alloc_obj type Date: Tue, 6 Dec 2022 15:09:59 -0800 Message-ID: <20221206231000.3180914-13-davemarchevsky@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221206231000.3180914-1-davemarchevsky@fb.com> References: <20221206231000.3180914-1-davemarchevsky@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: ftqs6UxD-FlDD7k9MuTAPhJkb1PcMvhv X-Proofpoint-GUID: ftqs6UxD-FlDD7k9MuTAPhJkb1PcMvhv X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-06_12,2022-12-06_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net If a BPF program defines a struct or union type which has a field type that the verifier considers special - spin_lock, next-gen datastructure heads and nodes - the verifier needs to be able to find fields of that type using BTF. For such a program, BTF is required, so modify kernel_needs_btf helper to ensure that correct "BTF is mandatory" error message is emitted. The newly-added btf_has_alloc_obj_type looks for BTF_KIND_STRUCTs with a name corresponding to a special type. If any such struct is found it is assumed that some variable is using it, and therefore that successful BTF load is necessary. Also add a kernel_needs_btf check to bpf_object__create_map where it was previously missing. When this function calls bpf_map_create, kernel may reject map creation due to mismatched datastructure owner and ownee types (e.g. a struct bpf_list_head with __contains tag pointing to bpf_rbtree_node field). In such a scenario - or any other where BTF is necessary for verification - bpf_map_create should not be retried without BTF. Signed-off-by: Dave Marchevsky --- tools/lib/bpf/libbpf.c | 50 ++++++++++++++++++++++++++++++++---------- 1 file changed, 39 insertions(+), 11 deletions(-) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 2a82f49ce16f..56a905b502c9 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -998,6 +998,31 @@ find_struct_ops_kern_types(const struct btf *btf, const char *tname, return 0; } +/* Should match alloc_obj_fields in kernel/bpf/btf.c + */ +static const char *alloc_obj_fields[] = { + "bpf_spin_lock", + "bpf_list_head", + "bpf_list_node", + "bpf_rb_root", + "bpf_rb_node", +}; + +static bool +btf_has_alloc_obj_type(const struct btf *btf) +{ + const char *tname; + int i; + + for (i = 0; i < ARRAY_SIZE(alloc_obj_fields); i++) { + tname = alloc_obj_fields[i]; + if (btf__find_by_name_kind(btf, tname, BTF_KIND_STRUCT) > 0) + return true; + } + + return false; +} + static bool bpf_map__is_struct_ops(const struct bpf_map *map) { return map->def.type == BPF_MAP_TYPE_STRUCT_OPS; @@ -2794,7 +2819,8 @@ static bool libbpf_needs_btf(const struct bpf_object *obj) static bool kernel_needs_btf(const struct bpf_object *obj) { - return obj->efile.st_ops_shndx >= 0; + return obj->efile.st_ops_shndx >= 0 || + (obj->btf && btf_has_alloc_obj_type(obj->btf)); } static int bpf_object__init_btf(struct bpf_object *obj, @@ -5103,16 +5129,18 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b err = -errno; cp = libbpf_strerror_r(err, errmsg, sizeof(errmsg)); - pr_warn("Error in bpf_create_map_xattr(%s):%s(%d). Retrying without BTF.\n", - map->name, cp, err); - create_attr.btf_fd = 0; - create_attr.btf_key_type_id = 0; - create_attr.btf_value_type_id = 0; - map->btf_key_type_id = 0; - map->btf_value_type_id = 0; - map->fd = bpf_map_create(def->type, map_name, - def->key_size, def->value_size, - def->max_entries, &create_attr); + pr_warn("Error in bpf_create_map_xattr(%s):%s(%d).\n", map->name, cp, err); + if (!kernel_needs_btf(obj)) { + pr_warn("Retrying bpf_map_create_xattr(%s) without BTF.\n", map->name); + create_attr.btf_fd = 0; + create_attr.btf_key_type_id = 0; + create_attr.btf_value_type_id = 0; + map->btf_key_type_id = 0; + map->btf_value_type_id = 0; + map->fd = bpf_map_create(def->type, map_name, + def->key_size, def->value_size, + def->max_entries, &create_attr); + } } err = map->fd < 0 ? -errno : 0; From patchwork Tue Dec 6 23:10:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Marchevsky X-Patchwork-Id: 13066345 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99F7CC3A5A7 for ; Tue, 6 Dec 2022 23:10:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229621AbiLFXKa (ORCPT ); Tue, 6 Dec 2022 18:10:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55642 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229583AbiLFXK3 (ORCPT ); Tue, 6 Dec 2022 18:10:29 -0500 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4491342990 for ; Tue, 6 Dec 2022 15:10:28 -0800 (PST) Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2B6LhOje000527 for ; Tue, 6 Dec 2022 15:10:28 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=W45dsdRmgLt8uYlYRv4eoFKpVYCd0ZixBcP8zwl+Hhg=; b=Jx35w7KUymLbBYyF5otgSUZNLcmtfJSVDvIiNI66lk+BD5zMKtE2NYITajbCJbj/h+er OwqQzl1HvYwp08aMnErW8O2ejY/MJL1p8VLE6fvbykt4ilpqmd78WXVv0f5IdzhBq+xg rCVHZeDwSQeX8BCTYx0SzkFD4R0H3g4gUu8= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3m9x70xv5h-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 06 Dec 2022 15:10:27 -0800 Received: from twshared21592.39.frc1.facebook.com (2620:10d:c085:208::f) by mail.thefacebook.com (2620:10d:c085:11d::4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Tue, 6 Dec 2022 15:10:27 -0800 Received: by devbig077.ldc1.facebook.com (Postfix, from userid 158236) id E5669120B379D; Tue, 6 Dec 2022 15:10:08 -0800 (PST) From: Dave Marchevsky To: CC: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Kernel Team , Kumar Kartikeya Dwivedi , Tejun Heo , Dave Marchevsky Subject: [PATCH bpf-next 13/13] selftests/bpf: Add rbtree selftests Date: Tue, 6 Dec 2022 15:10:00 -0800 Message-ID: <20221206231000.3180914-14-davemarchevsky@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221206231000.3180914-1-davemarchevsky@fb.com> References: <20221206231000.3180914-1-davemarchevsky@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: T29aoubwgPOSSfjjZ4WwsoJrBfG-jznf X-Proofpoint-GUID: T29aoubwgPOSSfjjZ4WwsoJrBfG-jznf X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-06_12,2022-12-06_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This patch adds selftests exercising the logic changed/added in the previous patches in the series. A variety of successful and unsuccessful rbtree usages are validated: Success: * Add some nodes, let map_value bpf_rbtree_root destructor clean them up * Add some nodes, remove one using the release_on_unlock ref leftover by successful rbtree_add() call * Add some nodes, remove one using the release_on_unlock ref returned from rbtree_first() call Failure: * BTF where bpf_rb_root owns bpf_list_node should fail to load * BTF where node of type X is added to tree containing nodes of type Y should fail to load * No calling rbtree api functions in 'less' callback for rbtree_add * No releasing lock in 'less' callback for rbtree_add * No removing a node which hasn't been added to any tree * No adding a node which has already been added to a tree * No escaping of release_on_unlock references past their lock's critical section These tests mostly focus on rbtree-specific additions, but some of the Failure cases revalidate scenarios common to both linked_list and rbtree which are covered in the former's tests. Better to be a bit redundant in case linked_list and rbtree semantics deviate over time. Signed-off-by: Dave Marchevsky --- .../testing/selftests/bpf/prog_tests/rbtree.c | 184 ++++++++++++ tools/testing/selftests/bpf/progs/rbtree.c | 180 ++++++++++++ .../progs/rbtree_btf_fail__add_wrong_type.c | 48 ++++ .../progs/rbtree_btf_fail__wrong_node_type.c | 21 ++ .../testing/selftests/bpf/progs/rbtree_fail.c | 263 ++++++++++++++++++ 5 files changed, 696 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/rbtree.c create mode 100644 tools/testing/selftests/bpf/progs/rbtree.c create mode 100644 tools/testing/selftests/bpf/progs/rbtree_btf_fail__add_wrong_type.c create mode 100644 tools/testing/selftests/bpf/progs/rbtree_btf_fail__wrong_node_type.c create mode 100644 tools/testing/selftests/bpf/progs/rbtree_fail.c diff --git a/tools/testing/selftests/bpf/prog_tests/rbtree.c b/tools/testing/selftests/bpf/prog_tests/rbtree.c new file mode 100644 index 000000000000..688ce56d8b92 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/rbtree.c @@ -0,0 +1,184 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ + +#include +#include + +#include "rbtree.skel.h" +#include "rbtree_fail.skel.h" +#include "rbtree_btf_fail__wrong_node_type.skel.h" +#include "rbtree_btf_fail__add_wrong_type.skel.h" + +static char log_buf[1024 * 1024]; + +static struct { + const char *prog_name; + const char *err_msg; +} rbtree_fail_tests[] = { + {"rbtree_api_nolock_add", "bpf_spin_lock at off=16 must be held for bpf_rb_root"}, + {"rbtree_api_nolock_remove", "bpf_spin_lock at off=16 must be held for bpf_rb_root"}, + {"rbtree_api_nolock_first", "bpf_spin_lock at off=16 must be held for bpf_rb_root"}, + + /* Specific failure string for these three isn't very important, but it shouldn't be + * possible to call rbtree api func from within add() callback + */ + {"rbtree_api_add_bad_cb_bad_fn_call_add", "arg#1 expected pointer to allocated object"}, + {"rbtree_api_add_bad_cb_bad_fn_call_remove", "allocated object must be referenced"}, + {"rbtree_api_add_bad_cb_bad_fn_call_first", "Unreleased reference id=4 alloc_insn=26"}, + {"rbtree_api_add_bad_cb_bad_fn_call_first_unlock_after", + "failed to release release_on_unlock reference"}, + + {"rbtree_api_remove_unadded_node", "arg#1 expected pointer to allocated object"}, + {"rbtree_api_add_to_multiple_trees", "arg#1 expected pointer to allocated object"}, + {"rbtree_api_add_release_unlock_escape", "arg#1 expected pointer to allocated object"}, + {"rbtree_api_first_release_unlock_escape", "arg#1 expected pointer to allocated object"}, + {"rbtree_api_remove_no_drop", "Unreleased reference id=4 alloc_insn=10"}, +}; + +static void test_rbtree_fail_prog(const char *prog_name, const char *err_msg) +{ + LIBBPF_OPTS(bpf_object_open_opts, opts, + .kernel_log_buf = log_buf, + .kernel_log_size = sizeof(log_buf), + .kernel_log_level = 1 + ); + struct rbtree_fail *skel; + struct bpf_program *prog; + int ret; + + skel = rbtree_fail__open_opts(&opts); + if (!ASSERT_OK_PTR(skel, "rbtree_fail__open_opts")) + return; + + prog = bpf_object__find_program_by_name(skel->obj, prog_name); + if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name")) + goto end; + + bpf_program__set_autoload(prog, true); + + ret = rbtree_fail__load(skel); + if (!ASSERT_ERR(ret, "rbtree_fail__load must fail")) + goto end; + + if (!ASSERT_OK_PTR(strstr(log_buf, err_msg), "expected error message")) { + fprintf(stderr, "Expected: %s\n", err_msg); + fprintf(stderr, "Verifier: %s\n", log_buf); + } + +end: + rbtree_fail__destroy(skel); +} + +static void test_rbtree_add_nodes(void) +{ + LIBBPF_OPTS(bpf_test_run_opts, opts, + .data_in = &pkt_v4, + .data_size_in = sizeof(pkt_v4), + .repeat = 1, + ); + struct rbtree *skel; + int ret; + + skel = rbtree__open_and_load(); + if (!ASSERT_OK_PTR(skel, "rbtree__open_and_load")) + return; + + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.rbtree_add_nodes), &opts); + ASSERT_OK(ret, "rbtree_add_nodes run"); + ASSERT_OK(opts.retval, "rbtree_add_nodes retval"); + ASSERT_EQ(skel->data->less_callback_ran, 1, "rbtree_add_nodes less_callback_ran"); + + rbtree__destroy(skel); +} + +static void test_rbtree_add_and_remove(void) +{ + LIBBPF_OPTS(bpf_test_run_opts, opts, + .data_in = &pkt_v4, + .data_size_in = sizeof(pkt_v4), + .repeat = 1, + ); + struct rbtree *skel; + int ret; + + skel = rbtree__open_and_load(); + if (!ASSERT_OK_PTR(skel, "rbtree__open_and_load")) + return; + + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.rbtree_add_and_remove), &opts); + ASSERT_OK(ret, "rbtree_add_and_remove"); + ASSERT_OK(opts.retval, "rbtree_add_and_remove retval"); + ASSERT_EQ(skel->data->removed_key, 5, "rbtree_add_and_remove first removed key"); + + rbtree__destroy(skel); +} + +static void test_rbtree_first_and_remove(void) +{ + LIBBPF_OPTS(bpf_test_run_opts, opts, + .data_in = &pkt_v4, + .data_size_in = sizeof(pkt_v4), + .repeat = 1, + ); + struct rbtree *skel; + int ret; + + skel = rbtree__open_and_load(); + if (!ASSERT_OK_PTR(skel, "rbtree__open_and_load")) + return; + + ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.rbtree_first_and_remove), &opts); + ASSERT_OK(ret, "rbtree_first_and_remove"); + ASSERT_OK(opts.retval, "rbtree_first_and_remove retval"); + ASSERT_EQ(skel->data->first_data[0], 2, "rbtree_first_and_remove first rbtree_first()"); + ASSERT_EQ(skel->data->removed_key, 1, "rbtree_first_and_remove first removed key"); + ASSERT_EQ(skel->data->first_data[1], 4, "rbtree_first_and_remove second rbtree_first()"); + + rbtree__destroy(skel); +} + +void test_rbtree_success(void) +{ + if (test__start_subtest("rbtree_add_nodes")) + test_rbtree_add_nodes(); + if (test__start_subtest("rbtree_add_and_remove")) + test_rbtree_add_and_remove(); + if (test__start_subtest("rbtree_first_and_remove")) + test_rbtree_first_and_remove(); +} + +#define BTF_FAIL_TEST(suffix) \ +void test_rbtree_btf_fail__##suffix(void) \ +{ \ + struct rbtree_btf_fail__##suffix *skel; \ + \ + skel = rbtree_btf_fail__##suffix##__open_and_load(); \ + if (!ASSERT_ERR_PTR(skel, \ + "rbtree_btf_fail__" #suffix "__open_and_load unexpected success")) \ + rbtree_btf_fail__##suffix##__destroy(skel); \ +} + +#define RUN_BTF_FAIL_TEST(suffix) \ + if (test__start_subtest("rbtree_btf_fail__" #suffix)) \ + test_rbtree_btf_fail__##suffix(); + +BTF_FAIL_TEST(wrong_node_type); +BTF_FAIL_TEST(add_wrong_type); + +void test_rbtree_btf_fail(void) +{ + RUN_BTF_FAIL_TEST(wrong_node_type); + RUN_BTF_FAIL_TEST(add_wrong_type); +} + +void test_rbtree_fail(void) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(rbtree_fail_tests); i++) { + if (!test__start_subtest(rbtree_fail_tests[i].prog_name)) + continue; + test_rbtree_fail_prog(rbtree_fail_tests[i].prog_name, + rbtree_fail_tests[i].err_msg); + } +} diff --git a/tools/testing/selftests/bpf/progs/rbtree.c b/tools/testing/selftests/bpf/progs/rbtree.c new file mode 100644 index 000000000000..96a9d732e3fe --- /dev/null +++ b/tools/testing/selftests/bpf/progs/rbtree.c @@ -0,0 +1,180 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ + +#include +#include +#include +#include +#include "bpf_experimental.h" + +struct node_data { + long key; + long data; + struct bpf_rb_node node; +}; + +long less_callback_ran = -1; +long removed_key = -1; +long first_data[2] = {-1, -1}; + +#define private(name) SEC(".data." #name) __hidden __attribute__((aligned(8))) +private(A) struct bpf_spin_lock glock; +private(A) struct bpf_rb_root groot __contains(node_data, node); + +static bool less(struct bpf_rb_node *a, const struct bpf_rb_node *b) +{ + struct node_data *node_a; + struct node_data *node_b; + + node_a = container_of(a, struct node_data, node); + node_b = container_of(b, struct node_data, node); + less_callback_ran = 1; + + return node_a->key < node_b->key; +} + +static long __add_three(struct bpf_rb_root *root, struct bpf_spin_lock *lock) +{ + struct node_data *n, *m; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + n->key = 5; + + m = bpf_obj_new(typeof(*m)); + if (!m) { + bpf_obj_drop(n); + return 2; + } + m->key = 1; + + bpf_spin_lock(&glock); + bpf_rbtree_add(&groot, &n->node, less); + bpf_rbtree_add(&groot, &m->node, less); + bpf_spin_unlock(&glock); + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 3; + n->key = 3; + + bpf_spin_lock(&glock); + bpf_rbtree_add(&groot, &n->node, less); + bpf_spin_unlock(&glock); + return 0; +} + +SEC("tc") +long rbtree_add_nodes(void *ctx) +{ + return __add_three(&groot, &glock); +} + +SEC("tc") +long rbtree_add_and_remove(void *ctx) +{ + struct bpf_rb_node *res = NULL; + struct node_data *n, *m; + + n = bpf_obj_new(typeof(*n)); + if (!n) + goto err_out; + n->key = 5; + + m = bpf_obj_new(typeof(*m)); + if (!m) + goto err_out; + m->key = 3; + + bpf_spin_lock(&glock); + bpf_rbtree_add(&groot, &n->node, less); + bpf_rbtree_add(&groot, &m->node, less); + res = bpf_rbtree_remove(&groot, &n->node); + bpf_spin_unlock(&glock); + + if (!res) + return 1; + n = container_of(res, struct node_data, node); + removed_key = n->key; + + bpf_obj_drop(n); + + return 0; +err_out: + if (n) + bpf_obj_drop(n); + if (m) + bpf_obj_drop(m); + return 1; +} + +SEC("tc") +long rbtree_first_and_remove(void *ctx) +{ + struct bpf_rb_node *res = NULL; + struct node_data *n, *m, *o; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + n->key = 3; + n->data = 4; + + m = bpf_obj_new(typeof(*m)); + if (!m) + goto err_out; + m->key = 5; + m->data = 6; + + o = bpf_obj_new(typeof(*o)); + if (!o) + goto err_out; + o->key = 1; + o->data = 2; + + bpf_spin_lock(&glock); + bpf_rbtree_add(&groot, &n->node, less); + bpf_rbtree_add(&groot, &m->node, less); + bpf_rbtree_add(&groot, &o->node, less); + + res = bpf_rbtree_first(&groot); + if (!res) { + bpf_spin_unlock(&glock); + return 2; + } + + o = container_of(res, struct node_data, node); + first_data[0] = o->data; + + res = bpf_rbtree_remove(&groot, &o->node); + bpf_spin_unlock(&glock); + + if (!res) + return 1; + o = container_of(res, struct node_data, node); + removed_key = o->key; + + bpf_obj_drop(o); + + bpf_spin_lock(&glock); + res = bpf_rbtree_first(&groot); + if (!res) { + bpf_spin_unlock(&glock); + return 3; + } + + o = container_of(res, struct node_data, node); + first_data[1] = o->data; + bpf_spin_unlock(&glock); + + return 0; +err_out: + if (n) + bpf_obj_drop(n); + if (m) + bpf_obj_drop(m); + return 1; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/rbtree_btf_fail__add_wrong_type.c b/tools/testing/selftests/bpf/progs/rbtree_btf_fail__add_wrong_type.c new file mode 100644 index 000000000000..1729712722ec --- /dev/null +++ b/tools/testing/selftests/bpf/progs/rbtree_btf_fail__add_wrong_type.c @@ -0,0 +1,48 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ + +#include +#include +#include +#include +#include "bpf_experimental.h" + +struct node_data { + int key; + int data; + struct bpf_rb_node node; +}; + +struct node_data2 { + int key; + struct bpf_rb_node node; + int data; +}; + +static bool less2(struct bpf_rb_node *a, const struct bpf_rb_node *b) +{ + struct node_data2 *node_a; + struct node_data2 *node_b; + + node_a = container_of(a, struct node_data2, node); + node_b = container_of(b, struct node_data2, node); + + return node_a->key < node_b->key; +} + +#define private(name) SEC(".data." #name) __hidden __attribute__((aligned(8))) +private(A) struct bpf_spin_lock glock; +private(A) struct bpf_rb_root groot __contains(node_data, node); + +SEC("tc") +long rbtree_api_nolock_add(void *ctx) +{ + struct node_data2 *n; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + + bpf_rbtree_add(&groot, &n->node, less2); + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/rbtree_btf_fail__wrong_node_type.c b/tools/testing/selftests/bpf/progs/rbtree_btf_fail__wrong_node_type.c new file mode 100644 index 000000000000..df0efb46177c --- /dev/null +++ b/tools/testing/selftests/bpf/progs/rbtree_btf_fail__wrong_node_type.c @@ -0,0 +1,21 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ + +#include +#include +#include +#include +#include "bpf_experimental.h" + +/* BTF load should fail as bpf_rb_root __contains this type and points to + * 'node', but 'node' is not a bpf_rb_node + */ +struct node_data { + int key; + int data; + struct bpf_list_node node; +}; + +#define private(name) SEC(".data." #name) __hidden __attribute__((aligned(8))) +private(A) struct bpf_spin_lock glock; +private(A) struct bpf_rb_root groot __contains(node_data, node); diff --git a/tools/testing/selftests/bpf/progs/rbtree_fail.c b/tools/testing/selftests/bpf/progs/rbtree_fail.c new file mode 100644 index 000000000000..96caa7f33805 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/rbtree_fail.c @@ -0,0 +1,263 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include "bpf_experimental.h" + +struct node_data { + long key; + long data; + struct bpf_rb_node node; +}; + +#define private(name) SEC(".data." #name) __hidden __attribute__((aligned(8))) +private(A) struct bpf_spin_lock glock; +private(A) struct bpf_rb_root groot __contains(node_data, node); +private(A) struct bpf_rb_root groot2 __contains(node_data, node); + +static bool less(struct bpf_rb_node *a, const struct bpf_rb_node *b) +{ + struct node_data *node_a; + struct node_data *node_b; + + node_a = container_of(a, struct node_data, node); + node_b = container_of(b, struct node_data, node); + + return node_a->key < node_b->key; +} + +SEC("?tc") +long rbtree_api_nolock_add(void *ctx) +{ + struct node_data *n; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + + bpf_rbtree_add(&groot, &n->node, less); + return 0; +} + +SEC("?tc") +long rbtree_api_nolock_remove(void *ctx) +{ + struct node_data *n; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + + bpf_spin_lock(&glock); + bpf_rbtree_add(&groot, &n->node, less); + bpf_spin_unlock(&glock); + + bpf_rbtree_remove(&groot, &n->node); + return 0; +} + +SEC("?tc") +long rbtree_api_nolock_first(void *ctx) +{ + bpf_rbtree_first(&groot); + return 0; +} + +SEC("?tc") +long rbtree_api_remove_unadded_node(void *ctx) +{ + struct node_data *n, *m; + struct bpf_rb_node *res; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + + m = bpf_obj_new(typeof(*m)); + if (!m) { + bpf_obj_drop(n); + return 1; + } + + bpf_spin_lock(&glock); + bpf_rbtree_add(&groot, &n->node, less); + + /* This remove should pass verifier */ + res = bpf_rbtree_remove(&groot, &n->node); + if (res) + n = container_of(res, struct node_data, node); + + /* This remove shouldn't, m isn't in an rbtree */ + res = bpf_rbtree_remove(&groot, &m->node); + if (res) + m = container_of(res, struct node_data, node); + bpf_spin_unlock(&glock); + + if (n) + bpf_obj_drop(n); + if (m) + bpf_obj_drop(m); + return 0; +} + +SEC("?tc") +long rbtree_api_remove_no_drop(void *ctx) +{ + struct bpf_rb_node *res; + struct node_data *n; + + bpf_spin_lock(&glock); + res = bpf_rbtree_first(&groot); + if (!res) + goto unlock_err; + + res = bpf_rbtree_remove(&groot, res); + if (!res) + goto unlock_err; + + n = container_of(res, struct node_data, node); + bpf_spin_unlock(&glock); + + /* bpf_obj_drop(n) is missing here */ + return 0; + +unlock_err: + bpf_spin_unlock(&glock); + return 1; +} + +SEC("?tc") +long rbtree_api_add_to_multiple_trees(void *ctx) +{ + struct node_data *n; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + + bpf_spin_lock(&glock); + bpf_rbtree_add(&groot, &n->node, less); + + /* This add should fail since n already in groot's tree */ + bpf_rbtree_add(&groot2, &n->node, less); + bpf_spin_unlock(&glock); + return 0; +} + +SEC("?tc") +long rbtree_api_add_release_unlock_escape(void *ctx) +{ + struct node_data *n; + + n = bpf_obj_new(typeof(*n)); + if (!n) + return 1; + + bpf_spin_lock(&glock); + bpf_rbtree_add(&groot, &n->node, less); + bpf_spin_unlock(&glock); + + bpf_spin_lock(&glock); + /* After add() in previous critical section, n should be + * release_on_unlock and released after previous spin_unlock, + * so should not be possible to use it here + */ + bpf_rbtree_remove(&groot, &n->node); + bpf_spin_unlock(&glock); + return 0; +} + +SEC("?tc") +long rbtree_api_first_release_unlock_escape(void *ctx) +{ + struct bpf_rb_node *res; + struct node_data *n; + + bpf_spin_lock(&glock); + res = bpf_rbtree_first(&groot); + if (res) + n = container_of(res, struct node_data, node); + bpf_spin_unlock(&glock); + + bpf_spin_lock(&glock); + /* After first() in previous critical section, n should be + * release_on_unlock and released after previous spin_unlock, + * so should not be possible to use it here + */ + bpf_rbtree_remove(&groot, &n->node); + bpf_spin_unlock(&glock); + return 0; +} + +static bool less__bad_fn_call_add(struct bpf_rb_node *a, const struct bpf_rb_node *b) +{ + struct node_data *node_a; + struct node_data *node_b; + + node_a = container_of(a, struct node_data, node); + node_b = container_of(b, struct node_data, node); + bpf_rbtree_add(&groot, &node_a->node, less); + + return node_a->key < node_b->key; +} + +static bool less__bad_fn_call_remove(struct bpf_rb_node *a, const struct bpf_rb_node *b) +{ + struct node_data *node_a; + struct node_data *node_b; + + node_a = container_of(a, struct node_data, node); + node_b = container_of(b, struct node_data, node); + bpf_rbtree_remove(&groot, &node_a->node); + + return node_a->key < node_b->key; +} + +static bool less__bad_fn_call_first(struct bpf_rb_node *a, const struct bpf_rb_node *b) +{ + struct node_data *node_a; + struct node_data *node_b; + + node_a = container_of(a, struct node_data, node); + node_b = container_of(b, struct node_data, node); + bpf_rbtree_first(&groot); + + return node_a->key < node_b->key; +} + +static bool less__bad_fn_call_first_unlock_after(struct bpf_rb_node *a, const struct bpf_rb_node *b) +{ + struct node_data *node_a; + struct node_data *node_b; + + node_a = container_of(a, struct node_data, node); + node_b = container_of(b, struct node_data, node); + bpf_rbtree_first(&groot); + bpf_spin_unlock(&glock); + + return node_a->key < node_b->key; +} + +#define RBTREE_API_ADD_BAD_CB(cb_suffix) \ +SEC("?tc") \ +long rbtree_api_add_bad_cb_##cb_suffix(void *ctx) \ +{ \ + struct node_data *n; \ + \ + n = bpf_obj_new(typeof(*n)); \ + if (!n) \ + return 1; \ + \ + bpf_spin_lock(&glock); \ + bpf_rbtree_add(&groot, &n->node, less__##cb_suffix); \ + bpf_spin_unlock(&glock); \ + return 0; \ +} + +RBTREE_API_ADD_BAD_CB(bad_fn_call_add); +RBTREE_API_ADD_BAD_CB(bad_fn_call_remove); +RBTREE_API_ADD_BAD_CB(bad_fn_call_first); +RBTREE_API_ADD_BAD_CB(bad_fn_call_first_unlock_after); + +char _license[] SEC("license") = "GPL";