From patchwork Tue Oct 8 09:14:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13826004 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9C3DC7603F for ; Tue, 8 Oct 2024 09:02:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378150; cv=none; b=WREDWvdOLuGErpsGd7cIM/1iK55Jp2YZ987ERVLmWRfAAmZ8Egq8ciGs2TVVaY2AwpWmxq31cU+w4ooGfKOJBrp+yL8XxkaFXCKiNKymeieH4YheTDithhpuHcJmUo+EGos02tgiWAyjwy5TThcTRLnLrFalVEbIMaelBVmA9Wg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378150; c=relaxed/simple; bh=hMlQ465msOI+bKPFDzJcTc/5/xgEgFCXP/cnqYD8g2g=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=YhdbYCViZcXJsRRYOtQpm8HyB5uOxr/RwC+7ObIwmUJYtsJiI7zCSHdNj+qv0441YnIU9l61vaw5dYIm3WjjT4BEmxrEi+3E8vCrqi/L8ebow6mf8tXCyE8yWtUW0El+tVmDFbUHGVgtB+Ky6FXojT1qSv5NK3ViJa4XV8Cpnjc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4XN94v3s5Kz4f3lfY for ; Tue, 8 Oct 2024 17:02:07 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 178651A018D for ; Tue, 8 Oct 2024 17:02:25 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.60]) by APP4 (Coremail) with SMTP id gCh0CgDH+sYd9QRnbOEHDg--.25681S5; Tue, 08 Oct 2024 17:02:24 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 01/16] bpf: Introduce map flag BPF_F_DYNPTR_IN_KEY Date: Tue, 8 Oct 2024 17:14:46 +0800 Message-ID: <20241008091501.8302-2-houtao@huaweicloud.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20241008091501.8302-1-houtao@huaweicloud.com> References: <20241008091501.8302-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgDH+sYd9QRnbOEHDg--.25681S5 X-Coremail-Antispam: 1UD129KBjvJXoWxWw4fWr1kKr4Utr17Ar4DCFg_yoW5Gr4xpF s5CrWxCr40qFW7CrWDGa18Zry5Zw43ur1ak3ya9ryrAF12vr9aqr18WF1YyF95tw4jyrWr Xr4UKrWrGa4xZrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPFb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUGw A2048vs2IY020Ec7CjxVAFwI0_Gr0_Xr1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxS w2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxV W8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMc Ij6xIIjxv20xvE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_ Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2AFwI 0_Jw0_GFyl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG 67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r43MI IYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E 14v26r4j6F4UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJV W8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjxU7S_M UUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao Introduce map flag BPF_F_DYNPTR_IN_KEY to support dynptr in map key. Add the corresponding helper bpf_map_has_dynptr_key() to check whether or not the dynptr-key is supported. For map with dynptr key support, it needs to use map_extra to specify the maximum length of these dynptrs. The implementation of the map will check whether map_extra is smaller than the limitation imposed by memory allocation during map creation. It may also use map_extra to optimizate the memory allocation for dynptr. Signed-off-by: Hou Tao --- include/linux/bpf.h | 5 +++++ include/uapi/linux/bpf.h | 3 +++ kernel/bpf/syscall.c | 1 + tools/include/uapi/linux/bpf.h | 3 +++ 4 files changed, 12 insertions(+) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 19d8ca8ac960..f61bf427e14e 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -308,6 +308,11 @@ struct bpf_map { s64 __percpu *elem_count; }; +static inline bool bpf_map_has_dynptr_key(const struct bpf_map *map) +{ + return map->map_flags & BPF_F_DYNPTR_IN_KEY; +} + static inline const char *btf_field_type_name(enum btf_field_type type) { switch (type) { diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index c6cd7c7aeeee..07f7df308a01 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1409,6 +1409,9 @@ enum { /* Do not translate kernel bpf_arena pointers to user pointers */ BPF_F_NO_USER_CONV = (1U << 18), + +/* Create a map with bpf_dynptr in key */ + BPF_F_DYNPTR_IN_KEY = (1U << 19), }; /* Flags for BPF_PROG_QUERY. */ diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index a8f1808a1ca5..bffd803c5977 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1232,6 +1232,7 @@ static int map_create(union bpf_attr *attr) if (attr->map_type != BPF_MAP_TYPE_BLOOM_FILTER && attr->map_type != BPF_MAP_TYPE_ARENA && + !(attr->map_flags & BPF_F_DYNPTR_IN_KEY) && attr->map_extra != 0) return -EINVAL; diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 1fb3cb2636e6..14f223282bfa 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1409,6 +1409,9 @@ enum { /* Do not translate kernel bpf_arena pointers to user pointers */ BPF_F_NO_USER_CONV = (1U << 18), + +/* Create a map with bpf_dynptr in key */ + BPF_F_DYNPTR_IN_KEY = (1U << 19), }; /* Flags for BPF_PROG_QUERY. */ From patchwork Tue Oct 8 09:14:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13826006 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DCC08144D21 for ; Tue, 8 Oct 2024 09:02:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378151; cv=none; b=AkPAUG5+lJA81zWqFibq1Ji2wUTbwPoNAuctXA4lfvKRDi+JBxSdoQFvoC9z8IbkNGxuI5QT0hE5r70kbbze0btpjsrAf696VfP1kskO3zpY2wv83DDJCqvso/LsQ2MYOrlYkFsAnUnpw1wpQeH0PUxjpLdeZA1rVn7mrgxpAsM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378151; c=relaxed/simple; bh=Eyl2QH+w8IQPcln9wLqXUg7kD/RvCqzDpufkcvXlsV8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=r3OiDcjvq1/cWdr22PHWII8POdswJaLhtioHTaNWJ6cToWztYTyWhWkGlXDE/wmHlNEO7VZr69e/KD+5DUlwUZKgRPaiXMM9Kn6+nXF3W4yoI2/cmIRmrAO2VvKZjNYZQt8OPgtnauLmi56L8k4uinV0JV43c7JSS8xnXS7as04= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4XN94w60dCz4f3jY8 for ; Tue, 8 Oct 2024 17:02:08 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id A0F181A08FC for ; Tue, 8 Oct 2024 17:02:25 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.60]) by APP4 (Coremail) with SMTP id gCh0CgDH+sYd9QRnbOEHDg--.25681S6; Tue, 08 Oct 2024 17:02:25 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 02/16] bpf: Add two helpers to facilitate the btf parsing of bpf_dynptr Date: Tue, 8 Oct 2024 17:14:47 +0800 Message-ID: <20241008091501.8302-3-houtao@huaweicloud.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20241008091501.8302-1-houtao@huaweicloud.com> References: <20241008091501.8302-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgDH+sYd9QRnbOEHDg--.25681S6 X-Coremail-Antispam: 1UD129KBjvJXoWxZF1UZry5Xw17XryUAF43KFg_yoWrXFy3pF nrAry3Cr48tFWfuw1DGan8C3y3tay8Ww12vFy7G343KFWIqryDZF4UKr18uryYkrWYkr1x Zr1aga98Ary7AFDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPFb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUXw A2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxS w2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxV W8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMc Ij6xIIjxv20xvE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_ Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2AFwI 0_Jw0_GFyl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG 67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r43MI IYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E 14v26r4j6F4UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJV W8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjxUob18 DUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao Add helper btf_type_is_dynptr() to check whether or not the btf_type is a bpf_dynptr, and add helper btf_new_bpf_dynptr_record() to create an btf record which only includes a bpf_dynptr. These two helpers will be used when using bpf_dynptr as the map key directly. Signed-off-by: Hou Tao --- include/linux/btf.h | 2 ++ kernel/bpf/btf.c | 42 +++++++++++++++++++++++++++++++++++++----- 2 files changed, 39 insertions(+), 5 deletions(-) diff --git a/include/linux/btf.h b/include/linux/btf.h index b8a583194c4a..ed89b83e8c19 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -222,8 +222,10 @@ bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s, u32 expected_offset, u32 expected_size); struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type *t, u32 field_mask, u32 value_size); +struct btf_record *btf_new_bpf_dynptr_record(void); int btf_check_and_fixup_fields(const struct btf *btf, struct btf_record *rec); bool btf_type_is_void(const struct btf_type *t); +bool btf_type_is_dynptr(const struct btf *btf, const struct btf_type *t); s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind); s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p); const struct btf_type *btf_type_skip_modifiers(const struct btf *btf, diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 5254fc9c1b24..321356710924 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -3923,6 +3923,16 @@ static int btf_field_cmp(const void *_a, const void *_b, const void *priv) return 0; } +static void btf_init_btf_record(struct btf_record *record) +{ + record->cnt = 0; + record->field_mask = 0; + record->spin_lock_off = -EINVAL; + record->timer_off = -EINVAL; + record->wq_off = -EINVAL; + record->refcount_off = -EINVAL; +} + struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type *t, u32 field_mask, u32 value_size) { @@ -3941,14 +3951,11 @@ struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type /* This needs to be kzalloc to zero out padding and unused fields, see * comment in btf_record_equal. */ - rec = kzalloc(offsetof(struct btf_record, fields[cnt]), GFP_KERNEL | __GFP_NOWARN); + rec = kzalloc(struct_size(rec, fields, cnt), GFP_KERNEL | __GFP_NOWARN); if (!rec) return ERR_PTR(-ENOMEM); - rec->spin_lock_off = -EINVAL; - rec->timer_off = -EINVAL; - rec->wq_off = -EINVAL; - rec->refcount_off = -EINVAL; + btf_init_btf_record(rec); for (i = 0; i < cnt; i++) { field_type_size = btf_field_type_size(info_arr[i].type); if (info_arr[i].off + field_type_size > value_size) { @@ -4038,6 +4045,25 @@ struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type return ERR_PTR(ret); } +struct btf_record *btf_new_bpf_dynptr_record(void) +{ + struct btf_record *record; + + record = kzalloc(struct_size(record, fields, 1), GFP_KERNEL | __GFP_NOWARN); + if (!record) + return ERR_PTR(-ENOMEM); + + btf_init_btf_record(record); + + record->cnt = 1; + record->field_mask = BPF_DYNPTR; + record->fields[0].offset = 0; + record->fields[0].size = sizeof(struct bpf_dynptr); + record->fields[0].type = BPF_DYNPTR; + + return record; +} + int btf_check_and_fixup_fields(const struct btf *btf, struct btf_record *rec) { int i; @@ -7276,6 +7302,12 @@ static bool btf_is_dynptr_ptr(const struct btf *btf, const struct btf_type *t) return false; } +bool btf_type_is_dynptr(const struct btf *btf, const struct btf_type *t) +{ + return __btf_type_is_struct(t) && t->size == sizeof(struct bpf_dynptr) && + !strcmp(__btf_name_by_offset(btf, t->name_off), "bpf_dynptr"); +} + struct bpf_cand_cache { const char *name; u32 name_len; From patchwork Tue Oct 8 09:14:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13826008 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7F242538A for ; Tue, 8 Oct 2024 09:02:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378152; cv=none; b=WqMBlBEEmGga5tYskPajL3q3Quo5Es/h6H198sdX/NIwRfsJRTsLVu0OV6ujFIPM3Eh7CbAKRJntumwHg3twbXfcbj2bS/QL85wsE5d1Bt8X3hkBITgdywQEw5vOz2KwZ2Co+ZKnRDNQ5kmBuSJkIPvQbzycWn/MeO0+aumM+sE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378152; c=relaxed/simple; bh=D/V+IqSlBbjEoh2eAiCkzOwphcUgKk86iEhfnyyzCxE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=taIcAzS6dEWHTN5mf2RGJCxfnPab7Kr3pAt+dJav+eqV9pgTohO75y/CgT22MOp4DQ2GAQX032pWBnDpLmhLvzyQuYsBld9Ol4LSkj63V59dqZHhYCO41Hgivc9Htkb7gG2pDbG543DpakEErofRkUhSidWvzlgar/A+uLPOiKo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4XN94w4vnJz4f3lWJ for ; Tue, 8 Oct 2024 17:02:08 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 3AF911A08FC for ; Tue, 8 Oct 2024 17:02:26 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.60]) by APP4 (Coremail) with SMTP id gCh0CgDH+sYd9QRnbOEHDg--.25681S7; Tue, 08 Oct 2024 17:02:25 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 03/16] bpf: Parse bpf_dynptr in map key Date: Tue, 8 Oct 2024 17:14:48 +0800 Message-ID: <20241008091501.8302-4-houtao@huaweicloud.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20241008091501.8302-1-houtao@huaweicloud.com> References: <20241008091501.8302-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgDH+sYd9QRnbOEHDg--.25681S7 X-Coremail-Antispam: 1UD129KBjvJXoW3ury3CFWftF4UCF17JFy8AFb_yoWDCryxpF 4xGrWxCr4ktrW3Wr98Wa98u343tr4kWw1UWF95K34akF4agryDZF18tFyxZr45tFZ8Krn7 Ar4a9F95A34xAFDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPIb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUWw A2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxS w2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxV W8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMc Ij6xIIjxv20xvE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_ Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2AFwI 0_Jw0_GFyl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG 67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r43MI IYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E 14v26F4j6r4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr 0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x07UH nmiUUUUU= X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao To support variable-length key or strings in map key, use bpf_dynptr to represent these variable-length objects and save these bpf_dynptr fields in the map key. As shown in the examples below, a map key with an integer and a string is defined: struct pid_name { int pid; struct bpf_dynptr name; }; The bpf_dynptr in the map key could also be contained indirectly in a struct as shown below: struct pid_name_time { struct pid_name process; unsigned long long time; }; If the whole map key is a bpf_dynptr, the map could be defined as a struct or directly using bpf_dynptr as the map key: struct map_key { struct bpf_dynptr name; }; The bpf program could use bpf_dynptr_init() to initialize the dynptr part in the map key, and the userspace application will use bpf_dynptr_user_init() or similar API to initialize the dynptr. Just like kptrs in map value, the bpf_dynptr in the map key could also be defined in a nested struct which is contained in the map key struct. The patch updates map_create() accordingly to parse these bpf_dynptr fields in map key, just like it does for other special fields in map value. To enable bpf_dynptr support in map key, the map_type should be BPF_MAP_TYPE_HASH, and BPF_F_DYNPTR_IN_KEY should also be enabled in map_flags. For now, the max number of bpf_dynptr in a map key is arbitrarily chosen as 4 and it may be changed later. Signed-off-by: Hou Tao --- include/linux/bpf.h | 13 ++++++++++-- kernel/bpf/btf.c | 4 ++++ kernel/bpf/map_in_map.c | 19 ++++++++++++----- kernel/bpf/syscall.c | 47 +++++++++++++++++++++++++++++++++++++++++ 4 files changed, 76 insertions(+), 7 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index f61bf427e14e..3e25e94b7457 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -184,8 +184,8 @@ struct bpf_map_ops { }; enum { - /* Support at most 11 fields in a BTF type */ - BTF_FIELDS_MAX = 11, + /* Support at most 12 fields in a BTF type */ + BTF_FIELDS_MAX = 12, }; enum btf_field_type { @@ -203,6 +203,7 @@ enum btf_field_type { BPF_GRAPH_ROOT = BPF_RB_ROOT | BPF_LIST_HEAD, BPF_REFCOUNT = (1 << 9), BPF_WORKQUEUE = (1 << 10), + BPF_DYNPTR = (1 << 11), }; typedef void (*btf_dtor_kfunc_t)(void *); @@ -270,6 +271,7 @@ struct bpf_map { u32 map_flags; u32 id; struct btf_record *record; + struct btf_record *key_record; int numa_node; u32 btf_key_type_id; u32 btf_value_type_id; @@ -337,6 +339,8 @@ static inline const char *btf_field_type_name(enum btf_field_type type) return "bpf_rb_node"; case BPF_REFCOUNT: return "bpf_refcount"; + case BPF_DYNPTR: + return "bpf_dynptr"; default: WARN_ON_ONCE(1); return "unknown"; @@ -366,6 +370,8 @@ static inline u32 btf_field_type_size(enum btf_field_type type) return sizeof(struct bpf_rb_node); case BPF_REFCOUNT: return sizeof(struct bpf_refcount); + case BPF_DYNPTR: + return sizeof(struct bpf_dynptr); default: WARN_ON_ONCE(1); return 0; @@ -395,6 +401,8 @@ static inline u32 btf_field_type_align(enum btf_field_type type) return __alignof__(struct bpf_rb_node); case BPF_REFCOUNT: return __alignof__(struct bpf_refcount); + case BPF_DYNPTR: + return __alignof__(struct bpf_dynptr); default: WARN_ON_ONCE(1); return 0; @@ -424,6 +432,7 @@ static inline void bpf_obj_init_field(const struct btf_field *field, void *addr) case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: + case BPF_DYNPTR: break; default: WARN_ON_ONCE(1); diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 321356710924..2604cef53915 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -3500,6 +3500,7 @@ static int btf_get_field_type(const struct btf *btf, const struct btf_type *var_ field_mask_test_name(BPF_RB_ROOT, "bpf_rb_root"); field_mask_test_name(BPF_RB_NODE, "bpf_rb_node"); field_mask_test_name(BPF_REFCOUNT, "bpf_refcount"); + field_mask_test_name(BPF_DYNPTR, "bpf_dynptr"); /* Only return BPF_KPTR when all other types with matchable names fail */ if (field_mask & BPF_KPTR && !__btf_type_is_struct(var_type)) { @@ -3537,6 +3538,7 @@ static int btf_repeat_fields(struct btf_field_info *info, case BPF_KPTR_PERCPU: case BPF_LIST_HEAD: case BPF_RB_ROOT: + case BPF_DYNPTR: break; default: return -EINVAL; @@ -3659,6 +3661,7 @@ static int btf_find_field_one(const struct btf *btf, case BPF_LIST_NODE: case BPF_RB_NODE: case BPF_REFCOUNT: + case BPF_DYNPTR: ret = btf_find_struct(btf, var_type, off, sz, field_type, info_cnt ? &info[0] : &tmp); if (ret < 0) @@ -4014,6 +4017,7 @@ struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type break; case BPF_LIST_NODE: case BPF_RB_NODE: + case BPF_DYNPTR: break; default: ret = -EFAULT; diff --git a/kernel/bpf/map_in_map.c b/kernel/bpf/map_in_map.c index 645bd30bc9a9..a072835dc645 100644 --- a/kernel/bpf/map_in_map.c +++ b/kernel/bpf/map_in_map.c @@ -9,7 +9,7 @@ struct bpf_map *bpf_map_meta_alloc(int inner_map_ufd) { - struct bpf_map *inner_map, *inner_map_meta; + struct bpf_map *inner_map, *inner_map_meta, *ret; u32 inner_map_meta_size; CLASS(fd, f)(inner_map_ufd); @@ -45,9 +45,13 @@ struct bpf_map *bpf_map_meta_alloc(int inner_map_ufd) * invalid/empty/valid, but ERR_PTR in case of errors. During * equality NULL or IS_ERR is equivalent. */ - struct bpf_map *ret = ERR_CAST(inner_map_meta->record); - kfree(inner_map_meta); - return ret; + ret = ERR_CAST(inner_map_meta->record); + goto free; + } + inner_map_meta->key_record = btf_record_dup(inner_map->key_record); + if (IS_ERR(inner_map_meta->key_record)) { + ret = ERR_CAST(inner_map_meta->key_record); + goto free; } /* Note: We must use the same BTF, as we also used btf_record_dup above * which relies on BTF being same for both maps, as some members like @@ -71,6 +75,10 @@ struct bpf_map *bpf_map_meta_alloc(int inner_map_ufd) inner_map_meta->bypass_spec_v1 = inner_map->bypass_spec_v1; } return inner_map_meta; + +free: + bpf_map_meta_free(inner_map_meta); + return ret; } void bpf_map_meta_free(struct bpf_map *map_meta) @@ -88,7 +96,8 @@ bool bpf_map_meta_equal(const struct bpf_map *meta0, meta0->key_size == meta1->key_size && meta0->value_size == meta1->value_size && meta0->map_flags == meta1->map_flags && - btf_record_equal(meta0->record, meta1->record); + btf_record_equal(meta0->record, meta1->record) && + btf_record_equal(meta0->key_record, meta1->key_record); } void *bpf_map_fd_get_ptr(struct bpf_map *map, diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index bffd803c5977..aa0a500f8fad 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -561,6 +561,7 @@ void btf_record_free(struct btf_record *rec) case BPF_TIMER: case BPF_REFCOUNT: case BPF_WORKQUEUE: + case BPF_DYNPTR: /* Nothing to release */ break; default: @@ -574,7 +575,9 @@ void btf_record_free(struct btf_record *rec) void bpf_map_free_record(struct bpf_map *map) { btf_record_free(map->record); + btf_record_free(map->key_record); map->record = NULL; + map->key_record = NULL; } struct btf_record *btf_record_dup(const struct btf_record *rec) @@ -612,6 +615,7 @@ struct btf_record *btf_record_dup(const struct btf_record *rec) case BPF_TIMER: case BPF_REFCOUNT: case BPF_WORKQUEUE: + case BPF_DYNPTR: /* Nothing to acquire */ break; default: @@ -728,6 +732,8 @@ void bpf_obj_free_fields(const struct btf_record *rec, void *obj) case BPF_RB_NODE: case BPF_REFCOUNT: break; + case BPF_DYNPTR: + break; default: WARN_ON_ONCE(1); continue; @@ -737,6 +743,7 @@ void bpf_obj_free_fields(const struct btf_record *rec, void *obj) static void bpf_map_free(struct bpf_map *map) { + struct btf_record *key_rec = map->key_record; struct btf_record *rec = map->record; struct btf *btf = map->btf; @@ -751,6 +758,7 @@ static void bpf_map_free(struct bpf_map *map) * eventually calls bpf_map_free_meta, since inner_map_meta is only a * template bpf_map struct used during verification. */ + btf_record_free(key_rec); btf_record_free(rec); /* Delay freeing of btf for maps, as map_free callback may need * struct_meta info which will be freed with btf_put(). @@ -1081,6 +1089,8 @@ int map_check_no_btf(const struct bpf_map *map, return -ENOTSUPP; } +#define MAX_DYNPTR_CNT_IN_MAP_KEY 4 + static int map_check_btf(struct bpf_map *map, struct bpf_token *token, const struct btf *btf, u32 btf_key_id, u32 btf_value_id) { @@ -1103,6 +1113,40 @@ static int map_check_btf(struct bpf_map *map, struct bpf_token *token, if (!value_type || value_size != map->value_size) return -EINVAL; + if (btf_type_is_dynptr(btf, key_type)) + map->key_record = btf_new_bpf_dynptr_record(); + else + map->key_record = btf_parse_fields(btf, key_type, BPF_DYNPTR, map->key_size); + if (!IS_ERR_OR_NULL(map->key_record)) { + if (map->key_record->cnt > MAX_DYNPTR_CNT_IN_MAP_KEY) { + ret = -E2BIG; + goto free_map_tab; + } + if (!bpf_map_has_dynptr_key(map)) { + ret = -EINVAL; + goto free_map_tab; + } + if (map->map_type != BPF_MAP_TYPE_HASH) { + ret = -EOPNOTSUPP; + goto free_map_tab; + } + if (!bpf_token_capable(token, CAP_BPF)) { + ret = -EPERM; + goto free_map_tab; + } + /* Disallow key with dynptr for special map */ + if (map->map_flags & (BPF_F_RDONLY_PROG | BPF_F_WRONLY_PROG)) { + ret = -EACCES; + goto free_map_tab; + } + } else if (bpf_map_has_dynptr_key(map)) { + ret = -EINVAL; + goto free_map_tab; + } else { + /* Ensure key_record is either a valid btf_record or NULL */ + map->key_record = NULL; + } + map->record = btf_parse_fields(btf, value_type, BPF_SPIN_LOCK | BPF_TIMER | BPF_KPTR | BPF_LIST_HEAD | BPF_RB_ROOT | BPF_REFCOUNT | BPF_WORKQUEUE, @@ -1230,6 +1274,9 @@ static int map_create(union bpf_attr *attr) return -EINVAL; } + if ((attr->map_flags & BPF_F_DYNPTR_IN_KEY) && !attr->btf_key_type_id) + return -EINVAL; + if (attr->map_type != BPF_MAP_TYPE_BLOOM_FILTER && attr->map_type != BPF_MAP_TYPE_ARENA && !(attr->map_flags & BPF_F_DYNPTR_IN_KEY) && From patchwork Tue Oct 8 09:14:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13826007 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE92313A878 for ; Tue, 8 Oct 2024 09:02:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378151; cv=none; b=Q2QRj64RV9RUK91iLOqjElDfGPx6sHGM7dlazYFZ1bN5sRzUk9wCfwRebl3xDUpbv+TaYsiOaVTh+cAoYvIMCSDkxnsRlswAjHwdenmdJsW+jbJWrQADhBjFqU1iszEXCQeV9w0/y9RtwStLgITQzcIAJ8Qh0jARhnFtKTnydUE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378151; c=relaxed/simple; bh=doHkw15zBUtQWRlRh7DR+1LlCpB4SjxUsQ/0M9cVuMI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=NRvgF6UMOqLdyHOMN1jHoVY0iOr/x2ihlTBhG2512wMn3716pYIOLn66rFpVFHHaWmMEz2LSalijmwgHQFDqG111KsoymRlfPKVVbFZsqbldPuEq9gBlKWkdbfH7y19Ig/LzN8oFEXlL2j5MeCzPB0CA/yMKPb3fe8i2+Zb9tXo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4XN94x6wC1z4f3jY8 for ; Tue, 8 Oct 2024 17:02:09 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id BF9B11A0568 for ; Tue, 8 Oct 2024 17:02:26 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.60]) by APP4 (Coremail) with SMTP id gCh0CgDH+sYd9QRnbOEHDg--.25681S8; Tue, 08 Oct 2024 17:02:26 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 04/16] bpf: Pass flags instead of bool to check_helper_mem_access() Date: Tue, 8 Oct 2024 17:14:49 +0800 Message-ID: <20241008091501.8302-5-houtao@huaweicloud.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20241008091501.8302-1-houtao@huaweicloud.com> References: <20241008091501.8302-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgDH+sYd9QRnbOEHDg--.25681S8 X-Coremail-Antispam: 1UD129KBjvJXoW3Jr4rKryftrWxGF1kKFWfZrb_yoWxJw43pr 48G397Kr4ktF4xX3W2yFnrAa4UA348tw4fC3yfta4Iyr15ursxWF1vqryj9ryrArWvyw1x C3W0vrs3Aw1UCaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPqb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2 AFwI0_Jw0_GFyl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r 43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF 7I0E14v26F4j6r4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI 0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x 07Udl1kUUUUU= X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao Extend "bool zero_size_allowed" into "unsigned int flags", so more flags could by passed to check_helper_mem_access(). Signed-off-by: Hou Tao --- kernel/bpf/verifier.c | 42 ++++++++++++++++++++++++------------------ 1 file changed, 24 insertions(+), 18 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9a7ed527e47e..2090d2472d7c 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -5041,9 +5041,11 @@ enum bpf_access_src { ACCESS_HELPER = 2, /* the access is performed by a helper */ }; +#define ACCESS_F_ZERO_SIZE_ALLOWED BIT(0) + static int check_stack_range_initialized(struct bpf_verifier_env *env, int regno, int off, int access_size, - bool zero_size_allowed, + unsigned int access_flags, enum bpf_access_src type, struct bpf_call_arg_meta *meta); @@ -5077,7 +5079,7 @@ static int check_stack_read_var_off(struct bpf_verifier_env *env, /* Note that we pass a NULL meta, so raw access will not be permitted. */ err = check_stack_range_initialized(env, ptr_regno, off, size, - false, ACCESS_DIRECT, NULL); + 0, ACCESS_DIRECT, NULL); if (err) return err; @@ -7277,7 +7279,7 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i */ static int check_stack_range_initialized( struct bpf_verifier_env *env, int regno, int off, - int access_size, bool zero_size_allowed, + int access_size, unsigned int access_flags, enum bpf_access_src type, struct bpf_call_arg_meta *meta) { struct bpf_reg_state *reg = reg_state(env, regno); @@ -7290,7 +7292,7 @@ static int check_stack_range_initialized( */ bool clobber = false; - if (access_size == 0 && !zero_size_allowed) { + if (access_size == 0 && !(access_flags & ACCESS_F_ZERO_SIZE_ALLOWED)) { verbose(env, "invalid zero-sized read\n"); return -EACCES; } @@ -7432,9 +7434,10 @@ static int check_stack_range_initialized( } static int check_helper_mem_access(struct bpf_verifier_env *env, int regno, - int access_size, bool zero_size_allowed, + int access_size, unsigned int access_flags, struct bpf_call_arg_meta *meta) { + bool zero_size_allowed = access_flags & ACCESS_F_ZERO_SIZE_ALLOWED; struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno]; u32 *max_access; @@ -7488,7 +7491,7 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno, return check_stack_range_initialized( env, regno, reg->off, access_size, - zero_size_allowed, ACCESS_HELPER, meta); + access_flags, ACCESS_HELPER, meta); case PTR_TO_BTF_ID: return check_ptr_to_btf_access(env, regs, regno, reg->off, access_size, BPF_READ, -1); @@ -7532,9 +7535,10 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno, */ static int check_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg, u32 regno, - bool zero_size_allowed, + unsigned int access_flags, struct bpf_call_arg_meta *meta) { + bool zero_size_allowed = access_flags & ACCESS_F_ZERO_SIZE_ALLOWED; int err; /* This is used to refine r0 return value bounds for helpers @@ -7577,7 +7581,7 @@ static int check_mem_size_reg(struct bpf_verifier_env *env, } err = check_helper_mem_access(env, regno - 1, reg->umax_value, - zero_size_allowed, meta); + access_flags, meta); if (!err) err = mark_chain_precision(env, regno); return err; @@ -7604,11 +7608,11 @@ static int check_mem_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg mark_ptr_not_null_reg(reg); } - err = check_helper_mem_access(env, regno, mem_size, true, &meta); + err = check_helper_mem_access(env, regno, mem_size, ACCESS_F_ZERO_SIZE_ALLOWED, &meta); /* Check access for BPF_WRITE */ meta.raw_mode = true; - err = err ?: check_helper_mem_access(env, regno, mem_size, true, &meta); - + err = err ?: check_helper_mem_access(env, regno, mem_size, ACCESS_F_ZERO_SIZE_ALLOWED, + &meta); if (may_be_null) *reg = saved_reg; @@ -7633,10 +7637,10 @@ static int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg mark_ptr_not_null_reg(mem_reg); } - err = check_mem_size_reg(env, reg, regno, true, &meta); + err = check_mem_size_reg(env, reg, regno, ACCESS_F_ZERO_SIZE_ALLOWED, &meta); /* Check access for BPF_WRITE */ meta.raw_mode = true; - err = err ?: check_mem_size_reg(env, reg, regno, true, &meta); + err = err ?: check_mem_size_reg(env, reg, regno, ACCESS_F_ZERO_SIZE_ALLOWED, &meta); if (may_be_null) *mem_reg = saved_reg; @@ -8943,7 +8947,7 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, return -EACCES; } err = check_helper_mem_access(env, regno, - meta->map_ptr->key_size, false, + meta->map_ptr->key_size, 0, NULL); break; case ARG_PTR_TO_MAP_VALUE: @@ -8960,7 +8964,7 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, } meta->raw_mode = arg_type & MEM_UNINIT; err = check_helper_mem_access(env, regno, - meta->map_ptr->value_size, false, + meta->map_ptr->value_size, 0, meta); break; case ARG_PTR_TO_PERCPU_BTF_ID: @@ -9003,7 +9007,9 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, */ meta->raw_mode = arg_type & MEM_UNINIT; if (arg_type & MEM_FIXED_SIZE) { - err = check_helper_mem_access(env, regno, fn->arg_size[arg], false, meta); + err = check_helper_mem_access(env, regno, + fn->arg_size[arg], 0, + meta); if (err) return err; if (arg_type & MEM_ALIGNED) @@ -9011,10 +9017,10 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, } break; case ARG_CONST_SIZE: - err = check_mem_size_reg(env, reg, regno, false, meta); + err = check_mem_size_reg(env, reg, regno, 0, meta); break; case ARG_CONST_SIZE_OR_ZERO: - err = check_mem_size_reg(env, reg, regno, true, meta); + err = check_mem_size_reg(env, reg, regno, ACCESS_F_ZERO_SIZE_ALLOWED, meta); break; case ARG_PTR_TO_DYNPTR: err = process_dynptr_func(env, regno, insn_idx, arg_type, 0); From patchwork Tue Oct 8 09:14:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13826012 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 11FB816DEB4 for ; Tue, 8 Oct 2024 09:02:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378154; cv=none; b=DX5QqqjxtdIfd06yV2+/xejGs+If4E31bnZcPIjthtGRTEDwiFFjsBkxO1pZZjs9tx6qqBLEMo1grLDmxVGXm8TGO/6xdBvj2LHQSSdJCE/KgpM2QO3yHlq3yqunMg0IXXIu0l/eXih9LhjQPpPoDpFZWTCR08lwljtn+XbUJaU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378154; c=relaxed/simple; bh=8lAeQSX2SWsoFS95RbSvF8IAzPpYQeVpyKHetiCvKXE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=jBUr/Y9xB7M211CSu0+Gw6qhhGeVOnBe/10xKnkNN1JrEGPWuMzZj5G2Jhxu1HSK1VDLVX4sZDsqTGFeimjoJc9+r8BpHN0jcGkDEvS3THvGfPOexJEHkjzTWqXAeezgNdnWsOR7euZRaoDaVhT+8tR/nurGHsiOtChBWWYJRU8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4XN94x5vHjz4f3lfT for ; Tue, 8 Oct 2024 17:02:09 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 5CE111A018D for ; Tue, 8 Oct 2024 17:02:27 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.60]) by APP4 (Coremail) with SMTP id gCh0CgDH+sYd9QRnbOEHDg--.25681S9; Tue, 08 Oct 2024 17:02:27 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 05/16] bpf: Support map key with dynptr in verifier Date: Tue, 8 Oct 2024 17:14:50 +0800 Message-ID: <20241008091501.8302-6-houtao@huaweicloud.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20241008091501.8302-1-houtao@huaweicloud.com> References: <20241008091501.8302-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgDH+sYd9QRnbOEHDg--.25681S9 X-Coremail-Antispam: 1UD129KBjvJXoW3ArWxXw48Aw47ZrWkKw15Jwb_yoW3try5pF 4kGa4Sgr1kKr42vwsxtFsrAry5Kw4xZw47CrWFg340vFy8Jr909ry0qFy5ur15trWkA3y7 Aws0qFZ0v3WUJFDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPqb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2 AFwI0_Jw0_GFyl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r 43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IYx2IY6xkF 7I0E14v26F4j6r4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI 0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x 07Udl1kUUUUU= X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao The patch basically does the following three things to enable dynptr key for bpf map: 1) Only allow PTR_TO_STACK typed register for dynptr key The main reason is that bpf_dynptr can only be defined in the stack, so for dynptr key only PTR_TO_STACK typed register is allowed. bpf_dynptr could also be represented by CONST_PTR_TO_DYNPTR typed register (e.g., in callback func or subprog), but it is not supported now. 2) Only allow fixed-offset for PTR_TO_STACK register Variable-offset for PTR_TO_STACK typed register is disallowed, because it is impossible to check whether or not the stack access is aligned with BPF_REG_SIZE and is matched with the location of dynptr or non-dynptr part in the map key. 3) Check the layout of the stack content is matched with the btf_record Firstly check the start offset of the stack access is aligned with BPF_REG_SIZE, then check the offset and the size of dynptr/non-dynptr parts in the stack content is consistent with the btf_record of the map key. Signed-off-by: Hou Tao --- kernel/bpf/verifier.c | 143 +++++++++++++++++++++++++++++++++++++++--- 1 file changed, 134 insertions(+), 9 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 2090d2472d7c..345b45edf2a7 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -5042,6 +5042,7 @@ enum bpf_access_src { }; #define ACCESS_F_ZERO_SIZE_ALLOWED BIT(0) +#define ACCESS_F_DYNPTR_READ_ALLOWED BIT(1) static int check_stack_range_initialized(struct bpf_verifier_env *env, int regno, int off, int access_size, @@ -7267,6 +7268,86 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i return 0; } +struct dynptr_key_state { + const struct btf_record *rec; + const struct btf_field *cur_dynptr; + bool valid_dynptr_id; + int cur_dynptr_id; +}; + +static int init_dynptr_key_state(struct bpf_verifier_env *env, const struct btf_record *rec, + struct dynptr_key_state *state) +{ + unsigned int i; + + /* Find the first dynptr in the dynptr-key */ + for (i = 0; i < rec->cnt; i++) { + if (rec->fields[i].type == BPF_DYNPTR) + break; + } + if (i >= rec->cnt) { + verbose(env, "verifier bug: dynptr not found\n"); + return -EFAULT; + } + + state->rec = rec; + state->cur_dynptr = &rec->fields[i]; + state->valid_dynptr_id = false; + + return 0; +} + +static int check_dynptr_key_access(struct bpf_verifier_env *env, struct dynptr_key_state *state, + struct bpf_reg_state *reg, u8 stype, int offset) +{ + const struct btf_field *dynptr = state->cur_dynptr; + + /* Non-dynptr part before a dynptr or non-dynptr part after + * the last dynptr. + */ + if (offset < dynptr->offset || offset >= dynptr->offset + dynptr->size) { + if (stype == STACK_DYNPTR) { + verbose(env, + "dynptr-key expects non-dynptr at offset %d cur_dynptr_offset %u\n", + offset, dynptr->offset); + return -EACCES; + } + } else { + if (stype != STACK_DYNPTR) { + verbose(env, + "dynptr-key expects dynptr at offset %d cur_dynptr_offset %u\n", + offset, dynptr->offset); + return -EACCES; + } + + /* A dynptr is composed of parts from two dynptrs */ + if (state->valid_dynptr_id && reg->id != state->cur_dynptr_id) { + verbose(env, "malformed dynptr-key at offset %d cur_dynptr_offset %u\n", + offset, dynptr->offset); + return -EACCES; + } + if (!state->valid_dynptr_id) { + state->valid_dynptr_id = true; + state->cur_dynptr_id = reg->id; + } + + if (offset == dynptr->offset + dynptr->size - 1) { + const struct btf_record *rec = state->rec; + unsigned int i; + + for (i = dynptr - rec->fields + 1; i < rec->cnt; i++) { + if (rec->fields[i].type == BPF_DYNPTR) { + state->cur_dynptr = &rec->fields[i]; + state->valid_dynptr_id = false; + break; + } + } + } + } + + return 0; +} + /* When register 'regno' is used to read the stack (either directly or through * a helper function) make sure that it's within stack boundary and, depending * on the access type and privileges, that all elements of the stack are @@ -7287,6 +7368,8 @@ static int check_stack_range_initialized( int err, min_off, max_off, i, j, slot, spi; char *err_extra = type == ACCESS_HELPER ? " indirect" : ""; enum bpf_access_type bounds_check_type; + struct dynptr_key_state dynptr_key; + bool dynptr_read_allowed; /* Some accesses can write anything into the stack, others are * read-only. */ @@ -7312,9 +7395,14 @@ static int check_stack_range_initialized( if (err) return err; - + dynptr_read_allowed = access_flags & ACCESS_F_DYNPTR_READ_ALLOWED; if (tnum_is_const(reg->var_off)) { min_off = max_off = reg->var_off.value + off; + + if (dynptr_read_allowed && (min_off % BPF_REG_SIZE)) { + verbose(env, "R%d misaligned offset %d for dynptr-key\n", regno, min_off); + return -EACCES; + } } else { /* Variable offset is prohibited for unprivileged mode for * simplicity since it requires corresponding support in @@ -7329,6 +7417,12 @@ static int check_stack_range_initialized( regno, err_extra, tn_buf); return -EACCES; } + + if (dynptr_read_allowed) { + verbose(env, "R%d variable offset prohibited for dynptr-key\n", regno); + return -EACCES; + } + /* Only initialized buffer on stack is allowed to be accessed * with variable offset. With uninitialized buffer it's hard to * guarantee that whole memory is marked as initialized on @@ -7373,19 +7467,26 @@ static int check_stack_range_initialized( return 0; } + if (dynptr_read_allowed) { + err = init_dynptr_key_state(env, meta->map_ptr->key_record, &dynptr_key); + if (err) + return err; + } for (i = min_off; i < max_off + access_size; i++) { u8 *stype; slot = -i - 1; spi = slot / BPF_REG_SIZE; if (state->allocated_stack <= slot) { - verbose(env, "verifier bug: allocated_stack too small"); + verbose(env, "verifier bug: allocated_stack too small\n"); return -EFAULT; } stype = &state->stack[spi].slot_type[slot % BPF_REG_SIZE]; if (*stype == STACK_MISC) goto mark; + if (dynptr_read_allowed && *stype == STACK_DYNPTR) + goto mark; if ((*stype == STACK_ZERO) || (*stype == STACK_INVALID && env->allow_uninit_stack)) { if (clobber) { @@ -7418,18 +7519,28 @@ static int check_stack_range_initialized( } return -EACCES; mark: + if (dynptr_read_allowed) { + err = check_dynptr_key_access(env, &dynptr_key, + &state->stack[spi].spilled_ptr, *stype, + i - min_off); + if (err) + return err; + } + /* reading any byte out of 8-byte 'spill_slot' will cause * the whole slot to be marked as 'read' - */ - mark_reg_read(env, &state->stack[spi].spilled_ptr, - state->stack[spi].spilled_ptr.parent, - REG_LIVE_READ64); - /* We do not set REG_LIVE_WRITTEN for stack slot, as we can not + * + * We do not set REG_LIVE_WRITTEN for stack slot, as we can not * be sure that whether stack slot is written to or not. Hence, * we must still conservatively propagate reads upwards even if * helper may write to the entire memory range. */ + mark_reg_read(env, &state->stack[spi].spilled_ptr, + state->stack[spi].spilled_ptr.parent, + REG_LIVE_READ64); + } + return 0; } @@ -8933,6 +9044,9 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, meta->map_uid = reg->map_uid; break; case ARG_PTR_TO_MAP_KEY: + { + u32 access_flags = 0; + /* bpf_map_xxx(..., map_ptr, ..., key) call: * check that [key, key + map->key_size) are within * stack limits and initialized @@ -8946,10 +9060,21 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, verbose(env, "invalid map_ptr to access map->key\n"); return -EACCES; } + /* Only allow PTR_TO_STACK for dynptr-key */ + if (bpf_map_has_dynptr_key(meta->map_ptr)) { + if (base_type(reg->type) != PTR_TO_STACK) { + verbose(env, "map dynptr-key requires stack ptr but got %s\n", + reg_type_str(env, reg->type)); + return -EACCES; + } + access_flags |= ACCESS_F_DYNPTR_READ_ALLOWED; + } + meta->raw_mode = false; err = check_helper_mem_access(env, regno, - meta->map_ptr->key_size, 0, - NULL); + meta->map_ptr->key_size, access_flags, + meta); break; + } case ARG_PTR_TO_MAP_VALUE: if (type_may_be_null(arg_type) && register_is_null(reg)) return 0; From patchwork Tue Oct 8 09:14:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13826009 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 36427176FCE for ; Tue, 8 Oct 2024 09:02:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378153; cv=none; b=oUH/eOe6x8kbo1NRyny9qIqrvUt8VUQdj2Z/dw+ac4PtDvo5TD/DYLTzq5U+UyunOyCcaG9J+1P7+RL3n4b0o/yueBHUzWyr9x90ReZtg8YouwgawkL6A4S0sq0i/KOUOTOv544GbAqZ3GGA6IwM0WuAco/w/r93KrAcH017oXk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378153; c=relaxed/simple; bh=AG9QykCHtjlQad1wQaHyYZ98sUxDRwlfk0McW+uJnKQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=c16Csl6uR8GQgX9uHfdeIG+vcc5IkD61eE+i+rOZSvolmNHS+0AcovmFX5cCxGwniRemLmTCi0ASQtrnKXvK+Oyj7yeVqWtwxcPSqZEu8/cF5XfY0ryMeA2C22yHTUVBlcrATxMiuoLO97/FtezefMr3OeQ5iLmlTTx2SLpFVxE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4XN9541R6jz4f3jqx for ; Tue, 8 Oct 2024 17:02:16 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id E4F341A147B for ; Tue, 8 Oct 2024 17:02:27 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.60]) by APP4 (Coremail) with SMTP id gCh0CgDH+sYd9QRnbOEHDg--.25681S10; Tue, 08 Oct 2024 17:02:27 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 06/16] bpf: Introduce bpf_dynptr_user Date: Tue, 8 Oct 2024 17:14:51 +0800 Message-ID: <20241008091501.8302-7-houtao@huaweicloud.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20241008091501.8302-1-houtao@huaweicloud.com> References: <20241008091501.8302-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgDH+sYd9QRnbOEHDg--.25681S10 X-Coremail-Antispam: 1UD129KBjvJXoW7tr4xGF1rJF17tFWfXFyDAwb_yoW8AryxpF 95GrW3u3y8XFW7AryDJa1Ivr4ruFWrur17K3y2934jkFWDXas7Zw4xKas0kFZ8t3y5CrW7 JryxKrWrWryrXr7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPvb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2 AFwI0_Jw0_GFyl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r 43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IYx2IY6xkF 7I0E14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14 v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuY vjxUoo7KUUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao For bpf map with dynptr key support, the userspace application will use bpf_dynptr_user to represent the bpf_dynptr in the map key and pass it to bpf syscall. The bpf syscall will copy from bpf_dynptr_user to construct a corresponding bpf_dynptr_kern object when the map key is an input argument, and copy to bpf_dynptr_user from a bpf_dynptr_kern object when the map key is an output argument. For now the size of bpf_dynptr_user must be the same as bpf_dynptr, but the last u32 field is not used, so make it a reserved field. Signed-off-by: Hou Tao --- include/uapi/linux/bpf.h | 6 ++++++ tools/include/uapi/linux/bpf.h | 6 ++++++ 2 files changed, 12 insertions(+) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 07f7df308a01..72fe6a96b54c 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -7329,6 +7329,12 @@ struct bpf_dynptr { __u64 __opaque[2]; } __attribute__((aligned(8))); +struct bpf_dynptr_user { + __u64 data; + __u32 size; + __u32 rsvd; +} __attribute__((aligned(8))); + struct bpf_list_head { __u64 __opaque[2]; } __attribute__((aligned(8))); diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 14f223282bfa..f12ce268e6be 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -7328,6 +7328,12 @@ struct bpf_dynptr { __u64 __opaque[2]; } __attribute__((aligned(8))); +struct bpf_dynptr_user { + __u64 data; + __u32 size; + __u32 rsvd; +} __attribute__((aligned(8))); + struct bpf_list_head { __u64 __opaque[2]; } __attribute__((aligned(8))); From patchwork Tue Oct 8 09:14:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13826010 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A1326144D21 for ; Tue, 8 Oct 2024 09:02:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378153; cv=none; b=A6Mo0pBoOkS0BqTIGdcR26GG8tGnk/awkx/SakYicpi9XpE/fGfNRN0RQuMiQJG74GL8i72PTWrRxTp7Hwqk1dFmWOG5HBUGTYhiLhSKNDhNCdxm1Nbc+m8OuOuG+3QXEskY8GpXDD2G7vGkZBNANdn7kNRDA/z9oS+ECjNFN3M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378153; c=relaxed/simple; bh=8cUQ9+GaN+5wEUFTCpEst8SRcmkegpze/+Uuosquegc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gvOzRG9Q52vX/nB1D1aYTlE6m3UN4Esk1L8kky2z4Pk4a/d2PJ/QR3Bo/jn26e4URL8TE1IOxbwubJ4o1eK5hfmfxlaglmrWEj2okTp/FE37O7saWh2QiR9cBC7Xwdk0JWtG/cq0DfVJNTi1VrTgK2CX6Z7aoYsKo8ww/J0vwQI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4XN94y6nqsz4f3lfT for ; Tue, 8 Oct 2024 17:02:10 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 7C41E1A058E for ; Tue, 8 Oct 2024 17:02:28 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.60]) by APP4 (Coremail) with SMTP id gCh0CgDH+sYd9QRnbOEHDg--.25681S11; Tue, 08 Oct 2024 17:02:28 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 07/16] libbpf: Add helpers for bpf_dynptr_user Date: Tue, 8 Oct 2024 17:14:52 +0800 Message-ID: <20241008091501.8302-8-houtao@huaweicloud.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20241008091501.8302-1-houtao@huaweicloud.com> References: <20241008091501.8302-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgDH+sYd9QRnbOEHDg--.25681S11 X-Coremail-Antispam: 1UD129KBjvJXoW7Zr4fGw1ktw15Zr1xZw1rXrb_yoW8XF4fpa yrGry3Xr48JFW2kwsxJF4Iy3yruF48Xr1UGFWxt34rJr4UJr98ZFy0ywnxKrn8trZ5Wr47 ZFZxKrW5Wr18Gr7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPvb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2 AFwI0_Jw0_GFyl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r 43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IYx2IY6xkF 7I0E14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14 v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuY vjxUoo7KUUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao Add bpf_dynptr_user_init() to initialize a bpf_dynptr_user object, bpf_dynptr_user_{data,size}() to get the address and length of the dynptr object, and bpf_dynptr_user_set_size() to set the its size. Instead of exporting these symbols, simply adding these helpers as inline functions in bpf.h. Signed-off-by: Hou Tao --- tools/lib/bpf/bpf.h | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index a4a7b1ad1b63..92b4afac5f5f 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -700,6 +700,33 @@ struct bpf_token_create_opts { LIBBPF_API int bpf_token_create(int bpffs_fd, struct bpf_token_create_opts *opts); +/* sys_bpf() will check the validity of data and size */ +static inline void bpf_dynptr_user_init(void *data, __u32 size, + struct bpf_dynptr_user *dynptr) +{ + dynptr->data = (__u64)(unsigned long)data; + dynptr->size = size; + dynptr->rsvd = 0; +} + +static inline void bpf_dynptr_user_set_size(struct bpf_dynptr_user *dynptr, + __u32 new_size) +{ + dynptr->size = new_size; +} + +static inline __u32 +bpf_dynptr_user_size(const struct bpf_dynptr_user *dynptr) +{ + return dynptr->size; +} + +static inline void * +bpf_dynptr_user_data(const struct bpf_dynptr_user *dynptr) +{ + return (void *)(unsigned long)dynptr->data; +} + #ifdef __cplusplus } /* extern "C" */ #endif From patchwork Tue Oct 8 09:14:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13826011 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6E64913A878 for ; Tue, 8 Oct 2024 09:02:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378154; cv=none; b=n4ixbKJlNFX1vy5zjeRGIReh81sTyARXw8oxUFJwV7ws+vhFbx5JONV8JUYQQQt9FYZehAoLzSwHHfhvIgzWGBFFw1c/FnskJDHzJCdF1luCOJeG3XKkp/bTKNPIa4StrjIVEsLwCroniW7XHG/8AEITq2zBGB4faojBfmvl/NA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378154; c=relaxed/simple; bh=RvE+LTUJOvFTb5I4Ku5u7U5B5d+nmKjUsauimT3CG8U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HX777IELracaRu50uvERJLLVXSSaX/jVnh9y5TLM+MnMKFhgNST6/bwO94ZBBkKo1cAMGRRhYzFcTHg4u6hkOMS9FjQu2qeAD8S9WO66L2LVNjWC/YaJ3omLlylAS+BcBqf6SPdrEOKEGCuO1+xQhtRhvpkAip8CqS5iYj7cVq8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4XN9552QbVz4f3jqx for ; Tue, 8 Oct 2024 17:02:17 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 10FF81A08FC for ; Tue, 8 Oct 2024 17:02:29 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.60]) by APP4 (Coremail) with SMTP id gCh0CgDH+sYd9QRnbOEHDg--.25681S12; Tue, 08 Oct 2024 17:02:28 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 08/16] bpf: Handle bpf_dynptr_user in bpf syscall when it is used as input Date: Tue, 8 Oct 2024 17:14:53 +0800 Message-ID: <20241008091501.8302-9-houtao@huaweicloud.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20241008091501.8302-1-houtao@huaweicloud.com> References: <20241008091501.8302-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgDH+sYd9QRnbOEHDg--.25681S12 X-Coremail-Antispam: 1UD129KBjvJXoWxCF45CFWUWr45Jr1DCw18Grg_yoWrtry3pF W8WryfZrWFvr43Jr95J3WFva1rWrn2qw1UG3srJas5W3WDXrZ8Xr1xtFZYgryY9FykXrn8 Jr4Dta4rCry8ArJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPvb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2 AFwI0_Jw0_GFyl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r 43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IYx2IY6xkF 7I0E14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14 v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuY vjxUoo7KUUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao Introduce bpf_copy_from_dynptr_ukey() helper to handle map key with bpf_dynptr when the map key is used in map lookup, update, delete and get_next_key operations. The helper places all variable-length data of these bpf_dynptr_user objects at the end of the map key to simplify the allocate and the free of map key with dynptr. Signed-off-by: Hou Tao --- kernel/bpf/syscall.c | 98 +++++++++++++++++++++++++++++++++++++++----- 1 file changed, 87 insertions(+), 11 deletions(-) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index aa0a500f8fad..5bd75db9b12f 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1540,10 +1540,83 @@ int __weak bpf_stackmap_copy(struct bpf_map *map, void *key, void *value) return -ENOTSUPP; } -static void *__bpf_copy_key(void __user *ukey, u64 key_size) +static void *bpf_copy_from_dynptr_ukey(const struct bpf_map *map, bpfptr_t ukey) +{ + const struct btf_record *record; + const struct btf_field *field; + struct bpf_dynptr_user *uptr; + struct bpf_dynptr_kern *kptr; + void *key, *new_key, *kdata; + unsigned int key_size, size; + bpfptr_t udata; + unsigned int i; + int err; + + key_size = map->key_size; + key = kvmemdup_bpfptr(ukey, key_size); + if (!key) + return ERR_PTR(-ENOMEM); + + size = key_size; + record = map->key_record; + for (i = 0; i < record->cnt; i++) { + field = &record->fields[i]; + if (field->type != BPF_DYNPTR) + continue; + uptr = key + field->offset; + if (!uptr->size || uptr->size > map->map_extra || uptr->rsvd) { + err = -EINVAL; + goto free_key; + } + + size += uptr->size; + /* Overflow ? */ + if (size < uptr->size) { + err = -E2BIG; + goto free_key; + } + } + + /* Place all dynptrs' data in the end of the key */ + new_key = kvrealloc(key, size, GFP_USER | __GFP_NOWARN); + if (!new_key) { + err = -ENOMEM; + goto free_key; + } + + key = new_key; + kdata = key + key_size; + for (i = 0; i < record->cnt; i++) { + field = &record->fields[i]; + if (field->type != BPF_DYNPTR) + continue; + + uptr = key + field->offset; + size = uptr->size; + udata = make_bpfptr(uptr->data, bpfptr_is_kernel(ukey)); + if (copy_from_bpfptr(kdata, udata, size)) { + err = -EFAULT; + goto free_key; + } + kptr = (struct bpf_dynptr_kern *)uptr; + bpf_dynptr_init(kptr, kdata, BPF_DYNPTR_TYPE_LOCAL, 0, size); + kdata += size; + } + + return key; + +free_key: + kvfree(key); + return ERR_PTR(err); +} + +static void *__bpf_copy_key(const struct bpf_map *map, void __user *ukey) { - if (key_size) - return vmemdup_user(ukey, key_size); + if (bpf_map_has_dynptr_key(map)) + return bpf_copy_from_dynptr_ukey(map, USER_BPFPTR(ukey)); + + if (map->key_size) + return vmemdup_user(ukey, map->key_size); if (ukey) return ERR_PTR(-EINVAL); @@ -1551,10 +1624,13 @@ static void *__bpf_copy_key(void __user *ukey, u64 key_size) return NULL; } -static void *___bpf_copy_key(bpfptr_t ukey, u64 key_size) +static void *___bpf_copy_key(const struct bpf_map *map, bpfptr_t ukey) { - if (key_size) - return kvmemdup_bpfptr(ukey, key_size); + if (bpf_map_has_dynptr_key(map)) + return bpf_copy_from_dynptr_ukey(map, ukey); + + if (map->key_size) + return kvmemdup_bpfptr(ukey, map->key_size); if (!bpfptr_is_null(ukey)) return ERR_PTR(-EINVAL); @@ -1591,7 +1667,7 @@ static int map_lookup_elem(union bpf_attr *attr) !btf_record_has_field(map->record, BPF_SPIN_LOCK)) return -EINVAL; - key = __bpf_copy_key(ukey, map->key_size); + key = __bpf_copy_key(map, ukey); if (IS_ERR(key)) return PTR_ERR(key); @@ -1658,7 +1734,7 @@ static int map_update_elem(union bpf_attr *attr, bpfptr_t uattr) goto err_put; } - key = ___bpf_copy_key(ukey, map->key_size); + key = ___bpf_copy_key(map, ukey); if (IS_ERR(key)) { err = PTR_ERR(key); goto err_put; @@ -1705,7 +1781,7 @@ static int map_delete_elem(union bpf_attr *attr, bpfptr_t uattr) goto err_put; } - key = ___bpf_copy_key(ukey, map->key_size); + key = ___bpf_copy_key(map, ukey); if (IS_ERR(key)) { err = PTR_ERR(key); goto err_put; @@ -1757,7 +1833,7 @@ static int map_get_next_key(union bpf_attr *attr) return -EPERM; if (ukey) { - key = __bpf_copy_key(ukey, map->key_size); + key = __bpf_copy_key(map, ukey); if (IS_ERR(key)) return PTR_ERR(key); } else { @@ -2054,7 +2130,7 @@ static int map_lookup_and_delete_elem(union bpf_attr *attr) goto err_put; } - key = __bpf_copy_key(ukey, map->key_size); + key = __bpf_copy_key(map, ukey); if (IS_ERR(key)) { err = PTR_ERR(key); goto err_put; From patchwork Tue Oct 8 09:14:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13826015 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DCD5613D601 for ; Tue, 8 Oct 2024 09:02:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378156; cv=none; b=mQUICWZsfQRaogA0Q7pnAqKzpTTow/qzsOcXST6dYycXagiU3wZZdMWiZEpX3N2uShfmuDzr6iD5JSBK7MB60S4b5jI092eEPAE5IiG2OVDSGhoZYWDTOWFc1RPxe6CY/tUnKvU58ay1ZnXl2OiDvY/0+cm5HicBlbU6a99BOcs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378156; c=relaxed/simple; bh=Q0unTHPJL125siBbqisYiKJ0aBgYrV4KfS4NTeLJDqI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=I3WtV/qQHS2EdbYCdaStUnc6mt6IaPAFbqmamFl6dufbMZMD0BP87dFdZzk4Y7eWjSMaKGV1+kN0nfJA8SLfmIIaUQ/ZL1qAimJLM+OFTo/ITGZYpEgEaE5gklEswzT/QwI8ILMQydzA5OyOBiF7xktgVMvT33yrxrKy++9QiaI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4XN9556YJdz4f3jsx for ; Tue, 8 Oct 2024 17:02:17 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 9EE201A08FC for ; Tue, 8 Oct 2024 17:02:29 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.60]) by APP4 (Coremail) with SMTP id gCh0CgDH+sYd9QRnbOEHDg--.25681S13; Tue, 08 Oct 2024 17:02:29 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 09/16] bpf: Handle bpf_dynptr_user in bpf syscall when it is used as output Date: Tue, 8 Oct 2024 17:14:54 +0800 Message-ID: <20241008091501.8302-10-houtao@huaweicloud.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20241008091501.8302-1-houtao@huaweicloud.com> References: <20241008091501.8302-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgDH+sYd9QRnbOEHDg--.25681S13 X-Coremail-Antispam: 1UD129KBjvJXoW3WF1fuF1DAr48GFW5ArWkJFb_yoW7KF1DpF 48G3savr4Fqr43Jas8X3Wqv3yrtrn7Ww1UGas3Ka4rXF9xWr90vr1xKFW09r90vFyDJF12 vw4Iqr98ZrWxJrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPvb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2 AFwI0_Jw0_GFyl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r 43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IYx2IY6xkF 7I0E14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14 v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuY vjxUoo7KUUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao For get_next_key operation, unext_key is used as an output argument. When there is dynptr in map key, unext_key will also be used as an input argument, because the userspace application needs to pre-allocate a buffer for each variable-length part in the map key and save the length and the address of these buffers in bpf_dynptr_user objects. To support get_next_key op for map with dynptr key, map_get_next_key() first calls bpf_copy_from_dynptr_ukey() to construct a map key in which each bpf_dynptr_kern object has the same size as the corresponding bpf_dynptr_user object. It then calls ->map_get_next_key() to get the next_key, and finally calls bpf_copy_to_dynptr_ukey() to copy both the non-dynptr part and dynptr part in the map key to unext_key. Signed-off-by: Hou Tao --- kernel/bpf/syscall.c | 88 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 73 insertions(+), 15 deletions(-) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 5bd75db9b12f..338f17530068 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1540,7 +1540,7 @@ int __weak bpf_stackmap_copy(struct bpf_map *map, void *key, void *value) return -ENOTSUPP; } -static void *bpf_copy_from_dynptr_ukey(const struct bpf_map *map, bpfptr_t ukey) +static void *bpf_copy_from_dynptr_ukey(const struct bpf_map *map, bpfptr_t ukey, bool copy_data) { const struct btf_record *record; const struct btf_field *field; @@ -1548,7 +1548,6 @@ static void *bpf_copy_from_dynptr_ukey(const struct bpf_map *map, bpfptr_t ukey) struct bpf_dynptr_kern *kptr; void *key, *new_key, *kdata; unsigned int key_size, size; - bpfptr_t udata; unsigned int i; int err; @@ -1563,6 +1562,7 @@ static void *bpf_copy_from_dynptr_ukey(const struct bpf_map *map, bpfptr_t ukey) field = &record->fields[i]; if (field->type != BPF_DYNPTR) continue; + uptr = key + field->offset; if (!uptr->size || uptr->size > map->map_extra || uptr->rsvd) { err = -EINVAL; @@ -1593,10 +1593,13 @@ static void *bpf_copy_from_dynptr_ukey(const struct bpf_map *map, bpfptr_t ukey) uptr = key + field->offset; size = uptr->size; - udata = make_bpfptr(uptr->data, bpfptr_is_kernel(ukey)); - if (copy_from_bpfptr(kdata, udata, size)) { - err = -EFAULT; - goto free_key; + if (copy_data) { + bpfptr_t udata = make_bpfptr(uptr->data, bpfptr_is_kernel(ukey)); + + if (copy_from_bpfptr(kdata, udata, size)) { + err = -EFAULT; + goto free_key; + } } kptr = (struct bpf_dynptr_kern *)uptr; bpf_dynptr_init(kptr, kdata, BPF_DYNPTR_TYPE_LOCAL, 0, size); @@ -1613,7 +1616,7 @@ static void *bpf_copy_from_dynptr_ukey(const struct bpf_map *map, bpfptr_t ukey) static void *__bpf_copy_key(const struct bpf_map *map, void __user *ukey) { if (bpf_map_has_dynptr_key(map)) - return bpf_copy_from_dynptr_ukey(map, USER_BPFPTR(ukey)); + return bpf_copy_from_dynptr_ukey(map, USER_BPFPTR(ukey), true); if (map->key_size) return vmemdup_user(ukey, map->key_size); @@ -1627,7 +1630,7 @@ static void *__bpf_copy_key(const struct bpf_map *map, void __user *ukey) static void *___bpf_copy_key(const struct bpf_map *map, bpfptr_t ukey) { if (bpf_map_has_dynptr_key(map)) - return bpf_copy_from_dynptr_ukey(map, ukey); + return bpf_copy_from_dynptr_ukey(map, ukey, true); if (map->key_size) return kvmemdup_bpfptr(ukey, map->key_size); @@ -1638,6 +1641,51 @@ static void *___bpf_copy_key(const struct bpf_map *map, bpfptr_t ukey) return NULL; } +static int bpf_copy_to_dynptr_ukey(const struct bpf_map *map, + void __user *ukey, void *key) +{ + struct bpf_dynptr_user __user *uptr; + struct bpf_dynptr_kern *kptr; + struct btf_record *record; + unsigned int i, offset; + + offset = 0; + record = map->key_record; + for (i = 0; i < record->cnt; i++) { + struct btf_field *field; + unsigned int size; + u64 udata; + + field = &record->fields[i]; + if (field->type != BPF_DYNPTR) + continue; + + /* Any no-dynptr part before the dynptr ? */ + if (offset < field->offset && + copy_to_user(ukey + offset, key + offset, field->offset - offset)) + return -EFAULT; + + /* dynptr part */ + uptr = ukey + field->offset; + if (copy_from_user(&udata, &uptr->data, sizeof(udata))) + return -EFAULT; + + kptr = key + field->offset; + size = __bpf_dynptr_size(kptr); + if (copy_to_user(u64_to_user_ptr(udata), __bpf_dynptr_data(kptr, size), size) || + put_user(size, &uptr->size) || put_user(0, &uptr->rsvd)) + return -EFAULT; + + offset = field->offset + field->size; + } + + if (offset < map->key_size && + copy_to_user(ukey + offset, key + offset, map->key_size - offset)) + return -EFAULT; + + return 0; +} + /* last field in 'union bpf_attr' used by this command */ #define BPF_MAP_LOOKUP_ELEM_LAST_FIELD flags @@ -1840,10 +1888,19 @@ static int map_get_next_key(union bpf_attr *attr) key = NULL; } - err = -ENOMEM; - next_key = kvmalloc(map->key_size, GFP_USER); - if (!next_key) + if (bpf_map_has_dynptr_key(map)) + next_key = bpf_copy_from_dynptr_ukey(map, USER_BPFPTR(unext_key), false); + else + next_key = kvmalloc(map->key_size, GFP_USER); + if (IS_ERR_OR_NULL(next_key)) { + if (!next_key) { + err = -ENOMEM; + } else { + err = PTR_ERR(next_key); + next_key = NULL; + } goto free_key; + } if (bpf_map_is_offloaded(map)) { err = bpf_map_offload_get_next_key(map, key, next_key); @@ -1857,12 +1914,13 @@ static int map_get_next_key(union bpf_attr *attr) if (err) goto free_next_key; - err = -EFAULT; - if (copy_to_user(unext_key, next_key, map->key_size) != 0) + if (bpf_map_has_dynptr_key(map)) + err = bpf_copy_to_dynptr_ukey(map, unext_key, next_key); + else + err = copy_to_user(unext_key, next_key, map->key_size) ? -EFAULT : 0; + if (err) goto free_next_key; - err = 0; - free_next_key: kvfree(next_key); free_key: From patchwork Tue Oct 8 09:14:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13826013 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 337C217C213 for ; Tue, 8 Oct 2024 09:02:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378154; cv=none; b=rzXxItGsN61RW96I172/MIRgz4vRyjyQYZiqll2xjQDl26TboKiCMyeQOjZBLwubSKov9n42n8yDieU0VUPAHQ2fmbUj2Hk8+cBtyoRk+ulMBkAgCQGrVHH0dIADL36rbGB91d7QmviY6mZEoUwU7xIk+l9unWY/rfTebYozSks= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378154; c=relaxed/simple; bh=DwIJI6d4n+WWIrXXqR7ouUaniAJCXKbSdAhx0i7JRVE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=K9EjMUvuLclbR3dZb5L7hxMbevYTrOM3EthQDTsQs3sWL+0SIN2X7/emdruQfJ22/jXH427FeBjk+BclSTx+MViJJf6dWIa9Yda7P/4fTUy5x6ZsC8dLZrSffFfnyMEvGGjzqKfO8uCvOFnchlsiMqnhak2GSvh9hD2yS2Nvai0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4XN9504dxHz4f3lfl for ; Tue, 8 Oct 2024 17:02:12 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 316BC1A08FC for ; Tue, 8 Oct 2024 17:02:30 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.60]) by APP4 (Coremail) with SMTP id gCh0CgDH+sYd9QRnbOEHDg--.25681S14; Tue, 08 Oct 2024 17:02:29 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 10/16] bpf: Disable unsupported functionalities for map with dynptr key Date: Tue, 8 Oct 2024 17:14:55 +0800 Message-ID: <20241008091501.8302-11-houtao@huaweicloud.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20241008091501.8302-1-houtao@huaweicloud.com> References: <20241008091501.8302-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgDH+sYd9QRnbOEHDg--.25681S14 X-Coremail-Antispam: 1UD129KBjvJXoW7tF47AF1kJF17Ww1DCw1fCrg_yoW8GFWrpF 4xAFyxur48XF4fZryUWa10v347Xw4UG342krs7K34fAF47Xr9Fgr18tFyfJr90yFW8trWf XrW2grWFk34xCFJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPvb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2 AFwI0_Jw0_GFyl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r 43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Gr0_Xr1lIxAIcVC0I7IYx2IY6xkF 7I0E14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14 v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuY vjxUoo7KUUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao For map with dynptr key, batched map operations and dumping all elements through bpffs file are not yet supported. Therefore, disable these functionalities for now. Signed-off-by: Hou Tao --- include/linux/bpf.h | 3 ++- kernel/bpf/syscall.c | 4 ++++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 3e25e94b7457..127952377025 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -592,7 +592,8 @@ static inline bool bpf_map_offload_neutral(const struct bpf_map *map) static inline bool bpf_map_support_seq_show(const struct bpf_map *map) { return (map->btf_value_type_id || map->btf_vmlinux_value_type_id) && - map->ops->map_seq_show_elem; + map->ops->map_seq_show_elem && + !bpf_map_has_dynptr_key(map); } int map_check_no_btf(const struct bpf_map *map, diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 338f17530068..262f8a5541b6 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -5316,6 +5316,10 @@ static int bpf_map_do_batch(const union bpf_attr *attr, err = -EPERM; goto err_put; } + if (bpf_map_has_dynptr_key(map)) { + err = -EOPNOTSUPP; + goto err_put; + } if (cmd == BPF_MAP_LOOKUP_BATCH) BPF_DO_BATCH(map->ops->map_lookup_batch, map, attr, uattr); From patchwork Tue Oct 8 09:14:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13826014 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A341419994E for ; Tue, 8 Oct 2024 09:02:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378155; cv=none; b=domqBPv7AuMde/TzJnkoBfbwVI11tb2RWIo633c/7QZeRwF9U3sypboazSk1UcfW67WHs8ZV6vF80pVcPp0nDhD0upsuuyRULYUEcy0kmpvGzA1Qu4XtcqOThD/Z33XiD8d6MheJtQlKrDt3YWwazWQl1D/1msr6sw/Gm03lxrc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378155; c=relaxed/simple; bh=QLd+X2SvqaZ7PkkpneE9fez8BCTG0V4lLSMf7nduySU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=PmZsA2Ck5hOSNoHjQZ5bYRCHlr8Nptun3VTvXsWRCxzx6MXzHTSQN0n43pKZhxM3uh/AO7EN0JANjnT2eFtr8Tkt7k8tc9Ww+P2AXVX71/Zzecpl2TOXLA4xFvkH/XXKm3Gj39iTR5GwVxHfvgtjTPWmWFSYWALdPciGj7/kF38= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4XN9570CXCz4f3jt1 for ; Tue, 8 Oct 2024 17:02:19 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id BA7B41A08FC for ; Tue, 8 Oct 2024 17:02:30 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.60]) by APP4 (Coremail) with SMTP id gCh0CgDH+sYd9QRnbOEHDg--.25681S15; Tue, 08 Oct 2024 17:02:30 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 11/16] bpf: Add bpf_mem_alloc_check_size() helper Date: Tue, 8 Oct 2024 17:14:56 +0800 Message-ID: <20241008091501.8302-12-houtao@huaweicloud.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20241008091501.8302-1-houtao@huaweicloud.com> References: <20241008091501.8302-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgDH+sYd9QRnbOEHDg--.25681S15 X-Coremail-Antispam: 1UD129KBjvJXoW7Aw17AF1Uuw18XFWUXw4xJFb_yoW5Jry5pF 47tr18AFs5JF48Wa42gw1xAas8Xw4vg3W7tay3XryUZF93GrnrWFs8try7XFnayrW5Aayx Jr1vgrWfCryUZ3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPvb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2 AFwI0_Jw0_GFyl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r 43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Gr0_Xr1lIxAIcVC0I7IYx2IY6xkF 7I0E14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14 v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuY vjxUoo7KUUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao Introduce bpf_mem_alloc_check_size() to check whether the allocation size exceeds the limitation for kmalloc-equivalent allocator. Because the upper limit for percpu allocation is LLIST_NODE_SZ bytes larger than non-percpu allocation, so an extra percpu argument is added to the helper. It will be used by the following patch to check whether the maximum size of variable-length key is valid. Signed-off-by: Hou Tao --- include/linux/bpf_mem_alloc.h | 3 +++ kernel/bpf/memalloc.c | 14 +++++++++++++- 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/include/linux/bpf_mem_alloc.h b/include/linux/bpf_mem_alloc.h index aaf004d94322..e45162ef59bb 100644 --- a/include/linux/bpf_mem_alloc.h +++ b/include/linux/bpf_mem_alloc.h @@ -33,6 +33,9 @@ int bpf_mem_alloc_percpu_init(struct bpf_mem_alloc *ma, struct obj_cgroup *objcg int bpf_mem_alloc_percpu_unit_init(struct bpf_mem_alloc *ma, int size); void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma); +/* Check the allocation size for kmalloc equivalent allocator */ +int bpf_mem_alloc_check_size(bool percpu, size_t size); + /* kmalloc/kfree equivalent: */ void *bpf_mem_alloc(struct bpf_mem_alloc *ma, size_t size); void bpf_mem_free(struct bpf_mem_alloc *ma, void *ptr); diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index b3858a76e0b3..146f5b57cfb1 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -35,6 +35,8 @@ */ #define LLIST_NODE_SZ sizeof(struct llist_node) +#define BPF_MEM_ALLOC_SIZE_MAX 4096 + /* similar to kmalloc, but sizeof == 8 bucket is gone */ static u8 size_index[24] __ro_after_init = { 3, /* 8 */ @@ -65,7 +67,7 @@ static u8 size_index[24] __ro_after_init = { static int bpf_mem_cache_idx(size_t size) { - if (!size || size > 4096) + if (!size || size > BPF_MEM_ALLOC_SIZE_MAX) return -1; if (size <= 192) @@ -1005,3 +1007,13 @@ void notrace *bpf_mem_cache_alloc_flags(struct bpf_mem_alloc *ma, gfp_t flags) return !ret ? NULL : ret + LLIST_NODE_SZ; } + +int bpf_mem_alloc_check_size(bool percpu, size_t size) +{ + /* The size of percpu allocation doesn't have LLIST_NODE_SZ overhead */ + if ((percpu && size > BPF_MEM_ALLOC_SIZE_MAX) || + (!percpu && size > BPF_MEM_ALLOC_SIZE_MAX - LLIST_NODE_SZ)) + return -E2BIG; + + return 0; +} From patchwork Tue Oct 8 09:14:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13826017 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 373DD1DE896 for ; Tue, 8 Oct 2024 09:02:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378156; cv=none; b=KB9o69FbAgt03gPLQbzCM07CyiW8p7ReOSb4vl1Cl82pgmVR3lBZE0jSRzQi8hFa3LApp07Z6anoB1bE+qa6VHT3RJasbm8rTMxf4hsDHPAU1ow7Yck2w3JJTqbqzNgpAZ4yIpaE9cplWcuiLxb52YrTvwe+9J9hN/7BHygmx6E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378156; c=relaxed/simple; bh=EoXiBbfGhuGiyEVKQzk4goqxWVjV6hMQOw8XvTmgACI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UFGzG+0YNUFlsybfi4RMDcDuX8DAuov8JutgVO/Oq2j2uUrcovV5kMs2lw5HxVBXN63HGbrD2SQ2qoqkJvM+scSmXvHIh+F/L4O5+93tehY1zIlr5kstCrJLZV0G7M0iL1n8jFunqu4lacs9Piq8m6wF0HTs1sGqS/KGqWK8JYo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4XN9523yc0z4f3jY6 for ; Tue, 8 Oct 2024 17:02:14 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 597C11A058E for ; Tue, 8 Oct 2024 17:02:31 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.60]) by APP4 (Coremail) with SMTP id gCh0CgDH+sYd9QRnbOEHDg--.25681S16; Tue, 08 Oct 2024 17:02:31 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 12/16] bpf: Support basic operations for dynptr key in hash map Date: Tue, 8 Oct 2024 17:14:57 +0800 Message-ID: <20241008091501.8302-13-houtao@huaweicloud.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20241008091501.8302-1-houtao@huaweicloud.com> References: <20241008091501.8302-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgDH+sYd9QRnbOEHDg--.25681S16 X-Coremail-Antispam: 1UD129KBjvAXoW3ur18WrW3KF4xtrW3AF4kCrg_yoW8Zr4rAo WfG3y5CFW8GF4xt3ykWFs2g3WkX3WDJayUJw4aqrs8Wa1avr4Y9ryxCF1fKayYqF15tF1I g340y3sxur4rXr1rn29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7v73VFW2AGmfu7bjvjm3 AaLaJ3UjIYCTnIWjp_UUUOY7kC6x804xWl14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK 8VAvwI8IcIk0rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF 0E3s1l82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vE j48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxV AFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x02 67AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I 80ewAv7VC0I7IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCj c4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4IIrI8v6xkF7I0E8cxan2IY04v7MxkF7I0En4 kS14v26r1q6r43MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E 5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtV W8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r4j6ryUMIIF0xvE2Ix0cI8IcVCY 1x0267AKxVW8Jr0_Cr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67 AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUvcSsGvfC2KfnxnUUI43ZE Xa7IUbXAw7UUUUU== X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao The patch supports lookup, update, delete and lookup_delete operations for hash map with dynptr map. There are two major differences between the implementation of normal hash map and dynptr-keyed hash map: 1) dynptr-keyed hash map doesn't support pre-allocation. The reason is that the dynptr in map key is allocated dynamically through bpf mem allocator. The length limitation for these dynptrs is 4088 bytes now. Because there dynptrs are allocated dynamically, the consumption of memory will be smaller compared with normal hash map when there are big differences between the length of these dynptrs. 2) the freed element in dynptr-key map will not be reused immediately For normal hash map, the freed element may be reused immediately by the newly-added element, so the lookup may return an incorrect result due to element deletion and element reuse. However dynptr-key map could not do that, there are pointers (dynptrs) in the map key and the updates of these dynptrs are not atomic: both the address and the length of the dynptr will be updated. If the element is reused immediately, the access of the dynptr in the freed element may incur invalid memory access due to the mismatch between the address and the size of dynptr, so reuse the freed element after one RCU grace period. Beside the differences above, dynptr-keyed hash map also needs to handle the maybe-nullified dynptr in the map key. Signed-off-by: Hou Tao --- kernel/bpf/hashtab.c | 283 +++++++++++++++++++++++++++++++++++++++---- 1 file changed, 257 insertions(+), 26 deletions(-) diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index b14b87463ee0..edf19d36a413 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -88,6 +88,7 @@ struct bpf_htab { struct bpf_map map; struct bpf_mem_alloc ma; struct bpf_mem_alloc pcpu_ma; + struct bpf_mem_alloc dynptr_ma; struct bucket *buckets; void *elems; union { @@ -425,6 +426,7 @@ static int htab_map_alloc_check(union bpf_attr *attr) bool percpu_lru = (attr->map_flags & BPF_F_NO_COMMON_LRU); bool prealloc = !(attr->map_flags & BPF_F_NO_PREALLOC); bool zero_seed = (attr->map_flags & BPF_F_ZERO_SEED); + bool dynptr_in_key = (attr->map_flags & BPF_F_DYNPTR_IN_KEY); int numa_node = bpf_map_attr_numa_node(attr); BUILD_BUG_ON(offsetof(struct htab_elem, fnode.next) != @@ -438,6 +440,14 @@ static int htab_map_alloc_check(union bpf_attr *attr) !bpf_map_flags_access_ok(attr->map_flags)) return -EINVAL; + if (dynptr_in_key) { + if (percpu || lru || prealloc || !attr->map_extra) + return -EINVAL; + if ((attr->map_extra >> 32) || bpf_dynptr_check_size(attr->map_extra) || + bpf_mem_alloc_check_size(percpu, attr->map_extra)) + return -E2BIG; + } + if (!lru && percpu_lru) return -EINVAL; @@ -482,6 +492,7 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr) */ bool percpu_lru = (attr->map_flags & BPF_F_NO_COMMON_LRU); bool prealloc = !(attr->map_flags & BPF_F_NO_PREALLOC); + bool dynptr_in_key = (attr->map_flags & BPF_F_DYNPTR_IN_KEY); struct bpf_htab *htab; int err, i; @@ -598,6 +609,11 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr) if (err) goto free_map_locked; } + if (dynptr_in_key) { + err = bpf_mem_alloc_init(&htab->dynptr_ma, 0, false); + if (err) + goto free_map_locked; + } } return &htab->map; @@ -610,6 +626,7 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr) for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++) free_percpu(htab->map_locked[i]); bpf_map_area_free(htab->buckets); + bpf_mem_alloc_destroy(&htab->dynptr_ma); bpf_mem_alloc_destroy(&htab->pcpu_ma); bpf_mem_alloc_destroy(&htab->ma); free_elem_count: @@ -620,13 +637,55 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr) return ERR_PTR(err); } -static inline u32 htab_map_hash(const void *key, u32 key_len, u32 hashrnd) +static inline u32 __htab_map_hash(const void *key, u32 key_len, u32 hashrnd) { if (likely(key_len % 4 == 0)) return jhash2(key, key_len / 4, hashrnd); return jhash(key, key_len, hashrnd); } +static u32 htab_map_dynptr_hash(const void *key, u32 key_len, u32 hashrnd, + const struct btf_record *rec) +{ + unsigned int i, cnt = rec->cnt; + unsigned int hash = hashrnd; + unsigned int offset = 0; + + for (i = 0; i < cnt; i++) { + const struct btf_field *field = &rec->fields[i]; + const struct bpf_dynptr_kern *kptr; + unsigned int len; + + if (field->type != BPF_DYNPTR) + continue; + + /* non-dynptr part ? */ + if (offset < field->offset) + hash = jhash(key + offset, field->offset - offset, hash); + + /* Skip nullified dynptr */ + kptr = key + field->offset; + if (kptr->data) { + len = __bpf_dynptr_size(kptr); + hash = jhash(__bpf_dynptr_data(kptr, len), len, hash); + } + offset = field->offset + field->size; + } + + if (offset < key_len) + hash = jhash(key + offset, key_len - offset, hash); + + return hash; +} + +static inline u32 htab_map_hash(const void *key, u32 key_len, u32 hashrnd, + const struct btf_record *rec) +{ + if (!rec) + return __htab_map_hash(key, key_len, hashrnd); + return htab_map_dynptr_hash(key, key_len, hashrnd, rec); +} + static inline struct bucket *__select_bucket(struct bpf_htab *htab, u32 hash) { return &htab->buckets[hash & (htab->n_buckets - 1)]; @@ -637,15 +696,68 @@ static inline struct hlist_nulls_head *select_bucket(struct bpf_htab *htab, u32 return &__select_bucket(htab, hash)->head; } +static bool is_same_dynptr_key(const void *key, const void *tgt, unsigned int key_size, + const struct btf_record *rec) +{ + unsigned int i, cnt = rec->cnt; + unsigned int offset = 0; + + for (i = 0; i < cnt; i++) { + const struct btf_field *field = &rec->fields[i]; + const struct bpf_dynptr_kern *kptr, *tgt_kptr; + const void *data, *tgt_data; + unsigned int len; + + if (field->type != BPF_DYNPTR) + continue; + + if (offset < field->offset && + memcmp(key + offset, tgt + offset, field->offset - offset)) + return false; + + /* + * For a nullified dynptr in the target key, __bpf_dynptr_size() + * will return 0, and there will be no match for the target key. + */ + kptr = key + field->offset; + tgt_kptr = tgt + field->offset; + len = __bpf_dynptr_size(kptr); + if (len != __bpf_dynptr_size(tgt_kptr)) + return false; + + data = __bpf_dynptr_data(kptr, len); + tgt_data = __bpf_dynptr_data(tgt_kptr, len); + if (memcmp(data, tgt_data, len)) + return false; + + offset = field->offset + field->size; + } + + if (offset < key_size && + memcmp(key + offset, tgt + offset, key_size - offset)) + return false; + + return true; +} + +static inline bool htab_is_same_key(const void *key, const void *tgt, unsigned int key_size, + const struct btf_record *rec) +{ + if (!rec) + return !memcmp(key, tgt, key_size); + return is_same_dynptr_key(key, tgt, key_size, rec); +} + /* this lookup function can only be called with bucket lock taken */ static struct htab_elem *lookup_elem_raw(struct hlist_nulls_head *head, u32 hash, - void *key, u32 key_size) + void *key, u32 key_size, + const struct btf_record *record) { struct hlist_nulls_node *n; struct htab_elem *l; hlist_nulls_for_each_entry_rcu(l, n, head, hash_node) - if (l->hash == hash && !memcmp(&l->key, key, key_size)) + if (l->hash == hash && htab_is_same_key(l->key, key, key_size, record)) return l; return NULL; @@ -657,14 +769,15 @@ static struct htab_elem *lookup_elem_raw(struct hlist_nulls_head *head, u32 hash */ static struct htab_elem *lookup_nulls_elem_raw(struct hlist_nulls_head *head, u32 hash, void *key, - u32 key_size, u32 n_buckets) + u32 key_size, u32 n_buckets, + const struct btf_record *record) { struct hlist_nulls_node *n; struct htab_elem *l; again: hlist_nulls_for_each_entry_rcu(l, n, head, hash_node) - if (l->hash == hash && !memcmp(&l->key, key, key_size)) + if (l->hash == hash && htab_is_same_key(l->key, key, key_size, record)) return l; if (unlikely(get_nulls_value(n) != (hash & (n_buckets - 1)))) @@ -681,6 +794,7 @@ static struct htab_elem *lookup_nulls_elem_raw(struct hlist_nulls_head *head, static void *__htab_map_lookup_elem(struct bpf_map *map, void *key) { struct bpf_htab *htab = container_of(map, struct bpf_htab, map); + const struct btf_record *record; struct hlist_nulls_head *head; struct htab_elem *l; u32 hash, key_size; @@ -689,12 +803,13 @@ static void *__htab_map_lookup_elem(struct bpf_map *map, void *key) !rcu_read_lock_bh_held()); key_size = map->key_size; + record = map->key_record; - hash = htab_map_hash(key, key_size, htab->hashrnd); + hash = htab_map_hash(key, key_size, htab->hashrnd, record); head = select_bucket(htab, hash); - l = lookup_nulls_elem_raw(head, hash, key, key_size, htab->n_buckets); + l = lookup_nulls_elem_raw(head, hash, key, key_size, htab->n_buckets, record); return l; } @@ -784,6 +899,26 @@ static int htab_lru_map_gen_lookup(struct bpf_map *map, return insn - insn_buf; } +static void htab_free_dynptr_key(struct bpf_htab *htab, void *key) +{ + const struct btf_record *record = htab->map.key_record; + unsigned int i, cnt = record->cnt; + + for (i = 0; i < cnt; i++) { + const struct btf_field *field = &record->fields[i]; + struct bpf_dynptr_kern *kptr; + + if (field->type != BPF_DYNPTR) + continue; + + /* It may be accessed concurrently, so don't overwrite + * the kptr. + */ + kptr = key + field->offset; + bpf_mem_free_rcu(&htab->dynptr_ma, kptr->data); + } +} + static void check_and_free_fields(struct bpf_htab *htab, struct htab_elem *elem) { @@ -834,6 +969,68 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node) return l == tgt_l; } +static int htab_copy_dynptr_key(struct bpf_htab *htab, void *dst_key, const void *key, u32 key_size) +{ + const struct btf_record *rec = htab->map.key_record; + struct bpf_dynptr_kern *dst_kptr; + const struct btf_field *field; + unsigned int i, cnt, offset; + int err; + + offset = 0; + cnt = rec->cnt; + for (i = 0; i < cnt; i++) { + const struct bpf_dynptr_kern *kptr; + unsigned int len; + const void *data; + void *dst_data; + + field = &rec->fields[i]; + if (field->type != BPF_DYNPTR) + continue; + + if (offset < field->offset) + memcpy(dst_key + offset, key + offset, field->offset - offset); + + /* Doesn't support nullified dynptr in map key */ + kptr = key + field->offset; + if (!kptr->data) { + err = -EINVAL; + goto out; + } + len = __bpf_dynptr_size(kptr); + data = __bpf_dynptr_data(kptr, len); + + dst_data = bpf_mem_alloc(&htab->dynptr_ma, len); + if (!dst_data) { + err = -ENOMEM; + goto out; + } + + memcpy(dst_data, data, len); + dst_kptr = dst_key + field->offset; + bpf_dynptr_init(dst_kptr, dst_data, BPF_DYNPTR_TYPE_LOCAL, 0, len); + + offset = field->offset + field->size; + } + + if (offset < key_size) + memcpy(dst_key + offset, key + offset, key_size - offset); + + return 0; + +out: + for (; i > 0; i--) { + field = &rec->fields[i - 1]; + if (field->type != BPF_DYNPTR) + continue; + + dst_kptr = dst_key + field->offset; + bpf_mem_free(&htab->dynptr_ma, dst_kptr->data); + } + return err; +} + /* Called from syscall */ static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key) { @@ -850,12 +1047,12 @@ static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key) if (!key) goto find_first_elem; - hash = htab_map_hash(key, key_size, htab->hashrnd); + hash = htab_map_hash(key, key_size, htab->hashrnd, NULL); head = select_bucket(htab, hash); /* lookup the key */ - l = lookup_nulls_elem_raw(head, hash, key, key_size, htab->n_buckets); + l = lookup_nulls_elem_raw(head, hash, key, key_size, htab->n_buckets, NULL); if (!l) goto find_first_elem; @@ -895,10 +1092,27 @@ static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key) static void htab_elem_free(struct bpf_htab *htab, struct htab_elem *l) { + bool dynptr_in_key = bpf_map_has_dynptr_key(&htab->map); + + if (dynptr_in_key) + htab_free_dynptr_key(htab, l->key); + check_and_free_fields(htab, l); + if (htab->map.map_type == BPF_MAP_TYPE_PERCPU_HASH) bpf_mem_cache_free(&htab->pcpu_ma, l->ptr_to_pptr); - bpf_mem_cache_free(&htab->ma, l); + + /* + * For dynptr key, the update of dynptr in the key is not atomic: + * both the pointer and the size are updated. If the element is reused + * immediately, the access of the dynptr key during lookup procedure may + * incur invalid memory access due to mismatch between the size and the + * data pointer, so reuse the element after one RCU GP. + */ + if (dynptr_in_key) + bpf_mem_cache_free_rcu(&htab->ma, l); + else + bpf_mem_cache_free(&htab->ma, l); } static void htab_put_fd_value(struct bpf_htab *htab, struct htab_elem *l) @@ -1046,7 +1260,19 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key, } } - memcpy(l_new->key, key, key_size); + if (bpf_map_has_dynptr_key(&htab->map)) { + int copy_err; + + copy_err = htab_copy_dynptr_key(htab, l_new->key, key, key_size); + if (copy_err) { + bpf_mem_cache_free(&htab->ma, l_new); + l_new = ERR_PTR(copy_err); + goto dec_count; + } + } else { + memcpy(l_new->key, key, key_size); + } + if (percpu) { if (prealloc) { pptr = htab_elem_get_ptr(l_new, key_size); @@ -1102,6 +1328,7 @@ static long htab_map_update_elem(struct bpf_map *map, void *key, void *value, u64 map_flags) { struct bpf_htab *htab = container_of(map, struct bpf_htab, map); + const struct btf_record *key_record = map->key_record; struct htab_elem *l_new = NULL, *l_old; struct hlist_nulls_head *head; unsigned long flags; @@ -1118,7 +1345,7 @@ static long htab_map_update_elem(struct bpf_map *map, void *key, void *value, key_size = map->key_size; - hash = htab_map_hash(key, key_size, htab->hashrnd); + hash = htab_map_hash(key, key_size, htab->hashrnd, key_record); b = __select_bucket(htab, hash); head = &b->head; @@ -1128,7 +1355,7 @@ static long htab_map_update_elem(struct bpf_map *map, void *key, void *value, return -EINVAL; /* find an element without taking the bucket lock */ l_old = lookup_nulls_elem_raw(head, hash, key, key_size, - htab->n_buckets); + htab->n_buckets, key_record); ret = check_flags(htab, l_old, map_flags); if (ret) return ret; @@ -1149,7 +1376,7 @@ static long htab_map_update_elem(struct bpf_map *map, void *key, void *value, if (ret) return ret; - l_old = lookup_elem_raw(head, hash, key, key_size); + l_old = lookup_elem_raw(head, hash, key, key_size, key_record); ret = check_flags(htab, l_old, map_flags); if (ret) @@ -1221,7 +1448,7 @@ static long htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value key_size = map->key_size; - hash = htab_map_hash(key, key_size, htab->hashrnd); + hash = __htab_map_hash(key, key_size, htab->hashrnd); b = __select_bucket(htab, hash); head = &b->head; @@ -1241,7 +1468,7 @@ static long htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value if (ret) goto err_lock_bucket; - l_old = lookup_elem_raw(head, hash, key, key_size); + l_old = lookup_elem_raw(head, hash, key, key_size, NULL); ret = check_flags(htab, l_old, map_flags); if (ret) @@ -1290,7 +1517,7 @@ static long __htab_percpu_map_update_elem(struct bpf_map *map, void *key, key_size = map->key_size; - hash = htab_map_hash(key, key_size, htab->hashrnd); + hash = __htab_map_hash(key, key_size, htab->hashrnd); b = __select_bucket(htab, hash); head = &b->head; @@ -1299,7 +1526,7 @@ static long __htab_percpu_map_update_elem(struct bpf_map *map, void *key, if (ret) return ret; - l_old = lookup_elem_raw(head, hash, key, key_size); + l_old = lookup_elem_raw(head, hash, key, key_size, NULL); ret = check_flags(htab, l_old, map_flags); if (ret) @@ -1345,7 +1572,7 @@ static long __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key, key_size = map->key_size; - hash = htab_map_hash(key, key_size, htab->hashrnd); + hash = htab_map_hash(key, key_size, htab->hashrnd, NULL); b = __select_bucket(htab, hash); head = &b->head; @@ -1365,7 +1592,7 @@ static long __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key, if (ret) goto err_lock_bucket; - l_old = lookup_elem_raw(head, hash, key, key_size); + l_old = lookup_elem_raw(head, hash, key, key_size, NULL); ret = check_flags(htab, l_old, map_flags); if (ret) @@ -1411,6 +1638,7 @@ static long htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key, static long htab_map_delete_elem(struct bpf_map *map, void *key) { struct bpf_htab *htab = container_of(map, struct bpf_htab, map); + const struct btf_record *key_record = map->key_record; struct hlist_nulls_head *head; struct bucket *b; struct htab_elem *l; @@ -1423,7 +1651,7 @@ static long htab_map_delete_elem(struct bpf_map *map, void *key) key_size = map->key_size; - hash = htab_map_hash(key, key_size, htab->hashrnd); + hash = htab_map_hash(key, key_size, htab->hashrnd, key_record); b = __select_bucket(htab, hash); head = &b->head; @@ -1431,7 +1659,7 @@ static long htab_map_delete_elem(struct bpf_map *map, void *key) if (ret) return ret; - l = lookup_elem_raw(head, hash, key, key_size); + l = lookup_elem_raw(head, hash, key, key_size, key_record); if (l) { hlist_nulls_del_rcu(&l->hash_node); @@ -1459,7 +1687,7 @@ static long htab_lru_map_delete_elem(struct bpf_map *map, void *key) key_size = map->key_size; - hash = htab_map_hash(key, key_size, htab->hashrnd); + hash = __htab_map_hash(key, key_size, htab->hashrnd); b = __select_bucket(htab, hash); head = &b->head; @@ -1467,7 +1695,7 @@ static long htab_lru_map_delete_elem(struct bpf_map *map, void *key) if (ret) return ret; - l = lookup_elem_raw(head, hash, key, key_size); + l = lookup_elem_raw(head, hash, key, key_size, NULL); if (l) hlist_nulls_del_rcu(&l->hash_node); @@ -1564,6 +1792,7 @@ static void htab_map_free(struct bpf_map *map) bpf_map_free_elem_count(map); free_percpu(htab->extra_elems); bpf_map_area_free(htab->buckets); + bpf_mem_alloc_destroy(&htab->dynptr_ma); bpf_mem_alloc_destroy(&htab->pcpu_ma); bpf_mem_alloc_destroy(&htab->ma); if (htab->use_percpu_counter) @@ -1600,6 +1829,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key, bool is_percpu, u64 flags) { struct bpf_htab *htab = container_of(map, struct bpf_htab, map); + const struct btf_record *key_record; struct hlist_nulls_head *head; unsigned long bflags; struct htab_elem *l; @@ -1608,8 +1838,9 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key, int ret; key_size = map->key_size; + key_record = map->key_record; - hash = htab_map_hash(key, key_size, htab->hashrnd); + hash = htab_map_hash(key, key_size, htab->hashrnd, key_record); b = __select_bucket(htab, hash); head = &b->head; @@ -1617,7 +1848,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key, if (ret) return ret; - l = lookup_elem_raw(head, hash, key, key_size); + l = lookup_elem_raw(head, hash, key, key_size, key_record); if (!l) { ret = -ENOENT; } else { From patchwork Tue Oct 8 09:14:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13826016 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E979517CA1A for ; Tue, 8 Oct 2024 09:02:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378156; cv=none; b=JHvF283h8qKnCUtGL8ifMpcIEtDfdRbm3HHX2G8iPnMTCh9fZzJEoYVxLxarb0JMCfSqYnIISLJ1a89HDaXfwf7OgYV7oBfRHlrKJVWaI4IEREM/t17Q7Buns1y9G2O37DKHN63GpX+uFnT6s+DwDtR0qZlFkiaocGukQ0/aRJI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378156; c=relaxed/simple; bh=GiWGXqABqnZQtJBtWCg46gM92G0ZYjGxykw4AbcoRS8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XK4glkcJGpPM1KIzxLzh4mcEIuzliztSSylADEwTTo+oAvpPL7kYKrZ3/DmwbWu3j7Yn5oRDFA//2t9/VkLRFlNKZLgiKkL6U+tbYPVV4POWUIg35caosuPjHWqgzUcAbvAsAETB7nONseCWg2LpiuWQABHm6CEC1ij5ncCHRe4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4XN9581Lbhz4f3jkq for ; Tue, 8 Oct 2024 17:02:20 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id DFCBB1A08FC for ; Tue, 8 Oct 2024 17:02:31 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.60]) by APP4 (Coremail) with SMTP id gCh0CgDH+sYd9QRnbOEHDg--.25681S17; Tue, 08 Oct 2024 17:02:31 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 13/16] bpf: Export bpf_dynptr_set_size Date: Tue, 8 Oct 2024 17:14:58 +0800 Message-ID: <20241008091501.8302-14-houtao@huaweicloud.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20241008091501.8302-1-houtao@huaweicloud.com> References: <20241008091501.8302-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgDH+sYd9QRnbOEHDg--.25681S17 X-Coremail-Antispam: 1UD129KBjvJXoW7Aw18Aw1rur1xGryrCrWfKrg_yoW8Gr1UpF ykC347Zr48tFWxXw4UJFs2yw4UKay7Wr17GFy8t34rXwsFvF9xZF1jgry7Kr98t3yDGr43 AFn7KrWFvFy8Z37anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPvb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2 AFwI0_Jw0_GFyl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r 43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Gr0_Xr1lIxAIcVC0I7IYx2IY6xkF 7I0E14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14 v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuY vjxUoo7KUUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao It will be used by the following patch to shrink the size of dynptr. Signed-off-by: Hou Tao --- include/linux/bpf.h | 1 + kernel/bpf/helpers.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 127952377025..23db20e6402f 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1301,6 +1301,7 @@ enum bpf_dynptr_type { }; int bpf_dynptr_check_size(u32 size); +void bpf_dynptr_set_size(struct bpf_dynptr_kern *ptr, u32 new_size); u32 __bpf_dynptr_size(const struct bpf_dynptr_kern *ptr); const void *__bpf_dynptr_data(const struct bpf_dynptr_kern *ptr, u32 len); void *__bpf_dynptr_data_rw(const struct bpf_dynptr_kern *ptr, u32 len); diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index f39521b53a4e..f6fd996e30c7 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -1674,7 +1674,7 @@ u32 __bpf_dynptr_size(const struct bpf_dynptr_kern *ptr) return ptr->size & DYNPTR_SIZE_MASK; } -static void bpf_dynptr_set_size(struct bpf_dynptr_kern *ptr, u32 new_size) +void bpf_dynptr_set_size(struct bpf_dynptr_kern *ptr, u32 new_size) { u32 metadata = ptr->size & ~DYNPTR_SIZE_MASK; From patchwork Tue Oct 8 09:14:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13826018 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 779A617C213 for ; Tue, 8 Oct 2024 09:02:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378157; cv=none; b=SRjmjqv0owUo8+oXWlRRdQwI9oEPSODDrG7OC71M3NAgE3flUs3/BVHoDI9Sx+84paOJmX9DtkZvLwNB4E/PjtF4gV+F37H0mQMiQApI4B3+TrDNSJ7RtGBDQ9T61QbhZXTb52ZbxftCJCPqka1zWu3/i993SUmFtXrVO6RZg/I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378157; c=relaxed/simple; bh=IlRpadb3exyj92NARXH//zVRrVCI6frw+VScuWwIuLg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Gp0gpvYYGAK3LWC1YPe9yWx+b3/RcSaVKe+MWIRkmE1MPe9JtGsiX0MBwzekPgcmi8zyL04UpuLtDgHcNqhfSFBi1sboCRw8nNewBa08i6L8XMobn2dLX5nEDzmKa2fXROvtwxefDQWZyQyUVjgriqB7PEuUB53paQgGWIPUzpQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4XN9526YZ2z4f3lWJ for ; Tue, 8 Oct 2024 17:02:14 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 7497D1A092F for ; Tue, 8 Oct 2024 17:02:32 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.60]) by APP4 (Coremail) with SMTP id gCh0CgDH+sYd9QRnbOEHDg--.25681S18; Tue, 08 Oct 2024 17:02:32 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 14/16] bpf: Support get_next_key operation for dynptr key in hash map Date: Tue, 8 Oct 2024 17:14:59 +0800 Message-ID: <20241008091501.8302-15-houtao@huaweicloud.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20241008091501.8302-1-houtao@huaweicloud.com> References: <20241008091501.8302-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgDH+sYd9QRnbOEHDg--.25681S18 X-Coremail-Antispam: 1UD129KBjvJXoWxXF1UXryxGFWfCw4UKw1Dtrb_yoWrKF45pF 18Ga97Xr40kF4qqF45Wan2vw4akr1Igw18CFykGas7GFnrWr97Zw18tFW0kryYyFZrJr4r tr4jq34Y9ws5JrDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPqb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAV Cq3wA2048vs2IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0 rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267 AKxVWxJr0_GcWl84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2z4x0Y4vEx4A2jsIEc7CjxVAF wI0_GcCE3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2 WlYx0E2Ix0cI8IcVAFwI0_JrI_JrylYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkE bVWUJVW8JwACjcxG0xvY0x0EwIxGrwACI402YVCY1x02628vn2kIc2xKxwCY1x0262kKe7 AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02 F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GF ylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI42IY6xIIjxv20xvEc7Cj xVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI 0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x 07Udl1kUUUUU= X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao It firstly passed the key_record to htab_map_hash() and lookup_nulls_eleme_raw() to find the target key, then it uses htab_copy_dynptr_key() helper to copy from the target key to the next key used for output. Signed-off-by: Hou Tao --- kernel/bpf/hashtab.c | 55 ++++++++++++++++++++++++++++++-------------- 1 file changed, 38 insertions(+), 17 deletions(-) diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index edf19d36a413..b647fe9f8f9f 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -969,7 +969,8 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node) return l == tgt_l; } -static int htab_copy_dynptr_key(struct bpf_htab *htab, void *dst_key, const void *key, u32 key_size) +static int htab_copy_dynptr_key(struct bpf_htab *htab, void *dst_key, const void *key, u32 key_size, + bool copy_in) { const struct btf_record *rec = htab->map.key_record; struct bpf_dynptr_kern *dst_kptr; @@ -994,22 +995,32 @@ static int htab_copy_dynptr_key(struct bpf_htab *htab, void *dst_key, const void /* Doesn't support nullified dynptr in map key */ kptr = key + field->offset; - if (!kptr->data) { + if (copy_in && !kptr->data) { err = -EINVAL; goto out; } len = __bpf_dynptr_size(kptr); data = __bpf_dynptr_data(kptr, len); - dst_data = bpf_mem_alloc(&htab->dynptr_ma, len); - if (!dst_data) { - err = -ENOMEM; - goto out; - } + dst_kptr = dst_key + field->offset; + if (copy_in) { + dst_data = bpf_mem_alloc(&htab->dynptr_ma, len); + if (!dst_data) { + err = -ENOMEM; + goto out; + } + bpf_dynptr_init(dst_kptr, dst_data, BPF_DYNPTR_TYPE_LOCAL, 0, len); + } else { + dst_data = __bpf_dynptr_data_rw(dst_kptr, len); + if (!dst_data) { + err = -ENOSPC; + goto out; + } + if (__bpf_dynptr_size(dst_kptr) > len) + bpf_dynptr_set_size(dst_kptr, len); + } memcpy(dst_data, data, len); - dst_kptr = dst_key + field->offset; - bpf_dynptr_init(dst_kptr, dst_data, BPF_DYNPTR_TYPE_LOCAL, 0, len); offset = field->offset + field->size; } @@ -1020,7 +1031,7 @@ static int htab_copy_dynptr_key(struct bpf_htab *htab, void *dst_key, const void return 0; out: - for (; i > 0; i--) { + for (; i > 0 && copy_in; i--) { field = &rec->fields[i - 1]; if (field->type != BPF_DYNPTR) continue; @@ -1031,10 +1042,22 @@ static int htab_copy_dynptr_key(struct bpf_htab *htab, void *dst_key, const void return err; } +static inline int htab_copy_next_key(struct bpf_htab *htab, void *next_key, const void *key, + u32 key_size) +{ + if (!bpf_map_has_dynptr_key(&htab->map)) { + memcpy(next_key, key, key_size); + return 0; + } + + return htab_copy_dynptr_key(htab, next_key, key, key_size, false); +} + /* Called from syscall */ static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key) { struct bpf_htab *htab = container_of(map, struct bpf_htab, map); + const struct btf_record *key_record = map->key_record; struct hlist_nulls_head *head; struct htab_elem *l, *next_l; u32 hash, key_size; @@ -1047,12 +1070,12 @@ static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key) if (!key) goto find_first_elem; - hash = htab_map_hash(key, key_size, htab->hashrnd, NULL); + hash = htab_map_hash(key, key_size, htab->hashrnd, key_record); head = select_bucket(htab, hash); /* lookup the key */ - l = lookup_nulls_elem_raw(head, hash, key, key_size, htab->n_buckets, NULL); + l = lookup_nulls_elem_raw(head, hash, key, key_size, htab->n_buckets, key_record); if (!l) goto find_first_elem; @@ -1063,8 +1086,7 @@ static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key) if (next_l) { /* if next elem in this hash list is non-zero, just return it */ - memcpy(next_key, next_l->key, key_size); - return 0; + return htab_copy_next_key(htab, next_key, next_l->key, key_size); } /* no more elements in this hash list, go to the next bucket */ @@ -1081,8 +1103,7 @@ static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key) struct htab_elem, hash_node); if (next_l) { /* if it's not empty, just return it */ - memcpy(next_key, next_l->key, key_size); - return 0; + return htab_copy_next_key(htab, next_key, next_l->key, key_size); } } @@ -1263,7 +1284,7 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key, if (bpf_map_has_dynptr_key(&htab->map)) { int copy_err; - copy_err = htab_copy_dynptr_key(htab, l_new->key, key, key_size); + copy_err = htab_copy_dynptr_key(htab, l_new->key, key, key_size, true); if (copy_err) { bpf_mem_cache_free(&htab->ma, l_new); l_new = ERR_PTR(copy_err); From patchwork Tue Oct 8 09:15:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13826019 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 118B317D36A for ; Tue, 8 Oct 2024 09:02:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378157; cv=none; b=s31tnxyUmGBcraT7lUEPPM4b2xY2hTFA0ZMROXBqKLLOMhTLsITWp8bCK6xyYaZdvWxhM5bz9DQ4lboyQFmuWf+3n8LLVLrOmJSFY06g4mlFxJZD5Bs8q86q6UkpiHX4rnlpd5aWxRk5RBWTHoc/2eaapR3ZX06C2hn1VWda0OE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378157; c=relaxed/simple; bh=WjUFSfDfnKpIn1S8RFzkxjQfdeoRlUDCsUPJdQ5sg/0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eo750sGrqLB4kFK5ikwxyEDazhf+GjyHDxSm9+0c9yWgjkY3uM6hffq6PZUkxgy2HEAGtz2nlDYLQEv6zO9ReWnVFGofSG6ny3xGkZ08Hr3Irz4Pawz8LrU00eyhIKwrDU7yrEMF7QCWnLA/R62xaVbZAWNo5VQEB1avvtPSeAo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4XN9533l8Lz4f3lfk for ; Tue, 8 Oct 2024 17:02:15 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 14DCD1A08FC for ; Tue, 8 Oct 2024 17:02:33 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.60]) by APP4 (Coremail) with SMTP id gCh0CgDH+sYd9QRnbOEHDg--.25681S19; Tue, 08 Oct 2024 17:02:32 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 15/16] bpf: Enable BPF_F_DYNPTR_IN_KEY for hash map Date: Tue, 8 Oct 2024 17:15:00 +0800 Message-ID: <20241008091501.8302-16-houtao@huaweicloud.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20241008091501.8302-1-houtao@huaweicloud.com> References: <20241008091501.8302-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgDH+sYd9QRnbOEHDg--.25681S19 X-Coremail-Antispam: 1UD129KBjvdXoWrtw48uryftw48GF1UZFykAFb_yoW3JFb_Jw 48Xr1vgrs8Aay2v3yUCayfZFn7KFyftF1kCFyDWFZ2kF1ag3W8J34YvryUZr98WF1fWrZ0 vFsagryvvr1SvjkaLaAFLSUrUUUUjb8apTn2vfkv8UJUUUU8Yxn0WfASr-VFAUDa7-sFnT 9fnUUIcSsGvfJTRUUUbg8YFVCjjxCrM7AC8VAFwI0_Wr0E3s1l1xkIjI8I6I8E6xAIw20E Y4v20xvaj40_Wr0E3s1l1IIY67AEw4v_Jr0_Jr4l82xGYIkIc2x26280x7IE14v26r126s 0DM28IrcIa0xkI8VCY1x0267AKxVW5JVCq3wA2ocxC64kIII0Yj41l84x0c7CEw4AK67xG Y2AK021l84ACjcxK6xIIjxv20xvE14v26F1j6w1UM28EF7xvwVC0I7IYx2IY6xkF7I0E14 v26F4UJVW0owA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AK xVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ew Av7VC0I7IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY 6r1j6r4UM4x0Y48IcxkI7VAKI48JM4IIrI8v6xkF7I0E8cxan2IY04v7MxkF7I0En4kS14 v26r1q6r43MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8C rVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8Zw CIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r4j6ryUMIIF0xvE2Ix0cI8IcVCY1x02 67AKxVW8Jr0_Cr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxV W8JVWxJwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUvcSsGvfC2KfnxnUUI43ZEXa7I UbXAw7UUUUU== X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao Enable BPF_F_DYNPTR_IN_KEY in HTAB_CREATE_FLAG_MASK to support the creation of dynptr key hash map. Signed-off-by: Hou Tao --- kernel/bpf/hashtab.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index b647fe9f8f9f..b34693a7f35c 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -19,7 +19,7 @@ #define HTAB_CREATE_FLAG_MASK \ (BPF_F_NO_PREALLOC | BPF_F_NO_COMMON_LRU | BPF_F_NUMA_NODE | \ - BPF_F_ACCESS_MASK | BPF_F_ZERO_SEED) + BPF_F_ACCESS_MASK | BPF_F_ZERO_SEED | BPF_F_DYNPTR_IN_KEY) #define BATCH_OPS(_name) \ .map_lookup_batch = \ From patchwork Tue Oct 8 09:15:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hou Tao X-Patchwork-Id: 13826020 X-Patchwork-Delegate: bpf@iogearbox.net Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D9C131DEFEB for ; Tue, 8 Oct 2024 09:02:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378159; cv=none; b=K2cg7qaEJuSyNzALeiDJb8xwkUabh98gwcYCDLpNxf8kqKeJA8LApvcjXXkvES9oNP1q4SyA2giyd2KaH2LlTLimLCtxmW35NMe5ZGDP3gLND5zpY7cUyN3MgL10IlqTdNtwHtVHMe1EQUOFqr82sbWEbrO/pRJXl9q2jIN/nyk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728378159; c=relaxed/simple; bh=PXmMAGcykCdy4Gpwlpggw8hsAJzMbJ3gcGp42btH3BU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=k8mSmKZwAx6pL7LP8SViU7S+H8/J+uz4zWrNSOGyyPjq//8iIJm2/M4OI3k+yLh9llxFLfrpgPPpYVE0XLjEpNWu7styahwfBvNA+IDvxapOPpKJWylhICY/mz3rh2Pgf/V2JY7LDYDjDzbAylNReJv4saqvjRYzXZSKezkwQ/M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4XN95416jDz4f3lfn for ; Tue, 8 Oct 2024 17:02:16 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id ABF181A0A22 for ; Tue, 8 Oct 2024 17:02:33 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.60]) by APP4 (Coremail) with SMTP id gCh0CgDH+sYd9QRnbOEHDg--.25681S20; Tue, 08 Oct 2024 17:02:33 +0800 (CST) From: Hou Tao To: bpf@vger.kernel.org Cc: Martin KaFai Lau , Alexei Starovoitov , Andrii Nakryiko , Eduard Zingerman , Song Liu , Hao Luo , Yonghong Song , Daniel Borkmann , KP Singh , Stanislav Fomichev , Jiri Olsa , John Fastabend , houtao1@huawei.com, xukuohai@huawei.com Subject: [PATCH bpf-next 16/16] selftests/bpf: Add test cases for hash map with dynptr key Date: Tue, 8 Oct 2024 17:15:01 +0800 Message-ID: <20241008091501.8302-17-houtao@huaweicloud.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20241008091501.8302-1-houtao@huaweicloud.com> References: <20241008091501.8302-1-houtao@huaweicloud.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgDH+sYd9QRnbOEHDg--.25681S20 X-Coremail-Antispam: 1UD129KBjvAXoWfZryUXr1fCF1DKryfXF1fWFg_yoW5ZFWkCo ZxWrs0ya48Cas5Aw1DW3s7Ca1fZw48JryDCr4Sqws8Jr48KryYva4xGw45Gw42vw4rtFy8 uryfZw1fXrZ2gr15n29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7v73VFW2AGmfu7bjvjm3 AaLaJ3UjIYCTnIWjp_UUUO57kC6x804xWl14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK 8VAvwI8IcIk0rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF 0E3s1l82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vE j48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxV AFwI0_Cr1j6rxdM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E 14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7 xfMcIj6xIIjxv20xvE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Y z7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7CjxVAaw2 AFwI0_Jw0_GFyl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r 43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Gr0_Xr1lIxAIcVC0I7IYx2IY6xkF 7I0E14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14 v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuY vjxUoo7KUUUUU X-CM-SenderInfo: xkrx3t3r6k3tpzhluzxrxghudrp/ X-Patchwork-Delegate: bpf@iogearbox.net From: Hou Tao Add three positive test cases to test the basic operations on the dynptr-keyed hash map. These basic operations include lookup, update, delete, lookup_and_delete, and get_next_key. These operations are exercised through both bpf syscall and bpf program. These three test cases use different map keys. The first test case uses both bpf_dynptr and a struct with only bpf_dynptr as map key, the second one uses a struct with a integer and a bpf_dynptr as map key, and the last one uses a struct with two bpf_dynptr as map key. Also add multiple negative test cases for dynptr-keyed hash map. These test cases check whether the type of the register for the map key is expected, whether the offset of the access is aligned, and whether the layout of dynptr and non-dynptr parts in the stack is matched with the definition of map->key_record. Signed-off-by: Hou Tao --- .../bpf/prog_tests/htab_dynkey_test.c | 451 ++++++++++++++++++ .../bpf/progs/htab_dynkey_test_failure.c | 270 +++++++++++ .../bpf/progs/htab_dynkey_test_success.c | 399 ++++++++++++++++ 3 files changed, 1120 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/htab_dynkey_test.c create mode 100644 tools/testing/selftests/bpf/progs/htab_dynkey_test_failure.c create mode 100644 tools/testing/selftests/bpf/progs/htab_dynkey_test_success.c diff --git a/tools/testing/selftests/bpf/prog_tests/htab_dynkey_test.c b/tools/testing/selftests/bpf/prog_tests/htab_dynkey_test.c new file mode 100644 index 000000000000..30fc085cfc4c --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/htab_dynkey_test.c @@ -0,0 +1,451 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (C) 2024. Huawei Technologies Co., Ltd */ +#include +#include +#include +#include +#include + +#include "htab_dynkey_test_success.skel.h" +#include "htab_dynkey_test_failure.skel.h" + +struct id_dname_key { + int id; + struct bpf_dynptr_user name; +}; + +struct dname_key { + struct bpf_dynptr_user name; +}; + +struct multiple_dynptr_key { + struct dname_key f_1; + unsigned long f_2; + struct id_dname_key f_3; + unsigned long f_4; +}; + +static char *name_list[] = { + "systemd", + "[rcu_sched]", + "[kworker/42:0H-events_highpri]", + "[ksoftirqd/58]", + "[rcu_tasks_trace]", +}; + +#define INIT_VALUE 100 +#define INIT_ID 1000 + +static void setup_pure_dynptr_key_map(int fd) +{ + struct bpf_dynptr_user key, _cur_key, _next_key; + struct bpf_dynptr_user *cur_key, *next_key; + bool marked[ARRAY_SIZE(name_list)]; + unsigned int i, next_idx, size; + unsigned long value, got; + char name[2][64]; + char msg[64]; + void *data; + int err; + + /* lookup non-existent keys */ + for (i = 0; i < ARRAY_SIZE(name_list); i++) { + snprintf(msg, sizeof(msg), "#%u bad lookup", i); + /* Use strdup() to ensure that the content pointed by dynptr is + * used for lookup instead of the pointer in dynptr. sys_bpf() + * will handle the NULL case properly. + */ + data = strdup(name_list[i]); + bpf_dynptr_user_init(data, strlen(name_list[i]) + 1, &key); + err = bpf_map_lookup_elem(fd, &key, &value); + ASSERT_EQ(err, -ENOENT, msg); + free(data); + } + + /* update keys */ + for (i = 0; i < ARRAY_SIZE(name_list); i++) { + snprintf(msg, sizeof(msg), "#%u insert", i); + data = strdup(name_list[i]); + bpf_dynptr_user_init(data, strlen(name_list[i]) + 1, &key); + value = INIT_VALUE + i; + err = bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST); + ASSERT_OK(err, msg); + free(data); + } + + /* lookup existent keys */ + for (i = 0; i < ARRAY_SIZE(name_list); i++) { + snprintf(msg, sizeof(msg), "#%u lookup", i); + data = strdup(name_list[i]); + bpf_dynptr_user_init(data, strlen(name_list[i]) + 1, &key); + got = 0; + err = bpf_map_lookup_elem(fd, &key, &got); + ASSERT_OK(err, msg); + free(data); + + value = INIT_VALUE + i; + ASSERT_EQ(got, value, msg); + } + + /* delete keys */ + for (i = 0; i < ARRAY_SIZE(name_list); i++) { + snprintf(msg, sizeof(msg), "#%u delete", i); + data = strdup(name_list[i]); + bpf_dynptr_user_init(data, strlen(name_list[i]) + 1, &key); + err = bpf_map_delete_elem(fd, &key); + ASSERT_OK(err, msg); + free(data); + } + + /* re-insert keys */ + for (i = 0; i < ARRAY_SIZE(name_list); i++) { + snprintf(msg, sizeof(msg), "#%u re-insert", i); + data = strdup(name_list[i]); + bpf_dynptr_user_init(data, strlen(name_list[i]) + 1, &key); + value = 0; + err = bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST); + ASSERT_OK(err, msg); + free(data); + } + + /* overwrite keys */ + for (i = 0; i < ARRAY_SIZE(name_list); i++) { + snprintf(msg, sizeof(msg), "#%u overwrite", i); + data = strdup(name_list[i]); + bpf_dynptr_user_init(data, strlen(name_list[i]) + 1, &key); + value = INIT_VALUE + i; + err = bpf_map_update_elem(fd, &key, &value, BPF_EXIST); + ASSERT_OK(err, msg); + free(data); + } + + /* get_next keys */ + next_idx = 0; + cur_key = NULL; + next_key = &_next_key; + memset(&marked, 0, sizeof(marked)); + while (true) { + bpf_dynptr_user_init(name[next_idx], sizeof(name[next_idx]), next_key); + err = bpf_map_get_next_key(fd, cur_key, next_key); + if (err) { + ASSERT_EQ(err, -ENOENT, "get_next_key"); + break; + } + + size = bpf_dynptr_user_size(next_key); + data = bpf_dynptr_user_data(next_key); + for (i = 0; i < ARRAY_SIZE(name_list); i++) { + if (size == strlen(name_list[i]) + 1 && + !memcmp(name_list[i], data, size)) { + ASSERT_FALSE(marked[i], name_list[i]); + marked[i] = true; + break; + } + } + ASSERT_EQ(next_key->rsvd, 0, "rsvd"); + + if (!cur_key) + cur_key = &_cur_key; + *cur_key = *next_key; + next_idx ^= 1; + } + + for (i = 0; i < ARRAY_SIZE(marked); i++) + ASSERT_TRUE(marked[i], name_list[i]); + + /* lookup_and_delete all elements except the first one */ + for (i = 1; i < ARRAY_SIZE(name_list); i++) { + snprintf(msg, sizeof(msg), "#%u lookup_delete", i); + data = strdup(name_list[i]); + bpf_dynptr_user_init(data, strlen(name_list[i]) + 1, &key); + got = 0; + err = bpf_map_lookup_and_delete_elem(fd, &key, &got); + ASSERT_OK(err, msg); + free(data); + + value = INIT_VALUE + i; + ASSERT_EQ(got, value, msg); + } + + /* get the key after the first element */ + cur_key = &_cur_key; + strncpy(name[0], name_list[0], sizeof(name[0]) - 1); + name[0][sizeof(name[0]) - 1] = 0; + bpf_dynptr_user_init(name[0], strlen(name[0]) + 1, cur_key); + + next_key = &_next_key; + bpf_dynptr_user_init(name[1], sizeof(name[1]), next_key); + err = bpf_map_get_next_key(fd, cur_key, next_key); + ASSERT_EQ(err, -ENOENT, "get_last"); +} + +static void setup_mixed_dynptr_key_map(int fd) +{ + struct id_dname_key key, _cur_key, _next_key; + struct id_dname_key *cur_key, *next_key; + bool marked[ARRAY_SIZE(name_list)]; + unsigned int i, next_idx, size; + unsigned long value; + char name[2][64]; + char msg[64]; + void *data; + int err; + + /* Zero the hole */ + memset(&key, 0, sizeof(key)); + + /* lookup non-existent keys */ + for (i = 0; i < ARRAY_SIZE(name_list); i++) { + snprintf(msg, sizeof(msg), "#%u bad lookup", i); + key.id = INIT_ID + i; + data = strdup(name_list[i]); + bpf_dynptr_user_init(data, strlen(name_list[i]) + 1, &key.name); + err = bpf_map_lookup_elem(fd, &key, &value); + ASSERT_EQ(err, -ENOENT, msg); + free(data); + } + + /* update keys */ + for (i = 0; i < ARRAY_SIZE(name_list); i++) { + snprintf(msg, sizeof(msg), "#%u insert", i); + key.id = INIT_ID + i; + data = strdup(name_list[i]); + bpf_dynptr_user_init(data, strlen(name_list[i]) + 1, &key.name); + value = INIT_VALUE + i; + err = bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST); + ASSERT_OK(err, msg); + free(data); + } + + /* lookup existent keys */ + for (i = 0; i < ARRAY_SIZE(name_list); i++) { + unsigned long got = 0; + + snprintf(msg, sizeof(msg), "#%u lookup", i); + key.id = INIT_ID + i; + data = strdup(name_list[i]); + bpf_dynptr_user_init(data, strlen(name_list[i]) + 1, &key.name); + err = bpf_map_lookup_elem(fd, &key, &got); + ASSERT_OK(err, msg); + free(data); + + value = INIT_VALUE + i; + ASSERT_EQ(got, value, msg); + } + + /* delete keys */ + for (i = 0; i < ARRAY_SIZE(name_list); i++) { + snprintf(msg, sizeof(msg), "#%u delete", i); + key.id = INIT_ID + i; + data = strdup(name_list[i]); + bpf_dynptr_user_init(data, strlen(name_list[i]) + 1, &key.name); + err = bpf_map_delete_elem(fd, &key); + ASSERT_OK(err, msg); + free(data); + } + + /* re-insert keys */ + for (i = 0; i < ARRAY_SIZE(name_list); i++) { + snprintf(msg, sizeof(msg), "#%u re-insert", i); + key.id = INIT_ID + i; + data = strdup(name_list[i]); + bpf_dynptr_user_init(data, strlen(name_list[i]) + 1, &key.name); + value = 0; + err = bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST); + ASSERT_OK(err, msg); + free(data); + } + + /* overwrite keys */ + for (i = 0; i < ARRAY_SIZE(name_list); i++) { + snprintf(msg, sizeof(msg), "#%u overwrite", i); + key.id = INIT_ID + i; + data = strdup(name_list[i]); + bpf_dynptr_user_init(data, strlen(name_list[i]) + 1, &key.name); + value = INIT_VALUE + i; + err = bpf_map_update_elem(fd, &key, &value, BPF_EXIST); + ASSERT_OK(err, msg); + free(data); + } + + /* get_next keys */ + next_idx = 0; + cur_key = NULL; + next_key = &_next_key; + memset(&marked, 0, sizeof(marked)); + while (true) { + bpf_dynptr_user_init(name[next_idx], sizeof(name[next_idx]), &next_key->name); + err = bpf_map_get_next_key(fd, cur_key, next_key); + if (err) { + ASSERT_EQ(err, -ENOENT, "last get_next"); + break; + } + + size = bpf_dynptr_user_size(&next_key->name); + data = bpf_dynptr_user_data(&next_key->name); + for (i = 0; i < ARRAY_SIZE(name_list); i++) { + if (size == strlen(name_list[i]) + 1 && + !memcmp(name_list[i], data, size)) { + ASSERT_FALSE(marked[i], name_list[i]); + ASSERT_EQ(next_key->id, INIT_ID + i, name_list[i]); + marked[i] = true; + break; + } + } + ASSERT_EQ(next_key->name.rsvd, 0, "rsvd"); + + if (!cur_key) + cur_key = &_cur_key; + *cur_key = *next_key; + next_idx ^= 1; + } + + for (i = 0; i < ARRAY_SIZE(marked); i++) + ASSERT_TRUE(marked[i], name_list[i]); +} + +static void setup_multiple_dynptr_key_map(int fd) +{ + struct multiple_dynptr_key key, cur_key, next_key; + unsigned long value; + unsigned int size; + char name[4][64]; + void *data[2]; + int err; + + /* Zero the hole */ + memset(&key, 0, sizeof(key)); + + key.f_2 = 2; + key.f_3.id = 3; + key.f_4 = 4; + + /* lookup a non-existent key */ + data[0] = strdup(name_list[0]); + data[1] = strdup(name_list[1]); + bpf_dynptr_user_init(data[0], strlen(name_list[0]) + 1, &key.f_1.name); + bpf_dynptr_user_init(data[1], strlen(name_list[1]) + 1, &key.f_3.name); + err = bpf_map_lookup_elem(fd, &key, &value); + ASSERT_EQ(err, -ENOENT, "lookup"); + + /* update key */ + value = INIT_VALUE; + err = bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST); + ASSERT_OK(err, "update"); + free(data[0]); + free(data[1]); + + /* lookup key */ + data[0] = strdup(name_list[0]); + data[1] = strdup(name_list[1]); + bpf_dynptr_user_init(data[0], strlen(name_list[0]) + 1, &key.f_1.name); + bpf_dynptr_user_init(data[1], strlen(name_list[1]) + 1, &key.f_3.name); + err = bpf_map_lookup_elem(fd, &key, &value); + ASSERT_OK(err, "lookup"); + ASSERT_EQ(value, INIT_VALUE, "lookup"); + + /* delete key */ + err = bpf_map_delete_elem(fd, &key); + ASSERT_OK(err, "delete"); + free(data[0]); + free(data[1]); + + /* re-insert keys */ + bpf_dynptr_user_init(name_list[0], strlen(name_list[0]) + 1, &key.f_1.name); + bpf_dynptr_user_init(name_list[1], strlen(name_list[1]) + 1, &key.f_3.name); + value = 0; + err = bpf_map_update_elem(fd, &key, &value, BPF_NOEXIST); + ASSERT_OK(err, "re-insert"); + + /* overwrite keys */ + data[0] = strdup(name_list[0]); + data[1] = strdup(name_list[1]); + bpf_dynptr_user_init(data[0], strlen(name_list[0]) + 1, &key.f_1.name); + bpf_dynptr_user_init(data[1], strlen(name_list[1]) + 1, &key.f_3.name); + value = INIT_VALUE; + err = bpf_map_update_elem(fd, &key, &value, BPF_EXIST); + ASSERT_OK(err, "overwrite"); + free(data[0]); + free(data[1]); + + /* get_next_key */ + bpf_dynptr_user_init(name[0], sizeof(name[0]), &next_key.f_1.name); + bpf_dynptr_user_init(name[1], sizeof(name[1]), &next_key.f_3.name); + err = bpf_map_get_next_key(fd, NULL, &next_key); + ASSERT_OK(err, "first get_next"); + + size = bpf_dynptr_user_size(&next_key.f_1.name); + data[0] = bpf_dynptr_user_data(&next_key.f_1.name); + if (ASSERT_EQ(size, strlen(name_list[0]) + 1, "f_1 size")) + ASSERT_TRUE(!memcmp(name_list[0], data[0], size), "f_1 data"); + ASSERT_EQ(next_key.f_1.name.rsvd, 0, "f_1 rsvd"); + + ASSERT_EQ(next_key.f_2, 2, "f_2"); + + ASSERT_EQ(next_key.f_3.id, 3, "f_3 id"); + size = bpf_dynptr_user_size(&next_key.f_3.name); + data[0] = bpf_dynptr_user_data(&next_key.f_3.name); + if (ASSERT_EQ(size, strlen(name_list[1]) + 1, "f_3 size")) + ASSERT_TRUE(!memcmp(name_list[1], data[0], size), "f_3 data"); + ASSERT_EQ(next_key.f_3.name.rsvd, 0, "f_3 rsvd"); + + ASSERT_EQ(next_key.f_4, 4, "f_4"); + + cur_key = next_key; + bpf_dynptr_user_init(name[2], sizeof(name[2]), &next_key.f_1.name); + bpf_dynptr_user_init(name[3], sizeof(name[3]), &next_key.f_3.name); + err = bpf_map_get_next_key(fd, &cur_key, &next_key); + ASSERT_EQ(err, -ENOENT, "last get_next_key"); +} + +static void test_htab_dynptr_key(bool pure, bool multiple) +{ + struct htab_dynkey_test_success *skel; + struct bpf_program *prog; + int err; + + skel = htab_dynkey_test_success__open(); + if (!ASSERT_OK_PTR(skel, "open()")) + return; + + prog = pure ? skel->progs.pure_dynptr_key : + (multiple ? skel->progs.multiple_dynptr_key : skel->progs.mixed_dynptr_key); + bpf_program__set_autoload(prog, true); + + err = htab_dynkey_test_success__load(skel); + if (!ASSERT_OK(err, "load()")) + goto out; + + if (pure) { + setup_pure_dynptr_key_map(bpf_map__fd(skel->maps.htab_1)); + setup_pure_dynptr_key_map(bpf_map__fd(skel->maps.htab_2)); + } else if (multiple) { + setup_multiple_dynptr_key_map(bpf_map__fd(skel->maps.htab_4)); + } else { + setup_mixed_dynptr_key_map(bpf_map__fd(skel->maps.htab_3)); + } + + skel->bss->pid = getpid(); + + err = htab_dynkey_test_success__attach(skel); + if (!ASSERT_OK(err, "attach()")) + goto out; + + usleep(1); + + ASSERT_EQ(skel->bss->test_err, 0, "test"); +out: + htab_dynkey_test_success__destroy(skel); +} + +void test_htab_dynkey_test(void) +{ + if (test__start_subtest("pure_dynptr_key")) + test_htab_dynptr_key(true, false); + if (test__start_subtest("mixed_dynptr_key")) + test_htab_dynptr_key(false, false); + if (test__start_subtest("multiple_dynptr_key")) + test_htab_dynptr_key(false, true); + + RUN_TESTS(htab_dynkey_test_failure); +} diff --git a/tools/testing/selftests/bpf/progs/htab_dynkey_test_failure.c b/tools/testing/selftests/bpf/progs/htab_dynkey_test_failure.c new file mode 100644 index 000000000000..c391e4fc5320 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/htab_dynkey_test_failure.c @@ -0,0 +1,270 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (C) 2024. Huawei Technologies Co., Ltd */ +#include +#include +#include +#include +#include + +#include "bpf_misc.h" + +char _license[] SEC("license") = "GPL"; + +struct bpf_map; + +struct id_dname_key { + int id; + struct bpf_dynptr name; +}; + +struct dname_id_key { + struct bpf_dynptr name; + int id; +}; + +struct id_name_key { + int id; + char name[20]; +}; + +struct dname_key { + struct bpf_dynptr name; +}; + +struct dname_dname_key { + struct bpf_dynptr name_1; + struct bpf_dynptr name_2; +}; + +struct dname_dname_id_key { + struct dname_dname_key names; + __u64 id; +}; + +struct dname_id_id_id_key { + struct bpf_dynptr name; + __u64 id[3]; +}; + +struct dname_dname_dname_key { + struct bpf_dynptr name_1; + struct bpf_dynptr name_2; + struct bpf_dynptr name_3; +}; + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(max_entries, 10); + __uint(map_flags, BPF_F_NO_PREALLOC | BPF_F_DYNPTR_IN_KEY); + __type(key, struct id_dname_key); + __type(value, unsigned long); + __uint(map_extra, 1024); +} htab_1 SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(max_entries, 10); + __uint(map_flags, BPF_F_NO_PREALLOC | BPF_F_DYNPTR_IN_KEY); + __type(key, struct dname_key); + __type(value, unsigned long); + __uint(map_extra, 1024); +} htab_2 SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(max_entries, 10); + __uint(map_flags, BPF_F_NO_PREALLOC | BPF_F_DYNPTR_IN_KEY); + __type(key, struct dname_dname_id_key); + __type(value, unsigned long); + __uint(map_extra, 1024); +} htab_3 SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(max_entries, 10); + __uint(map_flags, BPF_F_NO_PREALLOC | BPF_F_DYNPTR_IN_KEY); + __type(key, struct bpf_dynptr); + __type(value, unsigned long); + __uint(map_extra, 1024); +} htab_4 SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_RINGBUF); + __uint(max_entries, 4096); +} ringbuf SEC(".maps"); + +char dynptr_buf[32] = {}; + +/* uninitialized dynptr */ +SEC("fentry/" SYS_PREFIX "sys_nanosleep") +__failure __msg("dynptr-key expects dynptr at offset 8") +int BPF_PROG(uninit_dynptr) +{ + struct id_dname_key key; + + key.id = 100; + bpf_map_lookup_elem(&htab_1, &key); + + return 0; +} + +/* invalid dynptr */ +SEC("fentry/" SYS_PREFIX "sys_nanosleep") +__failure __msg("dynptr-key expects dynptr at offset 8") +int BPF_PROG(invalid_dynptr) +{ + struct id_dname_key key; + + key.id = 100; + bpf_ringbuf_reserve_dynptr(&ringbuf, 10, 0, &key.name); + bpf_ringbuf_discard_dynptr(&key.name, 0); + bpf_map_lookup_elem(&htab_1, &key); + + return 0; +} + +/* expect no-dynptr got dynptr */ +SEC("fentry/" SYS_PREFIX "sys_nanosleep") +__failure __msg("dynptr-key expects non-dynptr at offset 0") +int BPF_PROG(invalid_non_dynptr) +{ + struct dname_id_key key; + + __builtin_memcpy(dynptr_buf, "test", 4); + bpf_dynptr_from_mem(dynptr_buf, 4, 0, &key.name); + key.id = 100; + bpf_map_lookup_elem(&htab_1, &key); + + return 0; +} + +/* expect dynptr get non-dynptr */ +SEC("fentry/" SYS_PREFIX "sys_nanosleep") +__failure __msg("dynptr-key expects dynptr at offset 8") +int BPF_PROG(no_dynptr) +{ + struct id_name_key key; + + key.id = 100; + __builtin_memset(key.name, 0, sizeof(key.name)); + __builtin_memcpy(key.name, "test", 4); + bpf_map_lookup_elem(&htab_1, &key); + + return 0; +} + +/* malformed */ +SEC("fentry/" SYS_PREFIX "sys_nanosleep") +__failure __msg("malformed dynptr-key at offset 8") +int BPF_PROG(malformed_dynptr) +{ + struct dname_dname_key key; + + bpf_dynptr_from_mem(dynptr_buf, 4, 0, &key.name_1); + bpf_dynptr_from_mem(dynptr_buf, 4, 0, &key.name_2); + + bpf_map_lookup_elem(&htab_2, (void *)&key + 8); + + return 0; +} + +/* expect no-dynptr got dynptr */ +SEC("fentry/" SYS_PREFIX "sys_nanosleep") +__failure __msg("dynptr-key expects non-dynptr at offset 32") +int BPF_PROG(invalid_non_dynptr_2) +{ + struct dname_dname_dname_key key; + + bpf_dynptr_from_mem(dynptr_buf, 4, 0, &key.name_1); + bpf_dynptr_from_mem(dynptr_buf, 4, 0, &key.name_2); + bpf_dynptr_from_mem(dynptr_buf, 4, 0, &key.name_3); + + bpf_map_lookup_elem(&htab_3, &key); + + return 0; +} + +/* expect dynptr get non-dynptr */ +SEC("fentry/" SYS_PREFIX "sys_nanosleep") +__failure __msg("dynptr-key expects dynptr at offset 16") +int BPF_PROG(no_dynptr_2) +{ + struct dname_id_id_id_key key; + + bpf_dynptr_from_mem(dynptr_buf, 4, 0, &key.name); + bpf_map_lookup_elem(&htab_3, &key); + + return 0; +} + +/* misaligned */ +SEC("fentry/" SYS_PREFIX "sys_nanosleep") +__failure __msg("R2 misaligned offset -28 for dynptr-key") +int BPF_PROG(misaligned_dynptr) +{ + struct dname_dname_key key; + + bpf_map_lookup_elem(&htab_1, (char *)&key + 4); + + return 0; +} + +/* variable offset */ +SEC("fentry/" SYS_PREFIX "sys_nanosleep") +__failure __msg("R2 variable offset prohibited for dynptr-key") +int BPF_PROG(variable_offset_dynptr) +{ + struct bpf_dynptr dynptr_1; + struct bpf_dynptr dynptr_2; + char *key; + + bpf_dynptr_from_mem(dynptr_buf, 4, 0, &dynptr_1); + bpf_dynptr_from_mem(dynptr_buf, 4, 0, &dynptr_2); + + key = (char *)&dynptr_2; + key = key + (bpf_get_prandom_u32() & 1) * 16; + + bpf_map_lookup_elem(&htab_2, key); + + return 0; +} + +SEC("fentry/" SYS_PREFIX "sys_nanosleep") +__failure __msg("map dynptr-key requires stack ptr but got map_value") +int BPF_PROG(map_value_as_key) +{ + bpf_map_lookup_elem(&htab_1, dynptr_buf); + + return 0; +} + +static int lookup_htab(struct bpf_map *map, struct id_dname_key *key, void *value, void *data) +{ + bpf_map_lookup_elem(&htab_1, key); + return 0; +} + +SEC("fentry/" SYS_PREFIX "sys_nanosleep") +__failure __msg("map dynptr-key requires stack ptr but got map_key") +int BPF_PROG(map_key_as_key) +{ + bpf_for_each_map_elem(&htab_1, lookup_htab, NULL, 0); + return 0; +} + +__noinline __weak int subprog_lookup_htab(struct bpf_dynptr *dynptr) +{ + bpf_map_lookup_elem(&htab_4, dynptr); + return 0; +} + +SEC("fentry/" SYS_PREFIX "sys_nanosleep") +__failure __msg("R2 type=dynptr_ptr expected=") +int BPF_PROG(subprog_dynptr) +{ + struct bpf_dynptr dynptr; + + bpf_dynptr_from_mem(dynptr_buf, 4, 0, &dynptr); + subprog_lookup_htab(&dynptr); + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/htab_dynkey_test_success.c b/tools/testing/selftests/bpf/progs/htab_dynkey_test_success.c new file mode 100644 index 000000000000..52736b3519fb --- /dev/null +++ b/tools/testing/selftests/bpf/progs/htab_dynkey_test_success.c @@ -0,0 +1,399 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (C) 2024. Huawei Technologies Co., Ltd */ +#include +#include +#include +#include +#include + +#include "bpf_misc.h" + +char _license[] SEC("license") = "GPL"; + +struct pure_dynptr_key { + struct bpf_dynptr name; +}; + +struct mixed_dynptr_key { + int id; + struct bpf_dynptr name; +}; + +struct multiple_dynptr_key { + struct pure_dynptr_key f_1; + unsigned long f_2; + struct mixed_dynptr_key f_3; + unsigned long f_4; +}; + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(max_entries, 10); + __uint(map_flags, BPF_F_NO_PREALLOC | BPF_F_DYNPTR_IN_KEY); + __type(key, struct bpf_dynptr); + __type(value, unsigned long); + __uint(map_extra, 1024); +} htab_1 SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(max_entries, 10); + __uint(map_flags, BPF_F_NO_PREALLOC | BPF_F_DYNPTR_IN_KEY); + __type(key, struct pure_dynptr_key); + __type(value, unsigned long); + __uint(map_extra, 1024); +} htab_2 SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(max_entries, 10); + __uint(map_flags, BPF_F_NO_PREALLOC | BPF_F_DYNPTR_IN_KEY); + __type(key, struct mixed_dynptr_key); + __type(value, unsigned long); + __uint(map_extra, 1024); +} htab_3 SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(max_entries, 10); + __uint(map_flags, BPF_F_NO_PREALLOC | BPF_F_DYNPTR_IN_KEY); + __type(key, struct multiple_dynptr_key); + __type(value, unsigned long); + __uint(map_extra, 1024); +} htab_4 SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_RINGBUF); + __uint(max_entries, 4096); +} ringbuf SEC(".maps"); + +int pid = 0; +int test_err = 0; +char dynptr_buf[2][32] = {{}, {}}; + +static const char systemd_name[] = "systemd"; +static const char udevd_name[] = "udevd"; +static const char rcu_sched_name[] = "[rcu_sched]"; + +struct bpf_map; + +static int test_pure_dynptr_key_htab(struct bpf_map *htab) +{ + unsigned long new_value, *value; + struct bpf_dynptr key; + int err = 0; + + /* Lookup a existent key */ + __builtin_memcpy(dynptr_buf[0], systemd_name, sizeof(systemd_name)); + bpf_dynptr_from_mem(dynptr_buf[0], sizeof(systemd_name), 0, &key); + value = bpf_map_lookup_elem(htab, &key); + if (!value) { + err = 1; + goto out; + } + if (*value != 100) { + err = 2; + goto out; + } + + /* Look up a non-existent key */ + __builtin_memcpy(dynptr_buf[0], udevd_name, sizeof(udevd_name)); + bpf_dynptr_from_mem(dynptr_buf[0], sizeof(udevd_name), 0, &key); + value = bpf_map_lookup_elem(htab, &key); + if (value) { + err = 3; + goto out; + } + + /* Insert a new key */ + new_value = 42; + err = bpf_map_update_elem(htab, &key, &new_value, BPF_NOEXIST); + if (err) { + err = 4; + goto out; + } + + /* Insert an existent key */ + bpf_ringbuf_reserve_dynptr(&ringbuf, sizeof(udevd_name), 0, &key); + err = bpf_dynptr_write(&key, 0, (void *)udevd_name, sizeof(udevd_name), 0); + if (err) { + bpf_ringbuf_discard_dynptr(&key, 0); + err = 5; + goto out; + } + + err = bpf_map_update_elem(htab, &key, &new_value, BPF_NOEXIST); + bpf_ringbuf_discard_dynptr(&key, 0); + if (err != -EEXIST) { + err = 6; + goto out; + } + + /* Lookup it again */ + bpf_dynptr_from_mem(dynptr_buf[0], sizeof(udevd_name), 0, &key); + value = bpf_map_lookup_elem(htab, &key); + if (!value) { + err = 7; + goto out; + } + if (*value != 42) { + err = 8; + goto out; + } + + /* Delete then lookup it */ + bpf_ringbuf_reserve_dynptr(&ringbuf, sizeof(udevd_name), 0, &key); + err = bpf_dynptr_write(&key, 0, (void *)udevd_name, sizeof(udevd_name), 0); + if (err) { + bpf_ringbuf_discard_dynptr(&key, 0); + err = 9; + goto out; + } + err = bpf_map_delete_elem(htab, &key); + bpf_ringbuf_discard_dynptr(&key, 0); + if (err) { + err = 10; + goto out; + } + + bpf_dynptr_from_mem(dynptr_buf[0], sizeof(udevd_name), 0, &key); + value = bpf_map_lookup_elem(htab, &key); + if (value) { + err = 10; + goto out; + } +out: + return err; +} + +static int test_mixed_dynptr_key_htab(struct bpf_map *htab) +{ + unsigned long new_value, *value; + char udevd_name[] = "udevd"; + struct mixed_dynptr_key key; + int err = 0; + + __builtin_memset(&key, 0, sizeof(key)); + key.id = 1000; + + /* Lookup a existent key */ + __builtin_memcpy(dynptr_buf[0], systemd_name, sizeof(systemd_name)); + bpf_dynptr_from_mem(dynptr_buf[0], sizeof(systemd_name), 0, &key.name); + value = bpf_map_lookup_elem(htab, &key); + if (!value) { + err = 1; + goto out; + } + if (*value != 100) { + err = 2; + goto out; + } + + /* Look up a non-existent key */ + __builtin_memcpy(dynptr_buf[0], udevd_name, sizeof(udevd_name)); + bpf_dynptr_from_mem(dynptr_buf[0], sizeof(udevd_name), 0, &key.name); + value = bpf_map_lookup_elem(htab, &key); + if (value) { + err = 3; + goto out; + } + + /* Insert a new key */ + new_value = 42; + err = bpf_map_update_elem(htab, &key, &new_value, BPF_NOEXIST); + if (err) { + err = 4; + goto out; + } + + /* Insert an existent key */ + bpf_ringbuf_reserve_dynptr(&ringbuf, sizeof(udevd_name), 0, &key.name); + err = bpf_dynptr_write(&key.name, 0, (void *)udevd_name, sizeof(udevd_name), 0); + if (err) { + bpf_ringbuf_discard_dynptr(&key.name, 0); + err = 5; + goto out; + } + + err = bpf_map_update_elem(htab, &key, &new_value, BPF_NOEXIST); + bpf_ringbuf_discard_dynptr(&key.name, 0); + if (err != -EEXIST) { + err = 6; + goto out; + } + + /* Lookup it again */ + bpf_dynptr_from_mem(dynptr_buf[0], sizeof(udevd_name), 0, &key.name); + value = bpf_map_lookup_elem(htab, &key); + if (!value) { + err = 7; + goto out; + } + if (*value != 42) { + err = 8; + goto out; + } + + /* Delete then lookup it */ + bpf_ringbuf_reserve_dynptr(&ringbuf, sizeof(udevd_name), 0, &key.name); + err = bpf_dynptr_write(&key.name, 0, (void *)udevd_name, sizeof(udevd_name), 0); + if (err) { + bpf_ringbuf_discard_dynptr(&key.name, 0); + err = 9; + goto out; + } + err = bpf_map_delete_elem(htab, &key); + bpf_ringbuf_discard_dynptr(&key.name, 0); + if (err) { + err = 10; + goto out; + } + + bpf_dynptr_from_mem(dynptr_buf[0], sizeof(udevd_name), 0, &key.name); + value = bpf_map_lookup_elem(htab, &key); + if (value) { + err = 10; + goto out; + } +out: + return err; +} + +static int test_multiple_dynptr_key_htab(struct bpf_map *htab) +{ + unsigned long new_value, *value; + struct multiple_dynptr_key key; + int err = 0; + + __builtin_memset(&key, 0, sizeof(key)); + key.f_2 = 2; + key.f_3.id = 3; + key.f_4 = 4; + + /* Lookup a existent key */ + __builtin_memcpy(dynptr_buf[0], systemd_name, sizeof(systemd_name)); + bpf_dynptr_from_mem(dynptr_buf[0], sizeof(systemd_name), 0, &key.f_1.name); + __builtin_memcpy(dynptr_buf[1], rcu_sched_name, sizeof(rcu_sched_name)); + bpf_dynptr_from_mem(dynptr_buf[1], sizeof(rcu_sched_name), 0, &key.f_3.name); + value = bpf_map_lookup_elem(htab, &key); + if (!value) { + err = 1; + goto out; + } + if (*value != 100) { + err = 2; + goto out; + } + + /* Look up a non-existent key */ + bpf_dynptr_from_mem(dynptr_buf[1], sizeof(rcu_sched_name), 0, &key.f_1.name); + bpf_dynptr_from_mem(dynptr_buf[0], sizeof(systemd_name), 0, &key.f_3.name); + value = bpf_map_lookup_elem(htab, &key); + if (value) { + err = 3; + goto out; + } + + /* Insert a new key */ + new_value = 42; + err = bpf_map_update_elem(htab, &key, &new_value, BPF_NOEXIST); + if (err) { + err = 4; + goto out; + } + + /* Insert an existent key */ + bpf_ringbuf_reserve_dynptr(&ringbuf, sizeof(rcu_sched_name), 0, &key.f_1.name); + err = bpf_dynptr_write(&key.f_1.name, 0, (void *)rcu_sched_name, sizeof(rcu_sched_name), 0); + if (err) { + bpf_ringbuf_discard_dynptr(&key.f_1.name, 0); + err = 5; + goto out; + } + err = bpf_map_update_elem(htab, &key, &new_value, BPF_NOEXIST); + bpf_ringbuf_discard_dynptr(&key.f_1.name, 0); + if (err != -EEXIST) { + err = 6; + goto out; + } + + /* Lookup a non-existent key */ + bpf_dynptr_from_mem(dynptr_buf[1], sizeof(rcu_sched_name), 0, &key.f_1.name); + key.f_4 = 0; + value = bpf_map_lookup_elem(htab, &key); + if (value) { + err = 7; + goto out; + } + + /* Lookup an existent key */ + key.f_4 = 4; + value = bpf_map_lookup_elem(htab, &key); + if (!value) { + err = 8; + goto out; + } + if (*value != 42) { + err = 9; + goto out; + } + + /* Delete the newly-inserted key */ + bpf_ringbuf_reserve_dynptr(&ringbuf, sizeof(systemd_name), 0, &key.f_3.name); + err = bpf_dynptr_write(&key.f_3.name, 0, (void *)systemd_name, sizeof(systemd_name), 0); + if (err) { + bpf_ringbuf_discard_dynptr(&key.f_3.name, 0); + err = 10; + goto out; + } + err = bpf_map_delete_elem(htab, &key); + if (err) { + bpf_ringbuf_discard_dynptr(&key.f_3.name, 0); + err = 11; + goto out; + } + + /* Lookup it again */ + value = bpf_map_lookup_elem(htab, &key); + bpf_ringbuf_discard_dynptr(&key.f_3.name, 0); + if (value) { + err = 12; + goto out; + } +out: + return err; +} + +SEC("?fentry/" SYS_PREFIX "sys_nanosleep") +int BPF_PROG(pure_dynptr_key) +{ + if (bpf_get_current_pid_tgid() >> 32 != pid) + return 0; + + test_err = test_pure_dynptr_key_htab((struct bpf_map *)&htab_1); + test_err |= test_pure_dynptr_key_htab((struct bpf_map *)&htab_2) << 8; + + return 0; +} + +SEC("?fentry/" SYS_PREFIX "sys_nanosleep") +int BPF_PROG(mixed_dynptr_key) +{ + if (bpf_get_current_pid_tgid() >> 32 != pid) + return 0; + + test_err = test_mixed_dynptr_key_htab((struct bpf_map *)&htab_3); + + return 0; +} + +SEC("?fentry/" SYS_PREFIX "sys_nanosleep") +int BPF_PROG(multiple_dynptr_key) +{ + if (bpf_get_current_pid_tgid() >> 32 != pid) + return 0; + + test_err = test_multiple_dynptr_key_htab((struct bpf_map *)&htab_4); + + return 0; +}