From patchwork Tue Oct 15 00:49:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Martin KaFai Lau X-Patchwork-Id: 13835566 X-Patchwork-Delegate: bpf@iogearbox.net Received: from out-170.mta1.migadu.com (out-170.mta1.migadu.com [95.215.58.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 417CC33C9 for ; Tue, 15 Oct 2024 00:50:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953424; cv=none; b=l8lQLcFigulMU3D1aTXCuhTHWfvQyUcspYrIy42FwUYE0d6yazdd1QHJWHSE8jgh1OzFgpbvaR8R+gdW+PoLQzW+lghuOT8ji6ZpJ66suk79ABCLSMHYU+/qzOr1D/SxwJUS2xr1WxdDc0pUBqrjAK20pK4nvOReufzCWAwZoyw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953424; c=relaxed/simple; bh=kQnYtQ7y941VN96pgnq7W/zNsGFgyRpWsyM068kURNY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=NFPQvkUszB3ujJsoYNBeyeSfftLc012miyJtKguK0v7CVK55wWbIgZ5CUldCbMKMAiy5TnP95ul8NKR6hUahcrayQeW17owQwVXTQKV+l90MNy79zSP3ncJtlCQwlFgEJNfGNe3JrfKYC7lrI49C0IMpfZ40w2I3xbnS6i2hwHw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=a/zFA9ue; arc=none smtp.client-ip=95.215.58.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="a/zFA9ue" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1728953420; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IoJSzDKt3Sa1zekPWVmTYBH4olnC0i2ep9oSD3Rch18=; b=a/zFA9uekZ26V/Qo/Whthf5c6vRzn1GJqd1FdTxGcpxYWesGmy2LZ2+b0M92z9GqQOOXPJ +qaz0o3oFDurIhzEHI2+DPHtFvm1ws9uNggvpKCeV0dvIk2yEE8sVbfgY63zXjJ4NXYY5K TxLrzeDAnSyARGZBT/0UBSf5UdS+N2c= From: Martin KaFai Lau To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Kui-Feng Lee , kernel-team@meta.com Subject: [PATCH v5 bpf-next 01/12] bpf: Support __uptr type tag in BTF Date: Mon, 14 Oct 2024 17:49:51 -0700 Message-ID: <20241015005008.767267-2-martin.lau@linux.dev> In-Reply-To: <20241015005008.767267-1-martin.lau@linux.dev> References: <20241015005008.767267-1-martin.lau@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Patchwork-Delegate: bpf@iogearbox.net From: Kui-Feng Lee This patch introduces the "__uptr" type tag to BTF. It is to define a pointer pointing to the user space memory. This patch adds BTF logic to pass the "__uptr" type tag. btf_find_kptr() is reused for the "__uptr" tag. The "__uptr" will only be supported in the map_value of the task storage map. However, btf_parse_struct_meta() also uses btf_find_kptr() but it is not interested in "__uptr". This patch adds a "field_mask" argument to btf_find_kptr() which will return BTF_FIELD_IGNORE if the caller is not interested in a “__uptr” field. btf_parse_kptr() is also reused to parse the uptr. The btf_check_and_fixup_fields() is changed to do extra checks on the uptr to ensure that its struct size is not larger than PAGE_SIZE. It is not clear how a uptr pointing to a CO-RE supported kernel struct will be used, so it is also not allowed now. Signed-off-by: Kui-Feng Lee Signed-off-by: Martin KaFai Lau --- Changes in v5: - A field_mask arg is added to btf_find_kptr - Some uptr enforcement is added to btf_check_and_fixup_fields() - The "case MAP_UPTR:" addition to bpf_obj_init_field() is moved to the later patch together with other bpf_obj_*() changes when BPF_UPTR is finally enabled in task storage map. include/linux/bpf.h | 5 +++++ kernel/bpf/btf.c | 32 +++++++++++++++++++++++++++----- kernel/bpf/syscall.c | 2 ++ 3 files changed, 34 insertions(+), 5 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 19d8ca8ac960..cdd0a891ce55 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -203,6 +203,7 @@ enum btf_field_type { BPF_GRAPH_ROOT = BPF_RB_ROOT | BPF_LIST_HEAD, BPF_REFCOUNT = (1 << 9), BPF_WORKQUEUE = (1 << 10), + BPF_UPTR = (1 << 11), }; typedef void (*btf_dtor_kfunc_t)(void *); @@ -322,6 +323,8 @@ static inline const char *btf_field_type_name(enum btf_field_type type) return "kptr"; case BPF_KPTR_PERCPU: return "percpu_kptr"; + case BPF_UPTR: + return "uptr"; case BPF_LIST_HEAD: return "bpf_list_head"; case BPF_LIST_NODE: @@ -350,6 +353,7 @@ static inline u32 btf_field_type_size(enum btf_field_type type) case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: + case BPF_UPTR: return sizeof(u64); case BPF_LIST_HEAD: return sizeof(struct bpf_list_head); @@ -379,6 +383,7 @@ static inline u32 btf_field_type_align(enum btf_field_type type) case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: + case BPF_UPTR: return __alignof__(u64); case BPF_LIST_HEAD: return __alignof__(struct bpf_list_head); diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 13dd1fa1d1b9..e15f41175f13 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -3334,7 +3334,7 @@ static int btf_find_struct(const struct btf *btf, const struct btf_type *t, } static int btf_find_kptr(const struct btf *btf, const struct btf_type *t, - u32 off, int sz, struct btf_field_info *info) + u32 off, int sz, struct btf_field_info *info, u32 field_mask) { enum btf_field_type type; u32 res_id; @@ -3358,9 +3358,14 @@ static int btf_find_kptr(const struct btf *btf, const struct btf_type *t, type = BPF_KPTR_REF; else if (!strcmp("percpu_kptr", __btf_name_by_offset(btf, t->name_off))) type = BPF_KPTR_PERCPU; + else if (!strcmp("uptr", __btf_name_by_offset(btf, t->name_off))) + type = BPF_UPTR; else return -EINVAL; + if (!(type & field_mask)) + return BTF_FIELD_IGNORE; + /* Get the base type */ t = btf_type_skip_modifiers(btf, t->type, &res_id); /* Only pointer to struct is allowed */ @@ -3502,7 +3507,7 @@ static int btf_get_field_type(const struct btf *btf, const struct btf_type *var_ field_mask_test_name(BPF_REFCOUNT, "bpf_refcount"); /* Only return BPF_KPTR when all other types with matchable names fail */ - if (field_mask & BPF_KPTR && !__btf_type_is_struct(var_type)) { + if (field_mask & (BPF_KPTR | BPF_UPTR) && !__btf_type_is_struct(var_type)) { type = BPF_KPTR_REF; goto end; } @@ -3535,6 +3540,7 @@ static int btf_repeat_fields(struct btf_field_info *info, case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: + case BPF_UPTR: case BPF_LIST_HEAD: case BPF_RB_ROOT: break; @@ -3661,8 +3667,9 @@ static int btf_find_field_one(const struct btf *btf, case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: + case BPF_UPTR: ret = btf_find_kptr(btf, var_type, off, sz, - info_cnt ? &info[0] : &tmp); + info_cnt ? &info[0] : &tmp, field_mask); if (ret < 0) return ret; break; @@ -3985,6 +3992,7 @@ struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: + case BPF_UPTR: ret = btf_parse_kptr(btf, &rec->fields[i], &info_arr[i]); if (ret < 0) goto end; @@ -4044,12 +4052,26 @@ int btf_check_and_fixup_fields(const struct btf *btf, struct btf_record *rec) * Hence we only need to ensure that bpf_{list_head,rb_root} ownership * does not form cycles. */ - if (IS_ERR_OR_NULL(rec) || !(rec->field_mask & BPF_GRAPH_ROOT)) + if (IS_ERR_OR_NULL(rec) || !(rec->field_mask & (BPF_GRAPH_ROOT | BPF_UPTR))) return 0; for (i = 0; i < rec->cnt; i++) { struct btf_struct_meta *meta; + const struct btf_type *t; u32 btf_id; + if (rec->fields[i].type & BPF_UPTR) { + /* The uptr only supports pinning one page and cannot + * point to a kernel struct + */ + if (btf_is_kernel(rec->fields[i].kptr.btf)) + return -EINVAL; + t = btf_type_by_id(rec->fields[i].kptr.btf, + rec->fields[i].kptr.btf_id); + if (t->size > PAGE_SIZE) + return -E2BIG; + continue; + } + if (!(rec->fields[i].type & BPF_GRAPH_ROOT)) continue; btf_id = rec->fields[i].graph_root.value_btf_id; @@ -5560,7 +5582,7 @@ btf_parse_struct_metas(struct bpf_verifier_log *log, struct btf *btf) goto free_aof; } - ret = btf_find_kptr(btf, t, 0, 0, &tmp); + ret = btf_find_kptr(btf, t, 0, 0, &tmp, BPF_KPTR); if (ret != BTF_FIELD_FOUND) continue; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index a8f1808a1ca5..694dbbeb0eb5 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -548,6 +548,7 @@ void btf_record_free(struct btf_record *rec) case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: + case BPF_UPTR: if (rec->fields[i].kptr.module) module_put(rec->fields[i].kptr.module); if (btf_is_kernel(rec->fields[i].kptr.btf)) @@ -597,6 +598,7 @@ struct btf_record *btf_record_dup(const struct btf_record *rec) case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: + case BPF_UPTR: if (btf_is_kernel(fields[i].kptr.btf)) btf_get(fields[i].kptr.btf); if (fields[i].kptr.module && !try_module_get(fields[i].kptr.module)) { From patchwork Tue Oct 15 00:49:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin KaFai Lau X-Patchwork-Id: 13835567 X-Patchwork-Delegate: bpf@iogearbox.net Received: from out-189.mta1.migadu.com (out-189.mta1.migadu.com [95.215.58.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7BB8629D0D for ; Tue, 15 Oct 2024 00:50:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.189 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953426; cv=none; b=UTgkup5DMPN7aFU08Fsps9sMCvMhtxjvehok6lj0Ocj3ZnSntJ2UmRfxEjcTNAweTzqzpiJuvnZDDGjgurfgmR3EsQTMB/yH66+1Wv+0r/qLdzy2rpGtIJ1wC+kOpXj2CczaPFZo5xx7H+ZWkKuGh5y9eQa/LaiCebYpUSB71d8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953426; c=relaxed/simple; bh=9KMzxJz6MjRuZ5hiaWaCgDCerLbGWHQXA22TuS7/ukA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FXbLgLCQxtmG/1tZfMDV/l8bKFXXJWe1sablvf2ZRTmqlAYzKHCwGMXpOPHp6n8dFOfFM6OhYLOxeMambSVgJnyWmD1qHUBmjn1mWu2b+jdenlRol4iDeMT3r5hz2xwylJ1yADJLtxbdXA9fTAoOBsrV5EAdZGfVXR9eCuz3pzQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=fNDurMsw; arc=none smtp.client-ip=95.215.58.189 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="fNDurMsw" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1728953422; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HdP3r6sZ9L7aRNwA3Hsw/mUKD3EHCjg+C1klWISp7z0=; b=fNDurMswZz1D8a2n+zcCX2R/O0LfU7T9Ii4+GBIaxmz0nKTRpkXFFmlx0e5yRivCPOPAQ5 JOmTgTJCK774JEiN9inYRvdSJXzXzN7HkClMtntkfAhKpNgxpSq1bttVzAATl9asqNRK4K EbwBWklRVYD78TXGm0gX4QScHA7x9/w= From: Martin KaFai Lau To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Kui-Feng Lee , kernel-team@meta.com Subject: [PATCH v5 bpf-next 02/12] bpf: Handle BPF_UPTR in verifier Date: Mon, 14 Oct 2024 17:49:52 -0700 Message-ID: <20241015005008.767267-3-martin.lau@linux.dev> In-Reply-To: <20241015005008.767267-1-martin.lau@linux.dev> References: <20241015005008.767267-1-martin.lau@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Patchwork-Delegate: bpf@iogearbox.net From: Kui-Feng Lee This patch adds BPF_UPTR support to the verifier. Not that only the map_value will support the "__uptr" type tag. This patch enforces only BPF_LDX is allowed to the value of an uptr. After BPF_LDX, it will mark the dst_reg as PTR_TO_MEM | PTR_MAYBE_NULL with size deduced from the field.kptr.btf_id. This will make the dst_reg pointed memory to be readable and writable as scalar. There is a redundant "val_reg = reg_state(env, value_regno);" statement in the check_map_kptr_access(). This patch takes this chance to remove it also. Signed-off-by: Kui-Feng Lee Signed-off-by: Martin KaFai Lau --- Changes in v5: - The "if (kptr_field->type == BPF_UPTR)" addition in v4 in map_kptr_match_type() is removed. It will not be reached. meta->kptr_field is for kptr only. - Use btf_field_type_name() for the verbose() log in check_map_access(). - Directly use t->size in mark_uptr_ld_reg. "t" must be a struct. kernel/bpf/verifier.c | 39 +++++++++++++++++++++++++++++++++------ 1 file changed, 33 insertions(+), 6 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index cfc62e0776bf..792154ee25cc 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -5485,6 +5485,22 @@ static u32 btf_ld_kptr_type(struct bpf_verifier_env *env, struct btf_field *kptr return ret; } +static int mark_uptr_ld_reg(struct bpf_verifier_env *env, u32 regno, + struct btf_field *field) +{ + struct bpf_reg_state *reg; + const struct btf_type *t; + + t = btf_type_by_id(field->kptr.btf, field->kptr.btf_id); + mark_reg_known_zero(env, cur_regs(env), regno); + reg = reg_state(env, regno); + reg->type = PTR_TO_MEM | PTR_MAYBE_NULL; + reg->mem_size = t->size; + reg->id = ++env->id_gen; + + return 0; +} + static int check_map_kptr_access(struct bpf_verifier_env *env, u32 regno, int value_regno, int insn_idx, struct btf_field *kptr_field) @@ -5513,9 +5529,15 @@ static int check_map_kptr_access(struct bpf_verifier_env *env, u32 regno, verbose(env, "store to referenced kptr disallowed\n"); return -EACCES; } + if (class != BPF_LDX && kptr_field->type == BPF_UPTR) { + verbose(env, "store to uptr disallowed\n"); + return -EACCES; + } if (class == BPF_LDX) { - val_reg = reg_state(env, value_regno); + if (kptr_field->type == BPF_UPTR) + return mark_uptr_ld_reg(env, value_regno, kptr_field); + /* We can simply mark the value_regno receiving the pointer * value from map as PTR_TO_BTF_ID, with the correct type. */ @@ -5573,21 +5595,26 @@ static int check_map_access(struct bpf_verifier_env *env, u32 regno, case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: + case BPF_UPTR: if (src != ACCESS_DIRECT) { - verbose(env, "kptr cannot be accessed indirectly by helper\n"); + verbose(env, "%s cannot be accessed indirectly by helper\n", + btf_field_type_name(field->type)); return -EACCES; } if (!tnum_is_const(reg->var_off)) { - verbose(env, "kptr access cannot have variable offset\n"); + verbose(env, "%s access cannot have variable offset\n", + btf_field_type_name(field->type)); return -EACCES; } if (p != off + reg->var_off.value) { - verbose(env, "kptr access misaligned expected=%u off=%llu\n", + verbose(env, "%s access misaligned expected=%u off=%llu\n", + btf_field_type_name(field->type), p, off + reg->var_off.value); return -EACCES; } if (size != bpf_size_to_bytes(BPF_DW)) { - verbose(env, "kptr access size must be BPF_DW\n"); + verbose(env, "%s access size must be BPF_DW\n", + btf_field_type_name(field->type)); return -EACCES; } break; @@ -6953,7 +6980,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn return err; if (tnum_is_const(reg->var_off)) kptr_field = btf_record_find(reg->map_ptr->record, - off + reg->var_off.value, BPF_KPTR); + off + reg->var_off.value, BPF_KPTR | BPF_UPTR); if (kptr_field) { err = check_map_kptr_access(env, regno, value_regno, insn_idx, kptr_field); } else if (t == BPF_READ && value_regno >= 0) { From patchwork Tue Oct 15 00:49:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin KaFai Lau X-Patchwork-Id: 13835568 X-Patchwork-Delegate: bpf@iogearbox.net Received: from out-188.mta1.migadu.com (out-188.mta1.migadu.com [95.215.58.188]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1FD6529D0D for ; Tue, 15 Oct 2024 00:50:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.188 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953429; cv=none; b=TK8do4B1ecnjrLn8uCR7+Hn872lWqysKOT2AE/9m2PcpW+Nli3U+Y5S2jpJMBDs12pBnOQONF/5N12k8SuI/XXf2ALwKZyHNqgj3/1fsKshE1jxuJ+BZaKN/U3ztvv7b5KaeDbmWRkqu3WktYBVrOnEmpRzI8w6MVWQ6RTXhWaI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953429; c=relaxed/simple; bh=PdMA6Bv9vgVIzqIYZGMyp9fWewEVtw857nn3KXPH8YA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RGyh3iRySuIH8fk1oTsS76K7mKmVlzHME2AKStGUUznGiSetlt2fYo6c/PDv0BJl1BheHquxJS84IbxrR6nPUw7r5hQQk/eUKXz2rQjHqyJRXQAnT8LvdGMOisew5/h7sTeKy6p3gOQih/hMQqzNT+FrrZ3O40PapvB8a9lzk/8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=UU6mLcTw; arc=none smtp.client-ip=95.215.58.188 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="UU6mLcTw" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1728953425; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bvYhJde40xawy1qxOhEF1sN00RFpRoe/EuUjB1+KGfM=; b=UU6mLcTwHsQBnb2MJYBTjv5jtl71oRZh0SNehNsJAkw0/isLz+LsD/GaTG8QFwMwFfnvjT 0BrE6CnhEVxO0j4LuDzOzO+VF2DemyEgE6ApWN7Tgnynire4I4cZrsNm9f5r3qOykp1X5E 3XPv606heMEqXP+3RJiiVbDVIOM6qe8= From: Martin KaFai Lau To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Kui-Feng Lee , kernel-team@meta.com Subject: [PATCH v5 bpf-next 03/12] bpf: Add "bool swap_uptrs" arg to bpf_local_storage_update() and bpf_selem_alloc() Date: Mon, 14 Oct 2024 17:49:53 -0700 Message-ID: <20241015005008.767267-4-martin.lau@linux.dev> In-Reply-To: <20241015005008.767267-1-martin.lau@linux.dev> References: <20241015005008.767267-1-martin.lau@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Patchwork-Delegate: bpf@iogearbox.net From: Martin KaFai Lau In a later patch, the task local storage will only accept uptr from the syscall update_elem and will not accept uptr from the bpf prog. The reason is the bpf prog does not have a way to provide a valid user space address. bpf_local_storage_update() and bpf_selem_alloc() are used by both bpf prog bpf_task_storage_get(BPF_LOCAL_STORAGE_GET_F_CREATE) and bpf syscall update_elem. "bool swap_uptrs" arg is added to bpf_local_storage_update() and bpf_selem_alloc() to tell if it is called by the bpf prog or by the bpf syscall. When swap_uptrs==true, it is called by the syscall. The arg is named (swap_)uptrs because the later patch will swap the uptrs between the newly allocated selem and the user space provided map_value. It will make error handling easier in case map->ops->map_update_elem() fails and the caller can decide if it needs to unpin the uptr in the user space provided map_value or the bpf_local_storage_update() has already taken the uptr ownership and will take care of unpinning it also. Only swap_uptrs==false is passed now. The logic to handle the true case will be added in a later patch. Signed-off-by: Martin KaFai Lau --- include/linux/bpf_local_storage.h | 4 ++-- kernel/bpf/bpf_cgrp_storage.c | 4 ++-- kernel/bpf/bpf_inode_storage.c | 4 ++-- kernel/bpf/bpf_local_storage.c | 8 ++++---- kernel/bpf/bpf_task_storage.c | 4 ++-- net/core/bpf_sk_storage.c | 6 +++--- 6 files changed, 15 insertions(+), 15 deletions(-) diff --git a/include/linux/bpf_local_storage.h b/include/linux/bpf_local_storage.h index dcddb0aef7d8..0c7216c065d5 100644 --- a/include/linux/bpf_local_storage.h +++ b/include/linux/bpf_local_storage.h @@ -181,7 +181,7 @@ void bpf_selem_link_map(struct bpf_local_storage_map *smap, struct bpf_local_storage_elem * bpf_selem_alloc(struct bpf_local_storage_map *smap, void *owner, void *value, - bool charge_mem, gfp_t gfp_flags); + bool charge_mem, bool swap_uptrs, gfp_t gfp_flags); void bpf_selem_free(struct bpf_local_storage_elem *selem, struct bpf_local_storage_map *smap, @@ -195,7 +195,7 @@ bpf_local_storage_alloc(void *owner, struct bpf_local_storage_data * bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap, - void *value, u64 map_flags, gfp_t gfp_flags); + void *value, u64 map_flags, bool swap_uptrs, gfp_t gfp_flags); u64 bpf_local_storage_map_mem_usage(const struct bpf_map *map); diff --git a/kernel/bpf/bpf_cgrp_storage.c b/kernel/bpf/bpf_cgrp_storage.c index 28efd0a3f220..20f05de92e9c 100644 --- a/kernel/bpf/bpf_cgrp_storage.c +++ b/kernel/bpf/bpf_cgrp_storage.c @@ -107,7 +107,7 @@ static long bpf_cgrp_storage_update_elem(struct bpf_map *map, void *key, bpf_cgrp_storage_lock(); sdata = bpf_local_storage_update(cgroup, (struct bpf_local_storage_map *)map, - value, map_flags, GFP_ATOMIC); + value, map_flags, false, GFP_ATOMIC); bpf_cgrp_storage_unlock(); cgroup_put(cgroup); return PTR_ERR_OR_ZERO(sdata); @@ -181,7 +181,7 @@ BPF_CALL_5(bpf_cgrp_storage_get, struct bpf_map *, map, struct cgroup *, cgroup, if (!percpu_ref_is_dying(&cgroup->self.refcnt) && (flags & BPF_LOCAL_STORAGE_GET_F_CREATE)) sdata = bpf_local_storage_update(cgroup, (struct bpf_local_storage_map *)map, - value, BPF_NOEXIST, gfp_flags); + value, BPF_NOEXIST, false, gfp_flags); unlock: bpf_cgrp_storage_unlock(); diff --git a/kernel/bpf/bpf_inode_storage.c b/kernel/bpf/bpf_inode_storage.c index 29da6d3838f6..44ccebc745e5 100644 --- a/kernel/bpf/bpf_inode_storage.c +++ b/kernel/bpf/bpf_inode_storage.c @@ -100,7 +100,7 @@ static long bpf_fd_inode_storage_update_elem(struct bpf_map *map, void *key, sdata = bpf_local_storage_update(file_inode(fd_file(f)), (struct bpf_local_storage_map *)map, - value, map_flags, GFP_ATOMIC); + value, map_flags, false, GFP_ATOMIC); return PTR_ERR_OR_ZERO(sdata); } @@ -154,7 +154,7 @@ BPF_CALL_5(bpf_inode_storage_get, struct bpf_map *, map, struct inode *, inode, if (flags & BPF_LOCAL_STORAGE_GET_F_CREATE) { sdata = bpf_local_storage_update( inode, (struct bpf_local_storage_map *)map, value, - BPF_NOEXIST, gfp_flags); + BPF_NOEXIST, false, gfp_flags); return IS_ERR(sdata) ? (unsigned long)NULL : (unsigned long)sdata->data; } diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c index c938dea5ddbf..1cf772cb26eb 100644 --- a/kernel/bpf/bpf_local_storage.c +++ b/kernel/bpf/bpf_local_storage.c @@ -73,7 +73,7 @@ static bool selem_linked_to_map(const struct bpf_local_storage_elem *selem) struct bpf_local_storage_elem * bpf_selem_alloc(struct bpf_local_storage_map *smap, void *owner, - void *value, bool charge_mem, gfp_t gfp_flags) + void *value, bool charge_mem, bool swap_uptrs, gfp_t gfp_flags) { struct bpf_local_storage_elem *selem; @@ -524,7 +524,7 @@ int bpf_local_storage_alloc(void *owner, */ struct bpf_local_storage_data * bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap, - void *value, u64 map_flags, gfp_t gfp_flags) + void *value, u64 map_flags, bool swap_uptrs, gfp_t gfp_flags) { struct bpf_local_storage_data *old_sdata = NULL; struct bpf_local_storage_elem *alloc_selem, *selem = NULL; @@ -550,7 +550,7 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap, if (err) return ERR_PTR(err); - selem = bpf_selem_alloc(smap, owner, value, true, gfp_flags); + selem = bpf_selem_alloc(smap, owner, value, true, swap_uptrs, gfp_flags); if (!selem) return ERR_PTR(-ENOMEM); @@ -584,7 +584,7 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap, /* A lookup has just been done before and concluded a new selem is * needed. The chance of an unnecessary alloc is unlikely. */ - alloc_selem = selem = bpf_selem_alloc(smap, owner, value, true, gfp_flags); + alloc_selem = selem = bpf_selem_alloc(smap, owner, value, true, swap_uptrs, gfp_flags); if (!alloc_selem) return ERR_PTR(-ENOMEM); diff --git a/kernel/bpf/bpf_task_storage.c b/kernel/bpf/bpf_task_storage.c index adf6dfe0ba68..45dc3ca334d3 100644 --- a/kernel/bpf/bpf_task_storage.c +++ b/kernel/bpf/bpf_task_storage.c @@ -147,7 +147,7 @@ static long bpf_pid_task_storage_update_elem(struct bpf_map *map, void *key, bpf_task_storage_lock(); sdata = bpf_local_storage_update( task, (struct bpf_local_storage_map *)map, value, map_flags, - GFP_ATOMIC); + false, GFP_ATOMIC); bpf_task_storage_unlock(); err = PTR_ERR_OR_ZERO(sdata); @@ -219,7 +219,7 @@ static void *__bpf_task_storage_get(struct bpf_map *map, (flags & BPF_LOCAL_STORAGE_GET_F_CREATE) && nobusy) { sdata = bpf_local_storage_update( task, (struct bpf_local_storage_map *)map, value, - BPF_NOEXIST, gfp_flags); + BPF_NOEXIST, false, gfp_flags); return IS_ERR(sdata) ? NULL : sdata->data; } diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c index bc01b3aa6b0f..2f4ed83a75ae 100644 --- a/net/core/bpf_sk_storage.c +++ b/net/core/bpf_sk_storage.c @@ -106,7 +106,7 @@ static long bpf_fd_sk_storage_update_elem(struct bpf_map *map, void *key, if (sock) { sdata = bpf_local_storage_update( sock->sk, (struct bpf_local_storage_map *)map, value, - map_flags, GFP_ATOMIC); + map_flags, false, GFP_ATOMIC); sockfd_put(sock); return PTR_ERR_OR_ZERO(sdata); } @@ -137,7 +137,7 @@ bpf_sk_storage_clone_elem(struct sock *newsk, { struct bpf_local_storage_elem *copy_selem; - copy_selem = bpf_selem_alloc(smap, newsk, NULL, true, GFP_ATOMIC); + copy_selem = bpf_selem_alloc(smap, newsk, NULL, true, false, GFP_ATOMIC); if (!copy_selem) return NULL; @@ -243,7 +243,7 @@ BPF_CALL_5(bpf_sk_storage_get, struct bpf_map *, map, struct sock *, sk, refcount_inc_not_zero(&sk->sk_refcnt)) { sdata = bpf_local_storage_update( sk, (struct bpf_local_storage_map *)map, value, - BPF_NOEXIST, gfp_flags); + BPF_NOEXIST, false, gfp_flags); /* sk must be a fullsock (guaranteed by verifier), * so sock_gen_put() is unnecessary. */ From patchwork Tue Oct 15 00:49:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin KaFai Lau X-Patchwork-Id: 13835569 X-Patchwork-Delegate: bpf@iogearbox.net Received: from out-181.mta1.migadu.com (out-181.mta1.migadu.com [95.215.58.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 935061CAAC for ; Tue, 15 Oct 2024 00:50:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953431; cv=none; b=VNeB5Be9ASfx+O0qrt0yet1hv9XKtYSsjt4NuQVwVzu5Qfmhnp8bCr+PZ4pING0oqjiMuna2lL+VEZxJDyJbWuOuoYGyVjzDVvg1fX1UWsPo0N5JpjDjkEhr0qXhWdqux616OFngXIGdudhlQUkHZHT36f1+kFeoY//e6bWrHHY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953431; c=relaxed/simple; bh=uvnsDwT3q4ClZooaE2lNmYRjYcNNU6gKDzqG4ASmDKo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=My6F2+WJO3GZyTKb0XordKexQm2/kNggj5cCD5MHbEsRFFOzAKAUj/++7v13KwJicIo15Yx8BrRdCAQjRwB29JTWPAekIgkgaKNvwbGQIjEJKvM3FEJmE6ESe9ZUvDj2SKNeZA3JAQd55TRsDzG+EjTU9m3icKLUTOlv3gPgFXA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=VJReO4CC; arc=none smtp.client-ip=95.215.58.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="VJReO4CC" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1728953427; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WP4DDQubYzzZl7wnh1q7E8loRaUCjPHg66/vzn/GFH8=; b=VJReO4CCVwZo+3hosSHBcR+RAv+tUgGl81CAFWAJj0xJ5HCSomVU3hgWbklBx4jhOZ5Kbt COm1Hit6VSyPJe3/O0RAnljvVYAHvJk3PPavEzchRZS/umRPgQU2uVTlyF0U1q9ZdXRdLK ntQocJV849eiAi8h5yVjDUztUKjPk+I= From: Martin KaFai Lau To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Kui-Feng Lee , kernel-team@meta.com Subject: [PATCH v5 bpf-next 04/12] bpf: Postpone bpf_selem_free() in bpf_selem_unlink_storage_nolock() Date: Mon, 14 Oct 2024 17:49:54 -0700 Message-ID: <20241015005008.767267-5-martin.lau@linux.dev> In-Reply-To: <20241015005008.767267-1-martin.lau@linux.dev> References: <20241015005008.767267-1-martin.lau@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Patchwork-Delegate: bpf@iogearbox.net From: Martin KaFai Lau In a later patch, bpf_selem_free() will call unpin_user_page() through bpf_obj_free_fields(). unpin_user_page() may take spin_lock. However, some bpf_selem_free() call paths have held a raw_spin_lock. Like this: raw_spin_lock_irqsave() bpf_selem_unlink_storage_nolock() bpf_selem_free() unpin_user_page() spin_lock() To avoid spinlock nested in raw_spinlock, bpf_selem_free() should be done after releasing the raw_spinlock. The "bool reuse_now" arg is replaced with "struct hlist_head *free_selem_list" in bpf_selem_unlink_storage_nolock(). The bpf_selem_unlink_storage_nolock() will append the to-be-free selem at the free_selem_list. The caller of bpf_selem_unlink_storage_nolock() will need to call the new bpf_selem_free_list(free_selem_list, reuse_now) to free the selem after releasing the raw_spinlock. Note that the selem->snode cannot be reused for linking to the free_selem_list because the selem->snode is protected by the raw_spinlock that we want to avoid holding. A new "struct hlist_node free_node;" is union-ized with the rcu_head. Only the first one successfully hlist_del_init_rcu(&selem->snode) will be able to use the free_node. After succeeding hlist_del_init_rcu(&selem->snode), the free_node and rcu_head usage is serialized such that they can share the 16 bytes in a union. Signed-off-by: Martin KaFai Lau --- include/linux/bpf_local_storage.h | 8 ++++++- kernel/bpf/bpf_local_storage.c | 35 ++++++++++++++++++++++++++----- 2 files changed, 37 insertions(+), 6 deletions(-) diff --git a/include/linux/bpf_local_storage.h b/include/linux/bpf_local_storage.h index 0c7216c065d5..ab7244d8108f 100644 --- a/include/linux/bpf_local_storage.h +++ b/include/linux/bpf_local_storage.h @@ -77,7 +77,13 @@ struct bpf_local_storage_elem { struct hlist_node map_node; /* Linked to bpf_local_storage_map */ struct hlist_node snode; /* Linked to bpf_local_storage */ struct bpf_local_storage __rcu *local_storage; - struct rcu_head rcu; + union { + struct rcu_head rcu; + struct hlist_node free_node; /* used to postpone + * bpf_selem_free + * after raw_spin_unlock + */ + }; /* 8 bytes hole */ /* The data is stored in another cacheline to minimize * the number of cachelines access during a cache hit. diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c index 1cf772cb26eb..09a67dff2336 100644 --- a/kernel/bpf/bpf_local_storage.c +++ b/kernel/bpf/bpf_local_storage.c @@ -246,13 +246,30 @@ void bpf_selem_free(struct bpf_local_storage_elem *selem, } } +static void bpf_selem_free_list(struct hlist_head *list, bool reuse_now) +{ + struct bpf_local_storage_elem *selem; + struct bpf_local_storage_map *smap; + struct hlist_node *n; + + /* The "_safe" iteration is needed. + * The loop is not removing the selem from the list + * but bpf_selem_free will use the selem->rcu_head + * which is union-ized with the selem->free_node. + */ + hlist_for_each_entry_safe(selem, n, list, free_node) { + smap = rcu_dereference_check(SDATA(selem)->smap, bpf_rcu_lock_held()); + bpf_selem_free(selem, smap, reuse_now); + } +} + /* local_storage->lock must be held and selem->local_storage == local_storage. * The caller must ensure selem->smap is still valid to be * dereferenced for its smap->elem_size and smap->cache_idx. */ static bool bpf_selem_unlink_storage_nolock(struct bpf_local_storage *local_storage, struct bpf_local_storage_elem *selem, - bool uncharge_mem, bool reuse_now) + bool uncharge_mem, struct hlist_head *free_selem_list) { struct bpf_local_storage_map *smap; bool free_local_storage; @@ -296,7 +313,7 @@ static bool bpf_selem_unlink_storage_nolock(struct bpf_local_storage *local_stor SDATA(selem)) RCU_INIT_POINTER(local_storage->cache[smap->cache_idx], NULL); - bpf_selem_free(selem, smap, reuse_now); + hlist_add_head(&selem->free_node, free_selem_list); if (rcu_access_pointer(local_storage->smap) == smap) RCU_INIT_POINTER(local_storage->smap, NULL); @@ -345,6 +362,7 @@ static void bpf_selem_unlink_storage(struct bpf_local_storage_elem *selem, struct bpf_local_storage_map *storage_smap; struct bpf_local_storage *local_storage; bool bpf_ma, free_local_storage = false; + HLIST_HEAD(selem_free_list); unsigned long flags; if (unlikely(!selem_linked_to_storage_lockless(selem))) @@ -360,9 +378,11 @@ static void bpf_selem_unlink_storage(struct bpf_local_storage_elem *selem, raw_spin_lock_irqsave(&local_storage->lock, flags); if (likely(selem_linked_to_storage(selem))) free_local_storage = bpf_selem_unlink_storage_nolock( - local_storage, selem, true, reuse_now); + local_storage, selem, true, &selem_free_list); raw_spin_unlock_irqrestore(&local_storage->lock, flags); + bpf_selem_free_list(&selem_free_list, reuse_now); + if (free_local_storage) bpf_local_storage_free(local_storage, storage_smap, bpf_ma, reuse_now); } @@ -529,6 +549,7 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap, struct bpf_local_storage_data *old_sdata = NULL; struct bpf_local_storage_elem *alloc_selem, *selem = NULL; struct bpf_local_storage *local_storage; + HLIST_HEAD(old_selem_free_list); unsigned long flags; int err; @@ -624,11 +645,12 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap, if (old_sdata) { bpf_selem_unlink_map(SELEM(old_sdata)); bpf_selem_unlink_storage_nolock(local_storage, SELEM(old_sdata), - true, false); + true, &old_selem_free_list); } unlock: raw_spin_unlock_irqrestore(&local_storage->lock, flags); + bpf_selem_free_list(&old_selem_free_list, false); if (alloc_selem) { mem_uncharge(smap, owner, smap->elem_size); bpf_selem_free(alloc_selem, smap, true); @@ -706,6 +728,7 @@ void bpf_local_storage_destroy(struct bpf_local_storage *local_storage) struct bpf_local_storage_map *storage_smap; struct bpf_local_storage_elem *selem; bool bpf_ma, free_storage = false; + HLIST_HEAD(free_selem_list); struct hlist_node *n; unsigned long flags; @@ -734,10 +757,12 @@ void bpf_local_storage_destroy(struct bpf_local_storage *local_storage) * of the loop will set the free_cgroup_storage to true. */ free_storage = bpf_selem_unlink_storage_nolock( - local_storage, selem, true, true); + local_storage, selem, true, &free_selem_list); } raw_spin_unlock_irqrestore(&local_storage->lock, flags); + bpf_selem_free_list(&free_selem_list, true); + if (free_storage) bpf_local_storage_free(local_storage, storage_smap, bpf_ma, true); } From patchwork Tue Oct 15 00:49:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin KaFai Lau X-Patchwork-Id: 13835570 X-Patchwork-Delegate: bpf@iogearbox.net Received: from out-171.mta1.migadu.com (out-171.mta1.migadu.com [95.215.58.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 30C7D482EB for ; Tue, 15 Oct 2024 00:50:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953434; cv=none; b=ix30GPuhJcCIKTQk0F9Qhu/N/otfQlnv0HZB0w8dGLT6d0PBzx5IqgVrmg8b1Xx8SdNAVIUDryW+E2Ffp/2j3hsEIHxlbj9bpE5v07HEpfJ13YT0E8LgKa4TcLFdoLl10JUjRa46uuXErCqQoce1YeZGKRjPhn/xgKZiizCmnEQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953434; c=relaxed/simple; bh=erruaj1mwgbTrLLH3N+xpCxIX0K1w+jQvUSbfxZ5jLU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FsKyfutR4LF8Z58/mdJpu4swSxH/vPWK8BRDZEm/sxUHSw/NMkS7HthqDtOUi7PHC98D7oLNX1mneDHQygpTSxZaJovng0riu37Vg3l0aB1mxDxe+VimSHZFXY/wVdoLKColeP8RVE6EKCdKTM97AtkbQFQhSWBmR6eebe+rPhc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=pW9ItJ8u; arc=none smtp.client-ip=95.215.58.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="pW9ItJ8u" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1728953430; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nfPAZv3g3p1YW+XfbWwyzXFoGBD+werzQrmzFBuZb9g=; b=pW9ItJ8u5OjDGJYpJqR7oMbCH2kO+qlpmv2voaiTvlJZXavBDhdT8sjz3AdGc+F9DP+6bk 7iCnjLZu44z06/aGSpdt4YVCHTDhIq+Ys0bKsi6R80Vg3NpZp8o0h4HMyrTt3hL5ddzyty 8X5FC/lyW3uLS+fqYtI+zh9JvnWaGvQ= From: Martin KaFai Lau To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Kui-Feng Lee , kernel-team@meta.com Subject: [PATCH v5 bpf-next 05/12] bpf: Postpone bpf_obj_free_fields to the rcu callback Date: Mon, 14 Oct 2024 17:49:55 -0700 Message-ID: <20241015005008.767267-6-martin.lau@linux.dev> In-Reply-To: <20241015005008.767267-1-martin.lau@linux.dev> References: <20241015005008.767267-1-martin.lau@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Patchwork-Delegate: bpf@iogearbox.net From: Martin KaFai Lau A later patch will enable the uptr usage in the task_local_storage map. This will require the unpin_user_page() to be done after the rcu task trace gp for the cases that the uptr may still be used by a bpf prog. The bpf_obj_free_fields() will be the one doing unpin_user_page(), so this patch is to postpone calling bpf_obj_free_fields() to the rcu callback. The bpf_obj_free_fields() is only required to be done in the rcu callback when bpf->bpf_ma==true and reuse_now==false. bpf->bpf_ma==true case is because uptr will only be enabled in task storage which has already been moved to bpf_mem_alloc. The bpf->bpf_ma==false case can be supported in the future also if there is a need. reuse_now==false when the selem (aka storage) is deleted by bpf prog (bpf_task_storage_delete) or by syscall delete_elem(). In both cases, bpf_obj_free_fields() needs to wait for rcu gp. A few words on reuse_now==true. reuse_now==true when the storage's owner (i.e. the task_struct) is destructing or the map itself is doing map_free(). In both cases, no bpf prog should have a hold on the selem and its uptrs, so there is no need to postpone bpf_obj_free_fields(). reuse_now==true should be the common case for local storage usage where the storage exists throughout the lifetime of its owner (task_struct). The bpf_obj_free_fields() needs to use the map->record. Doing bpf_obj_free_fields() in a rcu callback will require the bpf_local_storage_map_free() to wait for rcu_barrier. An optimization could be only waiting for rcu_barrier when the map has uptr in its map_value. This will require either yet another rcu callback function or adding a bool in the selem to flag if the SDATA(selem)->smap is still valid. This patch chooses to keep it simple and wait for rcu_barrier for maps that use bpf_mem_alloc. Signed-off-by: Martin KaFai Lau --- kernel/bpf/bpf_local_storage.c | 29 ++++++++++++++++++++++++----- 1 file changed, 24 insertions(+), 5 deletions(-) diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c index 09a67dff2336..ca871be1c42d 100644 --- a/kernel/bpf/bpf_local_storage.c +++ b/kernel/bpf/bpf_local_storage.c @@ -209,8 +209,12 @@ static void __bpf_selem_free(struct bpf_local_storage_elem *selem, static void bpf_selem_free_rcu(struct rcu_head *rcu) { struct bpf_local_storage_elem *selem; + struct bpf_local_storage_map *smap; selem = container_of(rcu, struct bpf_local_storage_elem, rcu); + /* The bpf_local_storage_map_free will wait for rcu_barrier */ + smap = rcu_dereference_check(SDATA(selem)->smap, 1); + bpf_obj_free_fields(smap->map.record, SDATA(selem)->data); bpf_mem_cache_raw_free(selem); } @@ -226,16 +230,25 @@ void bpf_selem_free(struct bpf_local_storage_elem *selem, struct bpf_local_storage_map *smap, bool reuse_now) { - bpf_obj_free_fields(smap->map.record, SDATA(selem)->data); - if (!smap->bpf_ma) { + /* Only task storage has uptrs and task storage + * has moved to bpf_mem_alloc. Meaning smap->bpf_ma == true + * for task storage, so this bpf_obj_free_fields() won't unpin + * any uptr. + */ + bpf_obj_free_fields(smap->map.record, SDATA(selem)->data); __bpf_selem_free(selem, reuse_now); return; } - if (!reuse_now) { - call_rcu_tasks_trace(&selem->rcu, bpf_selem_free_trace_rcu); - } else { + if (reuse_now) { + /* reuse_now == true only happens when the storage owner + * (e.g. task_struct) is being destructed or the map itself + * is being destructed (ie map_free). In both cases, + * no bpf prog can have a hold on the selem. It is + * safe to unpin the uptrs and free the selem now. + */ + bpf_obj_free_fields(smap->map.record, SDATA(selem)->data); /* Instead of using the vanilla call_rcu(), * bpf_mem_cache_free will be able to reuse selem * immediately. @@ -243,7 +256,10 @@ void bpf_selem_free(struct bpf_local_storage_elem *selem, migrate_disable(); bpf_mem_cache_free(&smap->selem_ma, selem); migrate_enable(); + return; } + + call_rcu_tasks_trace(&selem->rcu, bpf_selem_free_trace_rcu); } static void bpf_selem_free_list(struct hlist_head *list, bool reuse_now) @@ -908,6 +924,9 @@ void bpf_local_storage_map_free(struct bpf_map *map, synchronize_rcu(); if (smap->bpf_ma) { + rcu_barrier_tasks_trace(); + if (!rcu_trace_implies_rcu_gp()) + rcu_barrier(); bpf_mem_alloc_destroy(&smap->selem_ma); bpf_mem_alloc_destroy(&smap->storage_ma); } From patchwork Tue Oct 15 00:49:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin KaFai Lau X-Patchwork-Id: 13835571 X-Patchwork-Delegate: bpf@iogearbox.net Received: from out-188.mta1.migadu.com (out-188.mta1.migadu.com [95.215.58.188]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BB09D482EF for ; Tue, 15 Oct 2024 00:50:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.188 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953436; cv=none; b=QrsgJj8QSTkSP2DuaYyHgZrlCa1z7GOMl7Y/xqeDBn3XzLr83uE93w5IHEBccslaCv5wIqvEC1BGuNykH2P+sI2k144z6luPUPzM/Wqh1oFyGXhb3cQ2XCK9Z8MYk7d46pIjcbcBaKFJ+/7CY5T2y0nOiI7o8TWIUoX05zznskY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953436; c=relaxed/simple; bh=/upoL42Z6ZNyGB1cgtGNjFJx2EdBVtlNuLrkFUUZNQo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=tdnj7pCj2lcmvYt9AT5CZ3uZVc2Lyx04kJQdJTd+bdSeQELE2IXCO3RKN7pnuDmCfpE42W98lrQMEYpnnxMyyOjE+U/sQA1qRTbj1vXw6Kt3X3SkqZUN3PXwUCl5Czx5adOY96ETD3HgO6O6d51897ZUxn6nfxAm6Oh6HWyo13Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=rxtfHGoO; arc=none smtp.client-ip=95.215.58.188 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="rxtfHGoO" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1728953433; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nupGYauT1Cbc6CF7JyzhKjHkH9w0JESf2hVHtU1nwts=; b=rxtfHGoOCQJGA2TjrDDA9NFIcLROKLYarT/RW0+0vACuQ5Mz6hkOnMTjv8diuRGzSoqfXf vbR5OJ6cZrKM3OIrtdBRkq3ZKH+mtUVjxf9+EmXtkY42QGY1puah6GoxsJp/jkbB/+eHab U3ZWULEKgwn9zq9mRpj3Rk6ftD9/aqE= From: Martin KaFai Lau To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Kui-Feng Lee , kernel-team@meta.com, linux-mm@kvack.org Subject: [PATCH v5 bpf-next 06/12] bpf: Add uptr support in the map_value of the task local storage. Date: Mon, 14 Oct 2024 17:49:56 -0700 Message-ID: <20241015005008.767267-7-martin.lau@linux.dev> In-Reply-To: <20241015005008.767267-1-martin.lau@linux.dev> References: <20241015005008.767267-1-martin.lau@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Patchwork-Delegate: bpf@iogearbox.net From: Martin KaFai Lau This patch adds uptr support in the map_value of the task local storage. struct map_value { struct user_data __uptr *uptr; }; struct { __uint(type, BPF_MAP_TYPE_TASK_STORAGE); __uint(map_flags, BPF_F_NO_PREALLOC); __type(key, int); __type(value, struct value_type); } datamap SEC(".maps"); A new bpf_obj_pin_uptrs() is added to pin the user page and also stores the kernel address back to the uptr for the bpf prog to use later. It currently does not support the uptr pointing to a user struct across two pages. The uptr can only be stored to the task local storage by the syscall update_elem. Meaning the uptr will not be considered if it is provided by the bpf prog through bpf_task_storage_get(BPF_LOCAL_STORAGE_GET_F_CREATE). This is enforced by only calling bpf_local_storage_update(swap_uptrs==true) in bpf_pid_task_storage_update_elem. Everywhere else will have swap_uptrs==false. This will pump down to bpf_selem_alloc(swap_uptrs==true). It is the only case that bpf_selem_alloc() will take the uptr value when updating the newly allocated selem. bpf_obj_swap_uptrs() is added to swap the uptr between the SDATA(selem)->data and the user provided map_value in "void *value". bpf_obj_swap_uptrs() makes the SDATA(selem)->data takes the ownership of the uptr and the user space provided map_value will have NULL in the uptr. The bpf_obj_unpin_uptrs() is called after map->ops->map_update_elem() returning error. If the map->ops->map_update_elem has reached a state that the local storage has taken the uptr ownership, the bpf_obj_unpin_uptrs() will be a no op because the uptr is NULL. A "__"bpf_obj_unpin_uptrs is added to make this error path unpin easier such that it does not have to check the map->record is NULL or not. BPF_F_LOCK is not supported when the map_value has uptr. This can be revisited later if there is a use case. A similar swap_uptrs idea can be considered. The final bit is to do unpin_user_page in the bpf_obj_free_fields(). The earlier patch has ensured that the bpf_obj_free_fields() has gone through the rcu gp when needed. Cc: linux-mm@kvack.org Signed-off-by: Martin KaFai Lau --- include/linux/bpf.h | 20 +++++++ kernel/bpf/bpf_local_storage.c | 7 ++- kernel/bpf/bpf_task_storage.c | 5 +- kernel/bpf/syscall.c | 99 ++++++++++++++++++++++++++++++++-- 4 files changed, 124 insertions(+), 7 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index cdd0a891ce55..7bf1627019fc 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -424,6 +424,7 @@ static inline void bpf_obj_init_field(const struct btf_field *field, void *addr) case BPF_KPTR_UNREF: case BPF_KPTR_REF: case BPF_KPTR_PERCPU: + case BPF_UPTR: break; default: WARN_ON_ONCE(1); @@ -512,6 +513,25 @@ static inline void copy_map_value_long(struct bpf_map *map, void *dst, void *src bpf_obj_memcpy(map->record, dst, src, map->value_size, true); } +static inline void bpf_obj_swap_uptrs(const struct btf_record *rec, void *dst, void *src) +{ + unsigned long *src_uptr, *dst_uptr; + const struct btf_field *field; + int i; + + if (!btf_record_has_field(rec, BPF_UPTR)) + return; + + for (i = 0, field = rec->fields; i < rec->cnt; i++, field++) { + if (field->type != BPF_UPTR) + continue; + + src_uptr = src + field->offset; + dst_uptr = dst + field->offset; + swap(*src_uptr, *dst_uptr); + } +} + static inline void bpf_obj_memzero(struct btf_record *rec, void *dst, u32 size) { u32 curr_off = 0; diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c index ca871be1c42d..7e6a0af0afc1 100644 --- a/kernel/bpf/bpf_local_storage.c +++ b/kernel/bpf/bpf_local_storage.c @@ -99,9 +99,12 @@ bpf_selem_alloc(struct bpf_local_storage_map *smap, void *owner, } if (selem) { - if (value) + if (value) { + /* No need to call check_and_init_map_value as memory is zero init */ copy_map_value(&smap->map, SDATA(selem)->data, value); - /* No need to call check_and_init_map_value as memory is zero init */ + if (swap_uptrs) + bpf_obj_swap_uptrs(smap->map.record, SDATA(selem)->data, value); + } return selem; } diff --git a/kernel/bpf/bpf_task_storage.c b/kernel/bpf/bpf_task_storage.c index 45dc3ca334d3..09705f9988f3 100644 --- a/kernel/bpf/bpf_task_storage.c +++ b/kernel/bpf/bpf_task_storage.c @@ -129,6 +129,9 @@ static long bpf_pid_task_storage_update_elem(struct bpf_map *map, void *key, struct pid *pid; int fd, err; + if ((map_flags & BPF_F_LOCK) && btf_record_has_field(map->record, BPF_UPTR)) + return -EOPNOTSUPP; + fd = *(int *)key; pid = pidfd_get_pid(fd, &f_flags); if (IS_ERR(pid)) @@ -147,7 +150,7 @@ static long bpf_pid_task_storage_update_elem(struct bpf_map *map, void *key, bpf_task_storage_lock(); sdata = bpf_local_storage_update( task, (struct bpf_local_storage_map *)map, value, map_flags, - false, GFP_ATOMIC); + true, GFP_ATOMIC); bpf_task_storage_unlock(); err = PTR_ERR_OR_ZERO(sdata); diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 694dbbeb0eb5..eb2b25fc1fa6 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -155,6 +155,82 @@ static void maybe_wait_bpf_programs(struct bpf_map *map) synchronize_rcu(); } +static void unpin_uptr_kaddr(void *kaddr) +{ + if (kaddr) + unpin_user_page(virt_to_page(kaddr)); +} + +static void __bpf_obj_unpin_uptrs(struct btf_record *rec, u32 cnt, void *obj) +{ + const struct btf_field *field; + void **uptr_addr; + int i; + + for (i = 0, field = rec->fields; i < cnt; i++, field++) { + if (field->type != BPF_UPTR) + continue; + + uptr_addr = obj + field->offset; + unpin_uptr_kaddr(*uptr_addr); + } +} + +static void bpf_obj_unpin_uptrs(struct btf_record *rec, void *obj) +{ + if (!btf_record_has_field(rec, BPF_UPTR)) + return; + + __bpf_obj_unpin_uptrs(rec, rec->cnt, obj); +} + +static int bpf_obj_pin_uptrs(struct btf_record *rec, void *obj) +{ + const struct btf_field *field; + const struct btf_type *t; + unsigned long start, end; + struct page *page; + void **uptr_addr; + int i, err; + + if (!btf_record_has_field(rec, BPF_UPTR)) + return 0; + + for (i = 0, field = rec->fields; i < rec->cnt; i++, field++) { + if (field->type != BPF_UPTR) + continue; + + uptr_addr = obj + field->offset; + start = *(unsigned long *)uptr_addr; + if (!start) + continue; + + t = btf_type_by_id(field->kptr.btf, field->kptr.btf_id); + if (check_add_overflow(start, t->size, &end)) { + err = -EFAULT; + goto unpin_all; + } + + /* The uptr's struct cannot span across two pages */ + if ((start & PAGE_MASK) != (end & PAGE_MASK)) { + err = -EOPNOTSUPP; + goto unpin_all; + } + + err = pin_user_pages_fast(start, 1, FOLL_LONGTERM | FOLL_WRITE, &page); + if (err != 1) + goto unpin_all; + + *uptr_addr = page_address(page) + offset_in_page(start); + } + + return 0; + +unpin_all: + __bpf_obj_unpin_uptrs(rec, i, obj); + return err; +} + static int bpf_map_update_value(struct bpf_map *map, struct file *map_file, void *key, void *value, __u64 flags) { @@ -199,9 +275,14 @@ static int bpf_map_update_value(struct bpf_map *map, struct file *map_file, map->map_type == BPF_MAP_TYPE_BLOOM_FILTER) { err = map->ops->map_push_elem(map, value, flags); } else { - rcu_read_lock(); - err = map->ops->map_update_elem(map, key, value, flags); - rcu_read_unlock(); + err = bpf_obj_pin_uptrs(map->record, value); + if (!err) { + rcu_read_lock(); + err = map->ops->map_update_elem(map, key, value, flags); + rcu_read_unlock(); + if (err) + bpf_obj_unpin_uptrs(map->record, value); + } } bpf_enable_instrumentation(); @@ -716,6 +797,10 @@ void bpf_obj_free_fields(const struct btf_record *rec, void *obj) field->kptr.dtor(xchgd_field); } break; + case BPF_UPTR: + /* The caller ensured that no one is using the uptr */ + unpin_uptr_kaddr(*(void **)field_ptr); + break; case BPF_LIST_HEAD: if (WARN_ON_ONCE(rec->spin_lock_off < 0)) continue; @@ -1107,7 +1192,7 @@ static int map_check_btf(struct bpf_map *map, struct bpf_token *token, map->record = btf_parse_fields(btf, value_type, BPF_SPIN_LOCK | BPF_TIMER | BPF_KPTR | BPF_LIST_HEAD | - BPF_RB_ROOT | BPF_REFCOUNT | BPF_WORKQUEUE, + BPF_RB_ROOT | BPF_REFCOUNT | BPF_WORKQUEUE | BPF_UPTR, map->value_size); if (!IS_ERR_OR_NULL(map->record)) { int i; @@ -1163,6 +1248,12 @@ static int map_check_btf(struct bpf_map *map, struct bpf_token *token, goto free_map_tab; } break; + case BPF_UPTR: + if (map->map_type != BPF_MAP_TYPE_TASK_STORAGE) { + ret = -EOPNOTSUPP; + goto free_map_tab; + } + break; case BPF_LIST_HEAD: case BPF_RB_ROOT: if (map->map_type != BPF_MAP_TYPE_HASH && From patchwork Tue Oct 15 00:49:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin KaFai Lau X-Patchwork-Id: 13835572 X-Patchwork-Delegate: bpf@iogearbox.net Received: from out-170.mta1.migadu.com (out-170.mta1.migadu.com [95.215.58.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 33AB341C79 for ; Tue, 15 Oct 2024 00:50:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953438; cv=none; b=fA7IY5uJKMJ0dCNkFWQak/tOgLMKficRy3JNIa0D/ah4K2iXh/KfztRIZE1CAD5N5r8WG4m4HB4aNwoAI1DAgykzfp9/VdKrxb3RLZDDDC6Q3enIfkjJ7nASckbD6s9hTdWgR8023cYaX270DPF0EiRUNwWKnOQ9ak8PWTv+nDM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953438; c=relaxed/simple; bh=7HDICpPHsebJv3BzIzIi1/AHx2X/2Qv1uWN4chBGU+o=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=THBZteljPqeBBhGqpMjLtEGAAFej7kR2ksPw7D9rEWFs3483dbbPypjqC9mpRXIDdBfB39X9XlcomyophlnIvpdzYdYCxGqhN0XixafrvnXt1H2HGQMv6/Sz8PmNCsPrgDPHe3L6BtIBgqOSpEtifO4pxJQxVCFTJibCN1Z+H2E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=fm7eB3sE; arc=none smtp.client-ip=95.215.58.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="fm7eB3sE" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1728953435; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HC3+xARsUKd/viABjpsb1scja9lwPh9MOMLQdWqqcS0=; b=fm7eB3sELlNZoy+t90YpMz9SFWdY/8AxwXHmrCnYF3ZV/kjVPqGEKsxVAHBgMHAOG3W5q6 uNdab7pyQGL4DNUS0i9d6PaeZLnCGEZR920ggFkAZn3vu/lQ8W9haDSRQbRWeC+uf1phs8 X9uKIW8LDyIhIIzBWoiFyeNOg0nf2EM= From: Martin KaFai Lau To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Kui-Feng Lee , kernel-team@meta.com Subject: [PATCH v5 bpf-next 07/12] libbpf: define __uptr. Date: Mon, 14 Oct 2024 17:49:57 -0700 Message-ID: <20241015005008.767267-8-martin.lau@linux.dev> In-Reply-To: <20241015005008.767267-1-martin.lau@linux.dev> References: <20241015005008.767267-1-martin.lau@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Patchwork-Delegate: bpf@iogearbox.net From: Kui-Feng Lee Make __uptr available to BPF programs to enable them to define uptrs. Acked-by: Andrii Nakryiko Signed-off-by: Kui-Feng Lee Signed-off-by: Martin KaFai Lau --- tools/lib/bpf/bpf_helpers.h | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h index 80bc0242e8dc..686824b8b413 100644 --- a/tools/lib/bpf/bpf_helpers.h +++ b/tools/lib/bpf/bpf_helpers.h @@ -185,6 +185,7 @@ enum libbpf_tristate { #define __kptr_untrusted __attribute__((btf_type_tag("kptr_untrusted"))) #define __kptr __attribute__((btf_type_tag("kptr"))) #define __percpu_kptr __attribute__((btf_type_tag("percpu_kptr"))) +#define __uptr __attribute__((btf_type_tag("uptr"))) #if defined (__clang__) #define bpf_ksym_exists(sym) ({ \ From patchwork Tue Oct 15 00:49:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin KaFai Lau X-Patchwork-Id: 13835573 X-Patchwork-Delegate: bpf@iogearbox.net Received: from out-181.mta1.migadu.com (out-181.mta1.migadu.com [95.215.58.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C068E41C79 for ; Tue, 15 Oct 2024 00:50:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953441; cv=none; b=GdUybdZ296CFm6UESGbVtsn6kCTFQvOqKHxpFHglUCTjYkEQE7EEnYXOf7a/uTPlZeeV8OcclRLl5IwP50vzFKgl0f8NVvxrklXjS1vSYkJPmkp133zfZBIfMs7B7KRx+0m8iF4DoePs2qiBq5QVuagJpyJdOI/o37C1kdUD8Vs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953441; c=relaxed/simple; bh=Q30+Ce83D33M0pWz+zz8KzCI0ehUBoJ3kP5AqtsMDDY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=k3G40onLqhCTiI02QWDfsiA1bFsOX3z2Vk3qGTBpmAqFD6EPTRurDLgHBvsg40uoQ6TpsTE/nXR4aRp+5REI/yq5zTsI0Iebxx9nLrcUleJ819Hyx/KLHx53lTzjQ5S/8rNhalaYYwYVOp10XmDrzx1cpdThCBWqN3KcCc3hO1M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=Xe8SL7ue; arc=none smtp.client-ip=95.215.58.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="Xe8SL7ue" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1728953438; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=otwiwCdm2oyjYypEuz5bsI2z4lmLvV6DISsgtZBo4G4=; b=Xe8SL7uelZZ2zUydAu+pTVra2MmCyFCSVK5r27/wDxBRC/XtTD1gj9evc2ndngVQKuBm3O NqAUdkm8KKdNvZjSUhLbjApNPorHD8IiEEvoXArUAUML6iwQHNxgycyhLUSEc5n3AIlPuF IM8NQWYvNxQSLv02uGTEAb84HibKeXI= From: Martin KaFai Lau To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Kui-Feng Lee , kernel-team@meta.com Subject: [PATCH v5 bpf-next 08/12] selftests/bpf: Some basic __uptr tests Date: Mon, 14 Oct 2024 17:49:58 -0700 Message-ID: <20241015005008.767267-9-martin.lau@linux.dev> In-Reply-To: <20241015005008.767267-1-martin.lau@linux.dev> References: <20241015005008.767267-1-martin.lau@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Patchwork-Delegate: bpf@iogearbox.net From: Kui-Feng Lee Make sure the memory of uptrs have been mapped to the kernel properly. Also ensure the values of uptrs in the kernel haven't been copied to userspace. It also has the syscall update_elem/delete_elem test to test the pin/unpin code paths. Signed-off-by: Kui-Feng Lee Signed-off-by: Martin KaFai Lau --- .../bpf/prog_tests/task_local_storage.c | 142 ++++++++++++++++++ .../selftests/bpf/progs/task_ls_uptr.c | 69 +++++++++ .../testing/selftests/bpf/uptr_test_common.h | 34 +++++ 3 files changed, 245 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/task_ls_uptr.c create mode 100644 tools/testing/selftests/bpf/uptr_test_common.h diff --git a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c index c33c05161a9e..4c8eadd1f083 100644 --- a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c +++ b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c @@ -7,12 +7,15 @@ #include #include /* For SYS_xxx definitions */ #include +#include #include #include "task_local_storage_helpers.h" #include "task_local_storage.skel.h" #include "task_local_storage_exit_creds.skel.h" #include "task_ls_recursion.skel.h" #include "task_storage_nodeadlock.skel.h" +#include "uptr_test_common.h" +#include "task_ls_uptr.skel.h" static void test_sys_enter_exit(void) { @@ -227,6 +230,143 @@ static void test_nodeadlock(void) sched_setaffinity(getpid(), sizeof(old), &old); } +static struct user_data udata __attribute__((aligned(16))) = { + .a = 1, + .b = 2, +}; + +static struct user_data udata2 __attribute__((aligned(16))) = { + .a = 3, + .b = 4, +}; + +static void check_udata2(int expected) +{ + udata2.result = udata2.nested_result = 0; + usleep(1); + ASSERT_EQ(udata2.result, expected, "udata2.result"); + ASSERT_EQ(udata2.nested_result, expected, "udata2.nested_result"); +} + +static void test_uptr_basic(void) +{ + int map_fd, parent_task_fd, ev_fd; + struct value_type value = {}; + struct task_ls_uptr *skel; + pid_t child_pid, my_tid; + __u64 ev_dummy_data = 1; + int err; + + my_tid = syscall(SYS_gettid); + parent_task_fd = sys_pidfd_open(my_tid, 0); + if (!ASSERT_OK_FD(parent_task_fd, "parent_task_fd")) + return; + + ev_fd = eventfd(0, 0); + if (!ASSERT_OK_FD(ev_fd, "ev_fd")) { + close(parent_task_fd); + return; + } + + skel = task_ls_uptr__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_open_and_load")) + goto out; + + map_fd = bpf_map__fd(skel->maps.datamap); + value.udata = &udata; + value.nested.udata = &udata; + err = bpf_map_update_elem(map_fd, &parent_task_fd, &value, BPF_NOEXIST); + if (!ASSERT_OK(err, "update_elem(udata)")) + goto out; + + err = task_ls_uptr__attach(skel); + if (!ASSERT_OK(err, "skel_attach")) + goto out; + + child_pid = fork(); + if (!ASSERT_NEQ(child_pid, -1, "fork")) + goto out; + + /* Call syscall in the child process, but access the map value of + * the parent process in the BPF program to check if the user kptr + * is translated/mapped correctly. + */ + if (child_pid == 0) { + /* child */ + + /* Overwrite the user_data in the child process to check if + * the BPF program accesses the user_data of the parent. + */ + udata.a = 0; + udata.b = 0; + + /* Wait for the parent to set child_pid */ + read(ev_fd, &ev_dummy_data, sizeof(ev_dummy_data)); + exit(0); + } + + skel->bss->parent_pid = my_tid; + skel->bss->target_pid = child_pid; + + write(ev_fd, &ev_dummy_data, sizeof(ev_dummy_data)); + + err = waitpid(child_pid, NULL, 0); + ASSERT_EQ(err, child_pid, "waitpid"); + ASSERT_EQ(udata.result, MAGIC_VALUE + udata.a + udata.b, "udata.result"); + ASSERT_EQ(udata.nested_result, MAGIC_VALUE + udata.a + udata.b, "udata.nested_result"); + + skel->bss->target_pid = my_tid; + + /* update_elem: uptr changes from udata1 to udata2 */ + value.udata = &udata2; + value.nested.udata = &udata2; + err = bpf_map_update_elem(map_fd, &parent_task_fd, &value, BPF_EXIST); + if (!ASSERT_OK(err, "update_elem(udata2)")) + goto out; + check_udata2(MAGIC_VALUE + udata2.a + udata2.b); + + /* update_elem: uptr changes from udata2 uptr to NULL */ + memset(&value, 0, sizeof(value)); + err = bpf_map_update_elem(map_fd, &parent_task_fd, &value, BPF_EXIST); + if (!ASSERT_OK(err, "update_elem(udata2)")) + goto out; + check_udata2(0); + + /* update_elem: uptr changes from NULL to udata2 */ + value.udata = &udata2; + value.nested.udata = &udata2; + err = bpf_map_update_elem(map_fd, &parent_task_fd, &value, BPF_EXIST); + if (!ASSERT_OK(err, "update_elem(udata2)")) + goto out; + check_udata2(MAGIC_VALUE + udata2.a + udata2.b); + + /* Check if user programs can access the value of user kptrs + * through bpf_map_lookup_elem(). Make sure the kernel value is not + * leaked. + */ + err = bpf_map_lookup_elem(map_fd, &parent_task_fd, &value); + if (!ASSERT_OK(err, "bpf_map_lookup_elem")) + goto out; + ASSERT_EQ(value.udata, NULL, "value.udata"); + ASSERT_EQ(value.nested.udata, NULL, "value.nested.udata"); + + /* delete_elem */ + err = bpf_map_delete_elem(map_fd, &parent_task_fd); + ASSERT_OK(err, "delete_elem(udata2)"); + check_udata2(0); + + /* update_elem: add uptr back to test map_free */ + value.udata = &udata2; + value.nested.udata = &udata2; + err = bpf_map_update_elem(map_fd, &parent_task_fd, &value, BPF_NOEXIST); + ASSERT_OK(err, "update_elem(udata2)"); + +out: + task_ls_uptr__destroy(skel); + close(ev_fd); + close(parent_task_fd); +} + void test_task_local_storage(void) { if (test__start_subtest("sys_enter_exit")) @@ -237,4 +377,6 @@ void test_task_local_storage(void) test_recursion(); if (test__start_subtest("nodeadlock")) test_nodeadlock(); + if (test__start_subtest("uptr_basic")) + test_uptr_basic(); } diff --git a/tools/testing/selftests/bpf/progs/task_ls_uptr.c b/tools/testing/selftests/bpf/progs/task_ls_uptr.c new file mode 100644 index 000000000000..3abb71806efa --- /dev/null +++ b/tools/testing/selftests/bpf/progs/task_ls_uptr.c @@ -0,0 +1,69 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ + +#include +#include +#include "uptr_test_common.h" + +struct task_struct *bpf_task_from_pid(s32 pid) __ksym; +void bpf_task_release(struct task_struct *p) __ksym; +void bpf_cgroup_release(struct cgroup *cgrp) __ksym; + +struct { + __uint(type, BPF_MAP_TYPE_TASK_STORAGE); + __uint(map_flags, BPF_F_NO_PREALLOC); + __type(key, int); + __type(value, struct value_type); +} datamap SEC(".maps"); + +/* This is a workaround to avoid clang generating a forward reference for + * struct user_data. This is a known issue and will be fixed in the future. + */ +struct user_data __dummy; + +pid_t target_pid = 0; +pid_t parent_pid = 0; + +SEC("tp_btf/sys_enter") +int on_enter(__u64 *ctx) +{ + struct task_struct *task, *data_task; + struct value_type *ptr; + struct user_data *udata; + struct cgroup *cgrp; + + task = bpf_get_current_task_btf(); + if (task->pid != target_pid) + return 0; + + data_task = bpf_task_from_pid(parent_pid); + if (!data_task) + return 0; + + ptr = bpf_task_storage_get(&datamap, data_task, 0, 0); + bpf_task_release(data_task); + if (!ptr) + return 0; + + cgrp = bpf_kptr_xchg(&ptr->cgrp, NULL); + if (cgrp) { + /* Avoid fwd cgroup type in btf */ + int lvl = cgrp->level; + + bpf_cgroup_release(cgrp); + return lvl; + } + + udata = ptr->udata; + if (!udata || udata->result) + return 0; + udata->result = MAGIC_VALUE + udata->a + udata->b; + + udata = ptr->nested.udata; + if (udata && !udata->nested_result) + udata->nested_result = udata->result; + + return 0; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/uptr_test_common.h b/tools/testing/selftests/bpf/uptr_test_common.h new file mode 100644 index 000000000000..eb81bdd31949 --- /dev/null +++ b/tools/testing/selftests/bpf/uptr_test_common.h @@ -0,0 +1,34 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ + +#ifndef _UPTR_TEST_COMMON_H +#define _UPTR_TEST_COMMON_H + +#define MAGIC_VALUE 0xabcd1234 + +#ifndef __BPF__ +struct cgroup { + int level; +}; +#define __uptr +#define __kptr +#endif + +struct user_data { + int a; + int b; + int result; + int nested_result; +}; + +struct nested_udata { + struct user_data __uptr *udata; +}; + +struct value_type { + struct user_data __uptr *udata; + struct cgroup __kptr *cgrp; + struct nested_udata nested; +}; + +#endif From patchwork Tue Oct 15 00:49:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin KaFai Lau X-Patchwork-Id: 13835574 X-Patchwork-Delegate: bpf@iogearbox.net Received: from out-174.mta1.migadu.com (out-174.mta1.migadu.com [95.215.58.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 54C7D433B9 for ; Tue, 15 Oct 2024 00:50:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953444; cv=none; b=dmfA6IVre1shkvzns79Nb2fkG7xaHuQ+8MeTGZx5FloirLAMa1v9ZpTn0OCWN79NR37CkVZjpwlIAiG6G9mTqD57XfAfxE0+JQ5gOZTMttOBY48Os9G4dj2/jLtwa1eH/qz1ot9EJbtQ2a2MI2jJNU+iItfGXtp1f5XQY3M8IXk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953444; c=relaxed/simple; bh=IqG8elDJ1M3DcxAKy/Mq4tIICfQQeQ/RkryWOrCZDTU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WjuomcuZiOArHXdPgE+jYV7DtiJD6ESHhtp1yxfRx8oJPjESvwR5EUITEDgzszoFUIjUbLig1ys7UVZ+0XbOaHAZBKjQyN3GdjPpvI1mPmy/kf341aRetGVatYrkajb+8Arvb5VWKC0k5SPcfxfJW12ZjvSVnK+xRv1JuT3HvfE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=aknKwej1; arc=none smtp.client-ip=95.215.58.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="aknKwej1" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1728953440; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JAGWdl0BorP87+LSwxIlzGu/6VkaEQ4WgDnn6hoAGic=; b=aknKwej1ccjBRDQZ3D6OPAwSCE1Qfi/ZurBLv3ajEPJwMtbcNRbJ/0nXZ5kfHUmpgPRq0Q zEsRVzKKAA75k/xWgJGIEjBaTcjntfL3W1ji9dqRVhwZKhUh6dK4FDFLfWmKJ93A4y7HX4 w2TL0vw1qoUVI9qDL8uz+RZPLyZPgU0= From: Martin KaFai Lau To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Kui-Feng Lee , kernel-team@meta.com Subject: [PATCH v5 bpf-next 09/12] selftests/bpf: Test a uptr struct spanning across pages. Date: Mon, 14 Oct 2024 17:49:59 -0700 Message-ID: <20241015005008.767267-10-martin.lau@linux.dev> In-Reply-To: <20241015005008.767267-1-martin.lau@linux.dev> References: <20241015005008.767267-1-martin.lau@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Patchwork-Delegate: bpf@iogearbox.net From: Martin KaFai Lau This patch tests the case when uptr has a struct spanning across two pages. It is not supported now and EOPNOTSUPP is expected from the syscall update_elem. Signed-off-by: Martin KaFai Lau --- .../bpf/prog_tests/task_local_storage.c | 39 +++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c index 4c8eadd1f083..9a081aaa26a3 100644 --- a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c +++ b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c @@ -8,6 +8,7 @@ #include /* For SYS_xxx definitions */ #include #include +#include #include #include "task_local_storage_helpers.h" #include "task_local_storage.skel.h" @@ -367,6 +368,42 @@ static void test_uptr_basic(void) close(parent_task_fd); } +static void test_uptr_across_pages(void) +{ + int page_size = getpagesize(); + struct value_type value = {}; + struct task_ls_uptr *skel; + int err, task_fd, map_fd; + void *mem; + + task_fd = sys_pidfd_open(getpid(), 0); + if (!ASSERT_OK_FD(task_fd, "task_fd")) + return; + + mem = mmap(NULL, page_size * 2, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + if (!ASSERT_OK_PTR(mem, "mmap(page_size * 2)")) { + close(task_fd); + return; + } + + skel = task_ls_uptr__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_open_and_load")) + goto out; + + map_fd = bpf_map__fd(skel->maps.datamap); + value.udata = mem + page_size - offsetof(struct user_data, b); + err = bpf_map_update_elem(map_fd, &task_fd, &value, 0); + if (!ASSERT_ERR(err, "update_elem(udata)")) + goto out; + ASSERT_EQ(errno, EOPNOTSUPP, "errno"); + +out: + task_ls_uptr__destroy(skel); + close(task_fd); + munmap(mem, page_size * 2); +} + void test_task_local_storage(void) { if (test__start_subtest("sys_enter_exit")) @@ -379,4 +416,6 @@ void test_task_local_storage(void) test_nodeadlock(); if (test__start_subtest("uptr_basic")) test_uptr_basic(); + if (test__start_subtest("uptr_across_pages")) + test_uptr_across_pages(); } From patchwork Tue Oct 15 00:50:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin KaFai Lau X-Patchwork-Id: 13835575 X-Patchwork-Delegate: bpf@iogearbox.net Received: from out-173.mta1.migadu.com (out-173.mta1.migadu.com [95.215.58.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ABC0150285 for ; Tue, 15 Oct 2024 00:50:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953446; cv=none; b=SNbf6WzcaNhh90OM6llkmDlsmRZ5a4SeOw07Ij625/9olaTmduczReYGkj7R++XSbwDomoXIDjzmZ2QpeO0VDC4vgJJJGtPcdbnJI1vp02VU2YwGZi8S2b6Q0ZP6vwKyn5CgTe7MVUpRRmG0WNOcC9dpyt39dMRtMbRTbCrGb+0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953446; c=relaxed/simple; bh=iGMeIqK+kOVUnJCR+JTQY82a2XAZgQe03k/TT6CfqMw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Y2MoDmdM5syXHGNP/v8tFt4Xg95hzQ+0VHNSTG6inm4ktmJNGmKJBLPq1R/3LWEizJgP0WMeOoeaO8fUK4aJ2JovEHRCy5UE7GSR4zepyNYpQJl+KDMkWY4QfmOZVlaBlMXCAZX0yyK4NlPfI3YpNjb0qFaxUmTVSeYu8d4o7/M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=aSgqGV48; arc=none smtp.client-ip=95.215.58.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="aSgqGV48" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1728953443; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/6WYrtn3u3mhY3z9OtDumK7nyd5MYlnMvisoK3b50PY=; b=aSgqGV48m8J2kTXbofVrSh2EPyBF5LGwxiPPJ6Q4aq7Kfm3MlVu+BCBSIR2mIfZVNsH4e1 oPrHahdvQXNwHUR/i6a13Y0zzvLbFegrU1vx81uuOU3mMfnexqKpqW0Cji3p+/PBW18BOK 2HMdkzH/ezVKwH2alLmjsk99Wg7UBI8= From: Martin KaFai Lau To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Kui-Feng Lee , kernel-team@meta.com Subject: [PATCH v5 bpf-next 10/12] selftests/bpf: Add update_elem failure test for task storage uptr Date: Mon, 14 Oct 2024 17:50:00 -0700 Message-ID: <20241015005008.767267-11-martin.lau@linux.dev> In-Reply-To: <20241015005008.767267-1-martin.lau@linux.dev> References: <20241015005008.767267-1-martin.lau@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Patchwork-Delegate: bpf@iogearbox.net From: Martin KaFai Lau This patch test the following failures in syscall update_elem 1. The first update_elem(BPF_F_LOCK) should be EOPNOTSUPP. syscall.c takes care of unpinning the uptr. 2. The second update_elem(BPF_EXIST) fails. syscall.c takes care of unpinning the uptr. 3. The forth update_elem(BPF_NOEXIST) fails. bpf_local_storage_update takes care of unpinning the uptr. Signed-off-by: Martin KaFai Lau --- .../bpf/prog_tests/task_local_storage.c | 45 +++++++++++++++++++ .../selftests/bpf/progs/uptr_update_failure.c | 44 ++++++++++++++++++ .../testing/selftests/bpf/uptr_test_common.h | 5 +++ 3 files changed, 94 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/uptr_update_failure.c diff --git a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c index 9a081aaa26a3..4abe1621c5c0 100644 --- a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c +++ b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c @@ -17,6 +17,7 @@ #include "task_storage_nodeadlock.skel.h" #include "uptr_test_common.h" #include "task_ls_uptr.skel.h" +#include "uptr_update_failure.skel.h" static void test_sys_enter_exit(void) { @@ -404,6 +405,48 @@ static void test_uptr_across_pages(void) munmap(mem, page_size * 2); } +static void test_uptr_update_failure(void) +{ + struct value_lock_type value = {}; + struct uptr_update_failure *skel; + int err, task_fd, map_fd; + + task_fd = sys_pidfd_open(getpid(), 0); + if (!ASSERT_OK_FD(task_fd, "task_fd")) + return; + + skel = uptr_update_failure__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_open_and_load")) + goto out; + + map_fd = bpf_map__fd(skel->maps.datamap); + + value.udata = &udata; + err = bpf_map_update_elem(map_fd, &task_fd, &value, BPF_F_LOCK); + if (!ASSERT_ERR(err, "update_elem(udata, BPF_F_LOCK)")) + goto out; + ASSERT_EQ(errno, EOPNOTSUPP, "errno"); + + err = bpf_map_update_elem(map_fd, &task_fd, &value, BPF_EXIST); + if (!ASSERT_ERR(err, "update_elem(udata, BPF_EXIST)")) + goto out; + ASSERT_EQ(errno, ENOENT, "errno"); + + err = bpf_map_update_elem(map_fd, &task_fd, &value, BPF_NOEXIST); + if (!ASSERT_OK(err, "update_elem(udata, BPF_NOEXIST)")) + goto out; + + value.udata = &udata2; + err = bpf_map_update_elem(map_fd, &task_fd, &value, BPF_NOEXIST); + if (!ASSERT_ERR(err, "update_elem(udata2, BPF_NOEXIST)")) + goto out; + ASSERT_EQ(errno, EEXIST, "errno"); + +out: + uptr_update_failure__destroy(skel); + close(task_fd); +} + void test_task_local_storage(void) { if (test__start_subtest("sys_enter_exit")) @@ -418,4 +461,6 @@ void test_task_local_storage(void) test_uptr_basic(); if (test__start_subtest("uptr_across_pages")) test_uptr_across_pages(); + if (test__start_subtest("uptr_update_failure")) + test_uptr_update_failure(); } diff --git a/tools/testing/selftests/bpf/progs/uptr_update_failure.c b/tools/testing/selftests/bpf/progs/uptr_update_failure.c new file mode 100644 index 000000000000..f986e521a053 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/uptr_update_failure.c @@ -0,0 +1,44 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ + +#include +#include +#include "uptr_test_common.h" + +struct { + __uint(type, BPF_MAP_TYPE_TASK_STORAGE); + __uint(map_flags, BPF_F_NO_PREALLOC); + __type(key, int); + __type(value, struct value_lock_type); +} datamap SEC(".maps"); + +struct user_data __dummy; + +/* load test only. not used */ +SEC("syscall") +int not_used(void *ctx) +{ + struct value_lock_type *ptr; + struct task_struct *task; + struct user_data *udata; + + task = bpf_get_current_task_btf(); + ptr = bpf_task_storage_get(&datamap, task, 0, 0); + if (!ptr) + return 0; + + bpf_spin_lock(&ptr->lock); + + udata = ptr->udata; + if (!udata) { + bpf_spin_unlock(&ptr->lock); + return 0; + } + udata->result = MAGIC_VALUE + udata->a + udata->b; + + bpf_spin_unlock(&ptr->lock); + + return 0; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/uptr_test_common.h b/tools/testing/selftests/bpf/uptr_test_common.h index eb81bdd31949..1356a61bce36 100644 --- a/tools/testing/selftests/bpf/uptr_test_common.h +++ b/tools/testing/selftests/bpf/uptr_test_common.h @@ -31,4 +31,9 @@ struct value_type { struct nested_udata nested; }; +struct value_lock_type { + struct user_data __uptr *udata; + struct bpf_spin_lock lock; +}; + #endif From patchwork Tue Oct 15 00:50:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin KaFai Lau X-Patchwork-Id: 13835576 X-Patchwork-Delegate: bpf@iogearbox.net Received: from out-172.mta1.migadu.com (out-172.mta1.migadu.com [95.215.58.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 38EF83DBB6 for ; Tue, 15 Oct 2024 00:50:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953448; cv=none; b=DHAQ01GZoa647J5yfOfaMUKCyKXkfcwMMH2rtiwCoUy//tKXhWcy4RJLTEvULzuRdhO52xVisfkRnU3UgwXQe393DUtxqYrb2b+mfQLp50g+p0Wike55NN8pVg4gcucAYi8B7RKHRbAw5GUVZuvBpXtKNdnriprVpXGBNVdNzaU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953448; c=relaxed/simple; bh=uclqs8seZwV0E0uCZoN5FbeeH2dyuIw5/J4tU5EjYvY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Jb9VktLCqamrgL5osZ2Tehmkwd7bl1rNSi54HkL7rmKSlbExn10Asnyy99+D15IMd2bdiuS3auIVMzklBRpRnfwJBZYJEXGbTLeBlsS0vyUS+mLeFHvIbgHQA+93ztINajdlBLCNyyXssvOzX7ocke5Qzn1Iq7vGL02rcLnPXJU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=L84N4x2l; arc=none smtp.client-ip=95.215.58.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="L84N4x2l" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1728953445; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8B2iH2nbqDUZ0YCT+c3Jyxa9bvxK6PhoXqTLFm/tZgI=; b=L84N4x2lgICg2V+djw32/n1oprSsXK49cAI4ZGr/bZQcMQ8QQfBpMy9gHlUa1IlMoZiawh 14MQBqaA5O1Gsg1NJMvjmaSgncClyvzl0DY7KXCP1l5RFCM1U8tXbJtS9JBoqXrk2f3Un1 18BFeNgHc19RJEn/2k6kEEvN2tCX6fM= From: Martin KaFai Lau To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Kui-Feng Lee , kernel-team@meta.com Subject: [PATCH v5 bpf-next 11/12] selftests/bpf: Add uptr failure verifier tests Date: Mon, 14 Oct 2024 17:50:01 -0700 Message-ID: <20241015005008.767267-12-martin.lau@linux.dev> In-Reply-To: <20241015005008.767267-1-martin.lau@linux.dev> References: <20241015005008.767267-1-martin.lau@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Patchwork-Delegate: bpf@iogearbox.net From: Martin KaFai Lau Add verifier tests to ensure invalid uptr usages are rejected. Signed-off-by: Martin KaFai Lau --- .../bpf/prog_tests/task_local_storage.c | 2 + .../selftests/bpf/progs/uptr_failure.c | 121 ++++++++++++++++++ 2 files changed, 123 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/uptr_failure.c diff --git a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c index 4abe1621c5c0..55e36956bf52 100644 --- a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c +++ b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c @@ -18,6 +18,7 @@ #include "uptr_test_common.h" #include "task_ls_uptr.skel.h" #include "uptr_update_failure.skel.h" +#include "uptr_failure.skel.h" static void test_sys_enter_exit(void) { @@ -463,4 +464,5 @@ void test_task_local_storage(void) test_uptr_across_pages(); if (test__start_subtest("uptr_update_failure")) test_uptr_update_failure(); + RUN_TESTS(uptr_failure); } diff --git a/tools/testing/selftests/bpf/progs/uptr_failure.c b/tools/testing/selftests/bpf/progs/uptr_failure.c new file mode 100644 index 000000000000..e035130c2959 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/uptr_failure.c @@ -0,0 +1,121 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ + +#include +#include +#include "bpf_experimental.h" +#include "bpf_misc.h" +#include "uptr_test_common.h" + +void bpf_cgroup_release(struct cgroup *cgrp) __ksym; + +struct { + __uint(type, BPF_MAP_TYPE_TASK_STORAGE); + __uint(map_flags, BPF_F_NO_PREALLOC); + __type(key, int); + __type(value, struct value_type); +} datamap SEC(".maps"); + +/* Avoid fwd user_data type in btf */ +struct user_data __dummy; + +SEC("?syscall") +__failure __msg("store to uptr disallowed") +int uptr_write(const void *ctx) +{ + struct task_struct *task; + struct value_type *v; + + task = bpf_get_current_task_btf(); + v = bpf_task_storage_get(&datamap, task, 0, + BPF_LOCAL_STORAGE_GET_F_CREATE); + if (!v) + return 0; + + v->udata = NULL; + return 0; +} + +SEC("?syscall") +__failure __msg("store to uptr disallowed") +int uptr_write_nested(const void *ctx) +{ + struct task_struct *task; + struct value_type *v; + + task = bpf_get_current_task_btf(); + v = bpf_task_storage_get(&datamap, task, 0, + BPF_LOCAL_STORAGE_GET_F_CREATE); + if (!v) + return 0; + + v->nested.udata = NULL; + return 0; +} + +SEC("?syscall") +__failure __msg("R1 invalid mem access 'mem_or_null'") +int uptr_no_null_check(const void *ctx) +{ + struct task_struct *task; + struct value_type *v; + + task = bpf_get_current_task_btf(); + v = bpf_task_storage_get(&datamap, task, 0, + BPF_LOCAL_STORAGE_GET_F_CREATE); + if (!v) + return 0; + + v->udata->result = 0; + + return 0; +} + +SEC("?syscall") +__failure __msg("doesn't point to kptr") +int uptr_kptr_xchg(const void *ctx) +{ + struct task_struct *task; + struct value_type *v; + + task = bpf_get_current_task_btf(); + v = bpf_task_storage_get(&datamap, task, 0, + BPF_LOCAL_STORAGE_GET_F_CREATE); + if (!v) + return 0; + + bpf_kptr_xchg(&v->udata, NULL); + + return 0; +} + +SEC("?syscall") +__failure __msg("invalid mem access 'scalar'") +int uptr_obj_new(const void *ctx) +{ + struct value_type *v; + struct cgroup *cgrp; + + v = bpf_obj_new(typeof(*v)); + if (!v) + return 0; + + cgrp = bpf_kptr_xchg(&v->cgrp, NULL); + if (cgrp) { + /* Avoid fwd cgroup type in btf */ + int lvl = cgrp->level; + + bpf_cgroup_release(cgrp); + bpf_obj_drop(v); + return lvl; + } + + if (v->udata) + v->udata->result = 0; + + bpf_obj_drop(v); + + return 0; +} + +char _license[] SEC("license") = "GPL"; From patchwork Tue Oct 15 00:50:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin KaFai Lau X-Patchwork-Id: 13835577 X-Patchwork-Delegate: bpf@iogearbox.net Received: from out-183.mta1.migadu.com (out-183.mta1.migadu.com [95.215.58.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 11E2850285 for ; Tue, 15 Oct 2024 00:50:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.183 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953451; cv=none; b=PCRSB7fZ17PhJgXxzvFJsT5rN1Rp67pk8jo/dGmqJbyu6eX/GG8dCl1NvMGjG57CmFTF1PjwfifnjJLuwAonwehQwo9V7UWWO7Byi2BR1RLJl2f6RUjGNrTabrzNuZDhkD+lnaHanuPi9yE01RqIVtMn2G9rvNSXyYSK0sMSvuE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728953451; c=relaxed/simple; bh=cADOfNss6Vt6FY6dU3qbAnK89LJqUEJ0IutnShQyyYU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Vc7rYf68GTUOY6otuDPu7O5cgDDLrRGL96SR06sGekojwtzkYxwKu1KlUS0TVgVQbgCrzWpAve8dzqkk6FNdZaJHlUWzzi4AZPyXMbt2ZpVmhysV4uQv3xatxHxtaINJ3khIHijA1134QDkjuqNUzl1Ri9cxBSNXEHNf+Bq0E6E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=HmqiYBfX; arc=none smtp.client-ip=95.215.58.183 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="HmqiYBfX" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1728953448; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ki6bqCuubJOtljTc48ZkxQvdEezGObz9kUoNCNQHYCM=; b=HmqiYBfX7CXq79x1rAfxd4ot1wMnJf2cIo+o3aWcQQKKEWV4C3AyFotPtR4aHt+1/AoafN No5x8WGXB66G7mzLEx6uJQD4nIst6leAUo7un6rebIjMOO8gXxcBlVQeJ++1ZlGcOz3mZ+ dkaPY0iyfxzmFYAW7cCRQZRTba744sE= From: Martin KaFai Lau To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Kui-Feng Lee , kernel-team@meta.com Subject: [PATCH v5 bpf-next 12/12] selftests/bpf: Create task_local_storage map with invalid uptr's struct Date: Mon, 14 Oct 2024 17:50:02 -0700 Message-ID: <20241015005008.767267-13-martin.lau@linux.dev> In-Reply-To: <20241015005008.767267-1-martin.lau@linux.dev> References: <20241015005008.767267-1-martin.lau@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Patchwork-Delegate: bpf@iogearbox.net From: Martin KaFai Lau This patch tests the map creation failure when the map_value has unsupported uptr. The two caes are the struct is larger than one page or the struct is a kernel struct. Signed-off-by: Martin KaFai Lau --- .../bpf/prog_tests/task_local_storage.c | 44 +++++++++++++++++ .../selftests/bpf/progs/uptr_map_failure.c | 49 +++++++++++++++++++ tools/testing/selftests/bpf/test_progs.h | 8 +++ .../testing/selftests/bpf/uptr_test_common.h | 14 ++++++ 4 files changed, 115 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/uptr_map_failure.c diff --git a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c index 55e36956bf52..8076545e064e 100644 --- a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c +++ b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c @@ -10,6 +10,7 @@ #include #include #include +#include #include "task_local_storage_helpers.h" #include "task_local_storage.skel.h" #include "task_local_storage_exit_creds.skel.h" @@ -19,6 +20,7 @@ #include "task_ls_uptr.skel.h" #include "uptr_update_failure.skel.h" #include "uptr_failure.skel.h" +#include "uptr_map_failure.skel.h" static void test_sys_enter_exit(void) { @@ -448,6 +450,40 @@ static void test_uptr_update_failure(void) close(task_fd); } +static void test_uptr_map_failure(const char *map_name, int expected_errno) +{ + LIBBPF_OPTS(bpf_map_create_opts, create_attr); + struct uptr_map_failure *skel; + struct bpf_map *map; + struct btf *btf; + int map_fd, err; + + skel = uptr_map_failure__open(); + if (!ASSERT_OK_PTR(skel, "uptr_map_failure__open")) + return; + + map = bpf_object__find_map_by_name(skel->obj, map_name); + btf = bpf_object__btf(skel->obj); + err = btf__load_into_kernel(btf); + if (!ASSERT_OK(err, "btf__load_into_kernel")) + goto done; + + create_attr.map_flags = bpf_map__map_flags(map); + create_attr.btf_fd = btf__fd(btf); + create_attr.btf_key_type_id = bpf_map__btf_key_type_id(map); + create_attr.btf_value_type_id = bpf_map__btf_value_type_id(map); + map_fd = bpf_map_create(bpf_map__type(map), map_name, + bpf_map__key_size(map), bpf_map__value_size(map), + 0, &create_attr); + if (ASSERT_ERR_FD(map_fd, "map_create")) + ASSERT_EQ(errno, expected_errno, "errno"); + else + close(map_fd); + +done: + uptr_map_failure__destroy(skel); +} + void test_task_local_storage(void) { if (test__start_subtest("sys_enter_exit")) @@ -464,5 +500,13 @@ void test_task_local_storage(void) test_uptr_across_pages(); if (test__start_subtest("uptr_update_failure")) test_uptr_update_failure(); + if (test__start_subtest("uptr_map_failure_e2big")) { + if (getpagesize() == PAGE_SIZE) + test_uptr_map_failure("large_uptr_map", E2BIG); + else + test__skip(); + } + if (test__start_subtest("uptr_map_failure_kstruct")) + test_uptr_map_failure("kstruct_uptr_map", EINVAL); RUN_TESTS(uptr_failure); } diff --git a/tools/testing/selftests/bpf/progs/uptr_map_failure.c b/tools/testing/selftests/bpf/progs/uptr_map_failure.c new file mode 100644 index 000000000000..f3afc2e5aff6 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/uptr_map_failure.c @@ -0,0 +1,49 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ + +#include +#include +#include "uptr_test_common.h" + +/* Avoid fwd btf type */ +struct large_data dummy_large_data; + +struct { + __uint(type, BPF_MAP_TYPE_TASK_STORAGE); + __uint(map_flags, BPF_F_NO_PREALLOC); + __type(key, int); + __type(value, struct large_uptr); +} large_uptr_map SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_TASK_STORAGE); + __uint(map_flags, BPF_F_NO_PREALLOC); + __type(key, int); + __type(value, struct kstruct_uptr); +} kstruct_uptr_map SEC(".maps"); + +/* compile only. not loaded */ +SEC("?syscall") +int not_loaded(const void *ctx) +{ + struct kstruct_uptr *kstruct_uptr; + struct large_uptr *large_uptr; + struct task_struct *task; + + task = bpf_get_current_task_btf(); + + kstruct_uptr = bpf_task_storage_get(&kstruct_uptr_map, task, NULL, + BPF_LOCAL_STORAGE_GET_F_CREATE); + if (!kstruct_uptr) + return 0; + + if (kstruct_uptr->cgrp) + return kstruct_uptr->cgrp->level; + + large_uptr = bpf_task_storage_get(&large_uptr_map, task, NULL, + BPF_LOCAL_STORAGE_GET_F_CREATE); + if (large_uptr && large_uptr->udata) + large_uptr->udata->a = 0; + + return 0; +} diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h index 7767d9a825ae..7a58895867c3 100644 --- a/tools/testing/selftests/bpf/test_progs.h +++ b/tools/testing/selftests/bpf/test_progs.h @@ -390,6 +390,14 @@ int test__join_cgroup(const char *path); ___ok; \ }) +#define ASSERT_ERR_FD(fd, name) ({ \ + static int duration = 0; \ + int ___fd = (fd); \ + bool ___ok = ___fd < 0; \ + CHECK(!___ok, (name), "unexpected fd: %d\n", ___fd); \ + ___ok; \ +}) + #define SYS(goto_label, fmt, ...) \ ({ \ char cmd[1024]; \ diff --git a/tools/testing/selftests/bpf/uptr_test_common.h b/tools/testing/selftests/bpf/uptr_test_common.h index 1356a61bce36..c332d7be6631 100644 --- a/tools/testing/selftests/bpf/uptr_test_common.h +++ b/tools/testing/selftests/bpf/uptr_test_common.h @@ -5,6 +5,7 @@ #define _UPTR_TEST_COMMON_H #define MAGIC_VALUE 0xabcd1234 +#define PAGE_SIZE 4096 #ifndef __BPF__ struct cgroup { @@ -36,4 +37,17 @@ struct value_lock_type { struct bpf_spin_lock lock; }; +struct large_data { + __u8 one_page[PAGE_SIZE]; + int a; +}; + +struct large_uptr { + struct large_data __uptr *udata; +}; + +struct kstruct_uptr { + struct cgroup __uptr *cgrp; +}; + #endif