[bpf-next,v2,19/25] bpf: Introduce bpf_kptr_new

Message ID	20221013062303.896469-20-memxor@gmail.com (mailing list archive)
State	Changes Requested
Delegated to:	BPF
Headers	show Return-Path: <bpf-owner@kernel.org> From: Kumar Kartikeya Dwivedi <memxor@gmail.com> To: bpf@vger.kernel.org Cc: Alexei Starovoitov <ast@kernel.org>, Andrii Nakryiko <andrii@kernel.org>, Daniel Borkmann <daniel@iogearbox.net>, Dave Marchevsky <davemarchevsky@meta.com>, Delyan Kratunov <delyank@meta.com> Subject: [PATCH bpf-next v2 19/25] bpf: Introduce bpf_kptr_new Date: Thu, 13 Oct 2022 11:52:57 +0530 Message-Id: <20221013062303.896469-20-memxor@gmail.com> In-Reply-To: <20221013062303.896469-1-memxor@gmail.com> References: <20221013062303.896469-1-memxor@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	Local kptrs, BPF linked lists \| expand [bpf-next,v2,00/25] Local kptrs, BPF linked lists [bpf-next,v2,01/25] bpf: Document UAPI details for special BPF types [bpf-next,v2,02/25] bpf: Allow specifying volatile type modifier for kptrs [bpf-next,v2,03/25] bpf: Clobber stack slot when writing over spilled PTR_TO_BTF_ID [bpf-next,v2,04/25] bpf: Fix slot type check in check_stack_write_var_off [bpf-next,v2,05/25] bpf: Drop reg_type_may_be_refcounted_or_null [bpf-next,v2,06/25] bpf: Refactor kptr_off_tab into fields_tab [bpf-next,v2,07/25] bpf: Consolidate spin_lock, timer management into fields_tab [bpf-next,v2,08/25] bpf: Refactor map->off_arr handling [bpf-next,v2,09/25] bpf: Support bpf_list_head in map values [bpf-next,v2,10/25] bpf: Introduce local kptrs [bpf-next,v2,11/25] bpf: Recognize bpf_{spin_lock,list_head,list_node} in local kptrs [bpf-next,v2,12/25] bpf: Verify ownership relationships for owning types [bpf-next,v2,13/25] bpf: Support locking bpf_spin_lock in local kptr [bpf-next,v2,14/25] bpf: Allow locking bpf_spin_lock global variables [bpf-next,v2,15/25] bpf: Rewrite kfunc argument handling [bpf-next,v2,16/25] bpf: Drop kfunc bits from btf_check_func_arg_match [bpf-next,v2,17/25] bpf: Support constant scalar arguments for kfuncs [bpf-next,v2,18/25] bpf: Teach verifier about non-size constant arguments [bpf-next,v2,19/25] bpf: Introduce bpf_kptr_new [bpf-next,v2,20/25] bpf: Introduce bpf_kptr_drop [bpf-next,v2,21/25] bpf: Permit NULL checking pointer with non-zero fixed offset [bpf-next,v2,22/25] bpf: Introduce single ownership BPF linked list API [bpf-next,v2,23/25] libbpf: Add support for private BSS map section [bpf-next,v2,24/25] selftests/bpf: Add __contains macro to bpf_experimental.h [bpf-next,v2,25/25] selftests/bpf: Add BPF linked list API tests

Context	Check	Description
bpf/vmtest-bpf-next-VM_Test-2	success	Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-3	success	Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-1	success	Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-15	success	Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16	success	Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17	success	Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-7	success	Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-8	success	Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-9	success	Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-10	fail	Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-11	fail	Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-13	fail	Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-14	fail	Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-PR	fail	PR summary
bpf/vmtest-bpf-next-VM_Test-6	success	Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12	success	Logs for test_progs_no_alu32 on s390x with gcc
netdev/tree_selection	success	Clearly marked for bpf-next, async
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/subject_prefix	success	Link
netdev/cover_letter	success	Series has a cover letter
netdev/patch_count	fail	Series longer than 15 patches (and no cover letter)
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	fail	Errors and warnings before: 1373 this patch: 1355
netdev/cc_maintainers	warning	11 maintainers not CCed: sdf@google.com john.fastabend@gmail.com yhs@fb.com haoluo@google.com linux-kselftest@vger.kernel.org jolsa@kernel.org kpsingh@kernel.org song@kernel.org shuah@kernel.org mykolal@fb.com martin.lau@linux.dev
netdev/build_clang	success	Errors and warnings before: 159 this patch: 159
netdev/module_param	success	Was 0 now: 0
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	fail	Errors and warnings before: 1367 this patch: 1368
netdev/checkpatch	warning	CHECK: extern prototypes should be avoided in .h files CHECK: multiple assignments should be avoided WARNING: 'unitialized' may be misspelled - perhaps 'uninitialized'? WARNING: line length of 101 exceeds 80 columns WARNING: line length of 103 exceeds 80 columns WARNING: line length of 105 exceeds 80 columns WARNING: line length of 106 exceeds 80 columns WARNING: line length of 108 exceeds 80 columns WARNING: line length of 111 exceeds 80 columns WARNING: line length of 82 exceeds 80 columns WARNING: line length of 83 exceeds 80 columns WARNING: line length of 84 exceeds 80 columns WARNING: line length of 85 exceeds 80 columns WARNING: line length of 86 exceeds 80 columns WARNING: line length of 87 exceeds 80 columns WARNING: line length of 89 exceeds 80 columns WARNING: line length of 94 exceeds 80 columns WARNING: line length of 95 exceeds 80 columns WARNING: line length of 96 exceeds 80 columns WARNING: line length of 97 exceeds 80 columns WARNING: line length of 98 exceeds 80 columns
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-4	success	Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-5	success	Logs for set-matrix

diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 7ffafa5bb866..29fccf7c8505 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -55,6 +55,8 @@ struct cgroup; extern struct idr btf_idr; extern spinlock_t btf_idr_lock; extern struct kobject *btf_kobj; +extern struct bpf_mem_alloc bpf_global_ma; +extern bool bpf_global_ma_set; typedef u64 (*bpf_callback_t)(u64, u64, u64, u64, u64); typedef int (*bpf_iter_init_seq_priv_t)(void *private_data, @@ -335,16 +337,19 @@ static inline bool btf_type_fields_has_field(const struct btf_type_fields *tab, return tab->field_mask & type; } -static inline void check_and_init_map_value(struct bpf_map *map, void *dst) +static inline void bpf_obj_init(const struct btf_type_fields_off *off_arr, void *obj) { - if (!IS_ERR_OR_NULL(map->fields_tab)) { - struct btf_field *fields = map->fields_tab->fields; - u32 cnt = map->fields_tab->cnt; - int i; + int i; - for (i = 0; i < cnt; i++) - memset(dst + fields[i].offset, 0, btf_field_type_size(fields[i].type)); - } + if (!off_arr) + return; + for (i = 0; i < off_arr->cnt; i++) + memset(obj + off_arr->field_off[i], 0, off_arr->field_sz[i]); +} + +static inline void check_and_init_map_value(struct bpf_map *map, void *dst) +{ + bpf_obj_init(map->off_arr, dst); } /* memcpy that is used with 8-byte aligned pointers, power-of-8 size and diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 8b09c3f82071..0cc4679f3f42 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -426,6 +426,8 @@ struct bpf_insn_aux_data { */ struct bpf_loop_inline_state loop_inline_state; }; + u64 kptr_new_size; /* remember the size of type passed to bpf_kptr_new to rewrite R1 */ + struct btf_struct_meta *kptr_struct_meta; u64 map_key_state; /* constant (32 bit) key tracking for maps */ int ctx_field_size; /* the ctx field size for load insn, maybe 0 */ u32 seen; /* this insn was processed by the verifier at env->pass_cnt */ diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 711fd293b6de..a8b3263a9a45 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -34,6 +34,7 @@ #include <linux/log2.h> #include <linux/bpf_verifier.h> #include <linux/nodemask.h> +#include <linux/bpf_mem_alloc.h> #include <asm/barrier.h> #include <asm/unaligned.h> @@ -60,6 +61,9 @@ #define CTX regs[BPF_REG_CTX] #define IMM insn->imm +struct bpf_mem_alloc bpf_global_ma; +bool bpf_global_ma_set; + /* No hurry in this branch * * Exported for the bpf jit load helper. @@ -2740,6 +2744,16 @@ int __weak bpf_arch_text_invalidate(void *dst, size_t len) return -ENOTSUPP; } +static int __init bpf_global_ma_init(void) +{ + int ret; + + ret = bpf_mem_alloc_init(&bpf_global_ma, 0, false); + bpf_global_ma_set = !ret; + return ret; +} +late_initcall(bpf_global_ma_init); + DEFINE_STATIC_KEY_FALSE(bpf_stats_enabled_key); EXPORT_SYMBOL(bpf_stats_enabled_key); diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 238103dc6c5e..954e0bf18269 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -19,6 +19,7 @@ #include <linux/proc_ns.h> #include <linux/security.h> #include <linux/btf_ids.h> +#include <linux/bpf_mem_alloc.h> #include "../../lib/kstrtox.h" @@ -1725,8 +1726,11 @@ void bpf_list_head_free(const struct btf_field *field, void *list_head, obj -= field->list_head.node_offset; head = head->next; - /* TODO: Rework later */ - kfree(obj); + /* The contained type can also have resources, including a + * bpf_list_head which needs to be freed. + */ + bpf_obj_free_fields(field->list_head.value_tab, obj); + bpf_mem_free(&bpf_global_ma, obj); } unlock: INIT_LIST_HEAD(head); @@ -1734,20 +1738,43 @@ void bpf_list_head_free(const struct btf_field *field, void *list_head, local_irq_restore(flags); } -BTF_SET8_START(tracing_btf_ids) +__diag_push(); +__diag_ignore_all("-Wmissing-prototypes", + "Global functions as their definitions will be in vmlinux BTF"); + +void *bpf_kptr_new_impl(u64 local_type_id__k, u64 flags, void *meta__ign) +{ + struct btf_struct_meta *meta = meta__ign; + u64 size = local_type_id__k; + void *p; + + if (unlikely(flags || !bpf_global_ma_set)) + return NULL; + p = bpf_mem_alloc(&bpf_global_ma, size); + if (!p) + return NULL; + if (meta) + bpf_obj_init(meta->off_arr, p); + return p; +} + +__diag_pop(); + +BTF_SET8_START(generic_btf_ids) #ifdef CONFIG_KEXEC_CORE BTF_ID_FLAGS(func, crash_kexec, KF_DESTRUCTIVE) #endif -BTF_SET8_END(tracing_btf_ids) +BTF_ID_FLAGS(func, bpf_kptr_new_impl, KF_ACQUIRE | KF_RET_NULL) +BTF_SET8_END(generic_btf_ids) -static const struct btf_kfunc_id_set tracing_kfunc_set = { +static const struct btf_kfunc_id_set generic_kfunc_set = { .owner = THIS_MODULE, - .set = &tracing_btf_ids, + .set = &generic_btf_ids, }; static int __init kfunc_init(void) { - return register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &tracing_kfunc_set); + return register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &generic_kfunc_set); } late_initcall(kfunc_init); diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index b324c1042fb8..9cc01535e391 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -7766,6 +7766,11 @@ static bool is_kfunc_arg_sfx_constant(const struct btf *btf, const struct btf_pa return __kfunc_param_match_suffix(btf, arg, "__k"); } +static bool is_kfunc_arg_sfx_ignore(const struct btf *btf, const struct btf_param *arg) +{ + return __kfunc_param_match_suffix(btf, arg, "__ign"); +} + static bool is_kfunc_arg_ret_buf_size(const struct btf *btf, const struct btf_param *arg, const struct bpf_reg_state *reg, @@ -8035,6 +8040,10 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ int kf_arg_type; t = btf_type_skip_modifiers(btf, args[i].type, NULL); + + if (is_kfunc_arg_sfx_ignore(btf, &args[i])) + continue; + if (btf_type_is_scalar(t)) { if (reg->type != SCALAR_VALUE) { verbose(env, "R%d is not a scalar\n", regno); @@ -8212,6 +8221,17 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ return 0; } +enum special_kfunc_type { + KF_bpf_kptr_new_impl, +}; + +BTF_SET_START(special_kfunc_set) +BTF_ID(func, bpf_kptr_new_impl) +BTF_SET_END(special_kfunc_set) + +BTF_ID_LIST(special_kfunc_list) +BTF_ID(func, bpf_kptr_new_impl) + static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, int *insn_idx_p) { @@ -8286,17 +8306,64 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, t = btf_type_skip_modifiers(desc_btf, func_proto->type, NULL); if (is_kfunc_acquire(&meta) && !btf_type_is_struct_ptr(meta.btf, t)) { - verbose(env, "acquire kernel function does not return PTR_TO_BTF_ID\n"); - return -EINVAL; + /* Only exception is bpf_kptr_new_impl */ + if (meta.btf != btf_vmlinux || meta.func_id != special_kfunc_list[KF_bpf_kptr_new_impl]) { + verbose(env, "acquire kernel function does not return PTR_TO_BTF_ID\n"); + return -EINVAL; + } } if (btf_type_is_scalar(t)) { mark_reg_unknown(env, regs, BPF_REG_0); mark_btf_func_reg_size(env, BPF_REG_0, t->size); } else if (btf_type_is_ptr(t)) { - ptr_type = btf_type_skip_modifiers(desc_btf, t->type, - &ptr_type_id); - if (!btf_type_is_struct(ptr_type)) { + ptr_type = btf_type_skip_modifiers(desc_btf, t->type, &ptr_type_id); + + if (meta.btf == btf_vmlinux && btf_id_set_contains(&special_kfunc_set, meta.func_id)) { + if (!btf_type_is_void(ptr_type)) { + verbose(env, "kernel function %s must have void * return type\n", + meta.func_name); + return -EINVAL; + } + if (meta.func_id == special_kfunc_list[KF_bpf_kptr_new_impl]) { + const struct btf_type *ret_t; + struct btf *ret_btf; + u32 ret_btf_id; + + if (((u64)(u32)meta.arg_constant.value) != meta.arg_constant.value) { + verbose(env, "local type ID argument must be in range [0, U32_MAX]\n"); + return -EINVAL; + } + + ret_btf = env->prog->aux->btf; + ret_btf_id = meta.arg_constant.value; + + /* This may be NULL due to user not supplying a BTF */ + if (!ret_btf) { + verbose(env, "bpf_kptr_new requires prog BTF\n"); + return -EINVAL; + } + + mark_reg_known_zero(env, regs, BPF_REG_0); + regs[BPF_REG_0].type = PTR_TO_BTF_ID | MEM_TYPE_LOCAL; + regs[BPF_REG_0].btf = ret_btf; + regs[BPF_REG_0].btf_id = ret_btf_id; + + ret_t = btf_type_by_id(ret_btf, ret_btf_id); + if (!ret_t || !__btf_type_is_struct(ret_t)) { + verbose(env, "bpf_kptr_new type ID argument must be of a struct\n"); + return -EINVAL; + } + + env->insn_aux_data[insn_idx].kptr_new_size = ret_t->size; + env->insn_aux_data[insn_idx].kptr_struct_meta = + btf_find_struct_meta(ret_btf, ret_btf_id); + } else { + verbose(env, "kernel function %s unhandled dynamic return type\n", + meta.func_name); + return -EFAULT; + } + } else if (!__btf_type_is_struct(ptr_type)) { if (!meta.r0_size) { ptr_type_name = btf_name_by_offset(desc_btf, ptr_type->name_off); @@ -8324,6 +8391,7 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, regs[BPF_REG_0].type = PTR_TO_BTF_ID; regs[BPF_REG_0].btf_id = ptr_type_id; } + if (is_kfunc_ret_null(&meta)) { regs[BPF_REG_0].type |= PTR_MAYBE_NULL; /* For mark_ptr_or_null_reg, see 93c230e3f5bd6 */ @@ -14455,8 +14523,8 @@ static int fixup_call_args(struct bpf_verifier_env *env) return err; } -static int fixup_kfunc_call(struct bpf_verifier_env *env, - struct bpf_insn *insn) +static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, + struct bpf_insn *insn_buf, int insn_idx, int *cnt) { const struct bpf_kfunc_desc *desc; @@ -14475,8 +14543,21 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, return -EFAULT; } + *cnt = 0; insn->imm = desc->imm; + if (insn->off) + return 0; + if (desc->func_id == special_kfunc_list[KF_bpf_kptr_new_impl]) { + struct btf_struct_meta *kptr_struct_meta = env->insn_aux_data[insn_idx].kptr_struct_meta; + struct bpf_insn addr[2] = { BPF_LD_IMM64(BPF_REG_3, (long)kptr_struct_meta) }; + u64 kptr_new_size = env->insn_aux_data[insn_idx].kptr_new_size; + insn_buf[0] = BPF_MOV64_IMM(BPF_REG_1, kptr_new_size); + insn_buf[1] = addr[0]; + insn_buf[2] = addr[1]; + insn_buf[3] = *insn; + *cnt = 4; + } return 0; } @@ -14618,9 +14699,19 @@ static int do_misc_fixups(struct bpf_verifier_env *env) if (insn->src_reg == BPF_PSEUDO_CALL) continue; if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) { - ret = fixup_kfunc_call(env, insn); + ret = fixup_kfunc_call(env, insn, insn_buf, i + delta, &cnt); if (ret) return ret; + if (cnt == 0) + continue; + + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); + if (!new_prog) + return -ENOMEM; + + delta += cnt - 1; + env->prog = prog = new_prog; + insn = new_prog->insnsi + i + delta; continue; } diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h index 4e31790e433d..9c7d0badb02e 100644 --- a/tools/testing/selftests/bpf/bpf_experimental.h +++ b/tools/testing/selftests/bpf/bpf_experimental.h @@ -20,4 +20,23 @@ struct bpf_list_node { #endif #ifndef __KERNEL__ + +/* Description + * Allocates a local kptr of type represented by 'local_type_id' in program + * BTF. User may use the bpf_core_type_id_local macro to pass the type ID + * of a struct in program BTF. + * + * The 'local_type_id' parameter must be a known constant. The 'flags' + * parameter must be 0. + * + * The 'meta__ign' parameter is a hidden argument that is ignored. + * Returns + * A local kptr corresponding to passed in 'local_type_id', or NULL on + * failure. + */ +extern void *bpf_kptr_new_impl(__u64 local_type_id, __u64 flags, void *meta__ign) __ksym; + +/* Convenience macro to wrap over bpf_kptr_new_impl */ +#define bpf_kptr_new(type) bpf_kptr_new_impl(bpf_core_type_id_local(type), 0, NULL) + #endif

[bpf-next,v2,19/25] bpf: Introduce bpf_kptr_new

Checks

Commit Message

Comments

Patch