From patchwork Thu Oct 10 17:55:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13830609 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 66-220-155-179.mail-mxout.facebook.com (66-220-155-179.mail-mxout.facebook.com [66.220.155.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4160019D88F for ; Thu, 10 Oct 2024 17:56:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.220.155.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728582972; cv=none; b=jf18n3NjudR8Jwzr4/ykUNXNCysFCrhOlHSpZzOXVwk88tWWG6pgkjJTl0+8FKLQtqTbtVveZm1opvqWiVvShsd1him1bmdq2S0/2CMW4EnK7GSH5YYVzL1btcTpcxWFyGvBAWzbgzX/KW0nLhZKQHZzgGcwW3D0FAjCMH12bKc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728582972; c=relaxed/simple; bh=dH77XZDdDja86mX4woV126qUrkN2KNSE1ve/Xg7khok=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IzSe9k/2pN86cn8izy2cIYx1wK2Zm4kmZgQGs+5r7bmXNh9/Bg6H42Nw0vAqd/f7RAXBiqKPXIQhcPK41LDM/eWnpt98K17cvpe2Ek8HVSjIjTOXcQUDAYe43XJN5nsH0l8mGEnxHH+a5/7F9gXz2uVCTxnlxyD6ms39Vb1TJzQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=66.220.155.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id 98A129F27B35; Thu, 10 Oct 2024 10:55:57 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v4 01/10] bpf: Allow each subprog having stack size of 512 bytes Date: Thu, 10 Oct 2024 10:55:57 -0700 Message-ID: <20241010175557.1896301-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241010175552.1895980-1-yonghong.song@linux.dev> References: <20241010175552.1895980-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net With private stack support, each subprog can have stack with up to 512 bytes. The limit of 512 bytes per subprog is kept to avoid increasing verifier complexity since greater than 512 bytes will cause big verifier change and increase memory consumption and verification time. If private stack is supported, for a bpf prog, esp. when it has subprogs, private stack will be allocated for the main prog and for each callback subprog. For example, main_prog subprog1 calling helper subprog10 (callback func) subprog11 subprog2 calling helper subprog10 (callback func) subprog11 Separate private allocations for main_prog and callback_fn subprog10 will make things easier since the helper function uses the kernel stack. Additional subprog info is also collected for later to allocate private stack for main prog and each callback functions. Note that if any tail_call is called in the prog (including all subprogs), then private stack is not used. Signed-off-by: Yonghong Song --- include/linux/bpf.h | 1 + include/linux/bpf_verifier.h | 3 ++ include/linux/filter.h | 1 + kernel/bpf/core.c | 5 ++ kernel/bpf/verifier.c | 94 +++++++++++++++++++++++++++++++----- 5 files changed, 91 insertions(+), 13 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 19d8ca8ac960..9ef9133e0470 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1483,6 +1483,7 @@ struct bpf_prog_aux { bool xdp_has_frags; bool exception_cb; bool exception_boundary; + bool priv_stack_eligible; struct bpf_arena *arena; /* BTF_KIND_FUNC_PROTO for valid attach_btf_id */ const struct btf_type *attach_func_proto; diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 4513372c5bc8..bcfe868e3801 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -659,6 +659,8 @@ struct bpf_subprog_info { * are used for bpf_fastcall spills and fills. */ s16 fastcall_stack_off; + u16 subtree_stack_depth; + u16 subtree_top_idx; bool has_tail_call: 1; bool tail_call_reachable: 1; bool has_ld_abs: 1; @@ -668,6 +670,7 @@ struct bpf_subprog_info { bool args_cached: 1; /* true if bpf_fastcall stack region is used by functions that can't be inlined */ bool keep_fastcall_stack: 1; + bool priv_stack_eligible: 1; u8 arg_cnt; struct bpf_subprog_arg_info args[MAX_BPF_FUNC_REG_ARGS]; diff --git a/include/linux/filter.h b/include/linux/filter.h index 7d7578a8eac1..3a21947f2fd4 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -1119,6 +1119,7 @@ bool bpf_jit_supports_exceptions(void); bool bpf_jit_supports_ptr_xchg(void); bool bpf_jit_supports_arena(void); bool bpf_jit_supports_insn(struct bpf_insn *insn, bool in_arena); +bool bpf_jit_supports_private_stack(void); u64 bpf_arch_uaddress_limit(void); void arch_bpf_stack_walk(bool (*consume_fn)(void *cookie, u64 ip, u64 sp, u64 bp), void *cookie); bool bpf_helper_changes_pkt_data(void *func); diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 5e77c58e0601..ba088b58746f 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -3044,6 +3044,11 @@ bool __weak bpf_jit_supports_exceptions(void) return false; } +bool __weak bpf_jit_supports_private_stack(void) +{ + return false; +} + void __weak arch_bpf_stack_walk(bool (*consume_fn)(void *cookie, u64 ip, u64 sp, u64 bp), void *cookie) { } diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 7d9b38ffd220..3972606f97d2 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -194,6 +194,8 @@ struct bpf_verifier_stack_elem { #define BPF_GLOBAL_PERCPU_MA_MAX_SIZE 512 +#define BPF_PRIV_STACK_MIN_SUBTREE_SIZE 128 + static int acquire_reference_state(struct bpf_verifier_env *env, int insn_idx); static int release_reference(struct bpf_verifier_env *env, int ref_obj_id); static void invalidate_non_owning_refs(struct bpf_verifier_env *env); @@ -5982,6 +5984,41 @@ static int check_ptr_alignment(struct bpf_verifier_env *env, strict); } +static bool bpf_enable_private_stack(struct bpf_prog *prog) +{ + if (!bpf_jit_supports_private_stack()) + return false; + + switch (prog->aux->prog->type) { + case BPF_PROG_TYPE_KPROBE: + case BPF_PROG_TYPE_TRACEPOINT: + case BPF_PROG_TYPE_PERF_EVENT: + case BPF_PROG_TYPE_RAW_TRACEPOINT: + return true; + case BPF_PROG_TYPE_TRACING: + if (prog->expected_attach_type != BPF_TRACE_ITER) + return true; + fallthrough; + default: + return false; + } +} + +static bool is_priv_stack_supported(struct bpf_verifier_env *env) +{ + struct bpf_subprog_info *si = env->subprog_info; + bool has_tail_call = false; + + for (int i = 0; i < env->subprog_cnt; i++) { + if (si[i].has_tail_call) { + has_tail_call = true; + break; + } + } + + return !has_tail_call && bpf_enable_private_stack(env->prog); +} + static int round_up_stack_depth(struct bpf_verifier_env *env, int stack_depth) { if (env->prog->jit_requested) @@ -5999,16 +6036,21 @@ static int round_up_stack_depth(struct bpf_verifier_env *env, int stack_depth) * Since recursion is prevented by check_cfg() this algorithm * only needs a local stack of MAX_CALL_FRAMES to remember callsites */ -static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx) +static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx, + bool check_priv_stack, bool priv_stack_supported) { struct bpf_subprog_info *subprog = env->subprog_info; struct bpf_insn *insn = env->prog->insnsi; int depth = 0, frame = 0, i, subprog_end; bool tail_call_reachable = false; + bool priv_stack_eligible = false; int ret_insn[MAX_CALL_FRAMES]; int ret_prog[MAX_CALL_FRAMES]; - int j; + int j, subprog_stack_depth; + int orig_idx = idx; + if (check_priv_stack) + subprog[idx].subtree_top_idx = idx; i = subprog[idx].start; process_func: /* protect against potential stack overflow that might happen when @@ -6030,18 +6072,33 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx) * tailcall will unwind the current stack frame but it will not get rid * of caller's stack as shown on the example above. */ - if (idx && subprog[idx].has_tail_call && depth >= 256) { + if (!check_priv_stack && idx && subprog[idx].has_tail_call && depth >= 256) { verbose(env, "tail_calls are not allowed when call stack of previous frames is %d bytes. Too large\n", depth); return -EACCES; } - depth += round_up_stack_depth(env, subprog[idx].stack_depth); - if (depth > MAX_BPF_STACK) { + subprog_stack_depth = round_up_stack_depth(env, subprog[idx].stack_depth); + depth += subprog_stack_depth; + if (!check_priv_stack && !priv_stack_supported && depth > MAX_BPF_STACK) { verbose(env, "combined stack size of %d calls is %d. Too large\n", frame + 1, depth); return -EACCES; } + if (check_priv_stack) { + if (subprog_stack_depth > MAX_BPF_STACK) { + verbose(env, "stack size of subprog %d is %d. Too large\n", + idx, subprog_stack_depth); + return -EACCES; + } + + if (!priv_stack_eligible && depth >= BPF_PRIV_STACK_MIN_SUBTREE_SIZE) { + subprog[orig_idx].priv_stack_eligible = true; + env->prog->aux->priv_stack_eligible = priv_stack_eligible = true; + } + subprog[orig_idx].subtree_stack_depth = + max_t(u16, subprog[orig_idx].subtree_stack_depth, depth); + } continue_func: subprog_end = subprog[idx + 1].start; for (; i < subprog_end; i++) { @@ -6097,8 +6154,10 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx) } i = next_insn; idx = sidx; + if (check_priv_stack) + subprog[idx].subtree_top_idx = orig_idx; - if (subprog[idx].has_tail_call) + if (!check_priv_stack && subprog[idx].has_tail_call) tail_call_reachable = true; frame++; @@ -6122,7 +6181,7 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx) } subprog[ret_prog[j]].tail_call_reachable = true; } - if (subprog[0].tail_call_reachable) + if (!check_priv_stack && subprog[0].tail_call_reachable) env->prog->aux->tail_call_reachable = true; /* end of for() loop means the last insn of the 'subprog' @@ -6137,14 +6196,18 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx) goto continue_func; } -static int check_max_stack_depth(struct bpf_verifier_env *env) +static int check_max_stack_depth(struct bpf_verifier_env *env, bool check_priv_stack, + bool priv_stack_supported) { struct bpf_subprog_info *si = env->subprog_info; + bool check_subprog; int ret; for (int i = 0; i < env->subprog_cnt; i++) { - if (!i || si[i].is_async_cb) { - ret = check_max_stack_depth_subprog(env, i); + check_subprog = !i || (check_priv_stack ? si[i].is_cb : si[i].is_async_cb); + if (check_subprog) { + ret = check_max_stack_depth_subprog(env, i, check_priv_stack, + priv_stack_supported); if (ret < 0) return ret; } @@ -22298,7 +22361,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3 struct bpf_verifier_env *env; int i, len, ret = -EINVAL, err; u32 log_true_size; - bool is_priv; + bool is_priv, priv_stack_supported = false; /* no program is valid */ if (ARRAY_SIZE(bpf_verifier_ops) == 0) @@ -22425,8 +22488,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3 if (ret == 0) ret = remove_fastcall_spills_fills(env); - if (ret == 0) - ret = check_max_stack_depth(env); + if (ret == 0) { + priv_stack_supported = is_priv_stack_supported(env); + ret = check_max_stack_depth(env, false, priv_stack_supported); + } /* instruction rewrites happen after this point */ if (ret == 0) @@ -22460,6 +22525,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3 : false; } + if (ret == 0 && priv_stack_supported) + ret = check_max_stack_depth(env, true, true); + if (ret == 0) ret = fixup_call_args(env); From patchwork Thu Oct 10 17:56:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13830610 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 69-171-232-181.mail-mxout.facebook.com (69-171-232-181.mail-mxout.facebook.com [69.171.232.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 056AC1CCB45 for ; Thu, 10 Oct 2024 17:56:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=69.171.232.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728582978; cv=none; b=MipmEXe6cBrgy9EFMmhDliLctKV7hLq+QsAgyp5LmiWoXsUYb4EoXM8iG1FTGxzaT+4Qu4EuTtcogMaYc6ybqB4AiUf2ferg77/x6zcaB7Ym7S3dyt3OkrZ8BWPi0bs1TNicOYsDFgtJ31gIvTFOuiyALKtUmqRU3w+foiEebnw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728582978; c=relaxed/simple; bh=TieqV78bxwE/irMwTJf6s5guZLtsULA1hHQkc3B5wwQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Zh4PGiERhtNgOg1VRn4k2q+O/YiMIKN92k0CZjOFM8Jj7MKxFQTWJ6Ks1Lj4kHEDOuzVKz1PEs8hH5K7/ZE9ItogP/32AGh+PW7D6k1rr0+HNRc/kz0I+CVjMs+4g1lYcmww13KyudCYmFvS/qa+M5s3iYM20wO6J8WPOypFYo0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=69.171.232.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id CA2FC9F27B63; Thu, 10 Oct 2024 10:56:02 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v4 02/10] bpf: Mark each subprog with proper private stack modes Date: Thu, 10 Oct 2024 10:56:02 -0700 Message-ID: <20241010175602.1896674-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241010175552.1895980-1-yonghong.song@linux.dev> References: <20241010175552.1895980-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Three private stack modes are used to direct jit action: NO_PRIV_STACK: do not use private stack PRIV_STACK_SUB_PROG: adjust frame pointer address (similar to normal stack) PRIV_STACK_ROOT_PROG: set the frame pointer Note that for subtree root prog (main prog or callback fn), even if the bpf_prog stack size is 0, PRIV_STACK_ROOT_PROG mode is still used. This is for bpf exception handling. More details can be found in subsequent jit support and selftest patches. Signed-off-by: Yonghong Song --- include/linux/bpf.h | 9 +++++++++ kernel/bpf/core.c | 19 +++++++++++++++++++ kernel/bpf/verifier.c | 29 +++++++++++++++++++++++++++++ 3 files changed, 57 insertions(+) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 9ef9133e0470..f22ddb423fd0 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1450,6 +1450,12 @@ struct btf_mod_pair { struct bpf_kfunc_desc_tab; +enum bpf_priv_stack_mode { + NO_PRIV_STACK, + PRIV_STACK_SUB_PROG, + PRIV_STACK_ROOT_PROG, +}; + struct bpf_prog_aux { atomic64_t refcnt; u32 used_map_cnt; @@ -1466,6 +1472,9 @@ struct bpf_prog_aux { u32 ctx_arg_info_size; u32 max_rdonly_access; u32 max_rdwr_access; + enum bpf_priv_stack_mode priv_stack_mode; + u16 subtree_stack_depth; /* Subtree stack depth if PRIV_STACK_ROOT_PROG, 0 otherwise */ + void __percpu *priv_stack_ptr; struct btf *attach_btf; const struct bpf_ctx_arg_aux *ctx_arg_info; struct mutex dst_mutex; /* protects dst_* pointers below, *after* prog becomes visible */ diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index ba088b58746f..f79d951a061f 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -1239,6 +1239,7 @@ void __weak bpf_jit_free(struct bpf_prog *fp) struct bpf_binary_header *hdr = bpf_jit_binary_hdr(fp); bpf_jit_binary_free(hdr); + free_percpu(fp->aux->priv_stack_ptr); WARN_ON_ONCE(!bpf_prog_kallsyms_verify_off(fp)); } @@ -2420,6 +2421,24 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err) if (*err) return fp; + if (fp->aux->priv_stack_eligible) { + if (!fp->aux->stack_depth) { + fp->aux->priv_stack_mode = NO_PRIV_STACK; + } else { + void __percpu *priv_stack_ptr; + + fp->aux->priv_stack_mode = PRIV_STACK_ROOT_PROG; + priv_stack_ptr = + __alloc_percpu_gfp(fp->aux->stack_depth, 8, GFP_KERNEL); + if (!priv_stack_ptr) { + *err = -ENOMEM; + return fp; + } + fp->aux->subtree_stack_depth = fp->aux->stack_depth; + fp->aux->priv_stack_ptr = priv_stack_ptr; + } + } + fp = bpf_int_jit_compile(fp); bpf_prog_jit_attempt_done(fp); if (!fp->jited && jit_needed) { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 3972606f97d2..46b0c277c6a8 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -20003,6 +20003,8 @@ static int jit_subprogs(struct bpf_verifier_env *env) { struct bpf_prog *prog = env->prog, **func, *tmp; int i, j, subprog_start, subprog_end = 0, len, subprog; + int subtree_top_idx, subtree_stack_depth; + void __percpu *priv_stack_ptr; struct bpf_map *map_ptr; struct bpf_insn *insn; void *old_bpf_func; @@ -20081,6 +20083,33 @@ static int jit_subprogs(struct bpf_verifier_env *env) func[i]->is_func = 1; func[i]->sleepable = prog->sleepable; func[i]->aux->func_idx = i; + + subtree_top_idx = env->subprog_info[i].subtree_top_idx; + if (env->subprog_info[subtree_top_idx].priv_stack_eligible) { + if (subtree_top_idx == i) + func[i]->aux->subtree_stack_depth = + env->subprog_info[i].subtree_stack_depth; + + subtree_stack_depth = func[i]->aux->subtree_stack_depth; + if (subtree_top_idx != i) { + if (env->subprog_info[subtree_top_idx].subtree_stack_depth) + func[i]->aux->priv_stack_mode = PRIV_STACK_SUB_PROG; + else + func[i]->aux->priv_stack_mode = NO_PRIV_STACK; + } else if (!subtree_stack_depth) { + func[i]->aux->priv_stack_mode = PRIV_STACK_ROOT_PROG; + } else { + func[i]->aux->priv_stack_mode = PRIV_STACK_ROOT_PROG; + priv_stack_ptr = + __alloc_percpu_gfp(subtree_stack_depth, 8, GFP_KERNEL); + if (!priv_stack_ptr) { + err = -ENOMEM; + goto out_free; + } + func[i]->aux->priv_stack_ptr = priv_stack_ptr; + } + } + /* Below members will be freed only at prog->aux */ func[i]->aux->btf = prog->aux->btf; func[i]->aux->func_info = prog->aux->func_info; From patchwork Thu Oct 10 17:56:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13830612 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 66-220-155-179.mail-mxout.facebook.com (66-220-155-179.mail-mxout.facebook.com [66.220.155.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 013091C1ABE for ; Thu, 10 Oct 2024 17:56:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.220.155.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728582982; cv=none; b=tbivh3T7nmAiadYYdttoQSUkThbUqMLBPd3v2UcGjff5BkHnkFLQuT1F25yz+8mU8b+7+2O6x250y1JkwJp2PuUJ+AzAk7T5t+fl22+Q3C3EHZ44mnYXgFxr8yd3nkndZgs6shbqbP3QLQEVutlPEBIDG8/2wR+7PqlDndnEOzQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728582982; c=relaxed/simple; bh=FwN/GS/aq27f3iSl15GyQ4terJAIdlhLTiTAe/mCah8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UCAhH6qgaztIkf7n9TTT0CPV+ByfNmkCMPkU/sDQr+weacPZx72E2597ewJhle1LzafZbnR5tLsYUTsJj8ufM58p8LDcCalv0OXZg1FLdS3kBXYfu2Zr6npf/Sw0/dxXm+Hs3cvlQnpj0ooD/tH5xN0xrQl4CNR945OK8enS8l0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=66.220.155.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id E3D7F9F27B83; Thu, 10 Oct 2024 10:56:07 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v4 03/10] bpf, x86: Refactor func emit_prologue Date: Thu, 10 Oct 2024 10:56:07 -0700 Message-ID: <20241010175607.1896910-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241010175552.1895980-1-yonghong.song@linux.dev> References: <20241010175552.1895980-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Refactor function emit_prologue() such that it has bpf_prog as one of arguments. This can reduce the number of total arguments since later on there will be more arguments being added to this function. Also add a variable 'stack_depth' to hold the value for bpf_prog->aux->stack_depth to simplify the code. Signed-off-by: Yonghong Song --- arch/x86/net/bpf_jit_comp.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 06b080b61aa5..6d24389e58a1 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -489,10 +489,12 @@ static void emit_prologue_tail_call(u8 **pprog, bool is_subprog) * bpf_tail_call helper will skip the first X86_TAIL_CALL_OFFSET bytes * while jumping to another program */ -static void emit_prologue(u8 **pprog, u32 stack_depth, bool ebpf_from_cbpf, - bool tail_call_reachable, bool is_subprog, - bool is_exception_cb) +static void emit_prologue(u8 **pprog, u32 stack_depth, struct bpf_prog *bpf_prog, + bool tail_call_reachable) { + bool ebpf_from_cbpf = bpf_prog_was_classic(bpf_prog); + bool is_exception_cb = bpf_prog->aux->exception_cb; + bool is_subprog = bpf_is_subprog(bpf_prog); u8 *prog = *pprog; emit_cfi(&prog, is_subprog ? cfi_bpf_subprog_hash : cfi_bpf_hash); @@ -1424,17 +1426,18 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image u64 arena_vm_start, user_vm_start; int i, excnt = 0; int ilen, proglen = 0; + u32 stack_depth; u8 *prog = temp; int err; + stack_depth = bpf_prog->aux->stack_depth; + arena_vm_start = bpf_arena_get_kern_vm_start(bpf_prog->aux->arena); user_vm_start = bpf_arena_get_user_vm_start(bpf_prog->aux->arena); detect_reg_usage(insn, insn_cnt, callee_regs_used); - emit_prologue(&prog, bpf_prog->aux->stack_depth, - bpf_prog_was_classic(bpf_prog), tail_call_reachable, - bpf_is_subprog(bpf_prog), bpf_prog->aux->exception_cb); + emit_prologue(&prog, stack_depth, bpf_prog, tail_call_reachable); /* Exception callback will clobber callee regs for its own use, and * restore the original callee regs from main prog's stack frame. */ @@ -2128,7 +2131,7 @@ st: if (is_imm8(insn->off)) func = (u8 *) __bpf_call_base + imm32; if (tail_call_reachable) { - LOAD_TAIL_CALL_CNT_PTR(bpf_prog->aux->stack_depth); + LOAD_TAIL_CALL_CNT_PTR(stack_depth); ip += 7; } if (!imm32) @@ -2145,13 +2148,13 @@ st: if (is_imm8(insn->off)) &bpf_prog->aux->poke_tab[imm32 - 1], &prog, image + addrs[i - 1], callee_regs_used, - bpf_prog->aux->stack_depth, + stack_depth, ctx); else emit_bpf_tail_call_indirect(bpf_prog, &prog, callee_regs_used, - bpf_prog->aux->stack_depth, + stack_depth, image + addrs[i - 1], ctx); break; From patchwork Thu Oct 10 17:56:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13830613 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 66-220-155-179.mail-mxout.facebook.com (66-220-155-179.mail-mxout.facebook.com [66.220.155.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1BF5E1C9B79 for ; Thu, 10 Oct 2024 17:56:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.220.155.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728582982; cv=none; b=cy8vm2bQKLSe21NaPpiSmT0IxtgoCyVDi9hN/qw8GR2PpAwqHB0YAoqsF0JGQCfKIB0cFwm730CpAMST/jSsVquD7VOY5bP8umDqSxFJSSu6Ol+loLzofmiN7S//BlKcjGwn9EwFwBQwvps60S0PKZ4T1a/A7lrNF+UhTJYe9/0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728582982; c=relaxed/simple; bh=tS4hrzTzbITzK7AMbjKeYryNPGy1TisYbJstcZAz5cc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AsF0gIJQM2coSB8HBamwIQH3WvPuBxjBqVHBsNxmkpRrzq2SuwvrrM0A086nG8pE5GkYKns4HL2oh35Abaa9X/t2PPGJUNohM/C68UOqtHDXF1cTPvQzKk6mlpUZyLYwHdrbXDdl0Ozme/JIWm+l2kE8l/TOx7aAkuOy/HEbaEg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=66.220.155.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id 0AFC69F27BAC; Thu, 10 Oct 2024 10:56:13 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v4 04/10] bpf, x86: Create a helper for certain "reg = imm" operations Date: Thu, 10 Oct 2024 10:56:13 -0700 Message-ID: <20241010175613.1897761-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241010175552.1895980-1-yonghong.song@linux.dev> References: <20241010175552.1895980-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Create a helper to generate jited codes for certain "reg = imm" operations where operations are for add/sub/and/or/xor. This helper will be used in the subsequent patch. Signed-off-by: Yonghong Song --- arch/x86/net/bpf_jit_comp.c | 82 +++++++++++++++++++++---------------- 1 file changed, 46 insertions(+), 36 deletions(-) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 6d24389e58a1..f01fdabf786e 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -1406,6 +1406,51 @@ static void emit_shiftx(u8 **pprog, u32 dst_reg, u8 src_reg, bool is64, u8 op) *pprog = prog; } +/* emit ADD/SUB/AND/OR/XOR 'reg = imm' operations */ +static void emit_alu_helper_1(u8 **pprog, u8 insn_code, u32 dst_reg, s32 imm32) +{ + u8 b2 = 0, b3 = 0; + u8 *prog = *pprog; + + maybe_emit_1mod(&prog, dst_reg, BPF_CLASS(insn_code) == BPF_ALU64); + + /* + * b3 holds 'normal' opcode, b2 short form only valid + * in case dst is eax/rax. + */ + switch (BPF_OP(insn_code)) { + case BPF_ADD: + b3 = 0xC0; + b2 = 0x05; + break; + case BPF_SUB: + b3 = 0xE8; + b2 = 0x2D; + break; + case BPF_AND: + b3 = 0xE0; + b2 = 0x25; + break; + case BPF_OR: + b3 = 0xC8; + b2 = 0x0D; + break; + case BPF_XOR: + b3 = 0xF0; + b2 = 0x35; + break; + } + + if (is_imm8(imm32)) + EMIT3(0x83, add_1reg(b3, dst_reg), imm32); + else if (is_axreg(dst_reg)) + EMIT1_off32(b2, imm32); + else + EMIT2_off32(0x81, add_1reg(b3, dst_reg), imm32); + + *pprog = prog; +} + #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp))) #define __LOAD_TCC_PTR(off) \ @@ -1567,42 +1612,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image case BPF_ALU64 | BPF_AND | BPF_K: case BPF_ALU64 | BPF_OR | BPF_K: case BPF_ALU64 | BPF_XOR | BPF_K: - maybe_emit_1mod(&prog, dst_reg, - BPF_CLASS(insn->code) == BPF_ALU64); - - /* - * b3 holds 'normal' opcode, b2 short form only valid - * in case dst is eax/rax. - */ - switch (BPF_OP(insn->code)) { - case BPF_ADD: - b3 = 0xC0; - b2 = 0x05; - break; - case BPF_SUB: - b3 = 0xE8; - b2 = 0x2D; - break; - case BPF_AND: - b3 = 0xE0; - b2 = 0x25; - break; - case BPF_OR: - b3 = 0xC8; - b2 = 0x0D; - break; - case BPF_XOR: - b3 = 0xF0; - b2 = 0x35; - break; - } - - if (is_imm8(imm32)) - EMIT3(0x83, add_1reg(b3, dst_reg), imm32); - else if (is_axreg(dst_reg)) - EMIT1_off32(b2, imm32); - else - EMIT2_off32(0x81, add_1reg(b3, dst_reg), imm32); + emit_alu_helper_1(&prog, insn->code, dst_reg, imm32); break; case BPF_ALU64 | BPF_MOV | BPF_K: From patchwork Thu Oct 10 17:56:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13830611 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 69-171-232-181.mail-mxout.facebook.com (69-171-232-181.mail-mxout.facebook.com [69.171.232.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1FBA219D89E for ; Thu, 10 Oct 2024 17:56:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=69.171.232.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728582980; cv=none; b=BzYAkXuZ2R0BztNvWwYp7I627Iu/9cMy3/1HwIDMmeUpDCltPQGT+kqAt5qIacDJb1qWnj0SALFYeEdwHn5/ZphszbTgfRhbyzHNSdw5+/N2F2PyBycyh1o7F3ODeuGpfJfLJ1QgGI99MDmcsWZUktZpRQckdTchQB8Euyefm88= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728582980; c=relaxed/simple; bh=yuLxpvwL7uG7lRCX6678J8lv7oU/X2ADSI5kFDo6wiM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=S0bXohcjK7582w/57R6dzJehMSIlh91nNzqtv9Mxi5H4vPaKT2gHcErHJqIvXelVcRRLqvSmrfHUtj1b5CNhVc3qddz/35hAnSE5XJCkwcw5bkIANBjVkxE5VTBYidc/DMMfvkEOGklS9Z+5X1LGlteun86WhfLulx7/nSgRDwM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=69.171.232.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id 25A239F27BD1; Thu, 10 Oct 2024 10:56:18 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v4 05/10] bpf, x86: Add jit support for private stack Date: Thu, 10 Oct 2024 10:56:18 -0700 Message-ID: <20241010175618.1897998-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241010175552.1895980-1-yonghong.song@linux.dev> References: <20241010175552.1895980-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Add jit support for private stack. For a particular subtree, e.g., subtree_root <== stack depth 120 subprog1 <== stack depth 80 subprog2 <== stack depth 40 subprog3 <== stack depth 160 Let us say that priv_stack_ptr is the memory address allocated for private stack. The frame pointer for each above is calculated like below: subtree_root <== subtree_root_fp = private_stack_ptr + 120 subprog1 <== subtree_subprog1_fp = subtree_root_fp + 80 subprog2 <== subtree_subprog2_fp = subtree_subprog1_fp + 40 subprog3 <== subtree_subprog1_fp = subtree_root_fp + 160 For any function call to helper/kfunc, push/pop prog frame pointer is needed in order to preserve frame pointer value. To deal with exception handling, push/pop frame pointer is also used surrounding call to subsequent subprog. For example, subtree_root subprog1 ... insn: call bpf_throw ... After jit, we will have subtree_root insn: push r9 subprog1 ... insn: push r9 insn: call bpf_throw insn: pop r9 ... insn: pop r9 exception_handler pop r9 ... where r9 represents the fp for each subprog. Signed-off-by: Yonghong Song --- arch/x86/net/bpf_jit_comp.c | 88 ++++++++++++++++++++++++++++++++++++- 1 file changed, 86 insertions(+), 2 deletions(-) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index f01fdabf786e..a6ba85cec49a 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -325,6 +325,22 @@ struct jit_context { /* Number of bytes that will be skipped on tailcall */ #define X86_TAIL_CALL_OFFSET (12 + ENDBR_INSN_SIZE) +static void push_r9(u8 **pprog) +{ + u8 *prog = *pprog; + + EMIT2(0x41, 0x51); /* push r9 */ + *pprog = prog; +} + +static void pop_r9(u8 **pprog) +{ + u8 *prog = *pprog; + + EMIT2(0x41, 0x59); /* pop r9 */ + *pprog = prog; +} + static void push_r12(u8 **pprog) { u8 *prog = *pprog; @@ -484,13 +500,17 @@ static void emit_prologue_tail_call(u8 **pprog, bool is_subprog) *pprog = prog; } +static void emit_priv_frame_ptr(u8 **pprog, struct bpf_prog *bpf_prog, + enum bpf_priv_stack_mode priv_stack_mode); + /* * Emit x86-64 prologue code for BPF program. * bpf_tail_call helper will skip the first X86_TAIL_CALL_OFFSET bytes * while jumping to another program */ static void emit_prologue(u8 **pprog, u32 stack_depth, struct bpf_prog *bpf_prog, - bool tail_call_reachable) + bool tail_call_reachable, + enum bpf_priv_stack_mode priv_stack_mode) { bool ebpf_from_cbpf = bpf_prog_was_classic(bpf_prog); bool is_exception_cb = bpf_prog->aux->exception_cb; @@ -520,6 +540,8 @@ static void emit_prologue(u8 **pprog, u32 stack_depth, struct bpf_prog *bpf_prog * first restore those callee-saved regs from stack, before * reusing the stack frame. */ + if (priv_stack_mode != NO_PRIV_STACK) + pop_r9(&prog); pop_callee_regs(&prog, all_callee_regs_used); pop_r12(&prog); /* Reset the stack frame. */ @@ -532,6 +554,8 @@ static void emit_prologue(u8 **pprog, u32 stack_depth, struct bpf_prog *bpf_prog /* X86_TAIL_CALL_OFFSET is here */ EMIT_ENDBR(); + emit_priv_frame_ptr(&prog, bpf_prog, priv_stack_mode); + /* sub rsp, rounded_stack_depth */ if (stack_depth) EMIT3_off32(0x48, 0x81, 0xEC, round_up(stack_depth, 8)); @@ -1451,6 +1475,42 @@ static void emit_alu_helper_1(u8 **pprog, u8 insn_code, u32 dst_reg, s32 imm32) *pprog = prog; } +static void emit_root_priv_frame_ptr(u8 **pprog, struct bpf_prog *bpf_prog, + u32 orig_stack_depth) +{ + void __percpu *priv_frame_ptr; + u8 *prog = *pprog; + + priv_frame_ptr = bpf_prog->aux->priv_stack_ptr + orig_stack_depth; + + /* movabs r9, priv_frame_ptr */ + emit_mov_imm64(&prog, X86_REG_R9, (long) priv_frame_ptr >> 32, + (u32) (long) priv_frame_ptr); +#ifdef CONFIG_SMP + /* add , gs:[] */ + EMIT2(0x65, 0x4c); + EMIT3(0x03, 0x0c, 0x25); + EMIT((u32)(unsigned long)&this_cpu_off, 4); +#endif + *pprog = prog; +} + +static void emit_priv_frame_ptr(u8 **pprog, struct bpf_prog *bpf_prog, + enum bpf_priv_stack_mode priv_stack_mode) +{ + u32 orig_stack_depth = round_up(bpf_prog->aux->stack_depth, 8); + u8 *prog = *pprog; + + if (priv_stack_mode == PRIV_STACK_ROOT_PROG) + emit_root_priv_frame_ptr(&prog, bpf_prog, orig_stack_depth); + else if (priv_stack_mode == PRIV_STACK_SUB_PROG && orig_stack_depth) + /* r9 += orig_stack_depth */ + emit_alu_helper_1(&prog, BPF_ALU64 | BPF_ADD | BPF_K, X86_REG_R9, + orig_stack_depth); + + *pprog = prog; +} + #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp))) #define __LOAD_TCC_PTR(off) \ @@ -1464,6 +1524,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image { bool tail_call_reachable = bpf_prog->aux->tail_call_reachable; struct bpf_insn *insn = bpf_prog->insnsi; + enum bpf_priv_stack_mode priv_stack_mode; bool callee_regs_used[4] = {}; int insn_cnt = bpf_prog->len; bool seen_exit = false; @@ -1476,13 +1537,17 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image int err; stack_depth = bpf_prog->aux->stack_depth; + priv_stack_mode = bpf_prog->aux->priv_stack_mode; + if (priv_stack_mode != NO_PRIV_STACK) + stack_depth = 0; arena_vm_start = bpf_arena_get_kern_vm_start(bpf_prog->aux->arena); user_vm_start = bpf_arena_get_user_vm_start(bpf_prog->aux->arena); detect_reg_usage(insn, insn_cnt, callee_regs_used); - emit_prologue(&prog, stack_depth, bpf_prog, tail_call_reachable); + emit_prologue(&prog, stack_depth, bpf_prog, tail_call_reachable, + priv_stack_mode); /* Exception callback will clobber callee regs for its own use, and * restore the original callee regs from main prog's stack frame. */ @@ -1521,6 +1586,14 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image u8 *func; int nops; + if (priv_stack_mode != NO_PRIV_STACK) { + if (src_reg == BPF_REG_FP) + src_reg = X86_REG_R9; + + if (dst_reg == BPF_REG_FP) + dst_reg = X86_REG_R9; + } + switch (insn->code) { /* ALU */ case BPF_ALU | BPF_ADD | BPF_X: @@ -2146,9 +2219,15 @@ st: if (is_imm8(insn->off)) } if (!imm32) return -EINVAL; + if (priv_stack_mode != NO_PRIV_STACK) { + push_r9(&prog); + ip += 2; + } ip += x86_call_depth_emit_accounting(&prog, func, ip); if (emit_call(&prog, func, ip)) return -EINVAL; + if (priv_stack_mode != NO_PRIV_STACK) + pop_r9(&prog); break; } @@ -3572,6 +3651,11 @@ bool bpf_jit_supports_exceptions(void) return IS_ENABLED(CONFIG_UNWINDER_ORC); } +bool bpf_jit_supports_private_stack(void) +{ + return true; +} + void arch_bpf_stack_walk(bool (*consume_fn)(void *cookie, u64 ip, u64 sp, u64 bp), void *cookie) { #if defined(CONFIG_UNWINDER_ORC) From patchwork Thu Oct 10 17:56:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13830615 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 66-220-155-179.mail-mxout.facebook.com (66-220-155-179.mail-mxout.facebook.com [66.220.155.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4BE4019D084 for ; Thu, 10 Oct 2024 17:56:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.220.155.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728582995; cv=none; b=dE1cDYVoIKxX1AzWz+xGGWy3Q8iPOP0PvVMP58nbxyymb6zgpzXbNDYaDX0zAeJRwYOBgDqk/8zmvwMYEoJPSsH7Qbibyzbymjim7gPmugfWILIMx/Flq1YRJM9hRA1oAaYzFPghVVjlQ6+j37Ee9zCBLXBPweAZD6M8AgwW0WI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728582995; c=relaxed/simple; bh=PVex4oQGYUtDO9nwOQiGRd76tefilsibaOCaDd1I45w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ex9WvITC3dkdVe0EZQ3Uc9uzHu7V0mVRRZ3gBWEducr8ivypx0lNzEFJWXTtGFMf6SMBLXkxXqwdutvpBxknreYpU4Ph040ATCZa1JyknDMKebiiOPCpwsxKu3ogMgG+0c3G+SBS7PUAv27ZzOgwuSPKheLzSAcoDkWsW/vj0kk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=66.220.155.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id 42F369F27BED; Thu, 10 Oct 2024 10:56:23 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v4 06/10] selftests/bpf: Add private stack tests Date: Thu, 10 Oct 2024 10:56:23 -0700 Message-ID: <20241010175623.1898269-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241010175552.1895980-1-yonghong.song@linux.dev> References: <20241010175552.1895980-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Some private stack tests are added including: - prog with stack size greater than BPF_PSTACK_MIN_SUBTREE_SIZE. - prog with stack size less than BPF_PSTACK_MIN_SUBTREE_SIZE. - prog with one subprog having MAX_BPF_STACK stack size and another subprog having non-zero stack size. - prog with callback function. - prog with exception in main prog or subprog. Signed-off-by: Yonghong Song --- .../selftests/bpf/prog_tests/verifier.c | 2 + .../bpf/progs/verifier_private_stack.c | 216 ++++++++++++++++++ 2 files changed, 218 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/verifier_private_stack.c diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c index e26b5150fc43..635ff3509403 100644 --- a/tools/testing/selftests/bpf/prog_tests/verifier.c +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c @@ -59,6 +59,7 @@ #include "verifier_or_jmp32_k.skel.h" #include "verifier_precision.skel.h" #include "verifier_prevent_map_lookup.skel.h" +#include "verifier_private_stack.skel.h" #include "verifier_raw_stack.skel.h" #include "verifier_raw_tp_writable.skel.h" #include "verifier_reg_equal.skel.h" @@ -185,6 +186,7 @@ void test_verifier_bpf_fastcall(void) { RUN(verifier_bpf_fastcall); } void test_verifier_or_jmp32_k(void) { RUN(verifier_or_jmp32_k); } void test_verifier_precision(void) { RUN(verifier_precision); } void test_verifier_prevent_map_lookup(void) { RUN(verifier_prevent_map_lookup); } +void test_verifier_private_stack(void) { RUN(verifier_private_stack); } void test_verifier_raw_stack(void) { RUN(verifier_raw_stack); } void test_verifier_raw_tp_writable(void) { RUN(verifier_raw_tp_writable); } void test_verifier_reg_equal(void) { RUN(verifier_reg_equal); } diff --git a/tools/testing/selftests/bpf/progs/verifier_private_stack.c b/tools/testing/selftests/bpf/progs/verifier_private_stack.c new file mode 100644 index 000000000000..e8de565f8b34 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/verifier_private_stack.c @@ -0,0 +1,216 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include "bpf_misc.h" +#include "bpf_experimental.h" + +/* From include/linux/filter.h */ +#define MAX_BPF_STACK 512 + +#if defined(__TARGET_ARCH_x86) + +SEC("kprobe") +__description("Private stack, single prog") +__success +__arch_x86_64 +__jited(" movabsq $0x{{.*}}, %r9") +__jited(" addq %gs:0x{{.*}}, %r9") +__jited(" movl $0x2a, %edi") +__jited(" movq %rdi, -0x100(%r9)") +__naked void private_stack_single_prog(void) +{ + asm volatile ( + "r1 = 42;" + "*(u64 *)(r10 - 256) = r1;" + "r0 = 0;" + "exit;" + : + : + : __clobber_all); +} + +__used +__naked static void cumulative_stack_depth_subprog(void) +{ + asm volatile ( + "r1 = 41;" + "*(u64 *)(r10 - 32) = r1;" + "call %[bpf_get_smp_processor_id];" + "exit;" + :: __imm(bpf_get_smp_processor_id) + : __clobber_all); +} + +SEC("kprobe") +__description("Private stack, subtree > MAX_BPF_STACK") +__success +__arch_x86_64 +/* private stack fp for the main prog */ +__jited(" movabsq $0x{{.*}}, %r9") +__jited(" addq %gs:0x{{.*}}, %r9") +__jited(" movl $0x2a, %edi") +__jited(" movq %rdi, -0x200(%r9)") +__jited(" pushq %r9") +__jited(" callq 0x{{.*}}") +__jited(" popq %r9") +__jited(" xorl %eax, %eax") +__naked void private_stack_nested_1(void) +{ + asm volatile ( + "r1 = 42;" + "*(u64 *)(r10 - %[max_bpf_stack]) = r1;" + "call cumulative_stack_depth_subprog;" + "r0 = 0;" + "exit;" + : + : __imm_const(max_bpf_stack, MAX_BPF_STACK) + : __clobber_all); +} + +SEC("kprobe") +__description("Private stack, subtree > MAX_BPF_STACK") +__success +__arch_x86_64 +/* private stack fp for the subprog */ +__jited(" addq $0x20, %r9") +__naked void private_stack_nested_2(void) +{ + asm volatile ( + "r1 = 42;" + "*(u64 *)(r10 - %[max_bpf_stack]) = r1;" + "call cumulative_stack_depth_subprog;" + "r0 = 0;" + "exit;" + : + : __imm_const(max_bpf_stack, MAX_BPF_STACK) + : __clobber_all); +} + +SEC("raw_tp") +__description("No private stack, nested") +__success +__arch_x86_64 +__jited(" subq $0x8, %rsp") +__naked void no_private_stack_nested(void) +{ + asm volatile ( + "r1 = 42;" + "*(u64 *)(r10 - 8) = r1;" + "call cumulative_stack_depth_subprog;" + "r0 = 0;" + "exit;" + : + : + : __clobber_all); +} + +__naked __noinline __used +static unsigned long loop_callback(void) +{ + asm volatile ( + "call %[bpf_get_prandom_u32];" + "r1 = 42;" + "*(u64 *)(r10 - 512) = r1;" + "call cumulative_stack_depth_subprog;" + "r0 = 0;" + "exit;" + : + : __imm(bpf_get_prandom_u32) + : __clobber_common); +} + +SEC("raw_tp") +__description("Private stack, callback") +__success +__arch_x86_64 +/* for func loop_callback */ +__jited("func #1") +__jited(" endbr64") +__jited(" nopl (%rax,%rax)") +__jited(" nopl (%rax)") +__jited(" pushq %rbp") +__jited(" movq %rsp, %rbp") +__jited(" endbr64") +__jited(" movabsq $0x{{.*}}, %r9") +__jited(" addq %gs:0x{{.*}}, %r9") +__jited(" pushq %r9") +__jited(" callq") +__jited(" popq %r9") +__jited(" movl $0x2a, %edi") +__jited(" movq %rdi, -0x200(%r9)") +__jited(" pushq %r9") +__jited(" callq") +__jited(" popq %r9") +__naked void private_stack_callback(void) +{ + asm volatile ( + "r1 = 1;" + "r2 = %[loop_callback];" + "r3 = 0;" + "r4 = 0;" + "call %[bpf_loop];" + "r0 = 0;" + "exit;" + : + : __imm_ptr(loop_callback), + __imm(bpf_loop) + : __clobber_common); +} + +SEC("fentry/bpf_fentry_test9") +__description("Private stack, exception in main prog") +__success __retval(0) +__arch_x86_64 +__jited(" pushq %r9") +__jited(" callq") +__jited(" popq %r9") +int private_stack_exception_main_prog(void) +{ + asm volatile ( + "r1 = 42;" + "*(u64 *)(r10 - 512) = r1;" + ::: __clobber_common); + + bpf_throw(0); + return 0; +} + +__used static int subprog_exception(void) +{ + bpf_throw(0); + return 0; +} + +SEC("fentry/bpf_fentry_test9") +__description("Private stack, exception in subprog") +__success __retval(0) +__arch_x86_64 +__jited(" movq %rdi, -0x200(%r9)") +__jited(" pushq %r9") +__jited(" callq") +__jited(" popq %r9") +int private_stack_exception_sub_prog(void) +{ + asm volatile ( + "r1 = 42;" + "*(u64 *)(r10 - 512) = r1;" + "call subprog_exception;" + ::: __clobber_common); + + return 0; +} + +#else + +SEC("kprobe") +__description("private stack is not supported, use a dummy test") +__success +int dummy_test(void) +{ + return 0; +} + +#endif + +char _license[] SEC("license") = "GPL"; From patchwork Thu Oct 10 17:56:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13830614 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 66-220-155-178.mail-mxout.facebook.com (66-220-155-178.mail-mxout.facebook.com [66.220.155.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 479A619C540 for ; Thu, 10 Oct 2024 17:56:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.220.155.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728582994; cv=none; b=QChKZIfuUz0PXeHPXJU1sUAMPQKBYhkOl0ROFn794fod6MjXtagNKFSAmHsPgZqCkJBR/+Pjpwh45D4Yewnt+ga/ulur6JZdofHFyw1tiVJInZ1u3/FYKlLyX2wnazYUGTceIa1eWp3C4wV9//utBrYJQmA0owdaUHYNdCzhXJ8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728582994; c=relaxed/simple; bh=Yh4acin4Y9BX9gyYd6c5K1rXBzwvPRphABy4UlvlPoQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Mc9mAoIGpKWVlNplnD5gcZrIMRXRm/wnaxjuRq12u6Y8IpQgIsS4PG/TV1Ig+4UtuuLZl9D5bQgtbWLSRP/96pB+V6j0xn7Hs4Q71jBBa/f6UVfRSaGAcemYaJKjZosl/WhTBU/6ONHD7xBbEGv66SBZyIvx2GqSy2tKtxdvrjo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=66.220.155.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id 5D8BB9F27C0F; Thu, 10 Oct 2024 10:56:28 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v4 07/10] bpf: Support calling non-tailcall bpf prog Date: Thu, 10 Oct 2024 10:56:28 -0700 Message-ID: <20241010175628.1898648-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241010175552.1895980-1-yonghong.song@linux.dev> References: <20241010175552.1895980-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net A kfunc bpf_prog_call() is introduced such that it can call another bpf prog within a bpf prog. It has the same parameters as bpf_tail_call() but acts like a normal function call. But bpf_prog_call() could recurse to the caller prog itself. So if a bpf prog calls bpf_prog_call(), that bpf prog will use private stacks with maximum recursion level 4. The 4 level recursion should work for most cases. bpf_prog_call() cannot be used if tail_call exists in the same prog since tail_call does not use private stack. If both prog_call and tail_call in the same prog, verification will fail. Signed-off-by: Yonghong Song --- include/linux/bpf.h | 2 ++ kernel/bpf/core.c | 7 +++++-- kernel/bpf/helpers.c | 20 ++++++++++++++++++++ kernel/bpf/verifier.c | 30 ++++++++++++++++++++++++++---- 4 files changed, 53 insertions(+), 6 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index f22ddb423fd0..952cb398eb30 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1493,6 +1493,7 @@ struct bpf_prog_aux { bool exception_cb; bool exception_boundary; bool priv_stack_eligible; + bool has_prog_call; struct bpf_arena *arena; /* BTF_KIND_FUNC_PROTO for valid attach_btf_id */ const struct btf_type *attach_func_proto; @@ -1929,6 +1930,7 @@ struct bpf_array { #define BPF_COMPLEXITY_LIMIT_INSNS 1000000 /* yes. 1M insns */ #define MAX_TAIL_CALL_CNT 33 +#define BPF_MAX_PRIV_STACK_NEST_LEVEL 4 /* Maximum number of loops for bpf_loop and bpf_iter_num. * It's enum to expose it (and thus make it discoverable) through BTF. diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index f79d951a061f..0d2c97f63ecf 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2426,10 +2426,13 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err) fp->aux->priv_stack_mode = NO_PRIV_STACK; } else { void __percpu *priv_stack_ptr; + int nest_level = 1; + if (fp->aux->has_prog_call) + nest_level = BPF_MAX_PRIV_STACK_NEST_LEVEL; fp->aux->priv_stack_mode = PRIV_STACK_ROOT_PROG; - priv_stack_ptr = - __alloc_percpu_gfp(fp->aux->stack_depth, 8, GFP_KERNEL); + priv_stack_ptr = __alloc_percpu_gfp( + fp->aux->stack_depth * nest_level, 8, GFP_KERNEL); if (!priv_stack_ptr) { *err = -ENOMEM; return fp; diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 4053f279ed4c..9cc880dc213e 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -2749,6 +2749,25 @@ __bpf_kfunc void bpf_rcu_read_unlock(void) rcu_read_unlock(); } +__bpf_kfunc int bpf_prog_call(void *ctx, struct bpf_map *p__map, u32 index) +{ + struct bpf_array *array; + struct bpf_prog *prog; + + if (p__map->map_type != BPF_MAP_TYPE_PROG_ARRAY) + return -EINVAL; + + array = container_of(p__map, struct bpf_array, map); + if (unlikely(index >= array->map.max_entries)) + return -E2BIG; + + prog = READ_ONCE(array->ptrs[index]); + if (!prog) + return -ENOENT; + + return bpf_prog_run(prog, ctx); +} + struct bpf_throw_ctx { struct bpf_prog_aux *aux; u64 sp; @@ -3035,6 +3054,7 @@ BTF_ID_FLAGS(func, bpf_task_get_cgroup1, KF_ACQUIRE | KF_RCU | KF_RET_NULL) #endif BTF_ID_FLAGS(func, bpf_task_from_pid, KF_ACQUIRE | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_throw) +BTF_ID_FLAGS(func, bpf_prog_call) BTF_KFUNCS_END(generic_btf_ids) static const struct btf_kfunc_id_set generic_kfunc_set = { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 46b0c277c6a8..e3d9820618a1 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -5986,6 +5986,9 @@ static int check_ptr_alignment(struct bpf_verifier_env *env, static bool bpf_enable_private_stack(struct bpf_prog *prog) { + if (prog->aux->has_prog_call) + return true; + if (!bpf_jit_supports_private_stack()) return false; @@ -6092,7 +6095,9 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx, return -EACCES; } - if (!priv_stack_eligible && depth >= BPF_PRIV_STACK_MIN_SUBTREE_SIZE) { + if (!priv_stack_eligible && + (depth >= BPF_PRIV_STACK_MIN_SUBTREE_SIZE || + env->prog->aux->has_prog_call)) { subprog[orig_idx].priv_stack_eligible = true; env->prog->aux->priv_stack_eligible = priv_stack_eligible = true; } @@ -6181,8 +6186,13 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx, } subprog[ret_prog[j]].tail_call_reachable = true; } - if (!check_priv_stack && subprog[0].tail_call_reachable) + if (!check_priv_stack && subprog[0].tail_call_reachable) { + if (env->prog->aux->has_prog_call) { + verbose(env, "cannot do prog call and tail call in the same prog\n"); + return -EINVAL; + } env->prog->aux->tail_call_reachable = true; + } /* end of for() loop means the last insn of the 'subprog' * was reached. Doesn't matter whether it was JA or EXIT @@ -11322,6 +11332,7 @@ enum special_kfunc_type { KF_bpf_preempt_enable, KF_bpf_iter_css_task_new, KF_bpf_session_cookie, + KF_bpf_prog_call, }; BTF_SET_START(special_kfunc_set) @@ -11387,6 +11398,7 @@ BTF_ID(func, bpf_session_cookie) #else BTF_ID_UNUSED #endif +BTF_ID(func, bpf_prog_call) static bool is_kfunc_ret_null(struct bpf_kfunc_call_arg_meta *meta) { @@ -11433,6 +11445,11 @@ get_kfunc_ptr_arg_type(struct bpf_verifier_env *env, if (meta->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx]) return KF_ARG_PTR_TO_CTX; + if (meta->func_id == special_kfunc_list[KF_bpf_prog_call] && argno == 0) { + env->prog->aux->has_prog_call = true; + return KF_ARG_PTR_TO_CTX; + } + /* In this function, we verify the kfunc's BTF as per the argument type, * leaving the rest of the verification with respect to the register * type to our caller. When a set of conditions hold in the BTF type of @@ -20009,6 +20026,7 @@ static int jit_subprogs(struct bpf_verifier_env *env) struct bpf_insn *insn; void *old_bpf_func; int err, num_exentries; + int nest_level = 1; if (env->subprog_cnt <= 1) return 0; @@ -20099,9 +20117,13 @@ static int jit_subprogs(struct bpf_verifier_env *env) } else if (!subtree_stack_depth) { func[i]->aux->priv_stack_mode = PRIV_STACK_ROOT_PROG; } else { + if (env->prog->aux->has_prog_call) { + func[i]->aux->has_prog_call = true; + nest_level = BPF_MAX_PRIV_STACK_NEST_LEVEL; + } func[i]->aux->priv_stack_mode = PRIV_STACK_ROOT_PROG; - priv_stack_ptr = - __alloc_percpu_gfp(subtree_stack_depth, 8, GFP_KERNEL); + priv_stack_ptr = __alloc_percpu_gfp( + subtree_stack_depth * nest_level, 8, GFP_KERNEL); if (!priv_stack_ptr) { err = -ENOMEM; goto out_free; From patchwork Thu Oct 10 17:56:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13830616 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 66-220-155-178.mail-mxout.facebook.com (66-220-155-178.mail-mxout.facebook.com [66.220.155.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 54D711991B8 for ; Thu, 10 Oct 2024 17:56:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.220.155.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728583008; cv=none; b=uxWzh9+ORuQW9TbkmPDWmLXp0HhgC7cJuh3QABTlxJBXEXXGTD1UKLeSVmJ/T53DCKSAe9DWFN7W1Ysi5I16c7s/1tAMu86lhC5AQoulPtHwO/euS3rX9eculQ4ib7q5DLe3dkl8Y+RMSdzO5+T1I18TSuBZ9WoegHv92e8+gqk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728583008; c=relaxed/simple; bh=cdO3x/5+pwKmXtQSyktuFEi8F1Obvkt9YjT61msxgW8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rDzAZj2HEoLFK+HX3aymgVU9ZkrT1x4YokrP27byCkBCuJ3nvtbhP5mlVElTtI7+TyCMhQCv4X7FJ3uPSdityWoorMmPBZiSpin6LMMIjcUi9XWxZo2Hgwribp3FNIamyXO7eZUV2LoUSM41ExaO0nsJicQ98QTHP+6jWWk3Qvk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=66.220.155.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id 79E639F27C2F; Thu, 10 Oct 2024 10:56:33 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v4 08/10] bpf, x86: Create two helpers for some arith operations Date: Thu, 10 Oct 2024 10:56:33 -0700 Message-ID: <20241010175633.1898994-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241010175552.1895980-1-yonghong.song@linux.dev> References: <20241010175552.1895980-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Two helpers are extracted from bpf/x86 jit: - a helper to handle 'reg1 = reg2' where is add/sub/and/or/xor - a helper to handle 'reg *= imm' Both helpers will be used in the subsequent patch. Signed-off-by: Yonghong Song --- arch/x86/net/bpf_jit_comp.c | 51 ++++++++++++++++++++++++------------- 1 file changed, 34 insertions(+), 17 deletions(-) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index a6ba85cec49a..297dd64f4b6a 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -1475,6 +1475,37 @@ static void emit_alu_helper_1(u8 **pprog, u8 insn_code, u32 dst_reg, s32 imm32) *pprog = prog; } +/* emit ADD/SUB/AND/OR/XOR 'reg1 = reg2' operations */ +static void emit_alu_helper_2(u8 **pprog, u8 insn_code, u32 dst_reg, u32 src_reg) +{ + u8 b2 = 0; + u8 *prog = *pprog; + + maybe_emit_mod(&prog, dst_reg, src_reg, + BPF_CLASS(insn_code) == BPF_ALU64); + b2 = simple_alu_opcodes[BPF_OP(insn_code)]; + EMIT2(b2, add_2reg(0xC0, dst_reg, src_reg)); + + *pprog = prog; +} + +/* emit 'reg *= imm' operations */ +static void emit_alu_helper_3(u8 **pprog, u8 insn_code, u32 dst_reg, s32 imm32) +{ + u8 *prog = *pprog; + + maybe_emit_mod(&prog, dst_reg, dst_reg, BPF_CLASS(insn_code) == BPF_ALU64); + + if (is_imm8(imm32)) + /* imul dst_reg, dst_reg, imm8 */ + EMIT3(0x6B, add_2reg(0xC0, dst_reg, dst_reg), imm32); + else + /* imul dst_reg, dst_reg, imm32 */ + EMIT2_off32(0x69, add_2reg(0xC0, dst_reg, dst_reg), imm32); + + *pprog = prog; +} + static void emit_root_priv_frame_ptr(u8 **pprog, struct bpf_prog *bpf_prog, u32 orig_stack_depth) { @@ -1578,7 +1609,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image const s32 imm32 = insn->imm; u32 dst_reg = insn->dst_reg; u32 src_reg = insn->src_reg; - u8 b2 = 0, b3 = 0; + u8 b3 = 0; u8 *start_of_ldx; s64 jmp_offset; s16 insn_off; @@ -1606,10 +1637,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image case BPF_ALU64 | BPF_AND | BPF_X: case BPF_ALU64 | BPF_OR | BPF_X: case BPF_ALU64 | BPF_XOR | BPF_X: - maybe_emit_mod(&prog, dst_reg, src_reg, - BPF_CLASS(insn->code) == BPF_ALU64); - b2 = simple_alu_opcodes[BPF_OP(insn->code)]; - EMIT2(b2, add_2reg(0xC0, dst_reg, src_reg)); + emit_alu_helper_2(&prog, insn->code, dst_reg, src_reg); break; case BPF_ALU64 | BPF_MOV | BPF_X: @@ -1772,18 +1800,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image case BPF_ALU | BPF_MUL | BPF_K: case BPF_ALU64 | BPF_MUL | BPF_K: - maybe_emit_mod(&prog, dst_reg, dst_reg, - BPF_CLASS(insn->code) == BPF_ALU64); - - if (is_imm8(imm32)) - /* imul dst_reg, dst_reg, imm8 */ - EMIT3(0x6B, add_2reg(0xC0, dst_reg, dst_reg), - imm32); - else - /* imul dst_reg, dst_reg, imm32 */ - EMIT2_off32(0x69, - add_2reg(0xC0, dst_reg, dst_reg), - imm32); + emit_alu_helper_3(&prog, insn->code, dst_reg, imm32); break; case BPF_ALU | BPF_MUL | BPF_X: From patchwork Thu Oct 10 17:56:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13830618 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 66-220-155-179.mail-mxout.facebook.com (66-220-155-179.mail-mxout.facebook.com [66.220.155.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0D13319D078 for ; Thu, 10 Oct 2024 17:59:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.220.155.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728583153; cv=none; b=E369W7zUsgGyajtAlmwuyXUgl7EmD4qV37Unbt63/Q1VGCrH6rWP6MApTCSzhiCnLM0Jh/zbpZhxoykhKpDctNjCjbBHiOvTMOzKv0v6fU0tWREkd+yw0SJJn5cyMjZ7geFjNP6WsIPLTzPp6nKZ0LBDFA1hykL6zUQy7wqYgi0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728583153; c=relaxed/simple; bh=JXHiWZQsFCNNhQXWmIKmRiUaix4QlY5rCF/57pGJMws=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sXyShq5ACKXMUbHHDuqTZFCa8+LzD21aDcp17KeMUOW05Ax1pNGEwgBQNa8Hx+q5csIovUFOdYS9rctZzSN2/bSlefqkx/2zr9YVzjR3Qa3PHp+lGRq1uvCJK3mqVuKvX8rLia3zWshmdpVTivXIFzZsDzV6petU+9OEw9bgqeM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=66.220.155.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id 4FD4B9F27C4C; Thu, 10 Oct 2024 10:56:38 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v4 09/10] bpf, x86: Jit support for nested bpf_prog_call Date: Thu, 10 Oct 2024 10:56:38 -0700 Message-ID: <20241010175638.1899406-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241010175552.1895980-1-yonghong.song@linux.dev> References: <20241010175552.1895980-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Two functions are added in the kernel - int notrace __bpf_prog_enter_recur_limited(struct bpf_prog *prog) - void notrace __bpf_prog_exit_recur_limited(struct bpf_prog *prog) and they are called in bpf progs through jit. Func __bpf_prog_enter_recur_limited() will return 0 if maximum recursion level has been reached in which case, bpf prog will return to the caller directly. Otherwise, it will return the current recursion level. The recursion level will be used by jit to calculated proper frame pointer for that recursion level. Signed-off-by: Yonghong Song --- arch/x86/net/bpf_jit_comp.c | 94 +++++++++++++++++++++++++++++++++---- include/linux/bpf.h | 2 + kernel/bpf/trampoline.c | 16 +++++++ 3 files changed, 104 insertions(+), 8 deletions(-) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 297dd64f4b6a..a763e018e87f 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -501,7 +501,8 @@ static void emit_prologue_tail_call(u8 **pprog, bool is_subprog) } static void emit_priv_frame_ptr(u8 **pprog, struct bpf_prog *bpf_prog, - enum bpf_priv_stack_mode priv_stack_mode); + enum bpf_priv_stack_mode priv_stack_mode, + bool is_subprog, u8 *image, u8 *temp); /* * Emit x86-64 prologue code for BPF program. @@ -510,7 +511,8 @@ static void emit_priv_frame_ptr(u8 **pprog, struct bpf_prog *bpf_prog, */ static void emit_prologue(u8 **pprog, u32 stack_depth, struct bpf_prog *bpf_prog, bool tail_call_reachable, - enum bpf_priv_stack_mode priv_stack_mode) + enum bpf_priv_stack_mode priv_stack_mode, u8 *image, + u8 *temp) { bool ebpf_from_cbpf = bpf_prog_was_classic(bpf_prog); bool is_exception_cb = bpf_prog->aux->exception_cb; @@ -554,7 +556,7 @@ static void emit_prologue(u8 **pprog, u32 stack_depth, struct bpf_prog *bpf_prog /* X86_TAIL_CALL_OFFSET is here */ EMIT_ENDBR(); - emit_priv_frame_ptr(&prog, bpf_prog, priv_stack_mode); + emit_priv_frame_ptr(&prog, bpf_prog, priv_stack_mode, is_subprog, image, temp); /* sub rsp, rounded_stack_depth */ if (stack_depth) @@ -696,6 +698,15 @@ static void emit_return(u8 **pprog, u8 *ip) *pprog = prog; } +static int num_bytes_of_emit_return(void) +{ + if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) + return 5; + if (IS_ENABLED(CONFIG_MITIGATION_SLS)) + return 2; + return 1; +} + #define BPF_TAIL_CALL_CNT_PTR_STACK_OFF(stack) (-16 - round_up(stack, 8)) /* @@ -1527,17 +1538,67 @@ static void emit_root_priv_frame_ptr(u8 **pprog, struct bpf_prog *bpf_prog, } static void emit_priv_frame_ptr(u8 **pprog, struct bpf_prog *bpf_prog, - enum bpf_priv_stack_mode priv_stack_mode) + enum bpf_priv_stack_mode priv_stack_mode, + bool is_subprog, u8 *image, u8 *temp) { u32 orig_stack_depth = round_up(bpf_prog->aux->stack_depth, 8); u8 *prog = *pprog; - if (priv_stack_mode == PRIV_STACK_ROOT_PROG) - emit_root_priv_frame_ptr(&prog, bpf_prog, orig_stack_depth); - else if (priv_stack_mode == PRIV_STACK_SUB_PROG && orig_stack_depth) + if (priv_stack_mode == PRIV_STACK_ROOT_PROG) { + int offs; + u8 *func; + + if (!bpf_prog->aux->has_prog_call) { + emit_root_priv_frame_ptr(&prog, bpf_prog, orig_stack_depth); + } else { + EMIT1(0x57); /* push rdi */ + if (is_subprog) { + /* subprog may have up to 5 arguments */ + EMIT1(0x56); /* push rsi */ + EMIT1(0x52); /* push rdx */ + EMIT1(0x51); /* push rcx */ + EMIT2(0x41, 0x50); /* push r8 */ + } + emit_mov_imm64(&prog, BPF_REG_1, (long) bpf_prog >> 32, + (u32) (long) bpf_prog); + func = (u8 *)__bpf_prog_enter_recur_limited; + offs = prog - temp; + offs += x86_call_depth_emit_accounting(&prog, func, image + offs); + emit_call(&prog, func, image + offs); + if (is_subprog) { + EMIT2(0x41, 0x58); /* pop r8 */ + EMIT1(0x59); /* pop rcx */ + EMIT1(0x5a); /* pop rdx */ + EMIT1(0x5e); /* pop rsi */ + } + EMIT1(0x5f); /* pop rdi */ + + EMIT4(0x48, 0x83, 0xf8, 0x0); /* cmp rax,0x0 */ + EMIT2(X86_JNE, num_bytes_of_emit_return() + 1); + + /* return if stack recursion has been reached */ + EMIT1(0xC9); /* leave */ + emit_return(&prog, image + (prog - temp)); + + /* cnt -= 1 */ + emit_alu_helper_1(&prog, BPF_ALU64 | BPF_SUB | BPF_K, + BPF_REG_0, 1); + + /* accum_stack_depth = cnt * subtree_stack_depth */ + emit_alu_helper_3(&prog, BPF_ALU64 | BPF_MUL | BPF_K, BPF_REG_0, + bpf_prog->aux->subtree_stack_depth); + + emit_root_priv_frame_ptr(&prog, bpf_prog, orig_stack_depth); + + /* r9 += accum_stack_depth */ + emit_alu_helper_2(&prog, BPF_ALU64 | BPF_ADD | BPF_X, X86_REG_R9, + BPF_REG_0); + } + } else if (priv_stack_mode == PRIV_STACK_SUB_PROG && orig_stack_depth) { /* r9 += orig_stack_depth */ emit_alu_helper_1(&prog, BPF_ALU64 | BPF_ADD | BPF_K, X86_REG_R9, orig_stack_depth); + } *pprog = prog; } @@ -1578,7 +1639,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image detect_reg_usage(insn, insn_cnt, callee_regs_used); emit_prologue(&prog, stack_depth, bpf_prog, tail_call_reachable, - priv_stack_mode); + priv_stack_mode, image, temp); /* Exception callback will clobber callee regs for its own use, and * restore the original callee regs from main prog's stack frame. */ @@ -2519,6 +2580,23 @@ st: if (is_imm8(insn->off)) if (arena_vm_start) pop_r12(&prog); } + + if (bpf_prog->aux->has_prog_call) { + u8 *func, *ip; + int offs; + + ip = image + addrs[i - 1]; + /* save and restore the return value */ + EMIT1(0x50); /* push rax */ + emit_mov_imm64(&prog, BPF_REG_1, (long) bpf_prog >> 32, + (u32) (long) bpf_prog); + func = (u8 *)__bpf_prog_exit_recur_limited; + offs = prog - temp; + offs += x86_call_depth_emit_accounting(&prog, func, ip + offs); + emit_call(&prog, func, ip + offs); + EMIT1(0x58); /* pop rax */ + } + EMIT1(0xC9); /* leave */ emit_return(&prog, image + addrs[i - 1] + (prog - temp)); break; diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 952cb398eb30..605004cba9f7 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1148,6 +1148,8 @@ u64 notrace __bpf_prog_enter_sleepable_recur(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx); void notrace __bpf_prog_exit_sleepable_recur(struct bpf_prog *prog, u64 start, struct bpf_tramp_run_ctx *run_ctx); +int notrace __bpf_prog_enter_recur_limited(struct bpf_prog *prog); +void notrace __bpf_prog_exit_recur_limited(struct bpf_prog *prog); void notrace __bpf_tramp_enter(struct bpf_tramp_image *tr); void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr); typedef u64 (*bpf_trampoline_enter_t)(struct bpf_prog *prog, diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c index f8302a5ca400..d9e7260e4b39 100644 --- a/kernel/bpf/trampoline.c +++ b/kernel/bpf/trampoline.c @@ -960,6 +960,22 @@ void notrace __bpf_prog_exit_sleepable_recur(struct bpf_prog *prog, u64 start, rcu_read_unlock_trace(); } +int notrace __bpf_prog_enter_recur_limited(struct bpf_prog *prog) +{ + int cnt = this_cpu_inc_return(*(prog->active)); + + if (cnt > BPF_MAX_PRIV_STACK_NEST_LEVEL) { + bpf_prog_inc_misses_counter(prog); + return 0; + } + return cnt; +} + +void notrace __bpf_prog_exit_recur_limited(struct bpf_prog *prog) +{ + this_cpu_dec(*(prog->active)); +} + static u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx) { From patchwork Thu Oct 10 17:56:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13830617 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 66-220-155-178.mail-mxout.facebook.com (66-220-155-178.mail-mxout.facebook.com [66.220.155.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1ADAA19A292 for ; Thu, 10 Oct 2024 17:56:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.220.155.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728583018; cv=none; b=Og3CcLxXdf1yvfrwJEs7WsbXMbiEDaNIky+SXiO9aK8EugdU5GDuvA/g+uenWaFLQ6dNlfr4JYpmUJRwdz6AAD1Hi5mzFZHYLm36AOY5drDCyzKuIwg/OQv7OJxNCGK+jCBZBK0jztpGlRKURnSh9RWzaJziEcPvMnHtfFTeRJA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728583018; c=relaxed/simple; bh=h7oi8MGl3FGBz84eAR+pbpwBbw8O7S+l7TTBl69H8/Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uJtsJ3TLUMhsEYqw+mR6Vf1Zhwupgfq7zbtz+yUVswblm6r72RXP3SxAtjI2xrE0XEQCC76bijWOGHN34zFXgfMNskkGwB2x4DP3OXxJKFLhBoj1mpg1y1AdXNFe/bVwYkLurmwTxcqRaRbOsBucqDej2/NPY4vBrl8UCGyo2Vo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=66.220.155.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id 6A1F59F27C79; Thu, 10 Oct 2024 10:56:44 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v4 10/10] selftests/bpf: Add tests for bpf_prog_call() Date: Thu, 10 Oct 2024 10:56:44 -0700 Message-ID: <20241010175644.1900546-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241010175552.1895980-1-yonghong.song@linux.dev> References: <20241010175552.1895980-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Add two subtests for nested bpf_prog_call(). One is recursion in main prog, and the other is recursion in callback func. Signed-off-by: Yonghong Song --- .../selftests/bpf/prog_tests/prog_call.c | 78 ++++++++++++++++ tools/testing/selftests/bpf/progs/prog_call.c | 92 +++++++++++++++++++ 2 files changed, 170 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/prog_call.c create mode 100644 tools/testing/selftests/bpf/progs/prog_call.c diff --git a/tools/testing/selftests/bpf/prog_tests/prog_call.c b/tools/testing/selftests/bpf/prog_tests/prog_call.c new file mode 100644 index 000000000000..573c67c9af12 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/prog_call.c @@ -0,0 +1,78 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include "prog_call.skel.h" + +static void test_nest_prog_call(int prog_index) +{ + LIBBPF_OPTS(bpf_test_run_opts, topts, + .data_in = &pkt_v4, + .data_size_in = sizeof(pkt_v4), + ); + int err, idx = 0, prog_fd, map_fd; + struct prog_call *skel; + struct bpf_program *prog; + + skel = prog_call__open(); + if (!ASSERT_OK_PTR(skel, "prog_call__open")) + return; + + switch (prog_index) { + case 0: + prog = skel->progs.entry_no_subprog; + break; + case 1: + prog = skel->progs.entry_subprog; + break; + case 2: + prog = skel->progs.entry_callback; + break; + } + + bpf_program__set_autoload(prog, true); + + err = prog_call__load(skel); + if (!ASSERT_OK(err, "prog_call__load")) + return; + + map_fd = bpf_map__fd(skel->maps.jmp_table); + prog_fd = bpf_program__fd(prog); + /* maximum recursion level 4 */ + err = bpf_map_update_elem(map_fd, &idx, &prog_fd, 0); + if (!ASSERT_OK(err, "bpf_map_update_elem")) + goto out; + + err = bpf_prog_test_run_opts(prog_fd, &topts); + ASSERT_OK(err, "test_run"); + ASSERT_EQ(skel->bss->vali, 4, "i"); + ASSERT_EQ(skel->bss->valj, 6, "j"); +out: + prog_call__destroy(skel); +} + +static void test_prog_call_with_tailcall(void) +{ + struct prog_call *skel; + int err; + + skel = prog_call__open(); + if (!ASSERT_OK_PTR(skel, "prog_call__open")) + return; + + bpf_program__set_autoload(skel->progs.entry_tail_call, true); + err = prog_call__load(skel); + if (!ASSERT_ERR(err, "prog_call__load")) + prog_call__destroy(skel); +} + +void test_prog_call(void) +{ + if (test__start_subtest("single_main_prog")) + test_nest_prog_call(0); + if (test__start_subtest("sub_prog")) + test_nest_prog_call(1); + if (test__start_subtest("callback_fn")) + test_nest_prog_call(2); + if (test__start_subtest("with_tailcall")) + test_prog_call_with_tailcall(); +} diff --git a/tools/testing/selftests/bpf/progs/prog_call.c b/tools/testing/selftests/bpf/progs/prog_call.c new file mode 100644 index 000000000000..c494cfcf653b --- /dev/null +++ b/tools/testing/selftests/bpf/progs/prog_call.c @@ -0,0 +1,92 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include + +struct { + __uint(type, BPF_MAP_TYPE_PROG_ARRAY); + __uint(max_entries, 3); + __uint(key_size, sizeof(__u32)); + __uint(value_size, sizeof(__u32)); +} jmp_table SEC(".maps"); + +struct callback_ctx { + struct __sk_buff *skb; +}; + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, __u64); +} arraymap SEC(".maps"); + +int vali, valj; + +int glb; +__noinline static void subprog2(volatile int *a) +{ + glb = a[20] + a[10]; +} + +__noinline static void subprog1(struct __sk_buff *skb) +{ + volatile int a[100] = {}; + + a[10] = vali; + subprog2(a); + vali++; + bpf_prog_call(skb, (struct bpf_map *)&jmp_table, 0); + valj += a[10]; +} + +SEC("?tc") +int entry_no_subprog(struct __sk_buff *skb) +{ + volatile int a[100] = {}; + + a[10] = vali; + subprog2(a); + vali++; + bpf_prog_call(skb, (struct bpf_map *)&jmp_table, 0); + valj += a[10]; + return 0; +} + +SEC("?tc") +int entry_subprog(struct __sk_buff *skb) +{ + subprog1(skb); + return 0; +} + +static __u64 +check_array_elem(struct bpf_map *map, __u32 *key, __u64 *val, + struct callback_ctx *data) +{ + subprog1(data->skb); + return 0; +} + +SEC("?tc") +int entry_callback(struct __sk_buff *skb) +{ + struct callback_ctx data; + + data.skb = skb; + bpf_for_each_map_elem(&arraymap, check_array_elem, &data, 0); + return 0; +} + +SEC("?tc") +int entry_tail_call(struct __sk_buff *skb) +{ + struct callback_ctx data; + + bpf_tail_call_static(skb, &jmp_table, 0); + + data.skb = skb; + bpf_for_each_map_elem(&arraymap, check_array_elem, &data, 0); + return 0; +} + +char __license[] SEC("license") = "GPL";