From patchwork Thu Oct 17 22:31:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13840887 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 66-220-155-179.mail-mxout.facebook.com (66-220-155-179.mail-mxout.facebook.com [66.220.155.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E68C11C1AC4 for ; Thu, 17 Oct 2024 22:32:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.220.155.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204336; cv=none; b=hmQCMj6EdywimIYRPYfvjfD9fEm2vQ6w0WOqat6z0BSlSqfjcs7eJNR9NKaWlrwMj0Kc1byxTmu/wKkywrQTsf8Hmka3VztmI9YNpA4mmUVtuo1fv/HiDaPI9+uVg/0LH+w2wLrbS7i5OTKTfhuUzEx17HNUOdM0wOSz0GCjqG4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204336; c=relaxed/simple; bh=AVlJGCavZcBDeFTlfE68iKAkJp67pPuhJa9n1LH2ZDA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MJdqTQjKd+hOeETj0HlEvSUOfGGcnZ77PwDSayib0SmGkhUfzlnV8SlFkGcdHgC1mCJzBINkLoty6V9AP1jZcxVy5FYNwE7NI4uexwjDsTYGMfRRr9WyciH9GLL2CfmnW+5MRIe7sIZ9DHb1IUcrDS9UGTO7SnARwd4C3tWOxHI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=66.220.155.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id 3E091A2F07EF; Thu, 17 Oct 2024 15:31:59 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v5 4/9] bpf: Mark each subprog with proper private stack modes Date: Thu, 17 Oct 2024 15:31:59 -0700 Message-ID: <20241017223159.3176904-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241017223138.3175885-1-yonghong.song@linux.dev> References: <20241017223138.3175885-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Three private stack modes are used to direct jit action: NO_PRIV_STACK: do not use private stack PRIV_STACK_SUB_PROG: adjust frame pointer address (similar to normal stack) PRIV_STACK_ROOT_PROG: set the frame pointer Note that for subtree root prog (main prog or callback fn), even if the bpf_prog stack size is 0, PRIV_STACK_ROOT_PROG mode is still used. This is for bpf exception handling. More details can be found in subsequent jit support and selftest patches. Signed-off-by: Yonghong Song --- include/linux/bpf.h | 9 +++++++++ kernel/bpf/core.c | 19 +++++++++++++++++++ kernel/bpf/verifier.c | 29 +++++++++++++++++++++++++++++ 3 files changed, 57 insertions(+) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index a789cd2f5d6a..2c07a2e311f4 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1457,6 +1457,12 @@ struct btf_mod_pair { struct bpf_kfunc_desc_tab; +enum bpf_priv_stack_mode { + NO_PRIV_STACK, + PRIV_STACK_SUB_PROG, + PRIV_STACK_ROOT_PROG, +}; + struct bpf_prog_aux { atomic64_t refcnt; u32 used_map_cnt; @@ -1473,6 +1479,9 @@ struct bpf_prog_aux { u32 ctx_arg_info_size; u32 max_rdonly_access; u32 max_rdwr_access; + enum bpf_priv_stack_mode priv_stack_mode; + u16 subtree_stack_depth; /* Subtree stack depth if PRIV_STACK_ROOT_PROG, 0 otherwise */ + void __percpu *priv_stack_ptr; struct btf *attach_btf; const struct bpf_ctx_arg_aux *ctx_arg_info; struct mutex dst_mutex; /* protects dst_* pointers below, *after* prog becomes visible */ diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 14d9288441f2..aee0055def4f 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -1240,6 +1240,7 @@ void __weak bpf_jit_free(struct bpf_prog *fp) struct bpf_binary_header *hdr = bpf_jit_binary_hdr(fp); bpf_jit_binary_free(hdr); + free_percpu(fp->aux->priv_stack_ptr); WARN_ON_ONCE(!bpf_prog_kallsyms_verify_off(fp)); } @@ -2421,6 +2422,24 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err) if (*err) return fp; + if (fp->aux->priv_stack_eligible) { + if (!fp->aux->stack_depth) { + fp->aux->priv_stack_mode = NO_PRIV_STACK; + } else { + void __percpu *priv_stack_ptr; + + fp->aux->priv_stack_mode = PRIV_STACK_ROOT_PROG; + priv_stack_ptr = + __alloc_percpu_gfp(fp->aux->stack_depth, 8, GFP_KERNEL); + if (!priv_stack_ptr) { + *err = -ENOMEM; + return fp; + } + fp->aux->subtree_stack_depth = fp->aux->stack_depth; + fp->aux->priv_stack_ptr = priv_stack_ptr; + } + } + fp = bpf_int_jit_compile(fp); bpf_prog_jit_attempt_done(fp); if (!fp->jited && jit_needed) { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index a14857015ad4..274b0b92177d 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -20010,6 +20010,8 @@ static int jit_subprogs(struct bpf_verifier_env *env) { struct bpf_prog *prog = env->prog, **func, *tmp; int i, j, subprog_start, subprog_end = 0, len, subprog; + int subtree_top_idx, subtree_stack_depth; + void __percpu *priv_stack_ptr; struct bpf_map *map_ptr; struct bpf_insn *insn; void *old_bpf_func; @@ -20088,6 +20090,33 @@ static int jit_subprogs(struct bpf_verifier_env *env) func[i]->is_func = 1; func[i]->sleepable = prog->sleepable; func[i]->aux->func_idx = i; + + subtree_top_idx = env->subprog_info[i].subtree_top_idx; + if (env->subprog_info[subtree_top_idx].priv_stack_eligible) { + if (subtree_top_idx == i) + func[i]->aux->subtree_stack_depth = + env->subprog_info[i].subtree_stack_depth; + + subtree_stack_depth = func[i]->aux->subtree_stack_depth; + if (subtree_top_idx != i) { + if (env->subprog_info[subtree_top_idx].subtree_stack_depth) + func[i]->aux->priv_stack_mode = PRIV_STACK_SUB_PROG; + else + func[i]->aux->priv_stack_mode = NO_PRIV_STACK; + } else if (!subtree_stack_depth) { + func[i]->aux->priv_stack_mode = PRIV_STACK_ROOT_PROG; + } else { + func[i]->aux->priv_stack_mode = PRIV_STACK_ROOT_PROG; + priv_stack_ptr = + __alloc_percpu_gfp(subtree_stack_depth, 8, GFP_KERNEL); + if (!priv_stack_ptr) { + err = -ENOMEM; + goto out_free; + } + func[i]->aux->priv_stack_ptr = priv_stack_ptr; + } + } + /* Below members will be freed only at prog->aux */ func[i]->aux->btf = prog->aux->btf; func[i]->aux->func_info = prog->aux->func_info;