diff mbox series

[RFC,v1,01/14] bpf: Mark subprogs as throw reachable before do_check pass

Message ID 20240201042109.1150490-2-memxor@gmail.com (mailing list archive)
State RFC
Delegated to: BPF
Headers show
Series Exceptions - Resource Cleanup | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
netdev/tree_selection success Guessing tree name failed - patch did not apply, async
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-7 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-9 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-4 fail Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / test
bpf/vmtest-bpf-next-VM_Test-26 success Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-34 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-8 fail Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for s390x-gcc / test
bpf/vmtest-bpf-next-VM_Test-11 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-12 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-13 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-17 pending Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-25 pending Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-30 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18 and -O2 optimization
bpf/vmtest-bpf-next-VM_Test-32 pending Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-33 pending Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-36 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-15 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-19 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17 and -O2 optimization
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-31 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-35 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18

Commit Message

Kumar Kartikeya Dwivedi Feb. 1, 2024, 4:20 a.m. UTC
The motivation of this patch is to figure out which subprogs participate
in exception propagation. In other words, whichever subprog's execution
can lead to an exception being thrown either directly or indirectly (by
way of calling other subprogs).

With the current exceptions support, the runtime performs stack
unwinding when bpf_throw is called. For now, any resources acquired by
the program cannot be released, therefore bpf_throw calls made with
non-zero acquired references must be rejected during verification.

However, there currently exists a loophole in this restriction due to
the way the verification procedure is structured. The verifier will
first walk over the main subprog's instructions, but not descend into
subprog calls to ones with global linkage. These global subprogs will
then be independently verified instead. Therefore, in a situation where
a global subprog ends up throwing an exception (either directly by
calling bpf_throw, or indirectly by way of calling another subprog that
does so), the verifier will fail to notice this fact and may permit
throwing BPF exceptions with non-zero acquired references.

Therefore, to fix this, we add a summarization pass before the do_check
stage which walks all call chains of the program and marks all of the
subprogs that are reachable from a bpf_throw call which unwinds the
program stack.

We only do so if we actually see a bpf_throw call in the program though,
since we do not want to walk all instructions unless we need to.  One we
analyze all possible call chains of the program, we will be able to mark
them as 'is_throw_reachable' in their subprog_info.

After performing this step, we need to make another change as to how
subprog call verification occurs. In case of global subprog, we will
need to explore an alternate program path where the call instruction
processing of a global subprog's call will immediately throw an
exception. We will thus simulate a normal path without any exceptions,
and one where the exception is thrown and the program proceeds no
further. In this way, the verifier will be able to detect the whether
any acquired references or locks exist in the verifier state and thus
reject the program if needed.

Fixes: f18b03fabaa9 ("bpf: Implement BPF exceptions")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 include/linux/bpf_verifier.h |  2 +
 kernel/bpf/verifier.c        | 86 ++++++++++++++++++++++++++++++++++++
 2 files changed, 88 insertions(+)

Comments

David Vernet Feb. 12, 2024, 7:35 p.m. UTC | #1
On Thu, Feb 01, 2024 at 04:20:56AM +0000, Kumar Kartikeya Dwivedi wrote:
> The motivation of this patch is to figure out which subprogs participate
> in exception propagation. In other words, whichever subprog's execution
> can lead to an exception being thrown either directly or indirectly (by
> way of calling other subprogs).
> 
> With the current exceptions support, the runtime performs stack
> unwinding when bpf_throw is called. For now, any resources acquired by
> the program cannot be released, therefore bpf_throw calls made with
> non-zero acquired references must be rejected during verification.
> 
> However, there currently exists a loophole in this restriction due to
> the way the verification procedure is structured. The verifier will
> first walk over the main subprog's instructions, but not descend into
> subprog calls to ones with global linkage. These global subprogs will
> then be independently verified instead. Therefore, in a situation where
> a global subprog ends up throwing an exception (either directly by
> calling bpf_throw, or indirectly by way of calling another subprog that
> does so), the verifier will fail to notice this fact and may permit
> throwing BPF exceptions with non-zero acquired references.
> 
> Therefore, to fix this, we add a summarization pass before the do_check
> stage which walks all call chains of the program and marks all of the
> subprogs that are reachable from a bpf_throw call which unwinds the
> program stack.
> 
> We only do so if we actually see a bpf_throw call in the program though,
> since we do not want to walk all instructions unless we need to.  One we

s/Once/once

> analyze all possible call chains of the program, we will be able to mark
> them as 'is_throw_reachable' in their subprog_info.
> 
> After performing this step, we need to make another change as to how
> subprog call verification occurs. In case of global subprog, we will
> need to explore an alternate program path where the call instruction
> processing of a global subprog's call will immediately throw an
> exception. We will thus simulate a normal path without any exceptions,
> and one where the exception is thrown and the program proceeds no
> further. In this way, the verifier will be able to detect the whether
> any acquired references or locks exist in the verifier state and thus
> reject the program if needed.
> 
> Fixes: f18b03fabaa9 ("bpf: Implement BPF exceptions")
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>

Just had a few nits and one question. Looks reasonable to me overall.

> ---
>  include/linux/bpf_verifier.h |  2 +
>  kernel/bpf/verifier.c        | 86 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 88 insertions(+)
> 
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 0dcde339dc7e..1d666b6c21e6 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -626,6 +626,7 @@ struct bpf_subprog_info {
>  	bool is_async_cb: 1;
>  	bool is_exception_cb: 1;
>  	bool args_cached: 1;
> +	bool is_throw_reachable: 1;
>  
>  	u8 arg_cnt;
>  	struct bpf_subprog_arg_info args[MAX_BPF_FUNC_REG_ARGS];
> @@ -691,6 +692,7 @@ struct bpf_verifier_env {
>  	bool bypass_spec_v4;
>  	bool seen_direct_write;
>  	bool seen_exception;
> +	bool seen_throw_insn;
>  	struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */
>  	const struct bpf_line_info *prev_linfo;
>  	struct bpf_verifier_log log;
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index cd4d780e5400..bba53c4e3a0c 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -2941,6 +2941,8 @@ static int check_subprogs(struct bpf_verifier_env *env)
>  		    insn[i].src_reg == 0 &&
>  		    insn[i].imm == BPF_FUNC_tail_call)
>  			subprog[cur_subprog].has_tail_call = true;
> +		if (!env->seen_throw_insn && is_bpf_throw_kfunc(&insn[i]))
> +			env->seen_throw_insn = true;
>  		if (BPF_CLASS(code) == BPF_LD &&
>  		    (BPF_MODE(code) == BPF_ABS || BPF_MODE(code) == BPF_IND))
>  			subprog[cur_subprog].has_ld_abs = true;
> @@ -5866,6 +5868,9 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx)
>  
>  			if (!is_bpf_throw_kfunc(insn + i))
>  				continue;
> +			/* When this is allowed, don't forget to update logic for sync and
> +			 * async callbacks in mark_exception_reachable_subprogs.
> +			 */
>  			if (subprog[idx].is_cb)
>  				err = true;
>  			for (int c = 0; c < frame && !err; c++) {
> @@ -16205,6 +16210,83 @@ static int check_btf_info(struct bpf_verifier_env *env,
>  	return 0;
>  }
>  
> +/* We walk the call graph of the program in this function, and mark everything in
> + * the call chain as 'is_throw_reachable'. This allows us to know which subprog
> + * calls may propagate an exception and generate exception frame descriptors for
> + * those call instructions. We already do that for bpf_throw calls made directly,
> + * but we need to mark the subprogs as we won't be able to see the call chains
> + * during symbolic execution in do_check_common due to global subprogs.
> + *
> + * Note that unlike check_max_stack_depth, we don't explore the async callbacks
> + * apart from main subprogs, as we don't support throwing from them for now, but

Comment ending prematurely

> + */
> +static int mark_exception_reachable_subprogs(struct bpf_verifier_env *env)
> +{
> +	struct bpf_subprog_info *subprog = env->subprog_info;
> +	struct bpf_insn *insn = env->prog->insnsi;
> +	int idx = 0, frame = 0, i, subprog_end;
> +	int ret_insn[MAX_CALL_FRAMES];
> +	int ret_prog[MAX_CALL_FRAMES];
> +
> +	/* No need if we never saw any bpf_throw() call in the program. */
> +	if (!env->seen_throw_insn)
> +		return 0;
> +
> +	i = subprog[idx].start;
> +restart:
> +	subprog_end = subprog[idx + 1].start;
> +	for (; i < subprog_end; i++) {
> +		int next_insn, sidx;
> +
> +		if (bpf_pseudo_kfunc_call(insn + i) && !insn[i].off) {

When should a kfunc call ever have a nonzero offset? We use the
immediate for the BTF ID, don't we?

> +			if (!is_bpf_throw_kfunc(insn + i))
> +				continue;
> +			subprog[idx].is_throw_reachable = true;
> +			for (int j = 0; j < frame; j++)
> +				subprog[ret_prog[j]].is_throw_reachable = true;
> +		}
> +
> +		if (!bpf_pseudo_call(insn + i) && !bpf_pseudo_func(insn + i))
> +			continue;
> +		/* remember insn and function to return to */
> +		ret_insn[frame] = i + 1;
> +		ret_prog[frame] = idx;
> +
> +		/* find the callee */
> +		next_insn = i + insn[i].imm + 1;
> +		sidx = find_subprog(env, next_insn);
> +		if (sidx < 0) {
> +			WARN_ONCE(1, "verifier bug. No program starts at insn %d\n", next_insn);
> +			return -EFAULT;
> +		}
> +		/* We cannot distinguish between sync or async cb, so we need to follow
> +		 * both.  Async callbacks don't really propagate exceptions but calling
> +		 * bpf_throw from them is not allowed anyway, so there is no harm in
> +		 * exploring them.
> +		 * TODO: To address this properly, we will have to move is_cb,
> +		 * is_async_cb markings to the stage before do_check.
> +		 */
> +		i = next_insn;
> +		idx = sidx;
> +
> +		frame++;
> +		if (frame >= MAX_CALL_FRAMES) {
> +			verbose(env, "the call stack of %d frames is too deep !\n", frame);
> +			return -E2BIG;
> +		}
> +		goto restart;
> +	}
> +	/* end of for() loop means the last insn of the 'subprog'
> +	 * was reached. Doesn't matter whether it was JA or EXIT
> +	 */
> +	if (frame == 0)
> +		return 0;
> +	frame--;
> +	i = ret_insn[frame];
> +	idx = ret_prog[frame];
> +	goto restart;
> +}

If you squint youre eyes there's a non-trivial amount of duplicated
intent / logic here compared to check_max_stack_depth_subprog(). Do you
think it would be possible to combine them somehow?

> +
>  /* check %cur's range satisfies %old's */
>  static bool range_within(struct bpf_reg_state *old,
>  			 struct bpf_reg_state *cur)
> @@ -20939,6 +21021,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
>  	if (ret < 0)
>  		goto skip_full_check;
>  
> +	ret = mark_exception_reachable_subprogs(env);
> +	if (ret < 0)
> +		goto skip_full_check;
> +
>  	ret = do_check_main(env);
>  	ret = ret ?: do_check_subprogs(env);
>  
> -- 
> 2.40.1
>
Kumar Kartikeya Dwivedi Feb. 12, 2024, 10:28 p.m. UTC | #2
On Mon, 12 Feb 2024 at 20:35, David Vernet <void@manifault.com> wrote:
>
> On Thu, Feb 01, 2024 at 04:20:56AM +0000, Kumar Kartikeya Dwivedi wrote:
> > The motivation of this patch is to figure out which subprogs participate
> > in exception propagation. In other words, whichever subprog's execution
> > can lead to an exception being thrown either directly or indirectly (by
> > way of calling other subprogs).
> >
> > With the current exceptions support, the runtime performs stack
> > unwinding when bpf_throw is called. For now, any resources acquired by
> > the program cannot be released, therefore bpf_throw calls made with
> > non-zero acquired references must be rejected during verification.
> >
> > However, there currently exists a loophole in this restriction due to
> > the way the verification procedure is structured. The verifier will
> > first walk over the main subprog's instructions, but not descend into
> > subprog calls to ones with global linkage. These global subprogs will
> > then be independently verified instead. Therefore, in a situation where
> > a global subprog ends up throwing an exception (either directly by
> > calling bpf_throw, or indirectly by way of calling another subprog that
> > does so), the verifier will fail to notice this fact and may permit
> > throwing BPF exceptions with non-zero acquired references.
> >
> > Therefore, to fix this, we add a summarization pass before the do_check
> > stage which walks all call chains of the program and marks all of the
> > subprogs that are reachable from a bpf_throw call which unwinds the
> > program stack.
> >
> > We only do so if we actually see a bpf_throw call in the program though,
> > since we do not want to walk all instructions unless we need to.  One we
>
> s/Once/once
>

Ack, will fix.

> > analyze all possible call chains of the program, we will be able to mark
> > them as 'is_throw_reachable' in their subprog_info.
> >
> > After performing this step, we need to make another change as to how
> > subprog call verification occurs. In case of global subprog, we will
> > need to explore an alternate program path where the call instruction
> > processing of a global subprog's call will immediately throw an
> > exception. We will thus simulate a normal path without any exceptions,
> > and one where the exception is thrown and the program proceeds no
> > further. In this way, the verifier will be able to detect the whether
> > any acquired references or locks exist in the verifier state and thus
> > reject the program if needed.
> >
> > Fixes: f18b03fabaa9 ("bpf: Implement BPF exceptions")
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
>
> Just had a few nits and one question. Looks reasonable to me overall.
>
> > ---
> >  include/linux/bpf_verifier.h |  2 +
> >  kernel/bpf/verifier.c        | 86 ++++++++++++++++++++++++++++++++++++
> >  2 files changed, 88 insertions(+)
> >
> > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > index 0dcde339dc7e..1d666b6c21e6 100644
> > --- a/include/linux/bpf_verifier.h
> > +++ b/include/linux/bpf_verifier.h
> > @@ -626,6 +626,7 @@ struct bpf_subprog_info {
> >       bool is_async_cb: 1;
> >       bool is_exception_cb: 1;
> >       bool args_cached: 1;
> > +     bool is_throw_reachable: 1;
> >
> >       u8 arg_cnt;
> >       struct bpf_subprog_arg_info args[MAX_BPF_FUNC_REG_ARGS];
> > @@ -691,6 +692,7 @@ struct bpf_verifier_env {
> >       bool bypass_spec_v4;
> >       bool seen_direct_write;
> >       bool seen_exception;
> > +     bool seen_throw_insn;
> >       struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */
> >       const struct bpf_line_info *prev_linfo;
> >       struct bpf_verifier_log log;
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index cd4d780e5400..bba53c4e3a0c 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -2941,6 +2941,8 @@ static int check_subprogs(struct bpf_verifier_env *env)
> >                   insn[i].src_reg == 0 &&
> >                   insn[i].imm == BPF_FUNC_tail_call)
> >                       subprog[cur_subprog].has_tail_call = true;
> > +             if (!env->seen_throw_insn && is_bpf_throw_kfunc(&insn[i]))
> > +                     env->seen_throw_insn = true;
> >               if (BPF_CLASS(code) == BPF_LD &&
> >                   (BPF_MODE(code) == BPF_ABS || BPF_MODE(code) == BPF_IND))
> >                       subprog[cur_subprog].has_ld_abs = true;
> > @@ -5866,6 +5868,9 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx)
> >
> >                       if (!is_bpf_throw_kfunc(insn + i))
> >                               continue;
> > +                     /* When this is allowed, don't forget to update logic for sync and
> > +                      * async callbacks in mark_exception_reachable_subprogs.
> > +                      */
> >                       if (subprog[idx].is_cb)
> >                               err = true;
> >                       for (int c = 0; c < frame && !err; c++) {
> > @@ -16205,6 +16210,83 @@ static int check_btf_info(struct bpf_verifier_env *env,
> >       return 0;
> >  }
> >
> > +/* We walk the call graph of the program in this function, and mark everything in
> > + * the call chain as 'is_throw_reachable'. This allows us to know which subprog
> > + * calls may propagate an exception and generate exception frame descriptors for
> > + * those call instructions. We already do that for bpf_throw calls made directly,
> > + * but we need to mark the subprogs as we won't be able to see the call chains
> > + * during symbolic execution in do_check_common due to global subprogs.
> > + *
> > + * Note that unlike check_max_stack_depth, we don't explore the async callbacks
> > + * apart from main subprogs, as we don't support throwing from them for now, but
>
> Comment ending prematurely
>

Ack.

> > + */
> > +static int mark_exception_reachable_subprogs(struct bpf_verifier_env *env)
> > +{
> > +     struct bpf_subprog_info *subprog = env->subprog_info;
> > +     struct bpf_insn *insn = env->prog->insnsi;
> > +     int idx = 0, frame = 0, i, subprog_end;
> > +     int ret_insn[MAX_CALL_FRAMES];
> > +     int ret_prog[MAX_CALL_FRAMES];
> > +
> > +     /* No need if we never saw any bpf_throw() call in the program. */
> > +     if (!env->seen_throw_insn)
> > +             return 0;
> > +
> > +     i = subprog[idx].start;
> > +restart:
> > +     subprog_end = subprog[idx + 1].start;
> > +     for (; i < subprog_end; i++) {
> > +             int next_insn, sidx;
> > +
> > +             if (bpf_pseudo_kfunc_call(insn + i) && !insn[i].off) {
>
> When should a kfunc call ever have a nonzero offset? We use the
> immediate for the BTF ID, don't we?
>

So in kfuncs, insn.off is used to indicate a vmlinux vs module kfunc.
If off is non-zero, it points to the index in the bpf_attr::fd_array
of the module BTF from which this kfunc comes.
But I think it might be easier to just remove this extra test and do
is_bpf_throw_kfunc directly.

> > +                     if (!is_bpf_throw_kfunc(insn + i))
> > +                             continue;
> > +                     subprog[idx].is_throw_reachable = true;
> > +                     for (int j = 0; j < frame; j++)
> > +                             subprog[ret_prog[j]].is_throw_reachable = true;
> > +             }
> > +
> > +             if (!bpf_pseudo_call(insn + i) && !bpf_pseudo_func(insn + i))
> > +                     continue;
> > +             /* remember insn and function to return to */
> > +             ret_insn[frame] = i + 1;
> > +             ret_prog[frame] = idx;
> > +
> > +             /* find the callee */
> > +             next_insn = i + insn[i].imm + 1;
> > +             sidx = find_subprog(env, next_insn);
> > +             if (sidx < 0) {
> > +                     WARN_ONCE(1, "verifier bug. No program starts at insn %d\n", next_insn);
> > +                     return -EFAULT;
> > +             }
> > +             /* We cannot distinguish between sync or async cb, so we need to follow
> > +              * both.  Async callbacks don't really propagate exceptions but calling
> > +              * bpf_throw from them is not allowed anyway, so there is no harm in
> > +              * exploring them.
> > +              * TODO: To address this properly, we will have to move is_cb,
> > +              * is_async_cb markings to the stage before do_check.
> > +              */
> > +             i = next_insn;
> > +             idx = sidx;
> > +
> > +             frame++;
> > +             if (frame >= MAX_CALL_FRAMES) {
> > +                     verbose(env, "the call stack of %d frames is too deep !\n", frame);
> > +                     return -E2BIG;
> > +             }
> > +             goto restart;
> > +     }
> > +     /* end of for() loop means the last insn of the 'subprog'
> > +      * was reached. Doesn't matter whether it was JA or EXIT
> > +      */
> > +     if (frame == 0)
> > +             return 0;
> > +     frame--;
> > +     i = ret_insn[frame];
> > +     idx = ret_prog[frame];
> > +     goto restart;
> > +}
>
> If you squint youre eyes there's a non-trivial amount of duplicated
> intent / logic here compared to check_max_stack_depth_subprog(). Do you
> think it would be possible to combine them somehow?
>

I agree, this function is mostly a copy-paste of that function with
some modifications.
I will take a stab at unifying both in the next version, though I
think they will end up calling some common logic from different points
as we need to do this marking before verification, and stack depth
checking after verification. Also, stack depth checks have some other
exception_cb and async_cb related checks as well, so this would
probably be a good opportunity to refactor that code as well.

Basically, just have a common way to iterate over all instructions and
call chains starting from some instruction in a subprog.

> > [...]
> >
Eduard Zingerman Feb. 15, 2024, 1:01 a.m. UTC | #3
On Thu, 2024-02-01 at 04:20 +0000, Kumar Kartikeya Dwivedi wrote:

[...]

> +static int mark_exception_reachable_subprogs(struct bpf_verifier_env *env)
> +{

[...]

> +restart:
> +	subprog_end = subprog[idx + 1].start;
> +	for (; i < subprog_end; i++) {

[...]

> +		if (!bpf_pseudo_call(insn + i) && !bpf_pseudo_func(insn + i))
> +			continue;
> +		/* remember insn and function to return to */
> +		ret_insn[frame] = i + 1;
> +		ret_prog[frame] = idx;
> +
> +		/* find the callee */
> +		next_insn = i + insn[i].imm + 1;
> +		sidx = find_subprog(env, next_insn);
> +		if (sidx < 0) {
> +			WARN_ONCE(1, "verifier bug. No program starts at insn %d\n", next_insn);
> +			return -EFAULT;
> +		}

For programs like:

  foo():
    bar()
    bar()

this algorithm would scan bar() multiple times.
Would it be possible to remember if subprogram had been scanned
already and reuse collected .is_throw_reachable info?

[...]
Kumar Kartikeya Dwivedi Feb. 16, 2024, 9:34 p.m. UTC | #4
On Thu, 15 Feb 2024 at 02:01, Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Thu, 2024-02-01 at 04:20 +0000, Kumar Kartikeya Dwivedi wrote:
>
> [...]
>
> > +static int mark_exception_reachable_subprogs(struct bpf_verifier_env *env)
> > +{
>
> [...]
>
> > +restart:
> > +     subprog_end = subprog[idx + 1].start;
> > +     for (; i < subprog_end; i++) {
>
> [...]
>
> > +             if (!bpf_pseudo_call(insn + i) && !bpf_pseudo_func(insn + i))
> > +                     continue;
> > +             /* remember insn and function to return to */
> > +             ret_insn[frame] = i + 1;
> > +             ret_prog[frame] = idx;
> > +
> > +             /* find the callee */
> > +             next_insn = i + insn[i].imm + 1;
> > +             sidx = find_subprog(env, next_insn);
> > +             if (sidx < 0) {
> > +                     WARN_ONCE(1, "verifier bug. No program starts at insn %d\n", next_insn);
> > +                     return -EFAULT;
> > +             }
>
> For programs like:
>
>   foo():
>     bar()
>     bar()
>
> this algorithm would scan bar() multiple times.
> Would it be possible to remember if subprogram had been scanned
> already and reuse collected .is_throw_reachable info?
>

Good idea, I will look into avoiding this. I think
check_max_stack_depth would also benefit from such a change, and since
I plan on consolidating both to use similar logic, I will make the
change for both.

> [...]
>
>
diff mbox series

Patch

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 0dcde339dc7e..1d666b6c21e6 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -626,6 +626,7 @@  struct bpf_subprog_info {
 	bool is_async_cb: 1;
 	bool is_exception_cb: 1;
 	bool args_cached: 1;
+	bool is_throw_reachable: 1;
 
 	u8 arg_cnt;
 	struct bpf_subprog_arg_info args[MAX_BPF_FUNC_REG_ARGS];
@@ -691,6 +692,7 @@  struct bpf_verifier_env {
 	bool bypass_spec_v4;
 	bool seen_direct_write;
 	bool seen_exception;
+	bool seen_throw_insn;
 	struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */
 	const struct bpf_line_info *prev_linfo;
 	struct bpf_verifier_log log;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index cd4d780e5400..bba53c4e3a0c 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2941,6 +2941,8 @@  static int check_subprogs(struct bpf_verifier_env *env)
 		    insn[i].src_reg == 0 &&
 		    insn[i].imm == BPF_FUNC_tail_call)
 			subprog[cur_subprog].has_tail_call = true;
+		if (!env->seen_throw_insn && is_bpf_throw_kfunc(&insn[i]))
+			env->seen_throw_insn = true;
 		if (BPF_CLASS(code) == BPF_LD &&
 		    (BPF_MODE(code) == BPF_ABS || BPF_MODE(code) == BPF_IND))
 			subprog[cur_subprog].has_ld_abs = true;
@@ -5866,6 +5868,9 @@  static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx)
 
 			if (!is_bpf_throw_kfunc(insn + i))
 				continue;
+			/* When this is allowed, don't forget to update logic for sync and
+			 * async callbacks in mark_exception_reachable_subprogs.
+			 */
 			if (subprog[idx].is_cb)
 				err = true;
 			for (int c = 0; c < frame && !err; c++) {
@@ -16205,6 +16210,83 @@  static int check_btf_info(struct bpf_verifier_env *env,
 	return 0;
 }
 
+/* We walk the call graph of the program in this function, and mark everything in
+ * the call chain as 'is_throw_reachable'. This allows us to know which subprog
+ * calls may propagate an exception and generate exception frame descriptors for
+ * those call instructions. We already do that for bpf_throw calls made directly,
+ * but we need to mark the subprogs as we won't be able to see the call chains
+ * during symbolic execution in do_check_common due to global subprogs.
+ *
+ * Note that unlike check_max_stack_depth, we don't explore the async callbacks
+ * apart from main subprogs, as we don't support throwing from them for now, but
+ */
+static int mark_exception_reachable_subprogs(struct bpf_verifier_env *env)
+{
+	struct bpf_subprog_info *subprog = env->subprog_info;
+	struct bpf_insn *insn = env->prog->insnsi;
+	int idx = 0, frame = 0, i, subprog_end;
+	int ret_insn[MAX_CALL_FRAMES];
+	int ret_prog[MAX_CALL_FRAMES];
+
+	/* No need if we never saw any bpf_throw() call in the program. */
+	if (!env->seen_throw_insn)
+		return 0;
+
+	i = subprog[idx].start;
+restart:
+	subprog_end = subprog[idx + 1].start;
+	for (; i < subprog_end; i++) {
+		int next_insn, sidx;
+
+		if (bpf_pseudo_kfunc_call(insn + i) && !insn[i].off) {
+			if (!is_bpf_throw_kfunc(insn + i))
+				continue;
+			subprog[idx].is_throw_reachable = true;
+			for (int j = 0; j < frame; j++)
+				subprog[ret_prog[j]].is_throw_reachable = true;
+		}
+
+		if (!bpf_pseudo_call(insn + i) && !bpf_pseudo_func(insn + i))
+			continue;
+		/* remember insn and function to return to */
+		ret_insn[frame] = i + 1;
+		ret_prog[frame] = idx;
+
+		/* find the callee */
+		next_insn = i + insn[i].imm + 1;
+		sidx = find_subprog(env, next_insn);
+		if (sidx < 0) {
+			WARN_ONCE(1, "verifier bug. No program starts at insn %d\n", next_insn);
+			return -EFAULT;
+		}
+		/* We cannot distinguish between sync or async cb, so we need to follow
+		 * both.  Async callbacks don't really propagate exceptions but calling
+		 * bpf_throw from them is not allowed anyway, so there is no harm in
+		 * exploring them.
+		 * TODO: To address this properly, we will have to move is_cb,
+		 * is_async_cb markings to the stage before do_check.
+		 */
+		i = next_insn;
+		idx = sidx;
+
+		frame++;
+		if (frame >= MAX_CALL_FRAMES) {
+			verbose(env, "the call stack of %d frames is too deep !\n", frame);
+			return -E2BIG;
+		}
+		goto restart;
+	}
+	/* end of for() loop means the last insn of the 'subprog'
+	 * was reached. Doesn't matter whether it was JA or EXIT
+	 */
+	if (frame == 0)
+		return 0;
+	frame--;
+	i = ret_insn[frame];
+	idx = ret_prog[frame];
+	goto restart;
+}
+
 /* check %cur's range satisfies %old's */
 static bool range_within(struct bpf_reg_state *old,
 			 struct bpf_reg_state *cur)
@@ -20939,6 +21021,10 @@  int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 	if (ret < 0)
 		goto skip_full_check;
 
+	ret = mark_exception_reachable_subprogs(env);
+	if (ret < 0)
+		goto skip_full_check;
+
 	ret = do_check_main(env);
 	ret = ret ?: do_check_subprogs(env);