diff mbox series

[bpf-next,v3,11/12] bpf: do check_nocsr_stack_contract() for ARG_ANYTHING helper params

Message ID 20240715230201.3901423-12-eddyz87@gmail.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series no_caller_saved_registers attribute for helper calls | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-4 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-11 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-13 success Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-18 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-19 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-21 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17-O2
bpf/vmtest-bpf-next-VM_Test-30 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-31 success Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-32 success Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-33 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-34 success Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-35 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-36 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18-O2
bpf/vmtest-bpf-next-VM_Test-37 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-38 success Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-39 success Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-40 success Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-41 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-42 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-14 success Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 816 this patch: 816
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 6 maintainers not CCed: kpsingh@kernel.org haoluo@google.com jolsa@kernel.org song@kernel.org sdf@fomichev.me john.fastabend@gmail.com
netdev/build_clang success Errors and warnings before: 821 this patch: 821
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 831 this patch: 831
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 15 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Eduard Zingerman July 15, 2024, 11:02 p.m. UTC
There is a number of BPF helper functions that use ARG_ANYTHING to
mark parameters that are used as memory addresses.
An address of BPF stack slot could be passed as such parameter
for two such helper functions:
- bpf_probe_read_kernel
- bpf_probe_read_kernel_str

This might lead to a surprising behavior in combination with nocsr
rewrites, e.g. consider the program below:

     1: r1 = 1;
        /* nocsr pattern with stack offset -16 */
     2: *(u64 *)(r10 - 16) = r1;
     3: call %[bpf_get_smp_processor_id];
     4: r1 = *(u64 *)(r10 - 16);
     5: r1 = r10;
     6: r1 += -8;
     7: r2 = 1;
     8: r3 = r10;
     9: r3 += -16;
        /* bpf_probe_read_kernel(dst: &fp[-8], size: 1, src: &fp[-16]) */
    10: call %[bpf_probe_read_kernel];
    11: exit;

Here nocsr rewrite logic would remove instructions (2) and (4).
However, (2) writes a value that is later read by a call at (10).

Function check_func_arg() is called from check_helper_call() and is
responsible for memory access checks, when helper argument type is
declared as ARG_PTR_TO_... .

However, for ARG_ANYTHING this function returns early and does not
call check_stack_{read,write}().

This patch opts to add a check_nocsr_stack_contract() to the
ARG_ANYTHING return path if passed parameter is a pointer to stack.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/verifier.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Alexei Starovoitov July 16, 2024, 2 a.m. UTC | #1
On Mon, Jul 15, 2024 at 4:02 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> There is a number of BPF helper functions that use ARG_ANYTHING to
> mark parameters that are used as memory addresses.
> An address of BPF stack slot could be passed as such parameter
> for two such helper functions:
> - bpf_probe_read_kernel
> - bpf_probe_read_kernel_str
>
> This might lead to a surprising behavior in combination with nocsr
> rewrites, e.g. consider the program below:
>
>      1: r1 = 1;
>         /* nocsr pattern with stack offset -16 */
>      2: *(u64 *)(r10 - 16) = r1;
>      3: call %[bpf_get_smp_processor_id];
>      4: r1 = *(u64 *)(r10 - 16);
>      5: r1 = r10;
>      6: r1 += -8;
>      7: r2 = 1;
>      8: r3 = r10;
>      9: r3 += -16;
>         /* bpf_probe_read_kernel(dst: &fp[-8], size: 1, src: &fp[-16]) */
>     10: call %[bpf_probe_read_kernel];
>     11: exit;
>
> Here nocsr rewrite logic would remove instructions (2) and (4).
> However, (2) writes a value that is later read by a call at (10).

This makes no sense to me.
This bpf prog is broken.
If probe_read is used to read stack it will read garbage.
JITs and the verifier are allowed to do any transformation
that keeps the program semantics and safety.

pw-bot: cr
Eduard Zingerman July 16, 2024, 10:03 a.m. UTC | #2
On Mon, 2024-07-15 at 19:00 -0700, Alexei Starovoitov wrote:
> On Mon, Jul 15, 2024 at 4:02 PM Eduard Zingerman <eddyz87@gmail.com> wrote:

[...]

> > This might lead to a surprising behavior in combination with nocsr
> > rewrites, e.g. consider the program below:
> > 
> >      1: r1 = 1;
> >         /* nocsr pattern with stack offset -16 */
> >      2: *(u64 *)(r10 - 16) = r1;
> >      3: call %[bpf_get_smp_processor_id];
> >      4: r1 = *(u64 *)(r10 - 16);
> >      5: r1 = r10;
> >      6: r1 += -8;
> >      7: r2 = 1;
> >      8: r3 = r10;
> >      9: r3 += -16;
> >         /* bpf_probe_read_kernel(dst: &fp[-8], size: 1, src: &fp[-16]) */
> >     10: call %[bpf_probe_read_kernel];
> >     11: exit;
> > 
> > Here nocsr rewrite logic would remove instructions (2) and (4).
> > However, (2) writes a value that is later read by a call at (10).
> 
> This makes no sense to me.
> This bpf prog is broken.
> If probe_read is used to read stack it will read garbage.
> JITs and the verifier are allowed to do any transformation
> that keeps the program semantics and safety.

I tried to run the following program
(should have run it earlier):

SEC("raw_tp")
__retval(42)
__success
int bpf_probe_read_kernel_stack_ptr(void *ctx)
{
	unsigned long a = 17;
	unsigned long b = 42;
	int err;

	err = bpf_probe_read_kernel(&a, 8, &b);
	if (err)
		return -1;
	return a;
}

And indeed, it does not produce expected result,
the retval varies around 22079.
However, I don't really understand why this program is broken.
E.g. from C compiler pov pointer &b escapes, and compiler is not
really allowed to replace object at that offset with garbage.
Is it a limitation of the bpf_probe_read_kernel() that it cannot read
BPF program stack?
Eduard Zingerman July 16, 2024, 6:15 p.m. UTC | #3
On Tue, 2024-07-16 at 03:03 -0700, Eduard Zingerman wrote:
> On Mon, 2024-07-15 at 19:00 -0700, Alexei Starovoitov wrote:
> > On Mon, Jul 15, 2024 at 4:02 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> 
> [...]
> 
> > > This might lead to a surprising behavior in combination with nocsr
> > > rewrites, e.g. consider the program below:
> > > 
> > >      1: r1 = 1;
> > >         /* nocsr pattern with stack offset -16 */
> > >      2: *(u64 *)(r10 - 16) = r1;
> > >      3: call %[bpf_get_smp_processor_id];
> > >      4: r1 = *(u64 *)(r10 - 16);
> > >      5: r1 = r10;
> > >      6: r1 += -8;
> > >      7: r2 = 1;
> > >      8: r3 = r10;
> > >      9: r3 += -16;
> > >         /* bpf_probe_read_kernel(dst: &fp[-8], size: 1, src: &fp[-16]) */
> > >     10: call %[bpf_probe_read_kernel];
> > >     11: exit;
> > > 
> > > Here nocsr rewrite logic would remove instructions (2) and (4).
> > > However, (2) writes a value that is later read by a call at (10).
> > 
> > This makes no sense to me.
> > This bpf prog is broken.
> > If probe_read is used to read stack it will read garbage.
> > JITs and the verifier are allowed to do any transformation
> > that keeps the program semantics and safety.

Ok, my bad, the following program works at the moment:

SEC("socket") // <---- used wrong program type
__retval(42)
__success
int bpf_probe_read_kernel_stack_ptr(void *ctx)
{
	unsigned long a = 17;
	unsigned long b = 42;
	int err;

	err = bpf_probe_read_kernel(&a, 8, &b);
	if (err)
		return 1;
	return a;
}

And it is compiled to BPF as one would expect:

       ... fp[-8,-16] setup ...
       4:	r1 = r10
       5:	r1 += -0x8
       6:	r3 = r10
       7:	r3 += -0x10
       8:	w2 = 0x8
       9:	call 0x71
       ... return check ...

So, the point stands: from C compiler pov pointer &b escapes,
and compiler is not really allowed to replace object at that offset
with garbage. Why do you think the program is broken?

I don't mind dropping the patch in question, but I agree with Andrii's
viewpoint that there is nothing wrong with this use case.
Alexei Starovoitov July 20, 2024, 1:54 a.m. UTC | #4
On Tue, Jul 16, 2024 at 11:15 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Tue, 2024-07-16 at 03:03 -0700, Eduard Zingerman wrote:
> > On Mon, 2024-07-15 at 19:00 -0700, Alexei Starovoitov wrote:
> > > On Mon, Jul 15, 2024 at 4:02 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> >
> > [...]
> >
> > > > This might lead to a surprising behavior in combination with nocsr
> > > > rewrites, e.g. consider the program below:
> > > >
> > > >      1: r1 = 1;
> > > >         /* nocsr pattern with stack offset -16 */
> > > >      2: *(u64 *)(r10 - 16) = r1;
> > > >      3: call %[bpf_get_smp_processor_id];
> > > >      4: r1 = *(u64 *)(r10 - 16);
> > > >      5: r1 = r10;
> > > >      6: r1 += -8;
> > > >      7: r2 = 1;
> > > >      8: r3 = r10;
> > > >      9: r3 += -16;
> > > >         /* bpf_probe_read_kernel(dst: &fp[-8], size: 1, src: &fp[-16]) */
> > > >     10: call %[bpf_probe_read_kernel];
> > > >     11: exit;
> > > >
> > > > Here nocsr rewrite logic would remove instructions (2) and (4).
> > > > However, (2) writes a value that is later read by a call at (10).
> > >
> > > This makes no sense to me.
> > > This bpf prog is broken.
> > > If probe_read is used to read stack it will read garbage.
> > > JITs and the verifier are allowed to do any transformation
> > > that keeps the program semantics and safety.
>
> Ok, my bad, the following program works at the moment:
>
> SEC("socket") // <---- used wrong program type
> __retval(42)
> __success
> int bpf_probe_read_kernel_stack_ptr(void *ctx)
> {
>         unsigned long a = 17;
>         unsigned long b = 42;
>         int err;
>
>         err = bpf_probe_read_kernel(&a, 8, &b);
>         if (err)
>                 return 1;
>         return a;
> }
>
> And it is compiled to BPF as one would expect:
>
>        ... fp[-8,-16] setup ...
>        4:       r1 = r10
>        5:       r1 += -0x8
>        6:       r3 = r10
>        7:       r3 += -0x10
>        8:       w2 = 0x8
>        9:       call 0x71
>        ... return check ...
>
> So, the point stands: from C compiler pov pointer &b escapes,
> and compiler is not really allowed to replace object at that offset
> with garbage. Why do you think the program is broken?

This is apples to oranges.
Compiler sees that the address of 'b' is taken and passed
into a function with side effect.
Whether 3rd arg of bpf_probe_read_kernel() is void * or long
is irrelevant. Compilers will do it, because it's a C language
requirement.

> I don't mind dropping the patch in question, but I agree with Andrii's
> viewpoint that there is nothing wrong with this use case.

bpf_probe_read_kernel() is not special and it's 3rd argument is
some kernel address. Whether it's stack pointer or anything else
is irrelevant.
JITs and verifier are allowed to do any optimizations on stack
and any other memory completely ignoring presence of
bpf_probe_read_kernel() and what is being passed into it.

Tomorrow we will teach arm64 JIT to replace stack spill/fill with
spare register read/write. There is no way we're going to special case
a particular fp-16 slot because fp-16 was passed into probe_read.
Eduard Zingerman July 20, 2024, 1:58 a.m. UTC | #5
On Fri, 2024-07-19 at 18:54 -0700, Alexei Starovoitov wrote:

[...]

> > So, the point stands: from C compiler pov pointer &b escapes,
> > and compiler is not really allowed to replace object at that offset
> > with garbage. Why do you think the program is broken?
> 
> This is apples to oranges.
> Compiler sees that the address of 'b' is taken and passed
> into a function with side effect.
> Whether 3rd arg of bpf_probe_read_kernel() is void * or long
> is irrelevant. Compilers will do it, because it's a C language
> requirement.
> 
> > I don't mind dropping the patch in question, but I agree with Andrii's
> > viewpoint that there is nothing wrong with this use case.
> 
> bpf_probe_read_kernel() is not special and it's 3rd argument is
> some kernel address. Whether it's stack pointer or anything else
> is irrelevant.
> JITs and verifier are allowed to do any optimizations on stack
> and any other memory completely ignoring presence of
> bpf_probe_read_kernel() and what is being passed into it.
> 
> Tomorrow we will teach arm64 JIT to replace stack spill/fill with
> spare register read/write. There is no way we're going to special case
> a particular fp-16 slot because fp-16 was passed into probe_read.

Ok, I will re-submit w/o these two patches.
Andrii also requested to re-structure the check contract function to
reset .nocsr_pattern and .spills_num marks to 0 upon contract violation.
diff mbox series

Patch

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 438daf36a694..77affc563a64 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -8684,11 +8684,15 @@  static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
 		return err;
 
 	if (arg_type == ARG_ANYTHING) {
+		/* return value depends on env->allow_ptr_leaks */
 		if (is_pointer_value(env, regno)) {
 			verbose(env, "R%d leaks addr into helper function\n",
 				regno);
 			return -EACCES;
 		}
+		if (reg->type == PTR_TO_STACK)
+			check_nocsr_stack_contract(env, cur_func(env), insn_idx,
+						   reg->smin_value + reg->off);
 		return 0;
 	}