[bpf] bpf: Do more tight ALU bounds tracking

Message ID	20220729224254.1798-1-liulin063@gmail.com (mailing list archive)
State	Changes Requested
Delegated to:	BPF
Headers	show Return-Path: <bpf-owner@kernel.org> From: Youlin Li <liulin063@gmail.com> To: ast@kernel.org Cc: daniel@iogearbox.net, john.fastabend@gmail.com, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, kpsingh@kernel.org, sdf@google.com, haoluo@google.com, jolsa@kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, Youlin Li <liulin063@gmail.com> Subject: [PATCH bpf] bpf: Do more tight ALU bounds tracking Date: Sat, 30 Jul 2022 06:42:54 +0800 Message-Id: <20220729224254.1798-1-liulin063@gmail.com> In-Reply-To: <CA+khW7iknv0hcn-D2tRt8HFseUnyTV7BwpohQHtEyctbA1k27w@mail.gmail.com> References: <CA+khW7iknv0hcn-D2tRt8HFseUnyTV7BwpohQHtEyctbA1k27w@mail.gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	[bpf] bpf: Do more tight ALU bounds tracking \| expand [bpf] bpf: Do more tight ALU bounds tracking

Context	Check	Description
netdev/tree_selection	success	Clearly marked for bpf
netdev/fixes_present	fail	Series targets non-next tree, but doesn't contain any Fixes tags
netdev/subject_prefix	success	Link
netdev/cover_letter	success	Single patches do not need cover letters
netdev/patch_count	success	Link
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 20 this patch: 20
netdev/cc_maintainers	success	CCed 12 of 12 maintainers
netdev/build_clang	success	Errors and warnings before: 6 this patch: 6
netdev/module_param	success	Was 0 now: 0
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 20 this patch: 20
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 29 lines checked
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0
bpf/vmtest-bpf-PR	success	PR summary
bpf/vmtest-bpf-VM_Test-1	success	Logs for build for s390x with gcc
bpf/vmtest-bpf-VM_Test-2	success	Logs for build for x86_64 with gcc
bpf/vmtest-bpf-VM_Test-3	success	Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-4	success	Logs for llvm-toolchain
bpf/vmtest-bpf-VM_Test-5	success	Logs for set-matrix
bpf/vmtest-bpf-VM_Test-6	success	Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-VM_Test-7	success	Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-8	success	Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-9	success	Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-VM_Test-10	success	Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-11	success	Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-12	success	Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-VM_Test-13	success	Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-14	success	Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-15	success	Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-VM_Test-16	success	Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-17	success	Logs for test_verifier on x86_64 with llvm-16

Youlin Li July 29, 2022, 10:42 p.m. UTC

In adjust_scalar_min_max_vals(), let 32bit bounds learn from 64bit bounds
to get more tight bounds tracking. Similar operation can be found in
reg_set_min_max().

Also, we can now fold reg_bounds_sync() into zext_32_to_64().

Before:

    func#0 @0
    0: R1=ctx(off=0,imm=0) R10=fp0
    0: (b7) r0 = 0                        ; R0_w=0
    1: (b7) r1 = 0                        ; R1_w=0
    2: (87) r1 = -r1                      ; R1_w=scalar()
    3: (87) r1 = -r1                      ; R1_w=scalar()
    4: (c7) r1 s>>= 63                    ; R1_w=scalar(smin=-1,smax=0)
    5: (07) r1 += 2                       ; R1_w=scalar(umin=1,umax=2,var_off=(0x0; 0xffffffff))  <--- [*]
    6: (95) exit

It can be seen that even if the 64bit bounds is clear here, the 32bit
bounds is still in the state of 'UNKNOWN'.

After:

    func#0 @0
    0: R1=ctx(off=0,imm=0) R10=fp0
    0: (b7) r0 = 0                        ; R0_w=0
    1: (b7) r1 = 0                        ; R1_w=0
    2: (87) r1 = -r1                      ; R1_w=scalar()
    3: (87) r1 = -r1                      ; R1_w=scalar()
    4: (c7) r1 s>>= 63                    ; R1_w=scalar(smin=-1,smax=0)
    5: (07) r1 += 2                       ; R1_w=scalar(umin=1,umax=2,var_off=(0x0; 0x3))  <--- [*]
    6: (95) exit

Signed-off-by: Youlin Li <liulin063@gmail.com>
---
 kernel/bpf/verifier.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

Hao Luo July 29, 2022, 10:48 p.m. UTC | #1

On Fri, Jul 29, 2022 at 3:43 PM Youlin Li <liulin063@gmail.com> wrote:
>
> In adjust_scalar_min_max_vals(), let 32bit bounds learn from 64bit bounds
> to get more tight bounds tracking. Similar operation can be found in
> reg_set_min_max().
>
> Also, we can now fold reg_bounds_sync() into zext_32_to_64().
>
> Before:
>
>     func#0 @0
>     0: R1=ctx(off=0,imm=0) R10=fp0
>     0: (b7) r0 = 0                        ; R0_w=0
>     1: (b7) r1 = 0                        ; R1_w=0
>     2: (87) r1 = -r1                      ; R1_w=scalar()
>     3: (87) r1 = -r1                      ; R1_w=scalar()
>     4: (c7) r1 s>>= 63                    ; R1_w=scalar(smin=-1,smax=0)
>     5: (07) r1 += 2                       ; R1_w=scalar(umin=1,umax=2,var_off=(0x0; 0xffffffff))  <--- [*]
>     6: (95) exit
>
> It can be seen that even if the 64bit bounds is clear here, the 32bit
> bounds is still in the state of 'UNKNOWN'.
>
> After:
>
>     func#0 @0
>     0: R1=ctx(off=0,imm=0) R10=fp0
>     0: (b7) r0 = 0                        ; R0_w=0
>     1: (b7) r1 = 0                        ; R1_w=0
>     2: (87) r1 = -r1                      ; R1_w=scalar()
>     3: (87) r1 = -r1                      ; R1_w=scalar()
>     4: (c7) r1 s>>= 63                    ; R1_w=scalar(smin=-1,smax=0)
>     5: (07) r1 += 2                       ; R1_w=scalar(umin=1,umax=2,var_off=(0x0; 0x3))  <--- [*]
>     6: (95) exit
>
> Signed-off-by: Youlin Li <liulin063@gmail.com>

Looks good to me. Thanks Youlin.

Acked-by: Hao Luo <haoluo@google.com>

Hao

Daniel Borkmann Aug. 8, 2022, 1:25 p.m. UTC | #2

On 7/30/22 12:48 AM, Hao Luo wrote:
> On Fri, Jul 29, 2022 at 3:43 PM Youlin Li <liulin063@gmail.com> wrote:
>>
>> In adjust_scalar_min_max_vals(), let 32bit bounds learn from 64bit bounds
>> to get more tight bounds tracking. Similar operation can be found in
>> reg_set_min_max().
>>
>> Also, we can now fold reg_bounds_sync() into zext_32_to_64().
>>
>> Before:
>>
>>      func#0 @0
>>      0: R1=ctx(off=0,imm=0) R10=fp0
>>      0: (b7) r0 = 0                        ; R0_w=0
>>      1: (b7) r1 = 0                        ; R1_w=0
>>      2: (87) r1 = -r1                      ; R1_w=scalar()
>>      3: (87) r1 = -r1                      ; R1_w=scalar()
>>      4: (c7) r1 s>>= 63                    ; R1_w=scalar(smin=-1,smax=0)
>>      5: (07) r1 += 2                       ; R1_w=scalar(umin=1,umax=2,var_off=(0x0; 0xffffffff))  <--- [*]
>>      6: (95) exit
>>
>> It can be seen that even if the 64bit bounds is clear here, the 32bit
>> bounds is still in the state of 'UNKNOWN'.
>>
>> After:
>>
>>      func#0 @0
>>      0: R1=ctx(off=0,imm=0) R10=fp0
>>      0: (b7) r0 = 0                        ; R0_w=0
>>      1: (b7) r1 = 0                        ; R1_w=0
>>      2: (87) r1 = -r1                      ; R1_w=scalar()
>>      3: (87) r1 = -r1                      ; R1_w=scalar()
>>      4: (c7) r1 s>>= 63                    ; R1_w=scalar(smin=-1,smax=0)
>>      5: (07) r1 += 2                       ; R1_w=scalar(umin=1,umax=2,var_off=(0x0; 0x3))  <--- [*]
>>      6: (95) exit
>>
>> Signed-off-by: Youlin Li <liulin063@gmail.com>
> 
> Looks good to me. Thanks Youlin.
> 
> Acked-by: Hao Luo <haoluo@google.com>

Thanks Youlin! Looks like the patch breaks CI [0] e.g.:

   #142/p bounds check after truncation of non-boundary-crossing range FAIL
   Failed to load prog 'Permission denied'!
   invalid access to map value, value_size=8 off=16777215 size=1
   R0 max value is outside of the allowed memory range
   verification time 296 usec
   stack depth 8
   processed 15 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

Please take a look. Also it would be great to add a test_verifier selftest to
assert above case from commit log against future changes.

Thanks,
Daniel

   [0] https://github.com/kernel-patches/bpf/runs/7696324041?check_suite_focus=true

Youlin Li Aug. 8, 2022, 3:14 p.m. UTC | #3

---------- Forwarded message ---------
From: Kuee k1r0a <liulin063@gmail.com>
Date: Mon, Aug 8, 2022 at 11:11 PM
Subject: Re: [PATCH bpf] bpf: Do more tight ALU bounds tracking
To: Daniel Borkmann <daniel@iogearbox.net>


On Mon, Aug 8, 2022 at 9:25 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 7/30/22 12:48 AM, Hao Luo wrote:
> > On Fri, Jul 29, 2022 at 3:43 PM Youlin Li <liulin063@gmail.com> wrote:
> >>
> >> In adjust_scalar_min_max_vals(), let 32bit bounds learn from 64bit bounds
> >> to get more tight bounds tracking. Similar operation can be found in
> >> reg_set_min_max().
> >>
> >> Also, we can now fold reg_bounds_sync() into zext_32_to_64().
> >>
> >> Before:
> >>
> >>      func#0 @0
> >>      0: R1=ctx(off=0,imm=0) R10=fp0
> >>      0: (b7) r0 = 0                        ; R0_w=0
> >>      1: (b7) r1 = 0                        ; R1_w=0
> >>      2: (87) r1 = -r1                      ; R1_w=scalar()
> >>      3: (87) r1 = -r1                      ; R1_w=scalar()
> >>      4: (c7) r1 s>>= 63                    ; R1_w=scalar(smin=-1,smax=0)
> >>      5: (07) r1 += 2                       ; R1_w=scalar(umin=1,umax=2,var_off=(0x0; 0xffffffff))  <--- [*]
> >>      6: (95) exit
> >>
> >> It can be seen that even if the 64bit bounds is clear here, the 32bit
> >> bounds is still in the state of 'UNKNOWN'.
> >>
> >> After:
> >>
> >>      func#0 @0
> >>      0: R1=ctx(off=0,imm=0) R10=fp0
> >>      0: (b7) r0 = 0                        ; R0_w=0
> >>      1: (b7) r1 = 0                        ; R1_w=0
> >>      2: (87) r1 = -r1                      ; R1_w=scalar()
> >>      3: (87) r1 = -r1                      ; R1_w=scalar()
> >>      4: (c7) r1 s>>= 63                    ; R1_w=scalar(smin=-1,smax=0)
> >>      5: (07) r1 += 2                       ; R1_w=scalar(umin=1,umax=2,var_off=(0x0; 0x3))  <--- [*]
> >>      6: (95) exit
> >>
> >> Signed-off-by: Youlin Li <liulin063@gmail.com>
> >
> > Looks good to me. Thanks Youlin.
> >
> > Acked-by: Hao Luo <haoluo@google.com>
>
> Thanks Youlin! Looks like the patch breaks CI [0] e.g.:
>
>    #142/p bounds check after truncation of non-boundary-crossing range FAIL
>    Failed to load prog 'Permission denied'!
>    invalid access to map value, value_size=8 off=16777215 size=1
>    R0 max value is outside of the allowed memory range
>    verification time 296 usec
>    stack depth 8
>    processed 15 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
>
> Please take a look. Also it would be great to add a test_verifier selftest to
> assert above case from commit log against future changes.
>
> Thanks,
> Daniel
>
>    [0] https://github.com/kernel-patches/bpf/runs/7696324041?check_suite_focus=true

This test case fails because the 32bit boundary information is lost
after the 11th instruction is executed:
Before:
    11: (07) r1 += 2147483647             ;
R1_w=scalar(umin=70866960383,umax=70866960638,var_off=(0x1000000000;
0xffffffff),u32_min=2147483647,u32_max=-2147483394)
After:
    11: (07) r1 += 2147483647             ;
R1_w=scalar(umin=70866960383,umax=70866960638,var_off=(0x1000000000;
0xffffffff))

This may be because, in previous versions of the code, when
__reg_combine_64_into_32() was called, the 32bit boundary was
completely deduced from the 64bit boundary, so there was a call to
__mark_reg32_unbounded() in __reg_combine_64_into_32().

But now, before adjust_scalar_min_max_vals() calls
__reg_combine_64_into_32() , the 32bit bounds are already calculated
to some extent, and __mark_reg32_unbounded() will eliminate these
information.

Simply copying a code without __mark_reg32_unbounded() should work,
perhaps it would be more elegant to introduce a flag into
__reg_combine_64_into_32()?

Sorry for not completing the tests because I did not 'make selftests'
successfully, and uploaded the code that caused the error.

Daniel Borkmann Aug. 8, 2022, 3:42 p.m. UTC | #4

On 8/8/22 5:14 PM, Kuee k1r0a wrote:
> ---------- Forwarded message ---------
> From: Kuee k1r0a <liulin063@gmail.com>
> Date: Mon, Aug 8, 2022 at 11:11 PM
> Subject: Re: [PATCH bpf] bpf: Do more tight ALU bounds tracking
> To: Daniel Borkmann <daniel@iogearbox.net>
> 
> 
> On Mon, Aug 8, 2022 at 9:25 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>
>> On 7/30/22 12:48 AM, Hao Luo wrote:
>>> On Fri, Jul 29, 2022 at 3:43 PM Youlin Li <liulin063@gmail.com> wrote:
>>>>
>>>> In adjust_scalar_min_max_vals(), let 32bit bounds learn from 64bit bounds
>>>> to get more tight bounds tracking. Similar operation can be found in
>>>> reg_set_min_max().
>>>>
>>>> Also, we can now fold reg_bounds_sync() into zext_32_to_64().
>>>>
>>>> Before:
>>>>
>>>>       func#0 @0
>>>>       0: R1=ctx(off=0,imm=0) R10=fp0
>>>>       0: (b7) r0 = 0                        ; R0_w=0
>>>>       1: (b7) r1 = 0                        ; R1_w=0
>>>>       2: (87) r1 = -r1                      ; R1_w=scalar()
>>>>       3: (87) r1 = -r1                      ; R1_w=scalar()
>>>>       4: (c7) r1 s>>= 63                    ; R1_w=scalar(smin=-1,smax=0)
>>>>       5: (07) r1 += 2                       ; R1_w=scalar(umin=1,umax=2,var_off=(0x0; 0xffffffff))  <--- [*]
>>>>       6: (95) exit
>>>>
>>>> It can be seen that even if the 64bit bounds is clear here, the 32bit
>>>> bounds is still in the state of 'UNKNOWN'.
>>>>
>>>> After:
>>>>
>>>>       func#0 @0
>>>>       0: R1=ctx(off=0,imm=0) R10=fp0
>>>>       0: (b7) r0 = 0                        ; R0_w=0
>>>>       1: (b7) r1 = 0                        ; R1_w=0
>>>>       2: (87) r1 = -r1                      ; R1_w=scalar()
>>>>       3: (87) r1 = -r1                      ; R1_w=scalar()
>>>>       4: (c7) r1 s>>= 63                    ; R1_w=scalar(smin=-1,smax=0)
>>>>       5: (07) r1 += 2                       ; R1_w=scalar(umin=1,umax=2,var_off=(0x0; 0x3))  <--- [*]
>>>>       6: (95) exit
>>>>
>>>> Signed-off-by: Youlin Li <liulin063@gmail.com>
>>>
>>> Looks good to me. Thanks Youlin.
>>>
>>> Acked-by: Hao Luo <haoluo@google.com>
>>
>> Thanks Youlin! Looks like the patch breaks CI [0] e.g.:
>>
>>     #142/p bounds check after truncation of non-boundary-crossing range FAIL
>>     Failed to load prog 'Permission denied'!
>>     invalid access to map value, value_size=8 off=16777215 size=1
>>     R0 max value is outside of the allowed memory range
>>     verification time 296 usec
>>     stack depth 8
>>     processed 15 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
>>
>> Please take a look. Also it would be great to add a test_verifier selftest to
>> assert above case from commit log against future changes.
>>
>> Thanks,
>> Daniel
>>
>>     [0] https://github.com/kernel-patches/bpf/runs/7696324041?check_suite_focus=true
> 
> This test case fails because the 32bit boundary information is lost
> after the 11th instruction is executed:
> Before:
>      11: (07) r1 += 2147483647             ;
> R1_w=scalar(umin=70866960383,umax=70866960638,var_off=(0x1000000000;
> 0xffffffff),u32_min=2147483647,u32_max=-2147483394)
> After:
>      11: (07) r1 += 2147483647             ;
> R1_w=scalar(umin=70866960383,umax=70866960638,var_off=(0x1000000000;
> 0xffffffff))
> 
> This may be because, in previous versions of the code, when
> __reg_combine_64_into_32() was called, the 32bit boundary was
> completely deduced from the 64bit boundary, so there was a call to
> __mark_reg32_unbounded() in __reg_combine_64_into_32().
> 
> But now, before adjust_scalar_min_max_vals() calls
> __reg_combine_64_into_32() , the 32bit bounds are already calculated
> to some extent, and __mark_reg32_unbounded() will eliminate these
> information.
> 
> Simply copying a code without __mark_reg32_unbounded() should work,
> perhaps it would be more elegant to introduce a flag into
> __reg_combine_64_into_32()?
> 
> Sorry for not completing the tests because I did not 'make selftests'
> successfully, and uploaded the code that caused the error.

Under tools/testing/selftests/bpf/, you can run test_progs and test_verifier
through the vmtest script, e.g. `./vmtest.sh -- ./test_progs` should ease
running it. The whole `make selftests` is not necessary given here we care
about BPF, CI is running these where 2 failed and need investigation:

           test_progs: PASS
  test_progs-no_alu32: FAIL (returned 1)
            test_maps: PASS
        test_verifier: FAIL (returned 1)

Fwiw, for the test_verifier failure case at least, we should then adapt it
in a separate commit with an analysis explaining why it is okay to alter the
test; plus a 3rd commit adding new test cases as mentioned earlier.

Thanks a lot, Kuee!
Daniel

[bpf] bpf: Do more tight ALU bounds tracking

Checks

Commit Message

Comments

Patch