diff mbox series

[bpf] bpf: Fix DEVMAP_HASH overflow check on 32-bit arches

Message ID 20240227152740.35120-1-toke@redhat.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series [bpf] bpf: Fix DEVMAP_HASH overflow check on 32-bit arches | expand

Checks

Context Check Description
bpf/vmtest-bpf-PR success PR summary
bpf/vmtest-bpf-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-VM_Test-4 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-VM_Test-11 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-VM_Test-10 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-VM_Test-13 success Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc
bpf/vmtest-bpf-VM_Test-12 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-VM_Test-16 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-VM_Test-18 success Logs for set-matrix
bpf/vmtest-bpf-VM_Test-20 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-VM_Test-19 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-VM_Test-17 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-VM_Test-15 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-VM_Test-22 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-23 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-21 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-25 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-29 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17 and -O2 optimization
bpf/vmtest-bpf-VM_Test-24 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-27 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-28 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-26 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-30 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-31 success Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-32 success Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-33 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-35 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-34 success Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-VM_Test-36 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18 and -O2 optimization
bpf/vmtest-bpf-VM_Test-39 success Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-40 success Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-38 success Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-41 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-42 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-VM_Test-37 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-9 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-7 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-6 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-8 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-14 success Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for bpf
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 958 this patch: 958
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 16 of 16 maintainers
netdev/build_clang success Errors and warnings before: 974 this patch: 974
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 975 this patch: 975
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 16 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Toke Høiland-Jørgensen Feb. 27, 2024, 3:27 p.m. UTC
The devmap code allocates a number hash buckets equal to the next power of two
of the max_entries value provided when creating the map. When rounding up to the
next power of two, the 32-bit variable storing the number of buckets can
overflow, and the code checks for overflow by checking if the truncated 32-bit value
is equal to 0. However, on 32-bit arches the rounding up itself can overflow
mid-way through, because it ends up doing a left-shift of 32 bits on an unsigned
long value. If the size of an unsigned long is four bytes, this is undefined
behaviour, so there is no guarantee that we'll end up with a nice and tidy
0-value at the end.

Syzbot managed to turn this into a crash on arm32 by creating a DEVMAP_HASH with
max_entries > 0x80000000 and then trying to update it. Fix this by moving the
overflow check to before the rounding up operation.

Fixes: 6f9d451ab1a3 ("xdp: Add devmap_hash map type for looking up devices by hashed index")
Link: https://lore.kernel.org/r/000000000000ed666a0611af6818@google.com
Reported-and-tested-by: syzbot+8cd36f6b65f3cafd400a@syzkaller.appspotmail.com
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 kernel/bpf/devmap.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

Comments

John Fastabend Feb. 28, 2024, 9:26 p.m. UTC | #1
Toke Høiland-Jørgensen wrote:
> The devmap code allocates a number hash buckets equal to the next power of two
> of the max_entries value provided when creating the map. When rounding up to the
> next power of two, the 32-bit variable storing the number of buckets can
> overflow, and the code checks for overflow by checking if the truncated 32-bit value
> is equal to 0. However, on 32-bit arches the rounding up itself can overflow
> mid-way through, because it ends up doing a left-shift of 32 bits on an unsigned
> long value. If the size of an unsigned long is four bytes, this is undefined
> behaviour, so there is no guarantee that we'll end up with a nice and tidy
> 0-value at the end.
> 
> Syzbot managed to turn this into a crash on arm32 by creating a DEVMAP_HASH with
> max_entries > 0x80000000 and then trying to update it. Fix this by moving the
> overflow check to before the rounding up operation.
> 
> Fixes: 6f9d451ab1a3 ("xdp: Add devmap_hash map type for looking up devices by hashed index")
> Link: https://lore.kernel.org/r/000000000000ed666a0611af6818@google.com
> Reported-and-tested-by: syzbot+8cd36f6b65f3cafd400a@syzkaller.appspotmail.com
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> ---
>  kernel/bpf/devmap.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
> index a936c704d4e7..9b2286f9c6da 100644
> --- a/kernel/bpf/devmap.c
> +++ b/kernel/bpf/devmap.c
> @@ -130,13 +130,11 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr)
>  	bpf_map_init_from_attr(&dtab->map, attr);
>  
>  	if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) {
> -		dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries);
> -
> -		if (!dtab->n_buckets) /* Overflow check */
> +		if (dtab->map.max_entries > U32_MAX / 2)
>  			return -EINVAL;
> -	}
>  
> -	if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) {
> +		dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries);
> +
>  		dtab->dev_index_head = dev_map_create_hash(dtab->n_buckets,
>  							   dtab->map.numa_node);
>  		if (!dtab->dev_index_head)
> -- 
> 2.43.2
> 

I'm fairly sure this code was just taken from the hashtab implementation.
Do we also need a fix there?

        /* hash table size must be power of 2 */
        htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);

The u32 check in hashtab is,

        /* prevent zero size kmalloc and check for u32 overflow */
        if (htab->n_buckets == 0 ||
            htab->n_buckets > U32_MAX / sizeof(struct bucket))
                goto free_htab;                 
                                  
Thanks,
John
Toke Høiland-Jørgensen Feb. 29, 2024, 10:16 a.m. UTC | #2
John Fastabend <john.fastabend@gmail.com> writes:

> Toke Høiland-Jørgensen wrote:
>> The devmap code allocates a number hash buckets equal to the next power of two
>> of the max_entries value provided when creating the map. When rounding up to the
>> next power of two, the 32-bit variable storing the number of buckets can
>> overflow, and the code checks for overflow by checking if the truncated 32-bit value
>> is equal to 0. However, on 32-bit arches the rounding up itself can overflow
>> mid-way through, because it ends up doing a left-shift of 32 bits on an unsigned
>> long value. If the size of an unsigned long is four bytes, this is undefined
>> behaviour, so there is no guarantee that we'll end up with a nice and tidy
>> 0-value at the end.
>> 
>> Syzbot managed to turn this into a crash on arm32 by creating a DEVMAP_HASH with
>> max_entries > 0x80000000 and then trying to update it. Fix this by moving the
>> overflow check to before the rounding up operation.
>> 
>> Fixes: 6f9d451ab1a3 ("xdp: Add devmap_hash map type for looking up devices by hashed index")
>> Link: https://lore.kernel.org/r/000000000000ed666a0611af6818@google.com
>> Reported-and-tested-by: syzbot+8cd36f6b65f3cafd400a@syzkaller.appspotmail.com
>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>> ---
>>  kernel/bpf/devmap.c | 8 +++-----
>>  1 file changed, 3 insertions(+), 5 deletions(-)
>> 
>> diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
>> index a936c704d4e7..9b2286f9c6da 100644
>> --- a/kernel/bpf/devmap.c
>> +++ b/kernel/bpf/devmap.c
>> @@ -130,13 +130,11 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr)
>>  	bpf_map_init_from_attr(&dtab->map, attr);
>>  
>>  	if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) {
>> -		dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries);
>> -
>> -		if (!dtab->n_buckets) /* Overflow check */
>> +		if (dtab->map.max_entries > U32_MAX / 2)
>>  			return -EINVAL;
>> -	}
>>  
>> -	if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) {
>> +		dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries);
>> +
>>  		dtab->dev_index_head = dev_map_create_hash(dtab->n_buckets,
>>  							   dtab->map.numa_node);
>>  		if (!dtab->dev_index_head)
>> -- 
>> 2.43.2
>> 
>
> I'm fairly sure this code was just taken from the hashtab implementation.

Yup, it was :)

> Do we also need a fix there?
>
>         /* hash table size must be power of 2 */
>         htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);
>
> The u32 check in hashtab is,
>
>         /* prevent zero size kmalloc and check for u32 overflow */
>         if (htab->n_buckets == 0 ||
>             htab->n_buckets > U32_MAX / sizeof(struct bucket))
>                 goto free_htab;

Yeah, I don't see any reason why this wouldn't be susceptible to the
same issue. I'll send a follow-up to apply the same fix there.

-Toke
John Fastabend Feb. 29, 2024, 8:39 p.m. UTC | #3
Toke Høiland-Jørgensen wrote:
> John Fastabend <john.fastabend@gmail.com> writes:
> 
> > Toke Høiland-Jørgensen wrote:
> >> The devmap code allocates a number hash buckets equal to the next power of two
> >> of the max_entries value provided when creating the map. When rounding up to the
> >> next power of two, the 32-bit variable storing the number of buckets can
> >> overflow, and the code checks for overflow by checking if the truncated 32-bit value
> >> is equal to 0. However, on 32-bit arches the rounding up itself can overflow
> >> mid-way through, because it ends up doing a left-shift of 32 bits on an unsigned
> >> long value. If the size of an unsigned long is four bytes, this is undefined
> >> behaviour, so there is no guarantee that we'll end up with a nice and tidy
> >> 0-value at the end.

Hi Toke, dumb question where is this left-shift noted above? It looks like fls_long
tries to account by having a check for sizeof(l) == 4. I'm asking mostly because
I've found a few more spots without this check.

> >> 
> >> Syzbot managed to turn this into a crash on arm32 by creating a DEVMAP_HASH with
> >> max_entries > 0x80000000 and then trying to update it. Fix this by moving the
> >> overflow check to before the rounding up operation.
> >> 
> >> Fixes: 6f9d451ab1a3 ("xdp: Add devmap_hash map type for looking up devices by hashed index")
> >> Link: https://lore.kernel.org/r/000000000000ed666a0611af6818@google.com
> >> Reported-and-tested-by: syzbot+8cd36f6b65f3cafd400a@syzkaller.appspotmail.com
> >> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> >> ---
> >>  kernel/bpf/devmap.c | 8 +++-----
> >>  1 file changed, 3 insertions(+), 5 deletions(-)
> >> 
> >> diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
> >> index a936c704d4e7..9b2286f9c6da 100644
> >> --- a/kernel/bpf/devmap.c
> >> +++ b/kernel/bpf/devmap.c
> >> @@ -130,13 +130,11 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr)
> >>  	bpf_map_init_from_attr(&dtab->map, attr);
> >>  
> >>  	if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) {
> >> -		dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries);
> >> -
> >> -		if (!dtab->n_buckets) /* Overflow check */
> >> +		if (dtab->map.max_entries > U32_MAX / 2)
> >>  			return -EINVAL;
> >> -	}
> >>  
> >> -	if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) {
> >> +		dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries);
> >> +
> >>  		dtab->dev_index_head = dev_map_create_hash(dtab->n_buckets,
> >>  							   dtab->map.numa_node);
> >>  		if (!dtab->dev_index_head)
> >> -- 
> >> 2.43.2
> >> 
> >
> > I'm fairly sure this code was just taken from the hashtab implementation.
> 
> Yup, it was :)
> 
> > Do we also need a fix there?
> >
> >         /* hash table size must be power of 2 */
> >         htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);
> >
> > The u32 check in hashtab is,
> >
> >         /* prevent zero size kmalloc and check for u32 overflow */
> >         if (htab->n_buckets == 0 ||
> >             htab->n_buckets > U32_MAX / sizeof(struct bucket))
> >                 goto free_htab;
> 
> Yeah, I don't see any reason why this wouldn't be susceptible to the
> same issue. I'll send a follow-up to apply the same fix there.

Cool thanks one more question above.

> 
> -Toke
>
Toke Høiland-Jørgensen March 1, 2024, 1:02 p.m. UTC | #4
John Fastabend <john.fastabend@gmail.com> writes:

> Toke Høiland-Jørgensen wrote:
>> John Fastabend <john.fastabend@gmail.com> writes:
>> 
>> > Toke Høiland-Jørgensen wrote:
>> >> The devmap code allocates a number hash buckets equal to the next power of two
>> >> of the max_entries value provided when creating the map. When rounding up to the
>> >> next power of two, the 32-bit variable storing the number of buckets can
>> >> overflow, and the code checks for overflow by checking if the truncated 32-bit value
>> >> is equal to 0. However, on 32-bit arches the rounding up itself can overflow
>> >> mid-way through, because it ends up doing a left-shift of 32 bits on an unsigned
>> >> long value. If the size of an unsigned long is four bytes, this is undefined
>> >> behaviour, so there is no guarantee that we'll end up with a nice and tidy
>> >> 0-value at the end.
>
> Hi Toke, dumb question where is this left-shift noted above? It looks
> like fls_long tries to account by having a check for sizeof(l) == 4.
> I'm asking mostly because I've found a few more spots without this
> check.

That check in fls_long only switches between too different
implementations of the fls op itself (fls() vs fls64()). AFAICT this is
mostly meaningful for the generic (non-ASM) version that iterates over
the bits instead of just emitting a single instruction.

The shift is in the caller:

static inline __attribute__((const))
unsigned long __roundup_pow_of_two(unsigned long n)
{
	return 1UL << fls_long(n - 1);
}

If this is called with a value > 0x80000000, fls_long() will (correctly)
return 32, leading to the ub[0] shift when sizeof(unsigned long) == 4.

-Toke

[0] https://wiki.sei.cmu.edu/confluence/display/c/int34-c.+do+not+shift+an+expression+by+a+negative+number+of+bits+or+by+greater+than+or+equal+to+the+number+of+bits+that+exist+in+the+operand
John Fastabend March 1, 2024, 5:22 p.m. UTC | #5
Toke Høiland-Jørgensen wrote:
> John Fastabend <john.fastabend@gmail.com> writes:
> 
> > Toke Høiland-Jørgensen wrote:
> >> John Fastabend <john.fastabend@gmail.com> writes:
> >> 
> >> > Toke Høiland-Jørgensen wrote:
> >> >> The devmap code allocates a number hash buckets equal to the next power of two
> >> >> of the max_entries value provided when creating the map. When rounding up to the
> >> >> next power of two, the 32-bit variable storing the number of buckets can
> >> >> overflow, and the code checks for overflow by checking if the truncated 32-bit value
> >> >> is equal to 0. However, on 32-bit arches the rounding up itself can overflow
> >> >> mid-way through, because it ends up doing a left-shift of 32 bits on an unsigned
> >> >> long value. If the size of an unsigned long is four bytes, this is undefined
> >> >> behaviour, so there is no guarantee that we'll end up with a nice and tidy
> >> >> 0-value at the end.
> >
> > Hi Toke, dumb question where is this left-shift noted above? It looks
> > like fls_long tries to account by having a check for sizeof(l) == 4.
> > I'm asking mostly because I've found a few more spots without this
> > check.
> 
> That check in fls_long only switches between too different
> implementations of the fls op itself (fls() vs fls64()). AFAICT this is
> mostly meaningful for the generic (non-ASM) version that iterates over
> the bits instead of just emitting a single instruction.
> 
> The shift is in the caller:
> 
> static inline __attribute__((const))
> unsigned long __roundup_pow_of_two(unsigned long n)
> {
> 	return 1UL << fls_long(n - 1);
> }
> 
> If this is called with a value > 0x80000000, fls_long() will (correctly)
> return 32, leading to the ub[0] shift when sizeof(unsigned long) == 4.

Yep thanks I was looking in fls_long there walked past the pow-of_two()
bits. Thanks.
diff mbox series

Patch

diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
index a936c704d4e7..9b2286f9c6da 100644
--- a/kernel/bpf/devmap.c
+++ b/kernel/bpf/devmap.c
@@ -130,13 +130,11 @@  static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr)
 	bpf_map_init_from_attr(&dtab->map, attr);
 
 	if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) {
-		dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries);
-
-		if (!dtab->n_buckets) /* Overflow check */
+		if (dtab->map.max_entries > U32_MAX / 2)
 			return -EINVAL;
-	}
 
-	if (attr->map_type == BPF_MAP_TYPE_DEVMAP_HASH) {
+		dtab->n_buckets = roundup_pow_of_two(dtab->map.max_entries);
+
 		dtab->dev_index_head = dev_map_create_hash(dtab->n_buckets,
 							   dtab->map.numa_node);
 		if (!dtab->dev_index_head)