diff mbox series

[bpf-next,1/4] bpf: Use __GFP_NOWARN for kvcalloc when attaching multiple uprobes

Message ID 20231211112843.4147157-2-houtao@huaweicloud.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series bpf: Fix warnings in kvmalloc_node() | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-3 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-8 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-llvm-16 / build / build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-14 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-13 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-4 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-19 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for x86_64-llvm-16 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-26 success Logs for x86_64-llvm-16 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-llvm-16 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-llvm-16 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-llvm-16 / veristat
bpf/vmtest-bpf-next-VM_Test-12 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf-next
netdev/ynl success SINGLE THREAD; Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1132 this patch: 1132
netdev/cc_maintainers fail 1 blamed authors not CCed: laoar.shao@gmail.com; 5 maintainers not CCed: mhiramat@kernel.org mathieu.desnoyers@efficios.com rostedt@goodmis.org linux-trace-kernel@vger.kernel.org laoar.shao@gmail.com
netdev/build_clang success Errors and warnings before: 1143 this patch: 1143
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1159 this patch: 1159
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 8 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-11 success Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc

Commit Message

Hou Tao Dec. 11, 2023, 11:28 a.m. UTC
From: Hou Tao <houtao1@huawei.com>

An abnormally big cnt may be passed to link_create.uprobe_multi.cnt,
and it will trigger the following warning in kvmalloc_node():

	if (unlikely(size > INT_MAX)) {
		WARN_ON_ONCE(!(flags & __GFP_NOWARN));
		return NULL;
	}

Fix the warning by using __GFP_NOWARN when invoking kvzalloc() in
bpf_uprobe_multi_link_attach().

Fixes: 89ae89f53d20 ("bpf: Add multi uprobe link")
Reported-by: xingwei lee <xrivendell7@gmail.com>
Closes: https://lore.kernel.org/bpf/CABOYnLwwJY=yFAGie59LFsUsBAgHfroVqbzZ5edAXbFE3YiNVA@mail.gmail.com
Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 kernel/trace/bpf_trace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jiri Olsa Dec. 11, 2023, 12:55 p.m. UTC | #1
On Mon, Dec 11, 2023 at 07:28:40PM +0800, Hou Tao wrote:
> From: Hou Tao <houtao1@huawei.com>
> 
> An abnormally big cnt may be passed to link_create.uprobe_multi.cnt,
> and it will trigger the following warning in kvmalloc_node():
> 
> 	if (unlikely(size > INT_MAX)) {
> 		WARN_ON_ONCE(!(flags & __GFP_NOWARN));
> 		return NULL;
> 	}
> 
> Fix the warning by using __GFP_NOWARN when invoking kvzalloc() in
> bpf_uprobe_multi_link_attach().
> 
> Fixes: 89ae89f53d20 ("bpf: Add multi uprobe link")
> Reported-by: xingwei lee <xrivendell7@gmail.com>
> Closes: https://lore.kernel.org/bpf/CABOYnLwwJY=yFAGie59LFsUsBAgHfroVqbzZ5edAXbFE3YiNVA@mail.gmail.com
> Signed-off-by: Hou Tao <houtao1@huawei.com>

Acked-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

> ---
>  kernel/trace/bpf_trace.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 774cf476a892..07b9b5896d6c 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -3378,7 +3378,7 @@ int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
>  	err = -ENOMEM;
>  
>  	link = kzalloc(sizeof(*link), GFP_KERNEL);
> -	uprobes = kvcalloc(cnt, sizeof(*uprobes), GFP_KERNEL);
> +	uprobes = kvcalloc(cnt, sizeof(*uprobes), GFP_KERNEL | __GFP_NOWARN);
>  
>  	if (!uprobes || !link)
>  		goto error_free;
> -- 
> 2.29.2
>
Alexei Starovoitov Dec. 11, 2023, 4:50 p.m. UTC | #2
On Mon, Dec 11, 2023 at 3:27 AM Hou Tao <houtao@huaweicloud.com> wrote:
>
> From: Hou Tao <houtao1@huawei.com>
>
> An abnormally big cnt may be passed to link_create.uprobe_multi.cnt,
> and it will trigger the following warning in kvmalloc_node():
>
>         if (unlikely(size > INT_MAX)) {
>                 WARN_ON_ONCE(!(flags & __GFP_NOWARN));
>                 return NULL;
>         }
>
> Fix the warning by using __GFP_NOWARN when invoking kvzalloc() in
> bpf_uprobe_multi_link_attach().
>
> Fixes: 89ae89f53d20 ("bpf: Add multi uprobe link")
> Reported-by: xingwei lee <xrivendell7@gmail.com>
> Closes: https://lore.kernel.org/bpf/CABOYnLwwJY=yFAGie59LFsUsBAgHfroVqbzZ5edAXbFE3YiNVA@mail.gmail.com
> Signed-off-by: Hou Tao <houtao1@huawei.com>
> ---
>  kernel/trace/bpf_trace.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 774cf476a892..07b9b5896d6c 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -3378,7 +3378,7 @@ int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
>         err = -ENOMEM;
>
>         link = kzalloc(sizeof(*link), GFP_KERNEL);
> -       uprobes = kvcalloc(cnt, sizeof(*uprobes), GFP_KERNEL);
> +       uprobes = kvcalloc(cnt, sizeof(*uprobes), GFP_KERNEL | __GFP_NOWARN);

__GFP_NOWARN will hide actual malloc failures.
Let's limit cnt instead. Both for k and u multi probes.
Hou Tao Dec. 12, 2023, 3:44 a.m. UTC | #3
Hi,

On 12/12/2023 12:50 AM, Alexei Starovoitov wrote:
> On Mon, Dec 11, 2023 at 3:27 AM Hou Tao <houtao@huaweicloud.com> wrote:
>> From: Hou Tao <houtao1@huawei.com>
>>
>> An abnormally big cnt may be passed to link_create.uprobe_multi.cnt,
>> and it will trigger the following warning in kvmalloc_node():
>>
>>         if (unlikely(size > INT_MAX)) {
>>                 WARN_ON_ONCE(!(flags & __GFP_NOWARN));
>>                 return NULL;
>>         }
>>
>> Fix the warning by using __GFP_NOWARN when invoking kvzalloc() in
>> bpf_uprobe_multi_link_attach().
>>
>> Fixes: 89ae89f53d20 ("bpf: Add multi uprobe link")
>> Reported-by: xingwei lee <xrivendell7@gmail.com>
>> Closes: https://lore.kernel.org/bpf/CABOYnLwwJY=yFAGie59LFsUsBAgHfroVqbzZ5edAXbFE3YiNVA@mail.gmail.com
>> Signed-off-by: Hou Tao <houtao1@huawei.com>
>> ---
>>  kernel/trace/bpf_trace.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
>> index 774cf476a892..07b9b5896d6c 100644
>> --- a/kernel/trace/bpf_trace.c
>> +++ b/kernel/trace/bpf_trace.c
>> @@ -3378,7 +3378,7 @@ int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
>>         err = -ENOMEM;
>>
>>         link = kzalloc(sizeof(*link), GFP_KERNEL);
>> -       uprobes = kvcalloc(cnt, sizeof(*uprobes), GFP_KERNEL);
>> +       uprobes = kvcalloc(cnt, sizeof(*uprobes), GFP_KERNEL | __GFP_NOWARN);
> __GFP_NOWARN will hide actual malloc failures.
> Let's limit cnt instead. Both for k and u multi probes.

Do you mean there will be no warning messages when the malloc request
can not be fulfilled, right ?  Because kvcalloc() will still return
-ENOMEM when __GFP_NOWARN is used, so the userspace knows the malloc
failed. And I also found out that __GFP_NOWARN only effect the
invocation of vmalloc(), because kvmalloc_node() enable __GFP_NOWARN for
kmalloc() by default when the passed size is greater than PAGE_SIZE.

I also though about fixing the problem by limitation, but I could not
get good reference values for these limitations. For multiple kprobe,
maybe the number of kallsyms can be used as an anchor (e.g, the number
is 207617 on my local dev machine), how about using 
__roundup_pow_of_two(207617 * 4) = 1 << 20 for multiple kprobes ? For
multiple uprobes, maybe (1<<20) is also suitable.
Jiri Olsa Dec. 12, 2023, 9:54 a.m. UTC | #4
On Tue, Dec 12, 2023 at 11:44:34AM +0800, Hou Tao wrote:
> Hi,
> 
> On 12/12/2023 12:50 AM, Alexei Starovoitov wrote:
> > On Mon, Dec 11, 2023 at 3:27 AM Hou Tao <houtao@huaweicloud.com> wrote:
> >> From: Hou Tao <houtao1@huawei.com>
> >>
> >> An abnormally big cnt may be passed to link_create.uprobe_multi.cnt,
> >> and it will trigger the following warning in kvmalloc_node():
> >>
> >>         if (unlikely(size > INT_MAX)) {
> >>                 WARN_ON_ONCE(!(flags & __GFP_NOWARN));
> >>                 return NULL;
> >>         }
> >>
> >> Fix the warning by using __GFP_NOWARN when invoking kvzalloc() in
> >> bpf_uprobe_multi_link_attach().
> >>
> >> Fixes: 89ae89f53d20 ("bpf: Add multi uprobe link")
> >> Reported-by: xingwei lee <xrivendell7@gmail.com>
> >> Closes: https://lore.kernel.org/bpf/CABOYnLwwJY=yFAGie59LFsUsBAgHfroVqbzZ5edAXbFE3YiNVA@mail.gmail.com
> >> Signed-off-by: Hou Tao <houtao1@huawei.com>
> >> ---
> >>  kernel/trace/bpf_trace.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> >> index 774cf476a892..07b9b5896d6c 100644
> >> --- a/kernel/trace/bpf_trace.c
> >> +++ b/kernel/trace/bpf_trace.c
> >> @@ -3378,7 +3378,7 @@ int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
> >>         err = -ENOMEM;
> >>
> >>         link = kzalloc(sizeof(*link), GFP_KERNEL);
> >> -       uprobes = kvcalloc(cnt, sizeof(*uprobes), GFP_KERNEL);
> >> +       uprobes = kvcalloc(cnt, sizeof(*uprobes), GFP_KERNEL | __GFP_NOWARN);
> > __GFP_NOWARN will hide actual malloc failures.
> > Let's limit cnt instead. Both for k and u multi probes.
> 
> Do you mean there will be no warning messages when the malloc request
> can not be fulfilled, right ?  Because kvcalloc() will still return
> -ENOMEM when __GFP_NOWARN is used, so the userspace knows the malloc
> failed. And I also found out that __GFP_NOWARN only effect the
> invocation of vmalloc(), because kvmalloc_node() enable __GFP_NOWARN for
> kmalloc() by default when the passed size is greater than PAGE_SIZE.
> 
> I also though about fixing the problem by limitation, but I could not
> get good reference values for these limitations. For multiple kprobe,
> maybe the number of kallsyms can be used as an anchor (e.g, the number
> is 207617 on my local dev machine), how about using 
> __roundup_pow_of_two(207617 * 4) = 1 << 20 for multiple kprobes ? For
> multiple uprobes, maybe (1<<20) is also suitable.

I think available_filter_functions is more relevant, because kallsyms
has everything

on fedora kernel:
  # cat available_filter_functions | wc -l
  80177

anyway to be on the safe side with some other configs and possible
huge kernel modules the '1 << 20' looks good to me, also for uprobe
multi

jirka
Hou Tao Dec. 12, 2023, 2:05 p.m. UTC | #5
Hi,

On 12/12/2023 5:54 PM, Jiri Olsa wrote:
> On Tue, Dec 12, 2023 at 11:44:34AM +0800, Hou Tao wrote:
>> Hi,
>>
>> On 12/12/2023 12:50 AM, Alexei Starovoitov wrote:
>>> On Mon, Dec 11, 2023 at 3:27 AM Hou Tao <houtao@huaweicloud.com> wrote:
>>>> From: Hou Tao <houtao1@huawei.com>
>>>>
>>>> An abnormally big cnt may be passed to link_create.uprobe_multi.cnt,
>>>> and it will trigger the following warning in kvmalloc_node():
>>>>
>>>>         if (unlikely(size > INT_MAX)) {
>>>>                 WARN_ON_ONCE(!(flags & __GFP_NOWARN));
>>>>                 return NULL;
>>>>         }
>>>>
>>>> Fix the warning by using __GFP_NOWARN when invoking kvzalloc() in
>>>> bpf_uprobe_multi_link_attach().
>>>>
>>>> Fixes: 89ae89f53d20 ("bpf: Add multi uprobe link")
>>>> Reported-by: xingwei lee <xrivendell7@gmail.com>
>>>> Closes: https://lore.kernel.org/bpf/CABOYnLwwJY=yFAGie59LFsUsBAgHfroVqbzZ5edAXbFE3YiNVA@mail.gmail.com
>>>> Signed-off-by: Hou Tao <houtao1@huawei.com>
>>>> ---
>>>>  kernel/trace/bpf_trace.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
>>>> index 774cf476a892..07b9b5896d6c 100644
>>>> --- a/kernel/trace/bpf_trace.c
>>>> +++ b/kernel/trace/bpf_trace.c
>>>> @@ -3378,7 +3378,7 @@ int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
>>>>         err = -ENOMEM;
>>>>
>>>>         link = kzalloc(sizeof(*link), GFP_KERNEL);
>>>> -       uprobes = kvcalloc(cnt, sizeof(*uprobes), GFP_KERNEL);
>>>> +       uprobes = kvcalloc(cnt, sizeof(*uprobes), GFP_KERNEL | __GFP_NOWARN);
>>> __GFP_NOWARN will hide actual malloc failures.
>>> Let's limit cnt instead. Both for k and u multi probes.
>> Do you mean there will be no warning messages when the malloc request
>> can not be fulfilled, right ?  Because kvcalloc() will still return
>> -ENOMEM when __GFP_NOWARN is used, so the userspace knows the malloc
>> failed. And I also found out that __GFP_NOWARN only effect the
>> invocation of vmalloc(), because kvmalloc_node() enable __GFP_NOWARN for
>> kmalloc() by default when the passed size is greater than PAGE_SIZE.
>>
>> I also though about fixing the problem by limitation, but I could not
>> get good reference values for these limitations. For multiple kprobe,
>> maybe the number of kallsyms can be used as an anchor (e.g, the number
>> is 207617 on my local dev machine), how about using 
>> __roundup_pow_of_two(207617 * 4) = 1 << 20 for multiple kprobes ? For
>> multiple uprobes, maybe (1<<20) is also suitable.
> I think available_filter_functions is more relevant, because kallsyms
> has everything
>
> on fedora kernel:
>   # cat available_filter_functions | wc -l
>   80177

Agreed. Only functions in available_filter_functions could use kprobe.
>
> anyway to be on the safe side with some other configs and possible
> huge kernel modules the '1 << 20' looks good to me, also for uprobe
> multi

Thanks. Will post v2 if Alexei is also fine with such limitations.
>
> jirka
Alexei Starovoitov Dec. 12, 2023, 4:58 p.m. UTC | #6
On Tue, Dec 12, 2023 at 6:06 AM Hou Tao <houtao@huaweicloud.com> wrote:
>
> > anyway to be on the safe side with some other configs and possible
> > huge kernel modules the '1 << 20' looks good to me, also for uprobe
> > multi
>
> Thanks. Will post v2 if Alexei is also fine with such limitations.

Yeah. The limit looks fine.

> >> can not be fulfilled, right ?  Because kvcalloc() will still return
> >> -ENOMEM when __GFP_NOWARN is used, so the userspace knows the malloc
> >> failed. And I also found out that __GFP_NOWARN only effect the
> >> invocation of vmalloc(), because kvmalloc_node() enable __GFP_NOWARN for
> >> kmalloc() by default when the passed size is greater than PAGE_SIZE.

Right, because kmalloc of more than a page will likely fail.
But vmalloc() may fail for all kinds of other reasons.
Suppressing them all prevents those messages appearing in the future.
The warn is there by default, so that users think about memory
allocations they request. So it served its exact purpose.
Just returning ENOMEM to user space would have been unnoticed
and cnt > INT_MAX would continue to be acceptable which potentially
opens up DoS, and other abuse.
diff mbox series

Patch

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 774cf476a892..07b9b5896d6c 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -3378,7 +3378,7 @@  int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
 	err = -ENOMEM;
 
 	link = kzalloc(sizeof(*link), GFP_KERNEL);
-	uprobes = kvcalloc(cnt, sizeof(*uprobes), GFP_KERNEL);
+	uprobes = kvcalloc(cnt, sizeof(*uprobes), GFP_KERNEL | __GFP_NOWARN);
 
 	if (!uprobes || !link)
 		goto error_free;