diff mbox series

bpf: Ensure BPF programs testing skb context initialization

Message ID 20240710084633.2229015-1-michal.switala@infogain.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series bpf: Ensure BPF programs testing skb context initialization | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-8 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-7 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-33 fail Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-4 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 fail Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-10 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-20 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-11 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-35 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-16 fail Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-26 fail Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17-O2
bpf/vmtest-bpf-next-VM_Test-18 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-19 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-34 success Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-12 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-17 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-36 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18-O2
bpf/vmtest-bpf-next-VM_Test-42 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-22 fail Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 fail Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-38 fail Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-21 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-39 fail Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-25 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-15 fail Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-31 fail Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-14 fail Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-41 fail Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-37 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-40 fail Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-32 fail Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit fail Errors and warnings before: 18 this patch: 18
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 15 maintainers not CCed: kpsingh@kernel.org haoluo@google.com edumazet@google.com kuba@kernel.org daniel@iogearbox.net andrii@kernel.org john.fastabend@gmail.com jolsa@kernel.org ast@kernel.org yonghong.song@linux.dev martin.lau@linux.dev song@kernel.org eddyz87@gmail.com pabeni@redhat.com sdf@fomichev.me
netdev/build_clang success Errors and warnings before: 821 this patch: 821
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 839 this patch: 839
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 42 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 1 this patch: 1
netdev/source_inline success Was 0 now: 0

Commit Message

Michal Switala July 10, 2024, 8:46 a.m. UTC
This commit addresses an issue where a netdevice was found to be uninitialized.
To mitigate this case, the change ensures that BPF programs designed to test
skb context initialization thoroughly verify the availability of a fully
initialized context before execution.The root cause of a NULL ctx stems from
the initialization process in bpf_ctx_init(). This function returns NULL if
the user initializes the bpf_attr variables ctx_in and ctx_out with invalid
pointers or sets them to NULL. These variables are directly controlled by user
input, and if both are NULL, the context cannot be initialized, resulting in a
NULL ctx.

Reported-by: syzbot+cca39e6e84a367a7e6f6@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=cca39e6e84a367a7e6f6
Link: https://lore.kernel.org/all/000000000000b95d41061cbf302a@google.com/
Signed-off-by: Michal Switala <michal.switala@infogain.com>
---
 net/bpf/test_run.c | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

Comments

Alexei Starovoitov July 10, 2024, 6:38 p.m. UTC | #1
On Wed, Jul 10, 2024 at 4:58 AM Michal Switala
<michal.switala@infogain.com> wrote:
>
> This commit addresses an issue where a netdevice was found to be uninitialized.
> To mitigate this case, the change ensures that BPF programs designed to test
> skb context initialization thoroughly verify the availability of a fully
> initialized context before execution.The root cause of a NULL ctx stems from
> the initialization process in bpf_ctx_init(). This function returns NULL if
> the user initializes the bpf_attr variables ctx_in and ctx_out with invalid
> pointers or sets them to NULL. These variables are directly controlled by user
> input, and if both are NULL, the context cannot be initialized, resulting in a
> NULL ctx.
>
> Reported-by: syzbot+cca39e6e84a367a7e6f6@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=cca39e6e84a367a7e6f6
> Link: https://lore.kernel.org/all/000000000000b95d41061cbf302a@google.com/

Something doesn't add up.
This syzbot report is about:

dev_map_enqueue+0x31/0x3e0 kernel/bpf/devmap.c:539
__xdp_do_redirect_frame net/core/filter.c:4397 [inline]
bpf_prog_test_run_xdp

while you're fixing bpf_prog_test_run_skb ?

pw-bot: cr

> Signed-off-by: Michal Switala <michal.switala@infogain.com>
> ---
>  net/bpf/test_run.c | 30 +++++++++++++++++++++++++++++-
>  1 file changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index 36ae54f57bf5..8b2efcee059f 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -970,7 +970,7 @@ static struct proto bpf_dummy_proto = {
>  int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
>                           union bpf_attr __user *uattr)
>  {
> -       bool is_l2 = false, is_direct_pkt_access = false;
> +       bool is_l2 = false, is_direct_pkt_access = false, ctx_needed = false;
>         struct net *net = current->nsproxy->net_ns;
>         struct net_device *dev = net->loopback_dev;
>         u32 size = kattr->test.data_size_in;
> @@ -998,6 +998,34 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
>                 return PTR_ERR(ctx);
>         }
>
> +       switch (prog->type) {
> +       case BPF_PROG_TYPE_SOCKET_FILTER:
> +       case BPF_PROG_TYPE_SCHED_CLS:
> +       case BPF_PROG_TYPE_SCHED_ACT:
> +       case BPF_PROG_TYPE_XDP:
> +       case BPF_PROG_TYPE_CGROUP_SKB:
> +       case BPF_PROG_TYPE_CGROUP_SOCK:
> +       case BPF_PROG_TYPE_SOCK_OPS:
> +       case BPF_PROG_TYPE_SK_SKB:
> +       case BPF_PROG_TYPE_SK_MSG:
> +       case BPF_PROG_TYPE_CGROUP_SOCK_ADDR:
> +       case BPF_PROG_TYPE_LWT_SEG6LOCAL:
> +       case BPF_PROG_TYPE_SK_REUSEPORT:
> +       case BPF_PROG_TYPE_NETFILTER:
> +       case BPF_PROG_TYPE_LWT_IN:
> +       case BPF_PROG_TYPE_LWT_OUT:
> +       case BPF_PROG_TYPE_LWT_XMIT:
> +               ctx_needed = true;
> +               break;
> +       default:
> +               break;
> +       }
> +
> +       if (!ctx && ctx_needed) {
> +               kfree(data);
> +               return -EINVAL;
> +       }
> +
>         switch (prog->type) {
>         case BPF_PROG_TYPE_SCHED_CLS:
>         case BPF_PROG_TYPE_SCHED_ACT:
> --
> 2.43.0
>
>
Michal Switala July 15, 2024, 6:13 p.m. UTC | #2
Hi,

The reproducer calls the methods bpf_prog_test_run_xdp and
bpf_prog_test_run_skb. Both lead to the invocation of dev_map_enqueue, in the
case of the former, the backtrace is recorded in its entirety, whereas for the
latter it is not. I think the bug might be incorrectly reported on syzkaller, as
during GDB debugging, the problem occurred in functions called from
bpf_prog_test_run_skb. I also ran testing of my patch on syzkaller and the tests
passed.

Regards
Michal
Martin KaFai Lau July 15, 2024, 9:59 p.m. UTC | #3
On 7/15/24 11:13 AM, Michal Switala wrote:

 >> Reported-by: syzbot+cca39e6e84a367a7e6f6@syzkaller.appspotmail.com
 >> Closes: https://syzkaller.appspot.com/bug?extid=cca39e6e84a367a7e6f6
 >> Link: https://lore.kernel.org/all/000000000000b95d41061cbf302a@google.com/
 >
 > Something doesn't add up.
 > This syzbot report is about:
 >
 > dev_map_enqueue+0x31/0x3e0 kernel/bpf/devmap.c:539
 > __xdp_do_redirect_frame net/core/filter.c:4397 [inline]
 > bpf_prog_test_run_xdp
 >
 > why you're fixing bpf_prog_test_run_skb ?


[ Please keep the relevant email context in the reply ]


> The reproducer calls the methods bpf_prog_test_run_xdp and
> bpf_prog_test_run_skb. Both lead to the invocation of dev_map_enqueue, in the

The syzbot report is triggering from the bpf_prog_test_run_xdp. I agree with 
Alexei that fixing the bpf_prog_test_run_skb does not make sense. At least I 
don't see how dev_map_enqueue can be used from bpf_prog_test_run_skb.

It looks very similar to 
https://lore.kernel.org/bpf/000000000000f6531b061494e696@google.com/. It has 
been fixed in commit 5bcf0dcbf906 ("xdp: use flags field to disambiguate 
broadcast redirect")

I tried the C repro. I can reproduce in the bpf tree also which should have the 
fix. I cannot reproduce in the bpf-next though.

Cc Toke who knows more details here.

> case of the former, the backtrace is recorded in its entirety, whereas for the
> latter it is not. I think the bug might be incorrectly reported on syzkaller, as
> during GDB debugging, the problem occurred in functions called from
> bpf_prog_test_run_skb. I also ran testing of my patch on syzkaller and the tests
> passed.
Toke Høiland-Jørgensen July 17, 2024, 1:28 p.m. UTC | #4
Martin KaFai Lau <martin.lau@linux.dev> writes:

> On 7/15/24 11:13 AM, Michal Switala wrote:
>
>  >> Reported-by: syzbot+cca39e6e84a367a7e6f6@syzkaller.appspotmail.com
>  >> Closes: https://syzkaller.appspot.com/bug?extid=cca39e6e84a367a7e6f6
>  >> Link: https://lore.kernel.org/all/000000000000b95d41061cbf302a@google.com/
>  >
>  > Something doesn't add up.
>  > This syzbot report is about:
>  >
>  > dev_map_enqueue+0x31/0x3e0 kernel/bpf/devmap.c:539
>  > __xdp_do_redirect_frame net/core/filter.c:4397 [inline]
>  > bpf_prog_test_run_xdp
>  >
>  > why you're fixing bpf_prog_test_run_skb ?
>
>
> [ Please keep the relevant email context in the reply ]
>
>
>> The reproducer calls the methods bpf_prog_test_run_xdp and
>> bpf_prog_test_run_skb. Both lead to the invocation of dev_map_enqueue, in the
>
> The syzbot report is triggering from the bpf_prog_test_run_xdp. I agree with 
> Alexei that fixing the bpf_prog_test_run_skb does not make sense. At least I 
> don't see how dev_map_enqueue can be used from bpf_prog_test_run_skb.

Me neither.

> It looks very similar to 
> https://lore.kernel.org/bpf/000000000000f6531b061494e696@google.com/. It has 
> been fixed in commit 5bcf0dcbf906 ("xdp: use flags field to disambiguate 
> broadcast redirect")
>
> I tried the C repro. I can reproduce in the bpf tree also which should have the 
> fix. I cannot reproduce in the bpf-next though.
>
> Cc Toke who knows more details here.

Hmm, yeah, it does look kinda similar. Do you mean that the C repro from
this new report triggers the crash for you on the current -bpf tree?

-Toke
Martin KaFai Lau July 17, 2024, 7:16 p.m. UTC | #5
On 7/17/24 6:28 AM, Toke Høiland-Jørgensen wrote:
>> It looks very similar to
>> https://lore.kernel.org/bpf/000000000000f6531b061494e696@google.com/. It has
>> been fixed in commit 5bcf0dcbf906 ("xdp: use flags field to disambiguate
>> broadcast redirect")
>>
>> I tried the C repro. I can reproduce in the bpf tree also which should have the
>> fix. I cannot reproduce in the bpf-next though.
>>
>> Cc Toke who knows more details here.
> 
> Hmm, yeah, it does look kinda similar. Do you mean that the C repro from
> this new report triggers the crash for you on the current -bpf tree?

I was able to repro in bpf tree ~two days ago but not now. The bpf tree has been 
fast forwarded and has the 6.10 changes. I just tried linux-stable/linux-6.9.y 
which has the fix in the commit 5bcf0dcbf906. The syzbot report (against the 
36534d3c5453) also has that fix.

In particular, the syzbot repro I tried:
https://syzkaller.appspot.com/text?tag=ReproC&x=17caa30a980000
diff mbox series

Patch

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 36ae54f57bf5..8b2efcee059f 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -970,7 +970,7 @@  static struct proto bpf_dummy_proto = {
 int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
 			  union bpf_attr __user *uattr)
 {
-	bool is_l2 = false, is_direct_pkt_access = false;
+	bool is_l2 = false, is_direct_pkt_access = false, ctx_needed = false;
 	struct net *net = current->nsproxy->net_ns;
 	struct net_device *dev = net->loopback_dev;
 	u32 size = kattr->test.data_size_in;
@@ -998,6 +998,34 @@  int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
 		return PTR_ERR(ctx);
 	}
 
+	switch (prog->type) {
+	case BPF_PROG_TYPE_SOCKET_FILTER:
+	case BPF_PROG_TYPE_SCHED_CLS:
+	case BPF_PROG_TYPE_SCHED_ACT:
+	case BPF_PROG_TYPE_XDP:
+	case BPF_PROG_TYPE_CGROUP_SKB:
+	case BPF_PROG_TYPE_CGROUP_SOCK:
+	case BPF_PROG_TYPE_SOCK_OPS:
+	case BPF_PROG_TYPE_SK_SKB:
+	case BPF_PROG_TYPE_SK_MSG:
+	case BPF_PROG_TYPE_CGROUP_SOCK_ADDR:
+	case BPF_PROG_TYPE_LWT_SEG6LOCAL:
+	case BPF_PROG_TYPE_SK_REUSEPORT:
+	case BPF_PROG_TYPE_NETFILTER:
+	case BPF_PROG_TYPE_LWT_IN:
+	case BPF_PROG_TYPE_LWT_OUT:
+	case BPF_PROG_TYPE_LWT_XMIT:
+		ctx_needed = true;
+		break;
+	default:
+		break;
+	}
+
+	if (!ctx && ctx_needed) {
+		kfree(data);
+		return -EINVAL;
+	}
+
 	switch (prog->type) {
 	case BPF_PROG_TYPE_SCHED_CLS:
 	case BPF_PROG_TYPE_SCHED_ACT: