Message ID | 20241019071149.81696-1-danielyangkang@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | [net] Drop packets with invalid headers to prevent KMSAN infoleak | expand |
On 10/19/24 12:11 AM, Daniel Yang wrote: > KMSAN detects uninitialized memory stored to memory by > bpf_clone_redirect(). Adding a check to the transmission path to find > malformed headers prevents this issue. Specifically, we check if the length > of the data stored in skb is less than the minimum device header length. > If so, drop the packet since the skb cannot contain a valid device header. > Also check if mac_header_len(skb) is outside the range provided of valid > device header lengths. > > Testing this patch with syzbot removes the bug. > > Fixes: 88264981f208 ("Merge tag 'sched_ext-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext") I am pretty sure this is the wrong tag. A test in selftests/bpf is needed to reproduce and better understand this. Only bpf_clone_redirect() is needed to reproduce or other bpf_skb_*() helpers calls are needed to reproduce? > Reported-by: syzbot+346474e3bf0b26bd3090@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=346474e3bf0b26bd3090 > Signed-off-by: Daniel Yang <danielyangkang@gmail.com> > --- > net/core/filter.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/net/core/filter.c b/net/core/filter.c > index cd3524cb3..92d8f2098 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -2191,6 +2191,13 @@ static int __bpf_redirect_common(struct sk_buff *skb, struct net_device *dev, > return -ERANGE; > } > > + if (unlikely(skb->len < dev->min_header_len || > + skb_mac_header_len(skb) < dev->min_header_len || > + skb_mac_header_len(skb) > dev->hard_header_len)) { > + kfree_skb(skb); > + return -ERANGE; > + } > + > bpf_push_mac_rcsum(skb); > return flags & BPF_F_INGRESS ? > __bpf_rx_skb(dev, skb) : __bpf_tx_skb(dev, skb);
> A test in selftests/bpf is needed to reproduce and better understand this. I don't know much about self tests but I've just been using the syzbot repro and #syz test at the link in the patch: https://syzkaller.appspot.com/bug?extid=346474e3bf0b26bd3090. Testing the patch showed that the uninitialized memory was not getting written to memory. > Only bpf_clone_redirect() is needed to reproduce or other bpf_skb_*() helpers calls > are needed to reproduce? From what I can see in the crash report here: https://syzkaller.appspot.com/text?tag=CrashReport&x=10ba3ca9980000, only bpf_clone_redirect() is needed to trigger this issue. The issue seems to be that bpf_try_make_head_writable clones the skb and creates uninitialized memory but __bpf_tx_skb() gets called and the ethernet header never got written, resulting in the skb having a data section without a proper mac header. Current check: if (unlikely(skb->mac_header >= skb->network_header || skb->len == 0)) { **drop packet** } in __bpf_redirect_common() is insufficient since it only checks if the mac header is misordered or if the data length is 0. So, any packet with a malformed MAC header that is not 14 bytes but is not 0 doesn't get dropped. Adding bounds checks for mac header size should fix this. And from what I see in the syz test of this patch, it does. Are there any possible unexpected issues that can be caused by this?
On 10/22/24 03:37, Daniel Yang wrote:
> Are there any possible unexpected issues that can be caused by this?
This patch is apparently the cause of BPF self-tests failures:
test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress
[redirect_ingress] unexpected ret: veth ETH_HLEN+1 packet ingress
[redirect_ingress]: actual -34 != expected 0
test_empty_skb:PASS:err: veth ETH_HLEN+1 packet ingress
[redirect_egress] 0 nsec
test_empty_skb:FAIL:ret: veth ETH_HLEN+1 packet ingress
[redirect_egress] unexpected ret: veth ETH_HLEN+1 packet ingress
[redirect_egress]: actual -34 != expected 1
Before submitting an eventual next revision, please very that BPF
self-tests are happy.
Thanks,
Paolo
On 10/21/24 6:37 PM, Daniel Yang wrote: >> A test in selftests/bpf is needed to reproduce and better understand this. > I don't know much about self tests but I've just been using the syzbot > repro and #syz test at the link in the patch: > https://syzkaller.appspot.com/bug?extid=346474e3bf0b26bd3090. Testing > the patch showed that the uninitialized memory was not getting written > to memory. > >> Only bpf_clone_redirect() is needed to reproduce or other bpf_skb_*() helpers calls >> are needed to reproduce? If only bpf_clone_redirect() is needed, it should be simple to write a selftest to reproduce it. It also helps to catch future regression. Please tag the next respin as "bpf" also. > > From what I can see in the crash report here: > https://syzkaller.appspot.com/text?tag=CrashReport&x=10ba3ca9980000, > only bpf_clone_redirect() is needed to trigger this issue. The issue > seems to be that bpf_try_make_head_writable clones the skb and creates > uninitialized memory but __bpf_tx_skb() gets called and the ethernet > header never got written, resulting in the skb having a data section > without a proper mac header. Current check: > > if (unlikely(skb->mac_header >= skb->network_header || skb->len == 0)) > { > **drop packet** > } > > in __bpf_redirect_common() is insufficient since it only checks if the > mac header is misordered or if the data length is 0. So, any packet > with a malformed MAC header that is not 14 bytes but is not 0 doesn't > get dropped. Adding bounds checks for mac header size should fix this. > And from what I see in the syz test of this patch, it does. > > Are there any possible unexpected issues that can be caused by this?
On Tue, Oct 22, 2024 at 11:14 AM Martin KaFai Lau <martin.lau@linux.dev> wrote: > > On 10/21/24 6:37 PM, Daniel Yang wrote: > >> A test in selftests/bpf is needed to reproduce and better understand this. > > I don't know much about self tests but I've just been using the syzbot > > repro and #syz test at the link in the patch: > > https://syzkaller.appspot.com/bug?extid=346474e3bf0b26bd3090. Testing > > the patch showed that the uninitialized memory was not getting written > > to memory. > > > >> Only bpf_clone_redirect() is needed to reproduce or other bpf_skb_*() helpers calls > >> are needed to reproduce? > > If only bpf_clone_redirect() is needed, it should be simple to write a selftest > to reproduce it. It also helps to catch future regression. > > Please tag the next respin as "bpf" also. I have a problem. I can't seem to build the bpf kselftests for some reason. There is always a struct definition error: In file included from progs/profiler1.c:5: progs/profiler.inc.h:599:49: error: declaration of 'struct syscall_trace_enter' will not be visible outside of t] 599 | int tracepoint__syscalls__sys_enter_kill(struct syscall_trace_enter* ctx) | ^ progs/profiler.inc.h:604:15: error: incomplete definition of type 'struct syscall_trace_enter' 604 | int pid = ctx->args[0]; | ~~~^ progs/profiler.inc.h:599:49: note: forward declaration of 'struct syscall_trace_enter' 599 | int tracepoint__syscalls__sys_enter_kill(struct syscall_trace_enter* ctx) | ^ progs/profiler.inc.h:605:15: error: incomplete definition of type 'struct syscall_trace_enter' 605 | int sig = ctx->args[1]; | ~~~^ progs/profiler.inc.h:599:49: note: forward declaration of 'struct syscall_trace_enter' 599 | int tracepoint__syscalls__sys_enter_kill(struct syscall_trace_enter* ctx) I just run the following to build: $ cd tools/testing/selftests/bpf/ $ make I can't find anyone else encountering the same error.
On 10/27/24 1:49 AM, Daniel Yang wrote: > On Tue, Oct 22, 2024 at 11:14 AM Martin KaFai Lau <martin.lau@linux.dev> wrote: >> On 10/21/24 6:37 PM, Daniel Yang wrote: >>>> A test in selftests/bpf is needed to reproduce and better understand this. >>> I don't know much about self tests but I've just been using the syzbot >>> repro and #syz test at the link in the patch: >>> https://syzkaller.appspot.com/bug?extid=346474e3bf0b26bd3090. Testing >>> the patch showed that the uninitialized memory was not getting written >>> to memory. >>> >>>> Only bpf_clone_redirect() is needed to reproduce or other bpf_skb_*() helpers calls >>>> are needed to reproduce? >> If only bpf_clone_redirect() is needed, it should be simple to write a selftest >> to reproduce it. It also helps to catch future regression. >> >> Please tag the next respin as "bpf" also. > I have a problem. I can't seem to build the bpf kselftests for some > reason. There is always a struct definition error: > In file included from progs/profiler1.c:5: > progs/profiler.inc.h:599:49: error: declaration of 'struct > syscall_trace_enter' will not be visible outside of t] > 599 | int tracepoint__syscalls__sys_enter_kill(struct > syscall_trace_enter* ctx) > | ^ > progs/profiler.inc.h:604:15: error: incomplete definition of type > 'struct syscall_trace_enter' > 604 | int pid = ctx->args[0]; > | ~~~^ > progs/profiler.inc.h:599:49: note: forward declaration of 'struct > syscall_trace_enter' > 599 | int tracepoint__syscalls__sys_enter_kill(struct > syscall_trace_enter* ctx) > | ^ > progs/profiler.inc.h:605:15: error: incomplete definition of type > 'struct syscall_trace_enter' > 605 | int sig = ctx->args[1]; > | ~~~^ > progs/profiler.inc.h:599:49: note: forward declaration of 'struct > syscall_trace_enter' > 599 | int tracepoint__syscalls__sys_enter_kill(struct > syscall_trace_enter* ctx) > > I just run the following to build: > $ cd tools/testing/selftests/bpf/ > $ make It might be due to your .config file. The 'struct syscall_trace_enter' is defined in kernel/trace/trace.h, which is used in kernel/trace/trace_syscalls.c. Maybe your config does not have CONFIG_FTRACE_SYSCALLS? > > I can't find anyone else encountering the same error.
From: Daniel Yang <danielyangkang@gmail.com> Date: Sat, 19 Oct 2024 00:11:39 -0700 > KMSAN detects uninitialized memory stored to memory by > bpf_clone_redirect(). Adding a check to the transmission path to find > malformed headers prevents this issue. Specifically, we check if the length > of the data stored in skb is less than the minimum device header length. > If so, drop the packet since the skb cannot contain a valid device header. > Also check if mac_header_len(skb) is outside the range provided of valid > device header lengths. > > Testing this patch with syzbot removes the bug. > > Fixes: 88264981f208 ("Merge tag 'sched_ext-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext") > Reported-by: syzbot+346474e3bf0b26bd3090@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=346474e3bf0b26bd3090 > Signed-off-by: Daniel Yang <danielyangkang@gmail.com> > --- > net/core/filter.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/net/core/filter.c b/net/core/filter.c > index cd3524cb3..92d8f2098 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -2191,6 +2191,13 @@ static int __bpf_redirect_common(struct sk_buff *skb, struct net_device *dev, > return -ERANGE; > } > > + if (unlikely(skb->len < dev->min_header_len || > + skb_mac_header_len(skb) < dev->min_header_len || > + skb_mac_header_len(skb) > dev->hard_header_len)) { > + kfree_skb(skb); > + return -ERANGE; > + } I believe this should go under IS_ENABLED(CONFIG_KMSAN) or CONFIG_DEBUG_NET or so to not affect the regular configurations. Or does this fix some real bug? > + > bpf_push_mac_rcsum(skb); > return flags & BPF_F_INGRESS ? > __bpf_rx_skb(dev, skb) : __bpf_tx_skb(dev, skb); Thanks, Olek
On Sun, Oct 27, 2024 at 10:42 PM Yonghong Song <yonghong.song@linux.dev> wrote: > It might be due to your .config file. > The 'struct syscall_trace_enter' is defined in kernel/trace/trace.h, > which is used in kernel/trace/trace_syscalls.c. Maybe your config > does not have CONFIG_FTRACE_SYSCALLS? I did add it to my config but another error popped up after I tried to rerun the tests. I just ended up using the vmtest.sh script and got the tests working. Seems like there should be a template config file or something to make building locally more convenient. Anyways, thanks for the help. - Daniel
On Tue, Oct 29, 2024 at 9:41 AM Alexander Lobakin <aleksander.lobakin@intel.com> wrote: > > + if (unlikely(skb->len < dev->min_header_len || > > + skb_mac_header_len(skb) < dev->min_header_len || > > + skb_mac_header_len(skb) > dev->hard_header_len)) { > > + kfree_skb(skb); > > + return -ERANGE; > > + } > > I believe this should go under IS_ENABLED(CONFIG_KMSAN) or > CONFIG_DEBUG_NET or so to not affect the regular configurations. > Or does this fix some real bug? Well in my opinion, an infoleak is still an infoleak. But, this would likely not get triggered as long as an skb with a properly initialized eth header is passed into the bpf_clone_redirect function. We could initialize the memory to 0 but the performance hit would be too much. If the bpf_clone_redirect() function cannot be called from user space with user-crafted skbs as input, I don't think this is really an issue and we can just put it under the macros to get rid of the syzbot error. - Daniel
diff --git a/net/core/filter.c b/net/core/filter.c index cd3524cb3..92d8f2098 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -2191,6 +2191,13 @@ static int __bpf_redirect_common(struct sk_buff *skb, struct net_device *dev, return -ERANGE; } + if (unlikely(skb->len < dev->min_header_len || + skb_mac_header_len(skb) < dev->min_header_len || + skb_mac_header_len(skb) > dev->hard_header_len)) { + kfree_skb(skb); + return -ERANGE; + } + bpf_push_mac_rcsum(skb); return flags & BPF_F_INGRESS ? __bpf_rx_skb(dev, skb) : __bpf_tx_skb(dev, skb);
KMSAN detects uninitialized memory stored to memory by bpf_clone_redirect(). Adding a check to the transmission path to find malformed headers prevents this issue. Specifically, we check if the length of the data stored in skb is less than the minimum device header length. If so, drop the packet since the skb cannot contain a valid device header. Also check if mac_header_len(skb) is outside the range provided of valid device header lengths. Testing this patch with syzbot removes the bug. Fixes: 88264981f208 ("Merge tag 'sched_ext-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext") Reported-by: syzbot+346474e3bf0b26bd3090@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=346474e3bf0b26bd3090 Signed-off-by: Daniel Yang <danielyangkang@gmail.com> --- net/core/filter.c | 7 +++++++ 1 file changed, 7 insertions(+)