diff mbox series

bpf: lwt: do not return NET_XMIT_xxx values on bpf_redirect

Message ID ZLbYdpWC8zt9EJtq@debian.debian (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series bpf: lwt: do not return NET_XMIT_xxx values on bpf_redirect | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for veristat
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for test_maps on s390x with gcc
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1368 this patch: 1368
netdev/cc_maintainers success CCed 17 of 17 maintainers
netdev/build_clang success Errors and warnings before: 1365 this patch: 1365
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1391 this patch: 1391
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 11 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-6 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for x86_64 with gcc

Commit Message

Yan Zhai July 18, 2023, 6:22 p.m. UTC
skb_do_redirect handles returns error code from both rx and tx path.
The tx path codes are special, e.g. NET_XMIT_CN: they are
non-negative, and can conflict with LWTUNNEL_XMIT_xxx values. Directly
returning such code can cause unexpected behavior. We found at least
one bug that will panic the kernel through KASAN report when we
accidentally redirect packets to a down or carrier-down device at lwt
xmit hook:

https://gist.github.com/zhaiyan920/8fbac245b261fe316a7ef04c9b1eba48

Above bug is hit because NET_XMIT_CN is returned by noop_qdisc of the
down device, and it propagates from dev_queue_xmit all way to the lwt
logic. Although skb has been freed by the qdisc, it still continues to
neighbor subsystem and triggers the bug.

This change converts the tx code to proper errors that lwt can consume.

Reported-by: Jordan Griege <jgriege@cloudflare.com>
Signed-off-by: Yan Zhai <yan@cloudflare.com>
---
 net/core/filter.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Stanislav Fomichev July 18, 2023, 8:28 p.m. UTC | #1
On Tue, Jul 18, 2023 at 11:22 AM Yan Zhai <yan@cloudflare.com> wrote:
>
> skb_do_redirect handles returns error code from both rx and tx path.
> The tx path codes are special, e.g. NET_XMIT_CN: they are
> non-negative, and can conflict with LWTUNNEL_XMIT_xxx values. Directly
> returning such code can cause unexpected behavior. We found at least
> one bug that will panic the kernel through KASAN report when we
> accidentally redirect packets to a down or carrier-down device at lwt
> xmit hook:
>
> https://gist.github.com/zhaiyan920/8fbac245b261fe316a7ef04c9b1eba48
>
> Above bug is hit because NET_XMIT_CN is returned by noop_qdisc of the
> down device, and it propagates from dev_queue_xmit all way to the lwt
> logic. Although skb has been freed by the qdisc, it still continues to
> neighbor subsystem and triggers the bug.
>
> This change converts the tx code to proper errors that lwt can consume.
>
> Reported-by: Jordan Griege <jgriege@cloudflare.com>
> Signed-off-by: Yan Zhai <yan@cloudflare.com>
> ---
>  net/core/filter.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 06ba0e56e369..c9cc501ecdc0 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -2129,6 +2129,11 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
>         ret = dev_queue_xmit(skb);
>         dev_xmit_recursion_dec();
>
> +       // We should not return NET_XMIT_xxx here since it will conflict with
> +       // LWTUNNEL_XMIT_xxx values. Convert the return value to errno instead.

C++ comments; should be /* */. But, also, maybe they are not really needed?

ret = dev_queue_xmit(skb);
if (ret)
        ret = net_xmit_errno(ret);

We have a bunch of places with the pattern like this, so probably can
do the same here?

> +       if (unlikely(ret != NET_XMIT_SUCCESS))
> +               ret = net_xmit_errno(ret);
> +
>         return ret;
>  }
>
> --
> 2.30.2
>
Yan Zhai July 19, 2023, 3:21 a.m. UTC | #2
On Tue, Jul 18, 2023 at 3:29 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> On Tue, Jul 18, 2023 at 11:22 AM Yan Zhai <yan@cloudflare.com> wrote:
> >
> > skb_do_redirect handles returns error code from both rx and tx path.
> > The tx path codes are special, e.g. NET_XMIT_CN: they are
> > non-negative, and can conflict with LWTUNNEL_XMIT_xxx values. Directly
> > returning such code can cause unexpected behavior. We found at least
> > one bug that will panic the kernel through KASAN report when we
> > accidentally redirect packets to a down or carrier-down device at lwt
> > xmit hook:
> >
> > https://gist.github.com/zhaiyan920/8fbac245b261fe316a7ef04c9b1eba48
> >
> > Above bug is hit because NET_XMIT_CN is returned by noop_qdisc of the
> > down device, and it propagates from dev_queue_xmit all way to the lwt
> > logic. Although skb has been freed by the qdisc, it still continues to
> > neighbor subsystem and triggers the bug.
> >
> > This change converts the tx code to proper errors that lwt can consume.
> >
> > Reported-by: Jordan Griege <jgriege@cloudflare.com>
> > Signed-off-by: Yan Zhai <yan@cloudflare.com>
> > ---
> >  net/core/filter.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 06ba0e56e369..c9cc501ecdc0 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -2129,6 +2129,11 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
> >         ret = dev_queue_xmit(skb);
> >         dev_xmit_recursion_dec();
> >
> > +       // We should not return NET_XMIT_xxx here since it will conflict with
> > +       // LWTUNNEL_XMIT_xxx values. Convert the return value to errno instead.
>
> C++ comments; should be /* */. But, also, maybe they are not really needed?
>
*facepalm* yes I think we can remove them since the commit message
already covers it...

> ret = dev_queue_xmit(skb);
> if (ret)
>         ret = net_xmit_errno(ret);
>
> We have a bunch of places with the pattern like this, so probably can
> do the same here?
>
Personally I like an explicit name better, since not all the return
codes use 0 to signal success, e.g. XDP_PASS, TC_ACT_PIPE. But I'd
leave that for future improvements now that all other places use 0 on
this.

thanks
Yan

> > +       if (unlikely(ret != NET_XMIT_SUCCESS))
> > +               ret = net_xmit_errno(ret);
> > +
> >         return ret;
> >  }
> >
> > --
> > 2.30.2
> >
diff mbox series

Patch

diff --git a/net/core/filter.c b/net/core/filter.c
index 06ba0e56e369..c9cc501ecdc0 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2129,6 +2129,11 @@  static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
 	ret = dev_queue_xmit(skb);
 	dev_xmit_recursion_dec();
 
+	// We should not return NET_XMIT_xxx here since it will conflict with
+	// LWTUNNEL_XMIT_xxx values. Convert the return value to errno instead.
+	if (unlikely(ret != NET_XMIT_SUCCESS))
+		ret = net_xmit_errno(ret);
+
 	return ret;
 }