Message ID | ZLbYdpWC8zt9EJtq@debian.debian (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | bpf: lwt: do not return NET_XMIT_xxx values on bpf_redirect | expand |
On Tue, Jul 18, 2023 at 11:22 AM Yan Zhai <yan@cloudflare.com> wrote: > > skb_do_redirect handles returns error code from both rx and tx path. > The tx path codes are special, e.g. NET_XMIT_CN: they are > non-negative, and can conflict with LWTUNNEL_XMIT_xxx values. Directly > returning such code can cause unexpected behavior. We found at least > one bug that will panic the kernel through KASAN report when we > accidentally redirect packets to a down or carrier-down device at lwt > xmit hook: > > https://gist.github.com/zhaiyan920/8fbac245b261fe316a7ef04c9b1eba48 > > Above bug is hit because NET_XMIT_CN is returned by noop_qdisc of the > down device, and it propagates from dev_queue_xmit all way to the lwt > logic. Although skb has been freed by the qdisc, it still continues to > neighbor subsystem and triggers the bug. > > This change converts the tx code to proper errors that lwt can consume. > > Reported-by: Jordan Griege <jgriege@cloudflare.com> > Signed-off-by: Yan Zhai <yan@cloudflare.com> > --- > net/core/filter.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/net/core/filter.c b/net/core/filter.c > index 06ba0e56e369..c9cc501ecdc0 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -2129,6 +2129,11 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb) > ret = dev_queue_xmit(skb); > dev_xmit_recursion_dec(); > > + // We should not return NET_XMIT_xxx here since it will conflict with > + // LWTUNNEL_XMIT_xxx values. Convert the return value to errno instead. C++ comments; should be /* */. But, also, maybe they are not really needed? ret = dev_queue_xmit(skb); if (ret) ret = net_xmit_errno(ret); We have a bunch of places with the pattern like this, so probably can do the same here? > + if (unlikely(ret != NET_XMIT_SUCCESS)) > + ret = net_xmit_errno(ret); > + > return ret; > } > > -- > 2.30.2 >
On Tue, Jul 18, 2023 at 3:29 PM Stanislav Fomichev <sdf@google.com> wrote: > > On Tue, Jul 18, 2023 at 11:22 AM Yan Zhai <yan@cloudflare.com> wrote: > > > > skb_do_redirect handles returns error code from both rx and tx path. > > The tx path codes are special, e.g. NET_XMIT_CN: they are > > non-negative, and can conflict with LWTUNNEL_XMIT_xxx values. Directly > > returning such code can cause unexpected behavior. We found at least > > one bug that will panic the kernel through KASAN report when we > > accidentally redirect packets to a down or carrier-down device at lwt > > xmit hook: > > > > https://gist.github.com/zhaiyan920/8fbac245b261fe316a7ef04c9b1eba48 > > > > Above bug is hit because NET_XMIT_CN is returned by noop_qdisc of the > > down device, and it propagates from dev_queue_xmit all way to the lwt > > logic. Although skb has been freed by the qdisc, it still continues to > > neighbor subsystem and triggers the bug. > > > > This change converts the tx code to proper errors that lwt can consume. > > > > Reported-by: Jordan Griege <jgriege@cloudflare.com> > > Signed-off-by: Yan Zhai <yan@cloudflare.com> > > --- > > net/core/filter.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/net/core/filter.c b/net/core/filter.c > > index 06ba0e56e369..c9cc501ecdc0 100644 > > --- a/net/core/filter.c > > +++ b/net/core/filter.c > > @@ -2129,6 +2129,11 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb) > > ret = dev_queue_xmit(skb); > > dev_xmit_recursion_dec(); > > > > + // We should not return NET_XMIT_xxx here since it will conflict with > > + // LWTUNNEL_XMIT_xxx values. Convert the return value to errno instead. > > C++ comments; should be /* */. But, also, maybe they are not really needed? > *facepalm* yes I think we can remove them since the commit message already covers it... > ret = dev_queue_xmit(skb); > if (ret) > ret = net_xmit_errno(ret); > > We have a bunch of places with the pattern like this, so probably can > do the same here? > Personally I like an explicit name better, since not all the return codes use 0 to signal success, e.g. XDP_PASS, TC_ACT_PIPE. But I'd leave that for future improvements now that all other places use 0 on this. thanks Yan > > + if (unlikely(ret != NET_XMIT_SUCCESS)) > > + ret = net_xmit_errno(ret); > > + > > return ret; > > } > > > > -- > > 2.30.2 > >
diff --git a/net/core/filter.c b/net/core/filter.c index 06ba0e56e369..c9cc501ecdc0 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -2129,6 +2129,11 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb) ret = dev_queue_xmit(skb); dev_xmit_recursion_dec(); + // We should not return NET_XMIT_xxx here since it will conflict with + // LWTUNNEL_XMIT_xxx values. Convert the return value to errno instead. + if (unlikely(ret != NET_XMIT_SUCCESS)) + ret = net_xmit_errno(ret); + return ret; }
skb_do_redirect handles returns error code from both rx and tx path. The tx path codes are special, e.g. NET_XMIT_CN: they are non-negative, and can conflict with LWTUNNEL_XMIT_xxx values. Directly returning such code can cause unexpected behavior. We found at least one bug that will panic the kernel through KASAN report when we accidentally redirect packets to a down or carrier-down device at lwt xmit hook: https://gist.github.com/zhaiyan920/8fbac245b261fe316a7ef04c9b1eba48 Above bug is hit because NET_XMIT_CN is returned by noop_qdisc of the down device, and it propagates from dev_queue_xmit all way to the lwt logic. Although skb has been freed by the qdisc, it still continues to neighbor subsystem and triggers the bug. This change converts the tx code to proper errors that lwt can consume. Reported-by: Jordan Griege <jgriege@cloudflare.com> Signed-off-by: Yan Zhai <yan@cloudflare.com> --- net/core/filter.c | 5 +++++ 1 file changed, 5 insertions(+)