Message ID | 20241001154400.22787-2-aroulin@nvidia.com (mailing list archive) |
---|---|
State | Accepted |
Commit | f9ff7665cd128012868098bbd07e28993e314fdb |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | netfilter: br_netfilter: fix panic with metadata_dst skb | expand |
On 10/1/24 18:43, Andy Roulin wrote: > Fix a kernel panic in the br_netfilter module when sending untagged > traffic via a VxLAN device. > This happens during the check for fragmentation in br_nf_dev_queue_xmit. > > It is dependent on: > 1) the br_netfilter module being loaded; > 2) net.bridge.bridge-nf-call-iptables set to 1; > 3) a bridge with a VxLAN (single-vxlan-device) netdevice as a bridge port; > 4) untagged frames with size higher than the VxLAN MTU forwarded/flooded > > When forwarding the untagged packet to the VxLAN bridge port, before > the netfilter hooks are called, br_handle_egress_vlan_tunnel is called and > changes the skb_dst to the tunnel dst. The tunnel_dst is a metadata type > of dst, i.e., skb_valid_dst(skb) is false, and metadata->dst.dev is NULL. > > Then in the br_netfilter hooks, in br_nf_dev_queue_xmit, there's a check > for frames that needs to be fragmented: frames with higher MTU than the > VxLAN device end up calling br_nf_ip_fragment, which in turns call > ip_skb_dst_mtu. > > The ip_dst_mtu tries to use the skb_dst(skb) as if it was a valid dst > with valid dst->dev, thus the crash. > > This case was never supported in the first place, so drop the packet > instead. > > PING 10.0.0.2 (10.0.0.2) from 0.0.0.0 h1-eth0: 2000(2028) bytes of data. > [ 176.291791] Unable to handle kernel NULL pointer dereference at > virtual address 0000000000000110 > [ 176.292101] Mem abort info: > [ 176.292184] ESR = 0x0000000096000004 > [ 176.292322] EC = 0x25: DABT (current EL), IL = 32 bits > [ 176.292530] SET = 0, FnV = 0 > [ 176.292709] EA = 0, S1PTW = 0 > [ 176.292862] FSC = 0x04: level 0 translation fault > [ 176.293013] Data abort info: > [ 176.293104] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 > [ 176.293488] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 > [ 176.293787] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > [ 176.293995] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000043ef5000 > [ 176.294166] [0000000000000110] pgd=0000000000000000, > p4d=0000000000000000 > [ 176.294827] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP > [ 176.295252] Modules linked in: vxlan ip6_udp_tunnel udp_tunnel veth > br_netfilter bridge stp llc ipv6 crct10dif_ce > [ 176.295923] CPU: 0 PID: 188 Comm: ping Not tainted > 6.8.0-rc3-g5b3fbd61b9d1 #2 > [ 176.296314] Hardware name: linux,dummy-virt (DT) > [ 176.296535] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS > BTYPE=--) > [ 176.296808] pc : br_nf_dev_queue_xmit+0x390/0x4ec [br_netfilter] > [ 176.297382] lr : br_nf_dev_queue_xmit+0x2ac/0x4ec [br_netfilter] > [ 176.297636] sp : ffff800080003630 > [ 176.297743] x29: ffff800080003630 x28: 0000000000000008 x27: > ffff6828c49ad9f8 > [ 176.298093] x26: ffff6828c49ad000 x25: 0000000000000000 x24: > 00000000000003e8 > [ 176.298430] x23: 0000000000000000 x22: ffff6828c4960b40 x21: > ffff6828c3b16d28 > [ 176.298652] x20: ffff6828c3167048 x19: ffff6828c3b16d00 x18: > 0000000000000014 > [ 176.298926] x17: ffffb0476322f000 x16: ffffb7e164023730 x15: > 0000000095744632 > [ 176.299296] x14: ffff6828c3f1c880 x13: 0000000000000002 x12: > ffffb7e137926a70 > [ 176.299574] x11: 0000000000000001 x10: ffff6828c3f1c898 x9 : > 0000000000000000 > [ 176.300049] x8 : ffff6828c49bf070 x7 : 0008460f18d5f20e x6 : > f20e0100bebafeca > [ 176.300302] x5 : ffff6828c7f918fe x4 : ffff6828c49bf070 x3 : > 0000000000000000 > [ 176.300586] x2 : 0000000000000000 x1 : ffff6828c3c7ad00 x0 : > ffff6828c7f918f0 > [ 176.300889] Call trace: > [ 176.301123] br_nf_dev_queue_xmit+0x390/0x4ec [br_netfilter] > [ 176.301411] br_nf_post_routing+0x2a8/0x3e4 [br_netfilter] > [ 176.301703] nf_hook_slow+0x48/0x124 > [ 176.302060] br_forward_finish+0xc8/0xe8 [bridge] > [ 176.302371] br_nf_hook_thresh+0x124/0x134 [br_netfilter] > [ 176.302605] br_nf_forward_finish+0x118/0x22c [br_netfilter] > [ 176.302824] br_nf_forward_ip.part.0+0x264/0x290 [br_netfilter] > [ 176.303136] br_nf_forward+0x2b8/0x4e0 [br_netfilter] > [ 176.303359] nf_hook_slow+0x48/0x124 > [ 176.303803] __br_forward+0xc4/0x194 [bridge] > [ 176.304013] br_flood+0xd4/0x168 [bridge] > [ 176.304300] br_handle_frame_finish+0x1d4/0x5c4 [bridge] > [ 176.304536] br_nf_hook_thresh+0x124/0x134 [br_netfilter] > [ 176.304978] br_nf_pre_routing_finish+0x29c/0x494 [br_netfilter] > [ 176.305188] br_nf_pre_routing+0x250/0x524 [br_netfilter] > [ 176.305428] br_handle_frame+0x244/0x3cc [bridge] > [ 176.305695] __netif_receive_skb_core.constprop.0+0x33c/0xecc > [ 176.306080] __netif_receive_skb_one_core+0x40/0x8c > [ 176.306197] __netif_receive_skb+0x18/0x64 > [ 176.306369] process_backlog+0x80/0x124 > [ 176.306540] __napi_poll+0x38/0x17c > [ 176.306636] net_rx_action+0x124/0x26c > [ 176.306758] __do_softirq+0x100/0x26c > [ 176.307051] ____do_softirq+0x10/0x1c > [ 176.307162] call_on_irq_stack+0x24/0x4c > [ 176.307289] do_softirq_own_stack+0x1c/0x2c > [ 176.307396] do_softirq+0x54/0x6c > [ 176.307485] __local_bh_enable_ip+0x8c/0x98 > [ 176.307637] __dev_queue_xmit+0x22c/0xd28 > [ 176.307775] neigh_resolve_output+0xf4/0x1a0 > [ 176.308018] ip_finish_output2+0x1c8/0x628 > [ 176.308137] ip_do_fragment+0x5b4/0x658 > [ 176.308279] ip_fragment.constprop.0+0x48/0xec > [ 176.308420] __ip_finish_output+0xa4/0x254 > [ 176.308593] ip_finish_output+0x34/0x130 > [ 176.308814] ip_output+0x6c/0x108 > [ 176.308929] ip_send_skb+0x50/0xf0 > [ 176.309095] ip_push_pending_frames+0x30/0x54 > [ 176.309254] raw_sendmsg+0x758/0xaec > [ 176.309568] inet_sendmsg+0x44/0x70 > [ 176.309667] __sys_sendto+0x110/0x178 > [ 176.309758] __arm64_sys_sendto+0x28/0x38 > [ 176.309918] invoke_syscall+0x48/0x110 > [ 176.310211] el0_svc_common.constprop.0+0x40/0xe0 > [ 176.310353] do_el0_svc+0x1c/0x28 > [ 176.310434] el0_svc+0x34/0xb4 > [ 176.310551] el0t_64_sync_handler+0x120/0x12c > [ 176.310690] el0t_64_sync+0x190/0x194 > [ 176.311066] Code: f9402e61 79402aa2 927ff821 f9400023 (f9408860) > [ 176.315743] ---[ end trace 0000000000000000 ]--- > [ 176.316060] Kernel panic - not syncing: Oops: Fatal exception in > interrupt > [ 176.316371] Kernel Offset: 0x37e0e3000000 from 0xffff800080000000 > [ 176.316564] PHYS_OFFSET: 0xffff97d780000000 > [ 176.316782] CPU features: 0x0,88000203,3c020000,0100421b > [ 176.317210] Memory Limit: none > [ 176.317527] ---[ end Kernel panic - not syncing: Oops: Fatal > Exception in interrupt ]---\ > > Fixes: 11538d039ac6 ("bridge: vlan dst_metadata hooks in ingress and egress paths") > Reviewed-by: Ido Schimmel <idosch@nvidia.com> > Signed-off-by: Andy Roulin <aroulin@nvidia.com> > --- > net/bridge/br_netfilter_hooks.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c > index 0e8bc0ea6175..1d458e9da660 100644 > --- a/net/bridge/br_netfilter_hooks.c > +++ b/net/bridge/br_netfilter_hooks.c > @@ -33,6 +33,7 @@ > #include <net/ip.h> > #include <net/ipv6.h> > #include <net/addrconf.h> > +#include <net/dst_metadata.h> > #include <net/route.h> > #include <net/netfilter/br_netfilter.h> > #include <net/netns/generic.h> > @@ -879,6 +880,10 @@ static int br_nf_dev_queue_xmit(struct net *net, struct sock *sk, struct sk_buff > return br_dev_queue_push_xmit(net, sk, skb); > } > > + /* Fragmentation on metadata/template dst is not supported */ > + if (unlikely(!skb_valid_dst(skb)))> + goto drop; > + > /* This is wrong! We should preserve the original fragment > * boundaries by preserving frag_list rather than refragmenting. > */ This helper's name is a bit misleading. :) But looking at it, it seems only metadata dsts are not considered valid, so looks good to me. Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c index 0e8bc0ea6175..1d458e9da660 100644 --- a/net/bridge/br_netfilter_hooks.c +++ b/net/bridge/br_netfilter_hooks.c @@ -33,6 +33,7 @@ #include <net/ip.h> #include <net/ipv6.h> #include <net/addrconf.h> +#include <net/dst_metadata.h> #include <net/route.h> #include <net/netfilter/br_netfilter.h> #include <net/netns/generic.h> @@ -879,6 +880,10 @@ static int br_nf_dev_queue_xmit(struct net *net, struct sock *sk, struct sk_buff return br_dev_queue_push_xmit(net, sk, skb); } + /* Fragmentation on metadata/template dst is not supported */ + if (unlikely(!skb_valid_dst(skb))) + goto drop; + /* This is wrong! We should preserve the original fragment * boundaries by preserving frag_list rather than refragmenting. */