Message ID | 20231213040641.2653812-1-liujian56@huawei.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v2] net: check vlan filter feature in vlan_vids_add_by_dev() and vlan_vids_del_by_dev() | expand |
On Wed, 13 Dec 2023 12:06:41 +0800 Liu Jian wrote: > I got the bleow warning trace: s/bleow/below/ > WARNING: CPU: 4 PID: 4056 at net/core/dev.c:11066 unregister_netdevice_many_notify > CPU: 4 PID: 4056 Comm: ip Not tainted 6.7.0-rc4+ #15 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 > RIP: 0010:unregister_netdevice_many_notify+0x9a4/0x9b0 > Call Trace: > rtnl_dellink > rtnetlink_rcv_msg > netlink_rcv_skb > netlink_unicast > netlink_sendmsg > __sock_sendmsg > ____sys_sendmsg > ___sys_sendmsg > __sys_sendmsg > do_syscall_64 > entry_SYSCALL_64_after_hwframe > > It can be repoduced via: > > ip netns add ns1 > ip netns exec ns1 ip link add bond0 type bond mode 0 > ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2 > ip netns exec ns1 ip link set bond_slave_1 master bond0 > [1] ip netns exec ns1 ethtool -K bond0 rx-vlan-filter off > [2] ip netns exec ns1 ip link add link bond_slave_1 name bond_slave_1.0 type vlan id 0 > [3] ip netns exec ns1 ip link add link bond0 name bond0.0 type vlan id 0 > [4] ip netns exec ns1 ip link set bond_slave_1 nomaster > [5] ip netns exec ns1 ip link del veth2 > ip netns del ns1 Could you construct a selftest based on those commands? > This is all caused by command [1] turning off the rx-vlan-filter function > of bond0. The reason is the same as commit 01f4fd270870 ("bonding: Fix > incorrect deletion of ETH_P_8021AD protocol vid from slaves"). Commands > [2] [3] add the same vid to slave and master respectively, causing > command [4] to empty slave->vlan_info. The following command [5] triggers > this problem. > > To fix this problem, we should add VLAN_FILTER feature checks in > vlan_vids_add_by_dev() and vlan_vids_del_by_dev() to prevent incorrect > addition or deletion of vlan_vid information. > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Did the STAG/CTAG features exist in 2.6? I thought I saw the commit that added them in git at some point. Could be misremembering... > Signed-off-by: Liu Jian <liujian56@huawei.com> > --- > v1->v2: Modify patch title and commit message. > Remove superfluous operations in ethtool/features.c and ioctl.c > net/8021q/vlan_core.c | 21 ++++++++++++++++++++- > 1 file changed, 20 insertions(+), 1 deletion(-) > > diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c > index 0beb44f2fe1f..e94b509386bb 100644 > --- a/net/8021q/vlan_core.c > +++ b/net/8021q/vlan_core.c > @@ -407,6 +407,12 @@ int vlan_vids_add_by_dev(struct net_device *dev, > return 0; > > list_for_each_entry(vid_info, &vlan_info->vid_list, list) { > + if (!(by_dev->features & NETIF_F_HW_VLAN_CTAG_FILTER) && > + vid_info->proto == htons(ETH_P_8021Q)) > + continue; > + if (!(by_dev->features & NETIF_F_HW_VLAN_STAG_FILTER) && > + vid_info->proto == htons(ETH_P_8021AD)) > + continue; this code is copied 3 times, could you please factor it out to a helper taking dev and vid_info and deciding if the walk should skip?
在 2023/12/15 10:36, Jakub Kicinski 写道: > On Wed, 13 Dec 2023 12:06:41 +0800 Liu Jian wrote: >> I got the bleow warning trace: > > s/bleow/below/ > >> WARNING: CPU: 4 PID: 4056 at net/core/dev.c:11066 unregister_netdevice_many_notify >> CPU: 4 PID: 4056 Comm: ip Not tainted 6.7.0-rc4+ #15 >> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 >> RIP: 0010:unregister_netdevice_many_notify+0x9a4/0x9b0 >> Call Trace: >> rtnl_dellink >> rtnetlink_rcv_msg >> netlink_rcv_skb >> netlink_unicast >> netlink_sendmsg >> __sock_sendmsg >> ____sys_sendmsg >> ___sys_sendmsg >> __sys_sendmsg >> do_syscall_64 >> entry_SYSCALL_64_after_hwframe >> >> It can be repoduced via: >> >> ip netns add ns1 >> ip netns exec ns1 ip link add bond0 type bond mode 0 >> ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2 >> ip netns exec ns1 ip link set bond_slave_1 master bond0 >> [1] ip netns exec ns1 ethtool -K bond0 rx-vlan-filter off >> [2] ip netns exec ns1 ip link add link bond_slave_1 name bond_slave_1.0 type vlan id 0 >> [3] ip netns exec ns1 ip link add link bond0 name bond0.0 type vlan id 0 >> [4] ip netns exec ns1 ip link set bond_slave_1 nomaster >> [5] ip netns exec ns1 ip link del veth2 >> ip netns del ns1 > > Could you construct a selftest based on those commands? OK. > >> This is all caused by command [1] turning off the rx-vlan-filter function >> of bond0. The reason is the same as commit 01f4fd270870 ("bonding: Fix >> incorrect deletion of ETH_P_8021AD protocol vid from slaves"). Commands >> [2] [3] add the same vid to slave and master respectively, causing >> command [4] to empty slave->vlan_info. The following command [5] triggers >> this problem. >> >> To fix this problem, we should add VLAN_FILTER feature checks in >> vlan_vids_add_by_dev() and vlan_vids_del_by_dev() to prevent incorrect >> addition or deletion of vlan_vid information. >> >> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > > Did the STAG/CTAG features exist in 2.6? I thought I saw the commit > that added them in git at some point. Could be misremembering... I just saw the feature NETIF_F_HW_VLAN_FILTER (NETIF_F_HW_VLAN_CTAG_FILTER) in this tag. Now I find that the following tag may be more suitable. 348a1443cc43 ("vlan: introduce functions to do mass addition/deletion of vids by another device") > >> Signed-off-by: Liu Jian <liujian56@huawei.com> >> --- >> v1->v2: Modify patch title and commit message. >> Remove superfluous operations in ethtool/features.c and ioctl.c >> net/8021q/vlan_core.c | 21 ++++++++++++++++++++- >> 1 file changed, 20 insertions(+), 1 deletion(-) >> >> diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c >> index 0beb44f2fe1f..e94b509386bb 100644 >> --- a/net/8021q/vlan_core.c >> +++ b/net/8021q/vlan_core.c >> @@ -407,6 +407,12 @@ int vlan_vids_add_by_dev(struct net_device *dev, >> return 0; >> >> list_for_each_entry(vid_info, &vlan_info->vid_list, list) { >> + if (!(by_dev->features & NETIF_F_HW_VLAN_CTAG_FILTER) && >> + vid_info->proto == htons(ETH_P_8021Q)) >> + continue; >> + if (!(by_dev->features & NETIF_F_HW_VLAN_STAG_FILTER) && >> + vid_info->proto == htons(ETH_P_8021AD)) >> + continue; > > this code is copied 3 times, could you please factor it out to a helper > taking dev and vid_info and deciding if the walk should skip? Find a suitable existing function vlan_hw_filter_capable(). Thanks for your review.
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c index 0beb44f2fe1f..e94b509386bb 100644 --- a/net/8021q/vlan_core.c +++ b/net/8021q/vlan_core.c @@ -407,6 +407,12 @@ int vlan_vids_add_by_dev(struct net_device *dev, return 0; list_for_each_entry(vid_info, &vlan_info->vid_list, list) { + if (!(by_dev->features & NETIF_F_HW_VLAN_CTAG_FILTER) && + vid_info->proto == htons(ETH_P_8021Q)) + continue; + if (!(by_dev->features & NETIF_F_HW_VLAN_STAG_FILTER) && + vid_info->proto == htons(ETH_P_8021AD)) + continue; err = vlan_vid_add(dev, vid_info->proto, vid_info->vid); if (err) goto unwind; @@ -417,6 +423,12 @@ int vlan_vids_add_by_dev(struct net_device *dev, list_for_each_entry_continue_reverse(vid_info, &vlan_info->vid_list, list) { + if (!(by_dev->features & NETIF_F_HW_VLAN_CTAG_FILTER) && + vid_info->proto == htons(ETH_P_8021Q)) + continue; + if (!(by_dev->features & NETIF_F_HW_VLAN_STAG_FILTER) && + vid_info->proto == htons(ETH_P_8021AD)) + continue; vlan_vid_del(dev, vid_info->proto, vid_info->vid); } @@ -436,8 +448,15 @@ void vlan_vids_del_by_dev(struct net_device *dev, if (!vlan_info) return; - list_for_each_entry(vid_info, &vlan_info->vid_list, list) + list_for_each_entry(vid_info, &vlan_info->vid_list, list) { + if (!(by_dev->features & NETIF_F_HW_VLAN_CTAG_FILTER) && + vid_info->proto == htons(ETH_P_8021Q)) + continue; + if (!(by_dev->features & NETIF_F_HW_VLAN_STAG_FILTER) && + vid_info->proto == htons(ETH_P_8021AD)) + continue; vlan_vid_del(dev, vid_info->proto, vid_info->vid); + } } EXPORT_SYMBOL(vlan_vids_del_by_dev);
I got the bleow warning trace: WARNING: CPU: 4 PID: 4056 at net/core/dev.c:11066 unregister_netdevice_many_notify CPU: 4 PID: 4056 Comm: ip Not tainted 6.7.0-rc4+ #15 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 RIP: 0010:unregister_netdevice_many_notify+0x9a4/0x9b0 Call Trace: rtnl_dellink rtnetlink_rcv_msg netlink_rcv_skb netlink_unicast netlink_sendmsg __sock_sendmsg ____sys_sendmsg ___sys_sendmsg __sys_sendmsg do_syscall_64 entry_SYSCALL_64_after_hwframe It can be repoduced via: ip netns add ns1 ip netns exec ns1 ip link add bond0 type bond mode 0 ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2 ip netns exec ns1 ip link set bond_slave_1 master bond0 [1] ip netns exec ns1 ethtool -K bond0 rx-vlan-filter off [2] ip netns exec ns1 ip link add link bond_slave_1 name bond_slave_1.0 type vlan id 0 [3] ip netns exec ns1 ip link add link bond0 name bond0.0 type vlan id 0 [4] ip netns exec ns1 ip link set bond_slave_1 nomaster [5] ip netns exec ns1 ip link del veth2 ip netns del ns1 This is all caused by command [1] turning off the rx-vlan-filter function of bond0. The reason is the same as commit 01f4fd270870 ("bonding: Fix incorrect deletion of ETH_P_8021AD protocol vid from slaves"). Commands [2] [3] add the same vid to slave and master respectively, causing command [4] to empty slave->vlan_info. The following command [5] triggers this problem. To fix this problem, we should add VLAN_FILTER feature checks in vlan_vids_add_by_dev() and vlan_vids_del_by_dev() to prevent incorrect addition or deletion of vlan_vid information. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Liu Jian <liujian56@huawei.com> --- v1->v2: Modify patch title and commit message. Remove superfluous operations in ethtool/features.c and ioctl.c net/8021q/vlan_core.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-)