Message ID | 20250308044726.1193222-1-sdf@fomichev.me (mailing list archive) |
---|---|
State | Accepted |
Commit | 0a13c1e0a449917b29c45d90701eededa69c99d3 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] net: revert to lockless TC_SETUP_BLOCK and TC_SETUP_FT | expand |
On Sat, Mar 8, 2025 at 5:47 AM Stanislav Fomichev <sdf@fomichev.me> wrote: > > There is a couple of places from which we can arrive to ndo_setup_tc > with TC_SETUP_BLOCK/TC_SETUP_FT: > - netlink > - netlink notifier > - netdev notifier > > Locking netdev too deep in this call chain seems to be problematic > (especially assuming some/all of the call_netdevice_notifiers > NETDEV_UNREGISTER) might soon be running with the instance lock). > Revert to lockless ndo_setup_tc for TC_SETUP_BLOCK/TC_SETUP_FT. NFT > framework already takes care of most of the locking. Document > the assumptions. > > > Fixes: c4f0f30b424e ("net: hold netdev instance lock during nft ndo_setup_tc") > Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> I think you forgot to mention syzbot. Reported-by: syzbot+0afb4bcf91e5a1afdcad@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/67cb88d1.050a0220.d8275.022d.GAE@google.com/T/#u Thanks.
On 03/08, Eric Dumazet wrote: > On Sat, Mar 8, 2025 at 5:47 AM Stanislav Fomichev <sdf@fomichev.me> wrote: > > > > There is a couple of places from which we can arrive to ndo_setup_tc > > with TC_SETUP_BLOCK/TC_SETUP_FT: > > - netlink > > - netlink notifier > > - netdev notifier > > > > Locking netdev too deep in this call chain seems to be problematic > > (especially assuming some/all of the call_netdevice_notifiers > > NETDEV_UNREGISTER) might soon be running with the instance lock). > > Revert to lockless ndo_setup_tc for TC_SETUP_BLOCK/TC_SETUP_FT. NFT > > framework already takes care of most of the locking. Document > > the assumptions. > > > > > > > > Fixes: c4f0f30b424e ("net: hold netdev instance lock during nft ndo_setup_tc") > > Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> > > I think you forgot to mention syzbot. > > Reported-by: syzbot+0afb4bcf91e5a1afdcad@syzkaller.appspotmail.com > Closes: https://lore.kernel.org/netdev/67cb88d1.050a0220.d8275.022d.GAE@google.com/T/#u Ah, yes, I was waiting for a repro, but should have attached the proper tags, thanks!
On Fri, Mar 07, 2025 at 08:47:26PM -0800, Stanislav Fomichev wrote: > There is a couple of places from which we can arrive to ndo_setup_tc > with TC_SETUP_BLOCK/TC_SETUP_FT: > - netlink > - netlink notifier > - netdev notifier > > Locking netdev too deep in this call chain seems to be problematic > (especially assuming some/all of the call_netdevice_notifiers > NETDEV_UNREGISTER) might soon be running with the instance lock). > Revert to lockless ndo_setup_tc for TC_SETUP_BLOCK/TC_SETUP_FT. NFT > framework already takes care of most of the locking. Document > the assumptions. > > ndo_setup_tc TC_SETUP_BLOCK > nft_block_offload_cmd > nft_chain_offload_cmd > nft_flow_block_chain > nft_flow_offload_chain > nft_flow_rule_offload_abort > nft_flow_rule_offload_commit > nft_flow_rule_offload_commit > nf_tables_commit > nfnetlink_rcv_batch > nfnetlink_rcv_skb_batch > nfnetlink_rcv > nft_offload_netdev_event > NETDEV_UNREGISTER notifier > > ndo_setup_tc TC_SETUP_FT > nf_flow_table_offload_cmd > nf_flow_table_offload_setup > nft_unregister_flowtable_hook > nft_register_flowtable_net_hooks > nft_flowtable_update > nf_tables_newflowtable > nfnetlink_rcv_batch (.call NFNL_CB_BATCH) > nft_flowtable_update > nf_tables_newflowtable > nft_flowtable_event > nf_tables_flowtable_event > NETDEV_UNREGISTER notifier > __nft_unregister_flowtable_net_hooks > nft_unregister_flowtable_net_hooks > nf_tables_commit > nfnetlink_rcv_batch (.call NFNL_CB_BATCH) > __nf_tables_abort > nf_tables_abort > nfnetlink_rcv_batch > __nft_release_hook > __nft_release_hooks > nf_tables_pre_exit_net -> module unload > nft_rcv_nl_event > netlink_register_notifier (oh boy) > nft_register_flowtable_net_hooks > nft_flowtable_update > nf_tables_newflowtable > nf_tables_newflowtable > > Fixes: c4f0f30b424e ("net: hold netdev instance lock during nft ndo_setup_tc") > Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Thanks Stan, Thinking aloud: the dev_setup_tc() helper checked if ndo_setup_tc is non-NULL, but that is not the case here. But that seems ok because it was also the case prior to the cited commit. Reviewed-by: Simon Horman <horms@kernel.org> ...
Hello: This patch was applied to netdev/net-next.git (main) by Jakub Kicinski <kuba@kernel.org>: On Fri, 7 Mar 2025 20:47:26 -0800 you wrote: > There is a couple of places from which we can arrive to ndo_setup_tc > with TC_SETUP_BLOCK/TC_SETUP_FT: > - netlink > - netlink notifier > - netdev notifier > > Locking netdev too deep in this call chain seems to be problematic > (especially assuming some/all of the call_netdevice_notifiers > NETDEV_UNREGISTER) might soon be running with the instance lock). > Revert to lockless ndo_setup_tc for TC_SETUP_BLOCK/TC_SETUP_FT. NFT > framework already takes care of most of the locking. Document > the assumptions. > > [...] Here is the summary with links: - [net-next] net: revert to lockless TC_SETUP_BLOCK and TC_SETUP_FT https://git.kernel.org/netdev/net-next/c/0a13c1e0a449 You are awesome, thank you!
diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst index d235a7364893..ebb868f50ac2 100644 --- a/Documentation/networking/netdevices.rst +++ b/Documentation/networking/netdevices.rst @@ -290,6 +290,12 @@ struct net_device synchronization rules Synchronization: netif_addr_lock spinlock. Context: BHs disabled +ndo_setup_tc: + ``TC_SETUP_BLOCK`` and ``TC_SETUP_FT`` are running under NFT locks + (i.e. no ``rtnl_lock`` and no device instance lock). The rest of + ``tc_setup_type`` types run under netdev instance lock if the driver + implements queue management or shaper API. + Most ndo callbacks not specified in the list above are running under ``rtnl_lock``. In addition, netdev instance lock is taken as well if the driver implements queue management or shaper API. diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index d206c9592b60..87fd3b1f2b99 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -3395,8 +3395,6 @@ int dev_open(struct net_device *dev, struct netlink_ext_ack *extack); void netif_close(struct net_device *dev); void dev_close(struct net_device *dev); void dev_close_many(struct list_head *head, bool unlink); -int dev_setup_tc(struct net_device *dev, enum tc_setup_type type, - void *type_data); void netif_disable_lro(struct net_device *dev); void dev_disable_lro(struct net_device *dev); int dev_loopback_xmit(struct net *net, struct sock *sk, struct sk_buff *newskb); diff --git a/net/core/dev.c b/net/core/dev.c index a0f75a1d1f5a..8b3603b764a7 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1760,25 +1760,6 @@ void netif_close(struct net_device *dev) } } -int dev_setup_tc(struct net_device *dev, enum tc_setup_type type, - void *type_data) -{ - const struct net_device_ops *ops = dev->netdev_ops; - int ret; - - ASSERT_RTNL(); - - if (!ops->ndo_setup_tc) - return -EOPNOTSUPP; - - netdev_lock_ops(dev); - ret = ops->ndo_setup_tc(dev, type, type_data); - netdev_unlock_ops(dev); - - return ret; -} -EXPORT_SYMBOL(dev_setup_tc); - void netif_disable_lro(struct net_device *dev) { struct net_device *lower_dev; diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c index 0ec4abded10d..e06bc36f49fe 100644 --- a/net/netfilter/nf_flow_table_offload.c +++ b/net/netfilter/nf_flow_table_offload.c @@ -1175,7 +1175,7 @@ static int nf_flow_table_offload_cmd(struct flow_block_offload *bo, nf_flow_table_block_offload_init(bo, dev_net(dev), cmd, flowtable, extack); down_write(&flowtable->flow_block_lock); - err = dev_setup_tc(dev, TC_SETUP_FT, bo); + err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_FT, bo); up_write(&flowtable->flow_block_lock); if (err < 0) return err; diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c index b761899c143c..64675f1c7f29 100644 --- a/net/netfilter/nf_tables_offload.c +++ b/net/netfilter/nf_tables_offload.c @@ -390,7 +390,7 @@ static int nft_block_offload_cmd(struct nft_base_chain *chain, nft_flow_block_offload_init(&bo, dev_net(dev), cmd, chain, &extack); - err = dev_setup_tc(dev, TC_SETUP_BLOCK, &bo); + err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo); if (err < 0) return err;
There is a couple of places from which we can arrive to ndo_setup_tc with TC_SETUP_BLOCK/TC_SETUP_FT: - netlink - netlink notifier - netdev notifier Locking netdev too deep in this call chain seems to be problematic (especially assuming some/all of the call_netdevice_notifiers NETDEV_UNREGISTER) might soon be running with the instance lock). Revert to lockless ndo_setup_tc for TC_SETUP_BLOCK/TC_SETUP_FT. NFT framework already takes care of most of the locking. Document the assumptions. ndo_setup_tc TC_SETUP_BLOCK nft_block_offload_cmd nft_chain_offload_cmd nft_flow_block_chain nft_flow_offload_chain nft_flow_rule_offload_abort nft_flow_rule_offload_commit nft_flow_rule_offload_commit nf_tables_commit nfnetlink_rcv_batch nfnetlink_rcv_skb_batch nfnetlink_rcv nft_offload_netdev_event NETDEV_UNREGISTER notifier ndo_setup_tc TC_SETUP_FT nf_flow_table_offload_cmd nf_flow_table_offload_setup nft_unregister_flowtable_hook nft_register_flowtable_net_hooks nft_flowtable_update nf_tables_newflowtable nfnetlink_rcv_batch (.call NFNL_CB_BATCH) nft_flowtable_update nf_tables_newflowtable nft_flowtable_event nf_tables_flowtable_event NETDEV_UNREGISTER notifier __nft_unregister_flowtable_net_hooks nft_unregister_flowtable_net_hooks nf_tables_commit nfnetlink_rcv_batch (.call NFNL_CB_BATCH) __nf_tables_abort nf_tables_abort nfnetlink_rcv_batch __nft_release_hook __nft_release_hooks nf_tables_pre_exit_net -> module unload nft_rcv_nl_event netlink_register_notifier (oh boy) nft_register_flowtable_net_hooks nft_flowtable_update nf_tables_newflowtable nf_tables_newflowtable Fixes: c4f0f30b424e ("net: hold netdev instance lock during nft ndo_setup_tc") Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> --- Documentation/networking/netdevices.rst | 6 ++++++ include/linux/netdevice.h | 2 -- net/core/dev.c | 19 ------------------- net/netfilter/nf_flow_table_offload.c | 2 +- net/netfilter/nf_tables_offload.c | 2 +- 5 files changed, 8 insertions(+), 23 deletions(-)