diff mbox series

[net-next] net: revert to lockless TC_SETUP_BLOCK and TC_SETUP_FT

Message ID 20250308044726.1193222-1-sdf@fomichev.me (mailing list archive)
State Accepted
Commit 0a13c1e0a449917b29c45d90701eededa69c99d3
Delegated to: Netdev Maintainers
Headers show
Series [net-next] net: revert to lockless TC_SETUP_BLOCK and TC_SETUP_FT | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next, async
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 38 this patch: 38
netdev/build_tools success Errors and warnings before: 26 (+0) this patch: 26 (+0)
netdev/cc_maintainers success CCed 13 of 13 maintainers
netdev/build_clang success Errors and warnings before: 64 this patch: 64
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 4096 this patch: 4096
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 61 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 85 this patch: 85
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2025-03-08--06-00 (tests: 894)

Commit Message

Stanislav Fomichev March 8, 2025, 4:47 a.m. UTC
There is a couple of places from which we can arrive to ndo_setup_tc
with TC_SETUP_BLOCK/TC_SETUP_FT:
- netlink
- netlink notifier
- netdev notifier

Locking netdev too deep in this call chain seems to be problematic
(especially assuming some/all of the call_netdevice_notifiers
NETDEV_UNREGISTER) might soon be running with the instance lock).
Revert to lockless ndo_setup_tc for TC_SETUP_BLOCK/TC_SETUP_FT. NFT
framework already takes care of most of the locking. Document
the assumptions.

ndo_setup_tc TC_SETUP_BLOCK
  nft_block_offload_cmd
    nft_chain_offload_cmd
      nft_flow_block_chain
        nft_flow_offload_chain
	  nft_flow_rule_offload_abort
	    nft_flow_rule_offload_commit
	  nft_flow_rule_offload_commit
	    nf_tables_commit
	      nfnetlink_rcv_batch
	        nfnetlink_rcv_skb_batch
		  nfnetlink_rcv
	nft_offload_netdev_event
	  NETDEV_UNREGISTER notifier

ndo_setup_tc TC_SETUP_FT
  nf_flow_table_offload_cmd
    nf_flow_table_offload_setup
      nft_unregister_flowtable_hook
        nft_register_flowtable_net_hooks
	  nft_flowtable_update
	  nf_tables_newflowtable
	    nfnetlink_rcv_batch (.call NFNL_CB_BATCH)
	nft_flowtable_update
	  nf_tables_newflowtable
	nft_flowtable_event
	  nf_tables_flowtable_event
	    NETDEV_UNREGISTER notifier
      __nft_unregister_flowtable_net_hooks
        nft_unregister_flowtable_net_hooks
	  nf_tables_commit
	    nfnetlink_rcv_batch (.call NFNL_CB_BATCH)
	  __nf_tables_abort
	    nf_tables_abort
	      nfnetlink_rcv_batch
	__nft_release_hook
	  __nft_release_hooks
	    nf_tables_pre_exit_net -> module unload
	  nft_rcv_nl_event
	    netlink_register_notifier (oh boy)
      nft_register_flowtable_net_hooks
      	nft_flowtable_update
	  nf_tables_newflowtable
        nf_tables_newflowtable

Fixes: c4f0f30b424e ("net: hold netdev instance lock during nft ndo_setup_tc")
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
 Documentation/networking/netdevices.rst |  6 ++++++
 include/linux/netdevice.h               |  2 --
 net/core/dev.c                          | 19 -------------------
 net/netfilter/nf_flow_table_offload.c   |  2 +-
 net/netfilter/nf_tables_offload.c       |  2 +-
 5 files changed, 8 insertions(+), 23 deletions(-)

Comments

Eric Dumazet March 8, 2025, 6:41 a.m. UTC | #1
On Sat, Mar 8, 2025 at 5:47 AM Stanislav Fomichev <sdf@fomichev.me> wrote:
>
> There is a couple of places from which we can arrive to ndo_setup_tc
> with TC_SETUP_BLOCK/TC_SETUP_FT:
> - netlink
> - netlink notifier
> - netdev notifier
>
> Locking netdev too deep in this call chain seems to be problematic
> (especially assuming some/all of the call_netdevice_notifiers
> NETDEV_UNREGISTER) might soon be running with the instance lock).
> Revert to lockless ndo_setup_tc for TC_SETUP_BLOCK/TC_SETUP_FT. NFT
> framework already takes care of most of the locking. Document
> the assumptions.
>


>
> Fixes: c4f0f30b424e ("net: hold netdev instance lock during nft ndo_setup_tc")
> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>

I think you forgot to mention syzbot.

Reported-by: syzbot+0afb4bcf91e5a1afdcad@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/67cb88d1.050a0220.d8275.022d.GAE@google.com/T/#u

Thanks.
Stanislav Fomichev March 8, 2025, 5:08 p.m. UTC | #2
On 03/08, Eric Dumazet wrote:
> On Sat, Mar 8, 2025 at 5:47 AM Stanislav Fomichev <sdf@fomichev.me> wrote:
> >
> > There is a couple of places from which we can arrive to ndo_setup_tc
> > with TC_SETUP_BLOCK/TC_SETUP_FT:
> > - netlink
> > - netlink notifier
> > - netdev notifier
> >
> > Locking netdev too deep in this call chain seems to be problematic
> > (especially assuming some/all of the call_netdevice_notifiers
> > NETDEV_UNREGISTER) might soon be running with the instance lock).
> > Revert to lockless ndo_setup_tc for TC_SETUP_BLOCK/TC_SETUP_FT. NFT
> > framework already takes care of most of the locking. Document
> > the assumptions.
> >
> 
> 
> >
> > Fixes: c4f0f30b424e ("net: hold netdev instance lock during nft ndo_setup_tc")
> > Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
> 
> I think you forgot to mention syzbot.
> 
> Reported-by: syzbot+0afb4bcf91e5a1afdcad@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/netdev/67cb88d1.050a0220.d8275.022d.GAE@google.com/T/#u

Ah, yes, I was waiting for a repro, but should have attached the proper
tags, thanks!
Simon Horman March 12, 2025, 12:43 p.m. UTC | #3
On Fri, Mar 07, 2025 at 08:47:26PM -0800, Stanislav Fomichev wrote:
> There is a couple of places from which we can arrive to ndo_setup_tc
> with TC_SETUP_BLOCK/TC_SETUP_FT:
> - netlink
> - netlink notifier
> - netdev notifier
> 
> Locking netdev too deep in this call chain seems to be problematic
> (especially assuming some/all of the call_netdevice_notifiers
> NETDEV_UNREGISTER) might soon be running with the instance lock).
> Revert to lockless ndo_setup_tc for TC_SETUP_BLOCK/TC_SETUP_FT. NFT
> framework already takes care of most of the locking. Document
> the assumptions.
> 
> ndo_setup_tc TC_SETUP_BLOCK
>   nft_block_offload_cmd
>     nft_chain_offload_cmd
>       nft_flow_block_chain
>         nft_flow_offload_chain
> 	  nft_flow_rule_offload_abort
> 	    nft_flow_rule_offload_commit
> 	  nft_flow_rule_offload_commit
> 	    nf_tables_commit
> 	      nfnetlink_rcv_batch
> 	        nfnetlink_rcv_skb_batch
> 		  nfnetlink_rcv
> 	nft_offload_netdev_event
> 	  NETDEV_UNREGISTER notifier
> 
> ndo_setup_tc TC_SETUP_FT
>   nf_flow_table_offload_cmd
>     nf_flow_table_offload_setup
>       nft_unregister_flowtable_hook
>         nft_register_flowtable_net_hooks
> 	  nft_flowtable_update
> 	  nf_tables_newflowtable
> 	    nfnetlink_rcv_batch (.call NFNL_CB_BATCH)
> 	nft_flowtable_update
> 	  nf_tables_newflowtable
> 	nft_flowtable_event
> 	  nf_tables_flowtable_event
> 	    NETDEV_UNREGISTER notifier
>       __nft_unregister_flowtable_net_hooks
>         nft_unregister_flowtable_net_hooks
> 	  nf_tables_commit
> 	    nfnetlink_rcv_batch (.call NFNL_CB_BATCH)
> 	  __nf_tables_abort
> 	    nf_tables_abort
> 	      nfnetlink_rcv_batch
> 	__nft_release_hook
> 	  __nft_release_hooks
> 	    nf_tables_pre_exit_net -> module unload
> 	  nft_rcv_nl_event
> 	    netlink_register_notifier (oh boy)
>       nft_register_flowtable_net_hooks
>       	nft_flowtable_update
> 	  nf_tables_newflowtable
>         nf_tables_newflowtable
> 
> Fixes: c4f0f30b424e ("net: hold netdev instance lock during nft ndo_setup_tc")
> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>

Thanks Stan,

Thinking aloud: the dev_setup_tc() helper checked if ndo_setup_tc is
non-NULL, but that is not the case here. But that seems ok because it was
also the case prior to the cited commit.

Reviewed-by: Simon Horman <horms@kernel.org>

...
patchwork-bot+netdevbpf@kernel.org March 12, 2025, 8:20 p.m. UTC | #4
Hello:

This patch was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Fri,  7 Mar 2025 20:47:26 -0800 you wrote:
> There is a couple of places from which we can arrive to ndo_setup_tc
> with TC_SETUP_BLOCK/TC_SETUP_FT:
> - netlink
> - netlink notifier
> - netdev notifier
> 
> Locking netdev too deep in this call chain seems to be problematic
> (especially assuming some/all of the call_netdevice_notifiers
> NETDEV_UNREGISTER) might soon be running with the instance lock).
> Revert to lockless ndo_setup_tc for TC_SETUP_BLOCK/TC_SETUP_FT. NFT
> framework already takes care of most of the locking. Document
> the assumptions.
> 
> [...]

Here is the summary with links:
  - [net-next] net: revert to lockless TC_SETUP_BLOCK and TC_SETUP_FT
    https://git.kernel.org/netdev/net-next/c/0a13c1e0a449

You are awesome, thank you!
diff mbox series

Patch

diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst
index d235a7364893..ebb868f50ac2 100644
--- a/Documentation/networking/netdevices.rst
+++ b/Documentation/networking/netdevices.rst
@@ -290,6 +290,12 @@  struct net_device synchronization rules
 	Synchronization: netif_addr_lock spinlock.
 	Context: BHs disabled
 
+ndo_setup_tc:
+	``TC_SETUP_BLOCK`` and ``TC_SETUP_FT`` are running under NFT locks
+	(i.e. no ``rtnl_lock`` and no device instance lock). The rest of
+	``tc_setup_type`` types run under netdev instance lock if the driver
+	implements queue management or shaper API.
+
 Most ndo callbacks not specified in the list above are running
 under ``rtnl_lock``. In addition, netdev instance lock is taken as well if
 the driver implements queue management or shaper API.
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index d206c9592b60..87fd3b1f2b99 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3395,8 +3395,6 @@  int dev_open(struct net_device *dev, struct netlink_ext_ack *extack);
 void netif_close(struct net_device *dev);
 void dev_close(struct net_device *dev);
 void dev_close_many(struct list_head *head, bool unlink);
-int dev_setup_tc(struct net_device *dev, enum tc_setup_type type,
-		 void *type_data);
 void netif_disable_lro(struct net_device *dev);
 void dev_disable_lro(struct net_device *dev);
 int dev_loopback_xmit(struct net *net, struct sock *sk, struct sk_buff *newskb);
diff --git a/net/core/dev.c b/net/core/dev.c
index a0f75a1d1f5a..8b3603b764a7 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1760,25 +1760,6 @@  void netif_close(struct net_device *dev)
 	}
 }
 
-int dev_setup_tc(struct net_device *dev, enum tc_setup_type type,
-		 void *type_data)
-{
-	const struct net_device_ops *ops = dev->netdev_ops;
-	int ret;
-
-	ASSERT_RTNL();
-
-	if (!ops->ndo_setup_tc)
-		return -EOPNOTSUPP;
-
-	netdev_lock_ops(dev);
-	ret = ops->ndo_setup_tc(dev, type, type_data);
-	netdev_unlock_ops(dev);
-
-	return ret;
-}
-EXPORT_SYMBOL(dev_setup_tc);
-
 void netif_disable_lro(struct net_device *dev)
 {
 	struct net_device *lower_dev;
diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c
index 0ec4abded10d..e06bc36f49fe 100644
--- a/net/netfilter/nf_flow_table_offload.c
+++ b/net/netfilter/nf_flow_table_offload.c
@@ -1175,7 +1175,7 @@  static int nf_flow_table_offload_cmd(struct flow_block_offload *bo,
 	nf_flow_table_block_offload_init(bo, dev_net(dev), cmd, flowtable,
 					 extack);
 	down_write(&flowtable->flow_block_lock);
-	err = dev_setup_tc(dev, TC_SETUP_FT, bo);
+	err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_FT, bo);
 	up_write(&flowtable->flow_block_lock);
 	if (err < 0)
 		return err;
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index b761899c143c..64675f1c7f29 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -390,7 +390,7 @@  static int nft_block_offload_cmd(struct nft_base_chain *chain,
 
 	nft_flow_block_offload_init(&bo, dev_net(dev), cmd, chain, &extack);
 
-	err = dev_setup_tc(dev, TC_SETUP_BLOCK, &bo);
+	err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
 	if (err < 0)
 		return err;