diff mbox series

[net,5/6] net/mlx5: Bridge, fix the crash caused by LAG state check

Message ID 1741644104-97767-6-git-send-email-tariqt@nvidia.com (mailing list archive)
State New
Headers show
Series mlx5 misc fixes 2025-03-10 | expand

Commit Message

Tariq Toukan March 10, 2025, 10:01 p.m. UTC
From: Jianbo Liu <jianbol@nvidia.com>

When removing LAG device from bridge, NETDEV_CHANGEUPPER event is
triggered. Driver finds the lower devices (PFs) to flush all the
offloaded entries. And mlx5_lag_is_shared_fdb is checked, it returns
false if one of PF is unloaded. In such case,
mlx5_esw_bridge_lag_rep_get() and its caller return NULL, instead of
the alive PF, and the flush is skipped.

Besides, the bridge fdb entry's lastuse is updated in mlx5 bridge
event handler. But this SWITCHDEV_FDB_ADD_TO_BRIDGE event can be
ignored in this case because the upper interface for bond is deleted,
and the entry will never be aged because lastuse is never updated.

To make things worse, as the entry is alive, mlx5 bridge workqueue
keeps sending that event, which is then handled by kernel bridge
notifier. It causes the following crash when accessing the passed bond
netdev which is already destroyed.

To fix this issue, remove such checks. LAG state is already checked in
commit 15f8f168952f ("net/mlx5: Bridge, verify LAG state when adding
bond to bridge"), driver still need to skip offload if LAG becomes
invalid state after initialization.

 Oops: stack segment: 0000 [#1] SMP
 CPU: 3 UID: 0 PID: 23695 Comm: kworker/u40:3 Tainted: G           OE      6.11.0_mlnx #1
 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 Workqueue: mlx5_bridge_wq mlx5_esw_bridge_update_work [mlx5_core]
 RIP: 0010:br_switchdev_event+0x2c/0x110 [bridge]
 Code: 44 00 00 48 8b 02 48 f7 00 00 02 00 00 74 69 41 54 55 53 48 83 ec 08 48 8b a8 08 01 00 00 48 85 ed 74 4a 48 83 fe 02 48 89 d3 <4c> 8b 65 00 74 23 76 49 48 83 fe 05 74 7e 48 83 fe 06 75 2f 0f b7
 RSP: 0018:ffffc900092cfda0 EFLAGS: 00010297
 RAX: ffff888123bfe000 RBX: ffffc900092cfe08 RCX: 00000000ffffffff
 RDX: ffffc900092cfe08 RSI: 0000000000000001 RDI: ffffffffa0c585f0
 RBP: 6669746f6e690a30 R08: 0000000000000000 R09: ffff888123ae92c8
 R10: 0000000000000000 R11: fefefefefefefeff R12: ffff888123ae9c60
 R13: 0000000000000001 R14: ffffc900092cfe08 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffff88852c980000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f15914c8734 CR3: 0000000002830005 CR4: 0000000000770ef0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 PKRU: 55555554
 Call Trace:
  <TASK>
  ? __die_body+0x1a/0x60
  ? die+0x38/0x60
  ? do_trap+0x10b/0x120
  ? do_error_trap+0x64/0xa0
  ? exc_stack_segment+0x33/0x50
  ? asm_exc_stack_segment+0x22/0x30
  ? br_switchdev_event+0x2c/0x110 [bridge]
  ? sched_balance_newidle.isra.149+0x248/0x390
  notifier_call_chain+0x4b/0xa0
  atomic_notifier_call_chain+0x16/0x20
  mlx5_esw_bridge_update+0xec/0x170 [mlx5_core]
  mlx5_esw_bridge_update_work+0x19/0x40 [mlx5_core]
  process_scheduled_works+0x81/0x390
  worker_thread+0x106/0x250
  ? bh_worker+0x110/0x110
  kthread+0xb7/0xe0
  ? kthread_park+0x80/0x80
  ret_from_fork+0x2d/0x50
  ? kthread_park+0x80/0x80
  ret_from_fork_asm+0x11/0x20
  </TASK>

Fixes: ff9b7521468b ("net/mlx5: Bridge, support LAG")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
 .../net/ethernet/mellanox/mlx5/core/en/rep/bridge.c  | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

Comments

Michal Swiatkowski March 11, 2025, 8:02 a.m. UTC | #1
On Tue, Mar 11, 2025 at 12:01:43AM +0200, Tariq Toukan wrote:
> From: Jianbo Liu <jianbol@nvidia.com>
> 
> When removing LAG device from bridge, NETDEV_CHANGEUPPER event is
> triggered. Driver finds the lower devices (PFs) to flush all the
> offloaded entries. And mlx5_lag_is_shared_fdb is checked, it returns
> false if one of PF is unloaded. In such case,
> mlx5_esw_bridge_lag_rep_get() and its caller return NULL, instead of
> the alive PF, and the flush is skipped.
> 
> Besides, the bridge fdb entry's lastuse is updated in mlx5 bridge
> event handler. But this SWITCHDEV_FDB_ADD_TO_BRIDGE event can be
> ignored in this case because the upper interface for bond is deleted,
> and the entry will never be aged because lastuse is never updated.
> 
> To make things worse, as the entry is alive, mlx5 bridge workqueue
> keeps sending that event, which is then handled by kernel bridge
> notifier. It causes the following crash when accessing the passed bond
> netdev which is already destroyed.
> 
> To fix this issue, remove such checks. LAG state is already checked in
> commit 15f8f168952f ("net/mlx5: Bridge, verify LAG state when adding
> bond to bridge"), driver still need to skip offload if LAG becomes
> invalid state after initialization.
> 
>  Oops: stack segment: 0000 [#1] SMP
>  CPU: 3 UID: 0 PID: 23695 Comm: kworker/u40:3 Tainted: G           OE      6.11.0_mlnx #1
>  Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
>  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>  Workqueue: mlx5_bridge_wq mlx5_esw_bridge_update_work [mlx5_core]
>  RIP: 0010:br_switchdev_event+0x2c/0x110 [bridge]
>  Code: 44 00 00 48 8b 02 48 f7 00 00 02 00 00 74 69 41 54 55 53 48 83 ec 08 48 8b a8 08 01 00 00 48 85 ed 74 4a 48 83 fe 02 48 89 d3 <4c> 8b 65 00 74 23 76 49 48 83 fe 05 74 7e 48 83 fe 06 75 2f 0f b7
>  RSP: 0018:ffffc900092cfda0 EFLAGS: 00010297
>  RAX: ffff888123bfe000 RBX: ffffc900092cfe08 RCX: 00000000ffffffff
>  RDX: ffffc900092cfe08 RSI: 0000000000000001 RDI: ffffffffa0c585f0
>  RBP: 6669746f6e690a30 R08: 0000000000000000 R09: ffff888123ae92c8
>  R10: 0000000000000000 R11: fefefefefefefeff R12: ffff888123ae9c60
>  R13: 0000000000000001 R14: ffffc900092cfe08 R15: 0000000000000000
>  FS:  0000000000000000(0000) GS:ffff88852c980000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 00007f15914c8734 CR3: 0000000002830005 CR4: 0000000000770ef0
>  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>  PKRU: 55555554
>  Call Trace:
>   <TASK>
>   ? __die_body+0x1a/0x60
>   ? die+0x38/0x60
>   ? do_trap+0x10b/0x120
>   ? do_error_trap+0x64/0xa0
>   ? exc_stack_segment+0x33/0x50
>   ? asm_exc_stack_segment+0x22/0x30
>   ? br_switchdev_event+0x2c/0x110 [bridge]
>   ? sched_balance_newidle.isra.149+0x248/0x390
>   notifier_call_chain+0x4b/0xa0
>   atomic_notifier_call_chain+0x16/0x20
>   mlx5_esw_bridge_update+0xec/0x170 [mlx5_core]
>   mlx5_esw_bridge_update_work+0x19/0x40 [mlx5_core]
>   process_scheduled_works+0x81/0x390
>   worker_thread+0x106/0x250
>   ? bh_worker+0x110/0x110
>   kthread+0xb7/0xe0
>   ? kthread_park+0x80/0x80
>   ret_from_fork+0x2d/0x50
>   ? kthread_park+0x80/0x80
>   ret_from_fork_asm+0x11/0x20
>   </TASK>
> 
> Fixes: ff9b7521468b ("net/mlx5: Bridge, support LAG")
> Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
> Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
> ---
>  .../net/ethernet/mellanox/mlx5/core/en/rep/bridge.c  | 12 +++++-------
>  1 file changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
> index 5d128c5b4529..0f5d7ea8956f 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
> @@ -48,15 +48,10 @@ mlx5_esw_bridge_lag_rep_get(struct net_device *dev, struct mlx5_eswitch *esw)
>  	struct list_head *iter;
>  
>  	netdev_for_each_lower_dev(dev, lower, iter) {
> -		struct mlx5_core_dev *mdev;
> -		struct mlx5e_priv *priv;
> -
>  		if (!mlx5e_eswitch_rep(lower))
>  			continue;
>  
> -		priv = netdev_priv(lower);
> -		mdev = priv->mdev;
> -		if (mlx5_lag_is_shared_fdb(mdev) && mlx5_esw_bridge_dev_same_esw(lower, esw))
> +		if (mlx5_esw_bridge_dev_same_esw(lower, esw))
>  			return lower;
>  	}
>  
> @@ -125,7 +120,7 @@ static bool mlx5_esw_bridge_is_local(struct net_device *dev, struct net_device *
>  	priv = netdev_priv(rep);
>  	mdev = priv->mdev;
>  	if (netif_is_lag_master(dev))
> -		return mlx5_lag_is_shared_fdb(mdev) && mlx5_lag_is_master(mdev);
> +		return mlx5_lag_is_master(mdev);
>  	return true;
>  }
>  
> @@ -455,6 +450,9 @@ static int mlx5_esw_bridge_switchdev_event(struct notifier_block *nb,
>  	if (!rep)
>  		return NOTIFY_DONE;
>  
> +	if (netif_is_lag_master(dev) && !mlx5_lag_is_shared_fdb(esw->dev))
> +		return NOTIFY_DONE;
> +
>  	switch (event) {
>  	case SWITCHDEV_FDB_ADD_TO_BRIDGE:
>  		fdb_info = container_of(info,

Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>

> -- 
> 2.31.1
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
index 5d128c5b4529..0f5d7ea8956f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c
@@ -48,15 +48,10 @@  mlx5_esw_bridge_lag_rep_get(struct net_device *dev, struct mlx5_eswitch *esw)
 	struct list_head *iter;
 
 	netdev_for_each_lower_dev(dev, lower, iter) {
-		struct mlx5_core_dev *mdev;
-		struct mlx5e_priv *priv;
-
 		if (!mlx5e_eswitch_rep(lower))
 			continue;
 
-		priv = netdev_priv(lower);
-		mdev = priv->mdev;
-		if (mlx5_lag_is_shared_fdb(mdev) && mlx5_esw_bridge_dev_same_esw(lower, esw))
+		if (mlx5_esw_bridge_dev_same_esw(lower, esw))
 			return lower;
 	}
 
@@ -125,7 +120,7 @@  static bool mlx5_esw_bridge_is_local(struct net_device *dev, struct net_device *
 	priv = netdev_priv(rep);
 	mdev = priv->mdev;
 	if (netif_is_lag_master(dev))
-		return mlx5_lag_is_shared_fdb(mdev) && mlx5_lag_is_master(mdev);
+		return mlx5_lag_is_master(mdev);
 	return true;
 }
 
@@ -455,6 +450,9 @@  static int mlx5_esw_bridge_switchdev_event(struct notifier_block *nb,
 	if (!rep)
 		return NOTIFY_DONE;
 
+	if (netif_is_lag_master(dev) && !mlx5_lag_is_shared_fdb(esw->dev))
+		return NOTIFY_DONE;
+
 	switch (event) {
 	case SWITCHDEV_FDB_ADD_TO_BRIDGE:
 		fdb_info = container_of(info,