diff mbox series

[net] net/mlx5: Fix management PF condition

Message ID 20230420035652.295680-1-saeed@kernel.org (mailing list archive)
State Deferred
Delegated to: Netdev Maintainers
Headers show
Series [net] net/mlx5: Fix management PF condition | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/apply fail Patch does not apply to net

Commit Message

Saeed Mahameed April 20, 2023, 3:56 a.m. UTC
From: Saeed Mahameed <saeedm@nvidia.com>

Paul reports that it causes a regression with IB on CX4 and FW 12.18.1000

Management PF capabilities can be set on old FW due to the use of old
reserved bits, to avoid such issues, explicitly check for new Bluefield
devices as well as for management PF capabilities, since management PF
is a minimal eth device that is meant to communicate between the ARM cores
of the Bluefield chip and the BMC node.

This should Fix the issue since Bluefield devices have relatively new
firmwares that don't have this bug.

Fixes: fe998a3c77b9 ("net/mlx5: Enable management PF initialization")
Reported-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/all/CAHC9VhQ7A4+msL38WpbOMYjAqLp0EtOjeLh4Dc6SQtD6OUvCQg@mail.gmail.com/
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---

Notes:
    This patch has a couple of TOODs, since this is a fix, this is the
    shortest path to a solution, will do the refactoring later on net-next.

    I hope Paul can test it before tomorrow's net PR, and I will ask Shay to
    test internally if he could today.

 drivers/net/ethernet/mellanox/mlx5/core/main.c | 14 +++++++++++---
 include/linux/mlx5/driver.h                    |  7 ++++++-
 2 files changed, 17 insertions(+), 4 deletions(-)

Comments

Paul Moore April 20, 2023, 5:22 p.m. UTC | #1
On Wed, Apr 19, 2023 at 11:57 PM Saeed Mahameed <saeed@kernel.org> wrote:
> From: Saeed Mahameed <saeedm@nvidia.com>
>
> Paul reports that it causes a regression with IB on CX4 and FW 12.18.1000
>
> Management PF capabilities can be set on old FW due to the use of old
> reserved bits, to avoid such issues, explicitly check for new Bluefield
> devices as well as for management PF capabilities, since management PF
> is a minimal eth device that is meant to communicate between the ARM cores
> of the Bluefield chip and the BMC node.
>
> This should Fix the issue since Bluefield devices have relatively new
> firmwares that don't have this bug.
>
> Fixes: fe998a3c77b9 ("net/mlx5: Enable management PF initialization")
> Reported-by: Paul Moore <paul@paul-moore.com>
> Link: https://lore.kernel.org/all/CAHC9VhQ7A4+msL38WpbOMYjAqLp0EtOjeLh4Dc6SQtD6OUvCQg@mail.gmail.com/
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> ---
>
> Notes:
>     This patch has a couple of TOODs, since this is a fix, this is the
>     shortest path to a solution, will do the refactoring later on net-next.
>
>     I hope Paul can test it before tomorrow's net PR, and I will ask Shay to
>     test internally if he could today.
>
>  drivers/net/ethernet/mellanox/mlx5/core/main.c | 14 +++++++++++---
>  include/linux/mlx5/driver.h                    |  7 ++++++-
>  2 files changed, 17 insertions(+), 4 deletions(-)

I'm not sure where this patch stands given the current timing, but I
ran this through my automated testing this morning and everything
looked good on my end.

Tested-by: Paul Moore <paul@paul-moore.com>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index f1de152a6113..95818c5a132b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1737,8 +1737,13 @@  static int probe_one(struct pci_dev *pdev, const struct pci_device_id *id)
 	dev->device = &pdev->dev;
 	dev->pdev = pdev;
 
+	/** TODO: don't maintain both coredev_type and has_mpf fields, just copy
+	 * the value from id->driver_data into a new instance dev->driver_data
+	 * and use it as is in the helper functions.
+	 */
 	dev->coredev_type = id->driver_data & MLX5_PCI_DEV_IS_VF ?
 			 MLX5_COREDEV_VF : MLX5_COREDEV_PF;
+	dev->priv.has_mpf = id->driver_data & MLX5_DD_HAS_MPF;
 
 	dev->priv.adev_idx = mlx5_adev_idx_alloc();
 	if (dev->priv.adev_idx < 0) {
@@ -2026,9 +2031,12 @@  static const struct pci_device_id mlx5_core_pci_table[] = {
 	{ PCI_VDEVICE(MELLANOX, 0x1023) },			/* ConnectX-8 */
 	{ PCI_VDEVICE(MELLANOX, 0xa2d2) },			/* BlueField integrated ConnectX-5 network controller */
 	{ PCI_VDEVICE(MELLANOX, 0xa2d3), MLX5_PCI_DEV_IS_VF},	/* BlueField integrated ConnectX-5 network controller VF */
-	{ PCI_VDEVICE(MELLANOX, 0xa2d6) },			/* BlueField-2 integrated ConnectX-6 Dx network controller */
-	{ PCI_VDEVICE(MELLANOX, 0xa2dc) },			/* BlueField-3 integrated ConnectX-7 network controller */
-	{ PCI_VDEVICE(MELLANOX, 0xa2df) },			/* BlueField-4 integrated ConnectX-8 network controller */
+	/* BlueField-2 integrated ConnectX-6 Dx network controller */
+	{ PCI_VDEVICE(MELLANOX, 0xa2d6), MLX5_DD_HAS_MPF},
+	/* BlueField-3 integrated ConnectX-7 network controller */
+	{ PCI_VDEVICE(MELLANOX, 0xa2dc), MLX5_DD_HAS_MPF},
+	/* BlueField-4 integrated ConnectX-8 network controller */
+	{ PCI_VDEVICE(MELLANOX, 0xa2df), MLX5_DD_HAS_MPF},
 	{ 0, }
 };
 
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index f33389b42209..149e7e5a2cf7 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -633,6 +633,7 @@  struct mlx5_priv {
 
 	struct mlx5_bfreg_data		bfregs;
 	struct mlx5_uars_page	       *uar;
+	bool has_mpf; /* TODO: Merge with mdev->coredev_type */
 #ifdef CONFIG_MLX5_SF
 	struct mlx5_vhca_state_notifier *vhca_state_notifier;
 	struct mlx5_sf_dev_table *sf_dev_table;
@@ -1197,8 +1198,9 @@  int mlx5_rdma_rn_get_params(struct mlx5_core_dev *mdev,
 			    struct ib_device *device,
 			    struct rdma_netdev_alloc_params *params);
 
-enum {
+enum { /* Per Device data */
 	MLX5_PCI_DEV_IS_VF		= 1 << 0,
+	MLX5_DD_HAS_MPF			= 1 << 1, /* Device has a management PF */
 };
 
 static inline bool mlx5_core_is_pf(const struct mlx5_core_dev *dev)
@@ -1213,6 +1215,9 @@  static inline bool mlx5_core_is_vf(const struct mlx5_core_dev *dev)
 
 static inline bool mlx5_core_is_management_pf(const struct mlx5_core_dev *dev)
 {
+	if (!dev->priv.has_mpf) /* This device can support management PF ? */
+		return false;
+	/* is this an MPF function ? */
 	return MLX5_CAP_GEN(dev, num_ports) == 1 && !MLX5_CAP_GEN(dev, native_port_num);
 }