diff mbox series

[net-next,3/3] mlxsw: pci: Lock configuration space of upstream bridge during reset

Message ID b2090f454fbde67d47c6204e0c127a07fdeb8ca1.1719849427.git.petrm@nvidia.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series mlxsw: Improvements | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 839 this patch: 839
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 6 of 6 maintainers
netdev/build_clang success Errors and warnings before: 846 this patch: 846
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 846 this patch: 846
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 24 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-07-01--21-00 (tests: 665)

Commit Message

Petr Machata July 1, 2024, 4:41 p.m. UTC
From: Ido Schimmel <idosch@nvidia.com>

The driver triggers a "Secondary Bus Reset" (SBR) by calling
__pci_reset_function_locked() which asserts the SBR bit in the "Bridge
Control Register" in the configuration space of the upstream bridge for
2ms. This is done without locking the configuration space of the
upstream bridge port, allowing user space to access it concurrently.

Linux 6.11 will start warning about such unlocked resets [1][2]:

pcieport 0000:00:01.0: unlocked secondary bus reset via: pci_reset_bus_function+0x51c/0x6a0

Avoid the warning by locking the configuration space of the upstream
bridge prior to the reset and unlocking it afterwards.

[1] https://lore.kernel.org/all/171711746953.1628941.4692125082286867825.stgit@dwillia2-xfh.jf.intel.com/
[2] https://lore.kernel.org/all/20240531213150.GA610983@bhelgaas/

Cc: linux-pci@vger.kernel.org
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlxsw/pci.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Przemek Kitszel July 2, 2024, 7:35 a.m. UTC | #1
On 7/1/24 18:41, Petr Machata wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> The driver triggers a "Secondary Bus Reset" (SBR) by calling
> __pci_reset_function_locked() which asserts the SBR bit in the "Bridge
> Control Register" in the configuration space of the upstream bridge for
> 2ms. This is done without locking the configuration space of the
> upstream bridge port, allowing user space to access it concurrently.

This means your patch is a bugfix.

> 
> Linux 6.11 will start warning about such unlocked resets [1][2]:
> 
> pcieport 0000:00:01.0: unlocked secondary bus reset via: pci_reset_bus_function+0x51c/0x6a0
> 
> Avoid the warning by locking the configuration space of the upstream
> bridge prior to the reset and unlocking it afterwards.

You are not avoiding the warning but protecting concurrent access,
please add a Fixes tag.

> 
> [1] https://lore.kernel.org/all/171711746953.1628941.4692125082286867825.stgit@dwillia2-xfh.jf.intel.com/
> [2] https://lore.kernel.org/all/20240531213150.GA610983@bhelgaas/
> 
> Cc: linux-pci@vger.kernel.org
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> Signed-off-by: Petr Machata <petrm@nvidia.com>
> ---
>   drivers/net/ethernet/mellanox/mlxsw/pci.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci.c b/drivers/net/ethernet/mellanox/mlxsw/pci.c
> index 0320dabd1380..060e5b939211 100644
> --- a/drivers/net/ethernet/mellanox/mlxsw/pci.c
> +++ b/drivers/net/ethernet/mellanox/mlxsw/pci.c
> @@ -1784,6 +1784,7 @@ static int mlxsw_pci_reset_at_pci_disable(struct mlxsw_pci *mlxsw_pci,
>   {
>   	struct pci_dev *pdev = mlxsw_pci->pdev;
>   	char mrsr_pl[MLXSW_REG_MRSR_LEN];
> +	struct pci_dev *bridge;
>   	int err;
>   
>   	if (!pci_reset_sbr_supported) {
> @@ -1800,6 +1801,9 @@ static int mlxsw_pci_reset_at_pci_disable(struct mlxsw_pci *mlxsw_pci,
>   sbr:
>   	device_lock_assert(&pdev->dev);
>   
> +	bridge = pci_upstream_bridge(pdev);
> +	if (bridge)
> +		pci_cfg_access_lock(bridge);
>   	pci_cfg_access_lock(pdev);
>   	pci_save_state(pdev);
>   
> @@ -1809,6 +1813,8 @@ static int mlxsw_pci_reset_at_pci_disable(struct mlxsw_pci *mlxsw_pci,
>   
>   	pci_restore_state(pdev);
>   	pci_cfg_access_unlock(pdev);
> +	if (bridge)
> +		pci_cfg_access_unlock(bridge);
>   
>   	return err;
>   }
Ido Schimmel July 3, 2024, 2:42 p.m. UTC | #2
On Tue, Jul 02, 2024 at 09:35:50AM +0200, Przemek Kitszel wrote:
> On 7/1/24 18:41, Petr Machata wrote:
> > From: Ido Schimmel <idosch@nvidia.com>
> > 
> > The driver triggers a "Secondary Bus Reset" (SBR) by calling
> > __pci_reset_function_locked() which asserts the SBR bit in the "Bridge
> > Control Register" in the configuration space of the upstream bridge for
> > 2ms. This is done without locking the configuration space of the
> > upstream bridge port, allowing user space to access it concurrently.
> 
> This means your patch is a bugfix.
> 
> > 
> > Linux 6.11 will start warning about such unlocked resets [1][2]:
> > 
> > pcieport 0000:00:01.0: unlocked secondary bus reset via: pci_reset_bus_function+0x51c/0x6a0
> > 
> > Avoid the warning by locking the configuration space of the upstream
> > bridge prior to the reset and unlocking it afterwards.
> 
> You are not avoiding the warning but protecting concurrent access,
> please add a Fixes tag.

The patch that added the missing lock in PCI core was posted without a
Fixes tag and merged as part of the 6.10 PR. See commit 7e89efc6e9e4
("PCI: Lock upstream bridge for pci_reset_function()").

I don't see a good reason for root to poke in the configuration space of
the upstream bridge during SBR, but AFAICT the worst that can happen is
that reset will fail and while it is a bug, it is not a regression.

Bjorn, do you see a reason to post this as a fix?

Thanks

> 
> > 
> > [1] https://lore.kernel.org/all/171711746953.1628941.4692125082286867825.stgit@dwillia2-xfh.jf.intel.com/
> > [2] https://lore.kernel.org/all/20240531213150.GA610983@bhelgaas/
> > 
> > Cc: linux-pci@vger.kernel.org
> > Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> > Signed-off-by: Petr Machata <petrm@nvidia.com>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci.c b/drivers/net/ethernet/mellanox/mlxsw/pci.c
index 0320dabd1380..060e5b939211 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/pci.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/pci.c
@@ -1784,6 +1784,7 @@  static int mlxsw_pci_reset_at_pci_disable(struct mlxsw_pci *mlxsw_pci,
 {
 	struct pci_dev *pdev = mlxsw_pci->pdev;
 	char mrsr_pl[MLXSW_REG_MRSR_LEN];
+	struct pci_dev *bridge;
 	int err;
 
 	if (!pci_reset_sbr_supported) {
@@ -1800,6 +1801,9 @@  static int mlxsw_pci_reset_at_pci_disable(struct mlxsw_pci *mlxsw_pci,
 sbr:
 	device_lock_assert(&pdev->dev);
 
+	bridge = pci_upstream_bridge(pdev);
+	if (bridge)
+		pci_cfg_access_lock(bridge);
 	pci_cfg_access_lock(pdev);
 	pci_save_state(pdev);
 
@@ -1809,6 +1813,8 @@  static int mlxsw_pci_reset_at_pci_disable(struct mlxsw_pci *mlxsw_pci,
 
 	pci_restore_state(pdev);
 	pci_cfg_access_unlock(pdev);
+	if (bridge)
+		pci_cfg_access_unlock(bridge);
 
 	return err;
 }