diff mbox series

[1/1] PCI: armada8k: Fix bar assignment failure upon rescan

Message ID 20241112073227.769814-1-jpatel2@marvell.com (mailing list archive)
State New
Headers show
Series [1/1] PCI: armada8k: Fix bar assignment failure upon rescan | expand

Commit Message

Jenishkumar Maheshbhai Patel Nov. 12, 2024, 7:32 a.m. UTC
When the attached device recovers the link from
an external reset, the following error might be
seen upon pci rescan.

On link-down event, it's not necessary to remove
the root bus. Only the child buses or devices
should be wiped off. However, the rescan operation
should be performed only when the link could be
retained. Otherwise, it should be done by a user
manually after the link is finally recovered.

~# echo 1 > /sys/bus/pci/rescan
[  322.857504] pci 0000:01:00.0: [177d:b200] type 00 class 0x028000
[  322.863682] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x007fffff 64bit pref]
[  322.871031] pci 0000:01:00.0: reg 0x18: [mem 0x00000000-0x0fffffff 64bit pref]
[  322.878362] pci 0000:01:00.0: reg 0x20: [mem 0x00000000-0x03ffffff 64bit pref]
[  322.886845] pci 0000:01:00.0: reg 0x244: [mem 0x00000000-0x000fffff 64bit pref]
[  322.894193] pci 0000:01:00.0: VF(n) BAR0 space: [mem 0x00000000-0x007fffff 64bit pref] (contains BAR0 for 8 VFs)
[  322.905154] pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s PCIe x2 link at 0000:00:00.0 (capable of 63.008 Gb/s with 8.0 GT/s PCIe x8 link)
[  322.921371] pcieport 0000:00:00.0: BAR 15: no space for [mem size 0x18000000 64bit pref]
[  322.929507] pcieport 0000:00:00.0: BAR 15: failed to assign [mem size 0x18000000 64bit pref]
[  322.937999] pcieport 0000:00:00.0: BAR 15: no space for [mem size 0x18000000 64bit pref]
[  322.946131] pcieport 0000:00:00.0: BAR 15: failed to assign [mem size 0x18000000 64bit pref]
[  322.954614] pci 0000:01:00.0: BAR 2: no space for [mem size 0x10000000 64bit pref]
[  322.962225] pci 0000:01:00.0: BAR 2: failed to assign [mem size 0x10000000 64bit pref]
[  322.970193] pci 0000:01:00.0: BAR 4: no space for [mem size 0x04000000 64bit pref]
[  322.977804] pci 0000:01:00.0: BAR 4: failed to assign [mem size 0x04000000 64bit pref]
[  322.985766] pci 0000:01:00.0: BAR 0: no space for [mem size 0x00800000 64bit pref]
[  322.993373] pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x00800000 64bit pref]
[  323.001331] pci 0000:01:00.0: BAR 7: no space for [mem size 0x00800000 64bit pref]
[  323.008938] pci 0000:01:00.0: BAR 7: failed to assign [mem size 0x00800000 64bit pref]
[  323.016903] pci 0000:01:00.0: BAR 2: no space for [mem size 0x10000000 64bit pref]
[  323.024511] pci 0000:01:00.0: BAR 2: failed to assign [mem size 0x10000000 64bit pref]
[  323.032469] pci 0000:01:00.0: BAR 4: no space for [mem size 0x04000000 64bit pref]
[  323.040079] pci 0000:01:00.0: BAR 4: failed to assign [mem size 0x04000000 64bit pref]
[  323.048037] pci 0000:01:00.0: BAR 0: no space for [mem size 0x00800000 64bit pref]
[  323.055644] pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x00800000 64bit pref]
[  323.063601] pci 0000:01:00.0: BAR 7: no space for [mem size 0x00800000 64bit pref]
[  323.071211] pci 0000:01:00.0: BAR 7: failed to assign [mem size 0x00800000 64bit pref]
[  323.081914] pcieport 0002:02:03.0: devices behind bridge are unusable because [bus 03] cannot be assigned for them
[  323.092384] pcieport 0002:02:07.0: devices behind bridge are unusable because [bus 04] cannot be assigned for them
[  323.102857] pcieport 0002:01:00.0: bridge has subordinate 02 but max busn 04

Signed-off-by: Jenishkumar Maheshbhai Patel <jpatel2@marvell.com>
---
 drivers/pci/controller/dwc/pcie-armada8k.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

Comments

Bjorn Helgaas Nov. 12, 2024, 9:39 p.m. UTC | #1
On Mon, Nov 11, 2024 at 11:32:27PM -0800, Jenishkumar Maheshbhai Patel wrote:
> When the attached device recovers the link from
> an external reset, the following error might be
> seen upon pci rescan.
> 
> On link-down event, it's not necessary to remove
> the root bus. Only the child buses or devices
> should be wiped off. However, the rescan operation
> should be performed only when the link could be
> retained. Otherwise, it should be done by a user
> manually after the link is finally recovered.

Wrap to fill 75 columns.

s/pci/PCI/
s/bar/BAR/ (subject)

> ~# echo 1 > /sys/bus/pci/rescan
> [  322.857504] pci 0000:01:00.0: [177d:b200] type 00 class 0x028000
> [  322.863682] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x007fffff 64bit pref]
> [  322.871031] pci 0000:01:00.0: reg 0x18: [mem 0x00000000-0x0fffffff 64bit pref]
> [  322.878362] pci 0000:01:00.0: reg 0x20: [mem 0x00000000-0x03ffffff 64bit pref]
> [  322.886845] pci 0000:01:00.0: reg 0x244: [mem 0x00000000-0x000fffff 64bit pref]
> [  322.894193] pci 0000:01:00.0: VF(n) BAR0 space: [mem 0x00000000-0x007fffff 64bit pref] (contains BAR0 for 8 VFs)
> [  322.905154] pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s PCIe x2 link at 0000:00:00.0 (capable of 63.008 Gb/s with 8.0 GT/s PCIe x8 link)
> [  322.921371] pcieport 0000:00:00.0: BAR 15: no space for [mem size 0x18000000 64bit pref]
> [  322.929507] pcieport 0000:00:00.0: BAR 15: failed to assign [mem size 0x18000000 64bit pref]
> [  322.937999] pcieport 0000:00:00.0: BAR 15: no space for [mem size 0x18000000 64bit pref]
> [  322.946131] pcieport 0000:00:00.0: BAR 15: failed to assign [mem size 0x18000000 64bit pref]
> [  322.954614] pci 0000:01:00.0: BAR 2: no space for [mem size 0x10000000 64bit pref]
> [  322.962225] pci 0000:01:00.0: BAR 2: failed to assign [mem size 0x10000000 64bit pref]
> [  322.970193] pci 0000:01:00.0: BAR 4: no space for [mem size 0x04000000 64bit pref]
> [  322.977804] pci 0000:01:00.0: BAR 4: failed to assign [mem size 0x04000000 64bit pref]
> [  322.985766] pci 0000:01:00.0: BAR 0: no space for [mem size 0x00800000 64bit pref]
> [  322.993373] pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x00800000 64bit pref]
> [  323.001331] pci 0000:01:00.0: BAR 7: no space for [mem size 0x00800000 64bit pref]
> [  323.008938] pci 0000:01:00.0: BAR 7: failed to assign [mem size 0x00800000 64bit pref]
> [  323.016903] pci 0000:01:00.0: BAR 2: no space for [mem size 0x10000000 64bit pref]
> [  323.024511] pci 0000:01:00.0: BAR 2: failed to assign [mem size 0x10000000 64bit pref]
> [  323.032469] pci 0000:01:00.0: BAR 4: no space for [mem size 0x04000000 64bit pref]
> [  323.040079] pci 0000:01:00.0: BAR 4: failed to assign [mem size 0x04000000 64bit pref]
> [  323.048037] pci 0000:01:00.0: BAR 0: no space for [mem size 0x00800000 64bit pref]
> [  323.055644] pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x00800000 64bit pref]
> [  323.063601] pci 0000:01:00.0: BAR 7: no space for [mem size 0x00800000 64bit pref]
> [  323.071211] pci 0000:01:00.0: BAR 7: failed to assign [mem size 0x00800000 64bit pref]
> [  323.081914] pcieport 0002:02:03.0: devices behind bridge are unusable because [bus 03] cannot be assigned for them
> [  323.092384] pcieport 0002:02:07.0: devices behind bridge are unusable because [bus 04] cannot be assigned for them
> [  323.102857] pcieport 0002:01:00.0: bridge has subordinate 02 but max busn 04

Remove timestamps; they don't help us understand.  We probably don't
need *all* the lines here to understand the problem.

Collect output from current kernel, which should use more useful
labels than "reg 0x10", "BAR 15", etc.

> Signed-off-by: Jenishkumar Maheshbhai Patel <jpatel2@marvell.com>
> ---
>  drivers/pci/controller/dwc/pcie-armada8k.c | 19 ++++++++++++++-----
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-armada8k.c b/drivers/pci/controller/dwc/pcie-armada8k.c
> index f9d6907900d1..ca2dedaa69a4 100644
> --- a/drivers/pci/controller/dwc/pcie-armada8k.c
> +++ b/drivers/pci/controller/dwc/pcie-armada8k.c
> @@ -231,6 +231,7 @@ static void armada8k_pcie_recover_link(struct work_struct *ws)
>  	struct dw_pcie_rp *pp = &pcie->pci->pp;
>  	struct pci_bus *bus = pp->bridge->bus;
>  	struct pci_dev *root_port;
> +	struct pci_dev *child, *tmp;
>  	int ret;
>  
>  	root_port = pci_get_slot(bus, 0);
> @@ -239,7 +240,14 @@ static void armada8k_pcie_recover_link(struct work_struct *ws)
>  		return;
>  	}
>  	pci_lock_rescan_remove();
> -	pci_stop_and_remove_bus_device(root_port);
> +
> +	/* Remove all devices under root bus */
> +	list_for_each_entry_safe(child, tmp,
> +				 &root_port->subordinate->devices, bus_list) {
> +		pci_stop_and_remove_bus_device(child);
> +		dev_dbg(&child->dev, "removed\n");
> +	}
> +
>  	/* Reset device if reset gpio is set */
>  	if (pcie->reset_gpio) {
>  		/* assert and then deassert the reset signal */
> @@ -279,11 +287,12 @@ static void armada8k_pcie_recover_link(struct work_struct *ws)
>  
>  	/* Wait until the link becomes active again */
>  	if (dw_pcie_wait_for_link(pcie->pci))
> -		dev_err(pcie->pci->dev, "Link not up after reconfiguration\n");
> +		goto fail;
> +
> +	dev_dbg(pcie->pci->dev, "%s: link has been recovered\n", __func__);
>  
> -	bus = NULL;
> -	while ((bus = pci_find_next_bus(bus)) != NULL)
> -		pci_rescan_bus(bus);
> +	/* Rescan the root bus only if link is retained */
> +	pci_rescan_bus(bus);
>  
>  fail:
>  	pci_unlock_rescan_remove();
> -- 
> 2.25.1
>
diff mbox series

Patch

diff --git a/drivers/pci/controller/dwc/pcie-armada8k.c b/drivers/pci/controller/dwc/pcie-armada8k.c
index f9d6907900d1..ca2dedaa69a4 100644
--- a/drivers/pci/controller/dwc/pcie-armada8k.c
+++ b/drivers/pci/controller/dwc/pcie-armada8k.c
@@ -231,6 +231,7 @@  static void armada8k_pcie_recover_link(struct work_struct *ws)
 	struct dw_pcie_rp *pp = &pcie->pci->pp;
 	struct pci_bus *bus = pp->bridge->bus;
 	struct pci_dev *root_port;
+	struct pci_dev *child, *tmp;
 	int ret;
 
 	root_port = pci_get_slot(bus, 0);
@@ -239,7 +240,14 @@  static void armada8k_pcie_recover_link(struct work_struct *ws)
 		return;
 	}
 	pci_lock_rescan_remove();
-	pci_stop_and_remove_bus_device(root_port);
+
+	/* Remove all devices under root bus */
+	list_for_each_entry_safe(child, tmp,
+				 &root_port->subordinate->devices, bus_list) {
+		pci_stop_and_remove_bus_device(child);
+		dev_dbg(&child->dev, "removed\n");
+	}
+
 	/* Reset device if reset gpio is set */
 	if (pcie->reset_gpio) {
 		/* assert and then deassert the reset signal */
@@ -279,11 +287,12 @@  static void armada8k_pcie_recover_link(struct work_struct *ws)
 
 	/* Wait until the link becomes active again */
 	if (dw_pcie_wait_for_link(pcie->pci))
-		dev_err(pcie->pci->dev, "Link not up after reconfiguration\n");
+		goto fail;
+
+	dev_dbg(pcie->pci->dev, "%s: link has been recovered\n", __func__);
 
-	bus = NULL;
-	while ((bus = pci_find_next_bus(bus)) != NULL)
-		pci_rescan_bus(bus);
+	/* Rescan the root bus only if link is retained */
+	pci_rescan_bus(bus);
 
 fail:
 	pci_unlock_rescan_remove();