diff mbox series

PCI: xgene: Revert "PCI: xgene: Use inbound resources for setup"

Message ID 20220314144429.1947610-1-maz@kernel.org (mailing list archive)
State Accepted
Commit 1874b6d7ab1bdc900e8398026350313ac29caddb
Headers show
Series PCI: xgene: Revert "PCI: xgene: Use inbound resources for setup" | expand

Commit Message

Marc Zyngier March 14, 2022, 2:44 p.m. UTC
Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup")
killed PCIe on my XGene-1 box (a Mustang board). The machine itself
is still alive, but half of its storage (over NVMe) is gone, and the
NVMe driver just times out.

Note that this machine boots with a device tree provided by the
UEFI firmware (2016 vintage), which could well be non conformant
with the spec, hence the breakage.

With the patch reverted, the box boots 5.17-rc8 with flying colors.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Toan Le <toan@os.amperecomputing.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Krzysztof Wilczyński <kw@linux.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stéphane Graber <stgraber@ubuntu.com>
Cc: dann frazier <dann.frazier@canonical.com>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: stable@vger.kernel.org>
---
 drivers/pci/controller/pci-xgene.c | 33 ++++++++++++++++++++----------
 1 file changed, 22 insertions(+), 11 deletions(-)

Comments

Thorsten Leemhuis March 14, 2022, 3:22 p.m. UTC | #1
On 14.03.22 15:44, Marc Zyngier wrote:
> Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup")
> killed PCIe on my XGene-1 box (a Mustang board). The machine itself
> is still alive, but half of its storage (over NVMe) is gone, and the
> NVMe driver just times out.
> 
> Note that this machine boots with a device tree provided by the
> UEFI firmware (2016 vintage), which could well be non conformant
> with the spec, hence the breakage.
> 
> With the patch reverted, the box boots 5.17-rc8 with flying colors.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Toan Le <toan@os.amperecomputing.com>
> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> Cc: Krzysztof Wilczyński <kw@linux.com>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Stéphane Graber <stgraber@ubuntu.com>
> Cc: dann frazier <dann.frazier@canonical.com>
> Cc: Thorsten Leemhuis <regressions@leemhuis.info>

Feel free to drop me there. But could you please instead add a 'Link:'
tag pointing to the report for anyone wanting to look into the backstory
in the future, as explained in
'Documentation/process/submitting-patches.rst' and
'Documentation/process/5.Posting.rst'? E.g. like this:

"Link: https://lore.kernel.org/r/Yf2wTLjmcRj%2BAbDv@xps13.dannf/"

FWIW, I care for another reason: I'm tracking this regression with
regzbot, my regression tracking bot. Proper "Link:" tags allow the bot
to connect regression reports with fixes being posted or applied to
resolve the regression -- which makes regression tracking a whole lot
easier.

While at it, let me tell regzbot about this thread:
#regzbot ^backmonitor:
https://lore.kernel.org/r/Yf2wTLjmcRj%2BAbDv@xps13.dannf/

> Cc: stable@vger.kernel.org>

Typo, missing a "<"

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I'm getting a lot of
reports on my table. I can only look briefly into most of them and lack
knowledge about most of the areas they concern. I thus unfortunately
will sometimes get things wrong or miss something important. I hope
that's not the case here; if you think it is, don't hesitate to tell me
in a public reply, it's in everyone's interest to set the public record
straight.
Lorenzo Pieralisi March 17, 2022, 9:15 a.m. UTC | #2
[removed CC stable]

On Mon, Mar 14, 2022 at 02:44:29PM +0000, Marc Zyngier wrote:
> Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup")
> killed PCIe on my XGene-1 box (a Mustang board). The machine itself
> is still alive, but half of its storage (over NVMe) is gone, and the
> NVMe driver just times out.
> 
> Note that this machine boots with a device tree provided by the
> UEFI firmware (2016 vintage), which could well be non conformant
> with the spec, hence the breakage.
> 
> With the patch reverted, the box boots 5.17-rc8 with flying colors.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Toan Le <toan@os.amperecomputing.com>
> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> Cc: Krzysztof Wilczyński <kw@linux.com>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Stéphane Graber <stgraber@ubuntu.com>
> Cc: dann frazier <dann.frazier@canonical.com>
> Cc: Thorsten Leemhuis <regressions@leemhuis.info>
> Cc: stable@vger.kernel.org>
> ---
>  drivers/pci/controller/pci-xgene.c | 33 ++++++++++++++++++++----------
>  1 file changed, 22 insertions(+), 11 deletions(-)

Dann, Rob,

does this fix the regression debated here:

https://lore.kernel.org/all/Yf2wTLjmcRj+AbDv@xps13.dannf

It is unclear in that thread what the conclusion reached was.

Thanks,
Lorenzo

> diff --git a/drivers/pci/controller/pci-xgene.c b/drivers/pci/controller/pci-xgene.c
> index 0d5acbfc7143..aa41ceaf031f 100644
> --- a/drivers/pci/controller/pci-xgene.c
> +++ b/drivers/pci/controller/pci-xgene.c
> @@ -479,28 +479,27 @@ static int xgene_pcie_select_ib_reg(u8 *ib_reg_mask, u64 size)
>  }
>  
>  static void xgene_pcie_setup_ib_reg(struct xgene_pcie *port,
> -				    struct resource_entry *entry,
> -				    u8 *ib_reg_mask)
> +				    struct of_pci_range *range, u8 *ib_reg_mask)
>  {
>  	void __iomem *cfg_base = port->cfg_base;
>  	struct device *dev = port->dev;
>  	void __iomem *bar_addr;
>  	u32 pim_reg;
> -	u64 cpu_addr = entry->res->start;
> -	u64 pci_addr = cpu_addr - entry->offset;
> -	u64 size = resource_size(entry->res);
> +	u64 cpu_addr = range->cpu_addr;
> +	u64 pci_addr = range->pci_addr;
> +	u64 size = range->size;
>  	u64 mask = ~(size - 1) | EN_REG;
>  	u32 flags = PCI_BASE_ADDRESS_MEM_TYPE_64;
>  	u32 bar_low;
>  	int region;
>  
> -	region = xgene_pcie_select_ib_reg(ib_reg_mask, size);
> +	region = xgene_pcie_select_ib_reg(ib_reg_mask, range->size);
>  	if (region < 0) {
>  		dev_warn(dev, "invalid pcie dma-range config\n");
>  		return;
>  	}
>  
> -	if (entry->res->flags & IORESOURCE_PREFETCH)
> +	if (range->flags & IORESOURCE_PREFETCH)
>  		flags |= PCI_BASE_ADDRESS_MEM_PREFETCH;
>  
>  	bar_low = pcie_bar_low_val((u32)cpu_addr, flags);
> @@ -531,13 +530,25 @@ static void xgene_pcie_setup_ib_reg(struct xgene_pcie *port,
>  
>  static int xgene_pcie_parse_map_dma_ranges(struct xgene_pcie *port)
>  {
> -	struct pci_host_bridge *bridge = pci_host_bridge_from_priv(port);
> -	struct resource_entry *entry;
> +	struct device_node *np = port->node;
> +	struct of_pci_range range;
> +	struct of_pci_range_parser parser;
> +	struct device *dev = port->dev;
>  	u8 ib_reg_mask = 0;
>  
> -	resource_list_for_each_entry(entry, &bridge->dma_ranges)
> -		xgene_pcie_setup_ib_reg(port, entry, &ib_reg_mask);
> +	if (of_pci_dma_range_parser_init(&parser, np)) {
> +		dev_err(dev, "missing dma-ranges property\n");
> +		return -EINVAL;
> +	}
> +
> +	/* Get the dma-ranges from DT */
> +	for_each_of_pci_range(&parser, &range) {
> +		u64 end = range.cpu_addr + range.size - 1;
>  
> +		dev_dbg(dev, "0x%08x 0x%016llx..0x%016llx -> 0x%016llx\n",
> +			range.flags, range.cpu_addr, end, range.pci_addr);
> +		xgene_pcie_setup_ib_reg(port, &range, &ib_reg_mask);
> +	}
>  	return 0;
>  }
>  
> -- 
> 2.34.1
>
dann frazier March 17, 2022, 6:23 p.m. UTC | #3
On Thu, Mar 17, 2022 at 09:15:43AM +0000, Lorenzo Pieralisi wrote:
> [removed CC stable]
> 
> On Mon, Mar 14, 2022 at 02:44:29PM +0000, Marc Zyngier wrote:
> > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup")
> > killed PCIe on my XGene-1 box (a Mustang board). The machine itself
> > is still alive, but half of its storage (over NVMe) is gone, and the
> > NVMe driver just times out.
> > 
> > Note that this machine boots with a device tree provided by the
> > UEFI firmware (2016 vintage), which could well be non conformant
> > with the spec, hence the breakage.
> > 
> > With the patch reverted, the box boots 5.17-rc8 with flying colors.
> > 
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > Cc: Rob Herring <robh@kernel.org>
> > Cc: Toan Le <toan@os.amperecomputing.com>
> > Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > Cc: Krzysztof Wilczyński <kw@linux.com>
> > Cc: Bjorn Helgaas <bhelgaas@google.com>
> > Cc: Stéphane Graber <stgraber@ubuntu.com>
> > Cc: dann frazier <dann.frazier@canonical.com>
> > Cc: Thorsten Leemhuis <regressions@leemhuis.info>
> > Cc: stable@vger.kernel.org>
> > ---
> >  drivers/pci/controller/pci-xgene.c | 33 ++++++++++++++++++++----------
> >  1 file changed, 22 insertions(+), 11 deletions(-)
> 
> Dann, Rob,
> 
> does this fix the regression debated here:
> 
> https://lore.kernel.org/all/Yf2wTLjmcRj+AbDv@xps13.dannf
> 
> It is unclear in that thread what the conclusion reached was.

Thanks for checking in Lorenzo! Reverting that patch is required but
not sufficient to get our m400s working. In addition, we'd also need
to revert commit c7a75d07827a ("PCI: xgene: Fix IB window setup").

I believe if we revert both then it should return us to a state where
Marc's Mustang, Stéphane's Merlins and our m400s all work again.

  -dann

> Thanks,
> Lorenzo
> 
> > diff --git a/drivers/pci/controller/pci-xgene.c b/drivers/pci/controller/pci-xgene.c
> > index 0d5acbfc7143..aa41ceaf031f 100644
> > --- a/drivers/pci/controller/pci-xgene.c
> > +++ b/drivers/pci/controller/pci-xgene.c
> > @@ -479,28 +479,27 @@ static int xgene_pcie_select_ib_reg(u8 *ib_reg_mask, u64 size)
> >  }
> >  
> >  static void xgene_pcie_setup_ib_reg(struct xgene_pcie *port,
> > -				    struct resource_entry *entry,
> > -				    u8 *ib_reg_mask)
> > +				    struct of_pci_range *range, u8 *ib_reg_mask)
> >  {
> >  	void __iomem *cfg_base = port->cfg_base;
> >  	struct device *dev = port->dev;
> >  	void __iomem *bar_addr;
> >  	u32 pim_reg;
> > -	u64 cpu_addr = entry->res->start;
> > -	u64 pci_addr = cpu_addr - entry->offset;
> > -	u64 size = resource_size(entry->res);
> > +	u64 cpu_addr = range->cpu_addr;
> > +	u64 pci_addr = range->pci_addr;
> > +	u64 size = range->size;
> >  	u64 mask = ~(size - 1) | EN_REG;
> >  	u32 flags = PCI_BASE_ADDRESS_MEM_TYPE_64;
> >  	u32 bar_low;
> >  	int region;
> >  
> > -	region = xgene_pcie_select_ib_reg(ib_reg_mask, size);
> > +	region = xgene_pcie_select_ib_reg(ib_reg_mask, range->size);
> >  	if (region < 0) {
> >  		dev_warn(dev, "invalid pcie dma-range config\n");
> >  		return;
> >  	}
> >  
> > -	if (entry->res->flags & IORESOURCE_PREFETCH)
> > +	if (range->flags & IORESOURCE_PREFETCH)
> >  		flags |= PCI_BASE_ADDRESS_MEM_PREFETCH;
> >  
> >  	bar_low = pcie_bar_low_val((u32)cpu_addr, flags);
> > @@ -531,13 +530,25 @@ static void xgene_pcie_setup_ib_reg(struct xgene_pcie *port,
> >  
> >  static int xgene_pcie_parse_map_dma_ranges(struct xgene_pcie *port)
> >  {
> > -	struct pci_host_bridge *bridge = pci_host_bridge_from_priv(port);
> > -	struct resource_entry *entry;
> > +	struct device_node *np = port->node;
> > +	struct of_pci_range range;
> > +	struct of_pci_range_parser parser;
> > +	struct device *dev = port->dev;
> >  	u8 ib_reg_mask = 0;
> >  
> > -	resource_list_for_each_entry(entry, &bridge->dma_ranges)
> > -		xgene_pcie_setup_ib_reg(port, entry, &ib_reg_mask);
> > +	if (of_pci_dma_range_parser_init(&parser, np)) {
> > +		dev_err(dev, "missing dma-ranges property\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	/* Get the dma-ranges from DT */
> > +	for_each_of_pci_range(&parser, &range) {
> > +		u64 end = range.cpu_addr + range.size - 1;
> >  
> > +		dev_dbg(dev, "0x%08x 0x%016llx..0x%016llx -> 0x%016llx\n",
> > +			range.flags, range.cpu_addr, end, range.pci_addr);
> > +		xgene_pcie_setup_ib_reg(port, &range, &ib_reg_mask);
> > +	}
> >  	return 0;
> >  }
> >
Marc Zyngier March 21, 2022, 9:40 a.m. UTC | #4
On Thu, 17 Mar 2022 18:23:33 +0000,
dann frazier <dann.frazier@canonical.com> wrote:
> 
> On Thu, Mar 17, 2022 at 09:15:43AM +0000, Lorenzo Pieralisi wrote:
> > [removed CC stable]
> > 
> > On Mon, Mar 14, 2022 at 02:44:29PM +0000, Marc Zyngier wrote:
> > > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup")
> > > killed PCIe on my XGene-1 box (a Mustang board). The machine itself
> > > is still alive, but half of its storage (over NVMe) is gone, and the
> > > NVMe driver just times out.
> > > 
> > > Note that this machine boots with a device tree provided by the
> > > UEFI firmware (2016 vintage), which could well be non conformant
> > > with the spec, hence the breakage.
> > > 
> > > With the patch reverted, the box boots 5.17-rc8 with flying colors.
> > > 
> > > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > > Cc: Rob Herring <robh@kernel.org>
> > > Cc: Toan Le <toan@os.amperecomputing.com>
> > > Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > > Cc: Krzysztof Wilczyński <kw@linux.com>
> > > Cc: Bjorn Helgaas <bhelgaas@google.com>
> > > Cc: Stéphane Graber <stgraber@ubuntu.com>
> > > Cc: dann frazier <dann.frazier@canonical.com>
> > > Cc: Thorsten Leemhuis <regressions@leemhuis.info>
> > > Cc: stable@vger.kernel.org>
> > > ---
> > >  drivers/pci/controller/pci-xgene.c | 33 ++++++++++++++++++++----------
> > >  1 file changed, 22 insertions(+), 11 deletions(-)
> > 
> > Dann, Rob,
> > 
> > does this fix the regression debated here:
> > 
> > https://lore.kernel.org/all/Yf2wTLjmcRj+AbDv@xps13.dannf
> > 
> > It is unclear in that thread what the conclusion reached was.
> 
> Thanks for checking in Lorenzo! Reverting that patch is required but
> not sufficient to get our m400s working. In addition, we'd also need
> to revert commit c7a75d07827a ("PCI: xgene: Fix IB window setup").
> 
> I believe if we revert both then it should return us to a state where
> Marc's Mustang, Stéphane's Merlins and our m400s all work again.

Right. I'll post a series reverting both patches, which hopefully
Lorenzo and Bjorn can merge shortly.

Thanks,

	M.
diff mbox series

Patch

diff --git a/drivers/pci/controller/pci-xgene.c b/drivers/pci/controller/pci-xgene.c
index 0d5acbfc7143..aa41ceaf031f 100644
--- a/drivers/pci/controller/pci-xgene.c
+++ b/drivers/pci/controller/pci-xgene.c
@@ -479,28 +479,27 @@  static int xgene_pcie_select_ib_reg(u8 *ib_reg_mask, u64 size)
 }
 
 static void xgene_pcie_setup_ib_reg(struct xgene_pcie *port,
-				    struct resource_entry *entry,
-				    u8 *ib_reg_mask)
+				    struct of_pci_range *range, u8 *ib_reg_mask)
 {
 	void __iomem *cfg_base = port->cfg_base;
 	struct device *dev = port->dev;
 	void __iomem *bar_addr;
 	u32 pim_reg;
-	u64 cpu_addr = entry->res->start;
-	u64 pci_addr = cpu_addr - entry->offset;
-	u64 size = resource_size(entry->res);
+	u64 cpu_addr = range->cpu_addr;
+	u64 pci_addr = range->pci_addr;
+	u64 size = range->size;
 	u64 mask = ~(size - 1) | EN_REG;
 	u32 flags = PCI_BASE_ADDRESS_MEM_TYPE_64;
 	u32 bar_low;
 	int region;
 
-	region = xgene_pcie_select_ib_reg(ib_reg_mask, size);
+	region = xgene_pcie_select_ib_reg(ib_reg_mask, range->size);
 	if (region < 0) {
 		dev_warn(dev, "invalid pcie dma-range config\n");
 		return;
 	}
 
-	if (entry->res->flags & IORESOURCE_PREFETCH)
+	if (range->flags & IORESOURCE_PREFETCH)
 		flags |= PCI_BASE_ADDRESS_MEM_PREFETCH;
 
 	bar_low = pcie_bar_low_val((u32)cpu_addr, flags);
@@ -531,13 +530,25 @@  static void xgene_pcie_setup_ib_reg(struct xgene_pcie *port,
 
 static int xgene_pcie_parse_map_dma_ranges(struct xgene_pcie *port)
 {
-	struct pci_host_bridge *bridge = pci_host_bridge_from_priv(port);
-	struct resource_entry *entry;
+	struct device_node *np = port->node;
+	struct of_pci_range range;
+	struct of_pci_range_parser parser;
+	struct device *dev = port->dev;
 	u8 ib_reg_mask = 0;
 
-	resource_list_for_each_entry(entry, &bridge->dma_ranges)
-		xgene_pcie_setup_ib_reg(port, entry, &ib_reg_mask);
+	if (of_pci_dma_range_parser_init(&parser, np)) {
+		dev_err(dev, "missing dma-ranges property\n");
+		return -EINVAL;
+	}
+
+	/* Get the dma-ranges from DT */
+	for_each_of_pci_range(&parser, &range) {
+		u64 end = range.cpu_addr + range.size - 1;
 
+		dev_dbg(dev, "0x%08x 0x%016llx..0x%016llx -> 0x%016llx\n",
+			range.flags, range.cpu_addr, end, range.pci_addr);
+		xgene_pcie_setup_ib_reg(port, &range, &ib_reg_mask);
+	}
 	return 0;
 }