Message ID | 20220314144429.1947610-1-maz@kernel.org (mailing list archive) |
---|---|
State | Accepted |
Commit | 1874b6d7ab1bdc900e8398026350313ac29caddb |
Headers | show |
Series | PCI: xgene: Revert "PCI: xgene: Use inbound resources for setup" | expand |
On 14.03.22 15:44, Marc Zyngier wrote: > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > killed PCIe on my XGene-1 box (a Mustang board). The machine itself > is still alive, but half of its storage (over NVMe) is gone, and the > NVMe driver just times out. > > Note that this machine boots with a device tree provided by the > UEFI firmware (2016 vintage), which could well be non conformant > with the spec, hence the breakage. > > With the patch reverted, the box boots 5.17-rc8 with flying colors. > > Signed-off-by: Marc Zyngier <maz@kernel.org> > Cc: Rob Herring <robh@kernel.org> > Cc: Toan Le <toan@os.amperecomputing.com> > Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > Cc: Krzysztof Wilczyński <kw@linux.com> > Cc: Bjorn Helgaas <bhelgaas@google.com> > Cc: Stéphane Graber <stgraber@ubuntu.com> > Cc: dann frazier <dann.frazier@canonical.com> > Cc: Thorsten Leemhuis <regressions@leemhuis.info> Feel free to drop me there. But could you please instead add a 'Link:' tag pointing to the report for anyone wanting to look into the backstory in the future, as explained in 'Documentation/process/submitting-patches.rst' and 'Documentation/process/5.Posting.rst'? E.g. like this: "Link: https://lore.kernel.org/r/Yf2wTLjmcRj%2BAbDv@xps13.dannf/" FWIW, I care for another reason: I'm tracking this regression with regzbot, my regression tracking bot. Proper "Link:" tags allow the bot to connect regression reports with fixes being posted or applied to resolve the regression -- which makes regression tracking a whole lot easier. While at it, let me tell regzbot about this thread: #regzbot ^backmonitor: https://lore.kernel.org/r/Yf2wTLjmcRj%2BAbDv@xps13.dannf/ > Cc: stable@vger.kernel.org> Typo, missing a "<" Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) P.S.: As the Linux kernel's regression tracker I'm getting a lot of reports on my table. I can only look briefly into most of them and lack knowledge about most of the areas they concern. I thus unfortunately will sometimes get things wrong or miss something important. I hope that's not the case here; if you think it is, don't hesitate to tell me in a public reply, it's in everyone's interest to set the public record straight.
[removed CC stable] On Mon, Mar 14, 2022 at 02:44:29PM +0000, Marc Zyngier wrote: > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > killed PCIe on my XGene-1 box (a Mustang board). The machine itself > is still alive, but half of its storage (over NVMe) is gone, and the > NVMe driver just times out. > > Note that this machine boots with a device tree provided by the > UEFI firmware (2016 vintage), which could well be non conformant > with the spec, hence the breakage. > > With the patch reverted, the box boots 5.17-rc8 with flying colors. > > Signed-off-by: Marc Zyngier <maz@kernel.org> > Cc: Rob Herring <robh@kernel.org> > Cc: Toan Le <toan@os.amperecomputing.com> > Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > Cc: Krzysztof Wilczyński <kw@linux.com> > Cc: Bjorn Helgaas <bhelgaas@google.com> > Cc: Stéphane Graber <stgraber@ubuntu.com> > Cc: dann frazier <dann.frazier@canonical.com> > Cc: Thorsten Leemhuis <regressions@leemhuis.info> > Cc: stable@vger.kernel.org> > --- > drivers/pci/controller/pci-xgene.c | 33 ++++++++++++++++++++---------- > 1 file changed, 22 insertions(+), 11 deletions(-) Dann, Rob, does this fix the regression debated here: https://lore.kernel.org/all/Yf2wTLjmcRj+AbDv@xps13.dannf It is unclear in that thread what the conclusion reached was. Thanks, Lorenzo > diff --git a/drivers/pci/controller/pci-xgene.c b/drivers/pci/controller/pci-xgene.c > index 0d5acbfc7143..aa41ceaf031f 100644 > --- a/drivers/pci/controller/pci-xgene.c > +++ b/drivers/pci/controller/pci-xgene.c > @@ -479,28 +479,27 @@ static int xgene_pcie_select_ib_reg(u8 *ib_reg_mask, u64 size) > } > > static void xgene_pcie_setup_ib_reg(struct xgene_pcie *port, > - struct resource_entry *entry, > - u8 *ib_reg_mask) > + struct of_pci_range *range, u8 *ib_reg_mask) > { > void __iomem *cfg_base = port->cfg_base; > struct device *dev = port->dev; > void __iomem *bar_addr; > u32 pim_reg; > - u64 cpu_addr = entry->res->start; > - u64 pci_addr = cpu_addr - entry->offset; > - u64 size = resource_size(entry->res); > + u64 cpu_addr = range->cpu_addr; > + u64 pci_addr = range->pci_addr; > + u64 size = range->size; > u64 mask = ~(size - 1) | EN_REG; > u32 flags = PCI_BASE_ADDRESS_MEM_TYPE_64; > u32 bar_low; > int region; > > - region = xgene_pcie_select_ib_reg(ib_reg_mask, size); > + region = xgene_pcie_select_ib_reg(ib_reg_mask, range->size); > if (region < 0) { > dev_warn(dev, "invalid pcie dma-range config\n"); > return; > } > > - if (entry->res->flags & IORESOURCE_PREFETCH) > + if (range->flags & IORESOURCE_PREFETCH) > flags |= PCI_BASE_ADDRESS_MEM_PREFETCH; > > bar_low = pcie_bar_low_val((u32)cpu_addr, flags); > @@ -531,13 +530,25 @@ static void xgene_pcie_setup_ib_reg(struct xgene_pcie *port, > > static int xgene_pcie_parse_map_dma_ranges(struct xgene_pcie *port) > { > - struct pci_host_bridge *bridge = pci_host_bridge_from_priv(port); > - struct resource_entry *entry; > + struct device_node *np = port->node; > + struct of_pci_range range; > + struct of_pci_range_parser parser; > + struct device *dev = port->dev; > u8 ib_reg_mask = 0; > > - resource_list_for_each_entry(entry, &bridge->dma_ranges) > - xgene_pcie_setup_ib_reg(port, entry, &ib_reg_mask); > + if (of_pci_dma_range_parser_init(&parser, np)) { > + dev_err(dev, "missing dma-ranges property\n"); > + return -EINVAL; > + } > + > + /* Get the dma-ranges from DT */ > + for_each_of_pci_range(&parser, &range) { > + u64 end = range.cpu_addr + range.size - 1; > > + dev_dbg(dev, "0x%08x 0x%016llx..0x%016llx -> 0x%016llx\n", > + range.flags, range.cpu_addr, end, range.pci_addr); > + xgene_pcie_setup_ib_reg(port, &range, &ib_reg_mask); > + } > return 0; > } > > -- > 2.34.1 >
On Thu, Mar 17, 2022 at 09:15:43AM +0000, Lorenzo Pieralisi wrote: > [removed CC stable] > > On Mon, Mar 14, 2022 at 02:44:29PM +0000, Marc Zyngier wrote: > > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > > killed PCIe on my XGene-1 box (a Mustang board). The machine itself > > is still alive, but half of its storage (over NVMe) is gone, and the > > NVMe driver just times out. > > > > Note that this machine boots with a device tree provided by the > > UEFI firmware (2016 vintage), which could well be non conformant > > with the spec, hence the breakage. > > > > With the patch reverted, the box boots 5.17-rc8 with flying colors. > > > > Signed-off-by: Marc Zyngier <maz@kernel.org> > > Cc: Rob Herring <robh@kernel.org> > > Cc: Toan Le <toan@os.amperecomputing.com> > > Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > > Cc: Krzysztof Wilczyński <kw@linux.com> > > Cc: Bjorn Helgaas <bhelgaas@google.com> > > Cc: Stéphane Graber <stgraber@ubuntu.com> > > Cc: dann frazier <dann.frazier@canonical.com> > > Cc: Thorsten Leemhuis <regressions@leemhuis.info> > > Cc: stable@vger.kernel.org> > > --- > > drivers/pci/controller/pci-xgene.c | 33 ++++++++++++++++++++---------- > > 1 file changed, 22 insertions(+), 11 deletions(-) > > Dann, Rob, > > does this fix the regression debated here: > > https://lore.kernel.org/all/Yf2wTLjmcRj+AbDv@xps13.dannf > > It is unclear in that thread what the conclusion reached was. Thanks for checking in Lorenzo! Reverting that patch is required but not sufficient to get our m400s working. In addition, we'd also need to revert commit c7a75d07827a ("PCI: xgene: Fix IB window setup"). I believe if we revert both then it should return us to a state where Marc's Mustang, Stéphane's Merlins and our m400s all work again. -dann > Thanks, > Lorenzo > > > diff --git a/drivers/pci/controller/pci-xgene.c b/drivers/pci/controller/pci-xgene.c > > index 0d5acbfc7143..aa41ceaf031f 100644 > > --- a/drivers/pci/controller/pci-xgene.c > > +++ b/drivers/pci/controller/pci-xgene.c > > @@ -479,28 +479,27 @@ static int xgene_pcie_select_ib_reg(u8 *ib_reg_mask, u64 size) > > } > > > > static void xgene_pcie_setup_ib_reg(struct xgene_pcie *port, > > - struct resource_entry *entry, > > - u8 *ib_reg_mask) > > + struct of_pci_range *range, u8 *ib_reg_mask) > > { > > void __iomem *cfg_base = port->cfg_base; > > struct device *dev = port->dev; > > void __iomem *bar_addr; > > u32 pim_reg; > > - u64 cpu_addr = entry->res->start; > > - u64 pci_addr = cpu_addr - entry->offset; > > - u64 size = resource_size(entry->res); > > + u64 cpu_addr = range->cpu_addr; > > + u64 pci_addr = range->pci_addr; > > + u64 size = range->size; > > u64 mask = ~(size - 1) | EN_REG; > > u32 flags = PCI_BASE_ADDRESS_MEM_TYPE_64; > > u32 bar_low; > > int region; > > > > - region = xgene_pcie_select_ib_reg(ib_reg_mask, size); > > + region = xgene_pcie_select_ib_reg(ib_reg_mask, range->size); > > if (region < 0) { > > dev_warn(dev, "invalid pcie dma-range config\n"); > > return; > > } > > > > - if (entry->res->flags & IORESOURCE_PREFETCH) > > + if (range->flags & IORESOURCE_PREFETCH) > > flags |= PCI_BASE_ADDRESS_MEM_PREFETCH; > > > > bar_low = pcie_bar_low_val((u32)cpu_addr, flags); > > @@ -531,13 +530,25 @@ static void xgene_pcie_setup_ib_reg(struct xgene_pcie *port, > > > > static int xgene_pcie_parse_map_dma_ranges(struct xgene_pcie *port) > > { > > - struct pci_host_bridge *bridge = pci_host_bridge_from_priv(port); > > - struct resource_entry *entry; > > + struct device_node *np = port->node; > > + struct of_pci_range range; > > + struct of_pci_range_parser parser; > > + struct device *dev = port->dev; > > u8 ib_reg_mask = 0; > > > > - resource_list_for_each_entry(entry, &bridge->dma_ranges) > > - xgene_pcie_setup_ib_reg(port, entry, &ib_reg_mask); > > + if (of_pci_dma_range_parser_init(&parser, np)) { > > + dev_err(dev, "missing dma-ranges property\n"); > > + return -EINVAL; > > + } > > + > > + /* Get the dma-ranges from DT */ > > + for_each_of_pci_range(&parser, &range) { > > + u64 end = range.cpu_addr + range.size - 1; > > > > + dev_dbg(dev, "0x%08x 0x%016llx..0x%016llx -> 0x%016llx\n", > > + range.flags, range.cpu_addr, end, range.pci_addr); > > + xgene_pcie_setup_ib_reg(port, &range, &ib_reg_mask); > > + } > > return 0; > > } > >
On Thu, 17 Mar 2022 18:23:33 +0000, dann frazier <dann.frazier@canonical.com> wrote: > > On Thu, Mar 17, 2022 at 09:15:43AM +0000, Lorenzo Pieralisi wrote: > > [removed CC stable] > > > > On Mon, Mar 14, 2022 at 02:44:29PM +0000, Marc Zyngier wrote: > > > Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") > > > killed PCIe on my XGene-1 box (a Mustang board). The machine itself > > > is still alive, but half of its storage (over NVMe) is gone, and the > > > NVMe driver just times out. > > > > > > Note that this machine boots with a device tree provided by the > > > UEFI firmware (2016 vintage), which could well be non conformant > > > with the spec, hence the breakage. > > > > > > With the patch reverted, the box boots 5.17-rc8 with flying colors. > > > > > > Signed-off-by: Marc Zyngier <maz@kernel.org> > > > Cc: Rob Herring <robh@kernel.org> > > > Cc: Toan Le <toan@os.amperecomputing.com> > > > Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > > > Cc: Krzysztof Wilczyński <kw@linux.com> > > > Cc: Bjorn Helgaas <bhelgaas@google.com> > > > Cc: Stéphane Graber <stgraber@ubuntu.com> > > > Cc: dann frazier <dann.frazier@canonical.com> > > > Cc: Thorsten Leemhuis <regressions@leemhuis.info> > > > Cc: stable@vger.kernel.org> > > > --- > > > drivers/pci/controller/pci-xgene.c | 33 ++++++++++++++++++++---------- > > > 1 file changed, 22 insertions(+), 11 deletions(-) > > > > Dann, Rob, > > > > does this fix the regression debated here: > > > > https://lore.kernel.org/all/Yf2wTLjmcRj+AbDv@xps13.dannf > > > > It is unclear in that thread what the conclusion reached was. > > Thanks for checking in Lorenzo! Reverting that patch is required but > not sufficient to get our m400s working. In addition, we'd also need > to revert commit c7a75d07827a ("PCI: xgene: Fix IB window setup"). > > I believe if we revert both then it should return us to a state where > Marc's Mustang, Stéphane's Merlins and our m400s all work again. Right. I'll post a series reverting both patches, which hopefully Lorenzo and Bjorn can merge shortly. Thanks, M.
diff --git a/drivers/pci/controller/pci-xgene.c b/drivers/pci/controller/pci-xgene.c index 0d5acbfc7143..aa41ceaf031f 100644 --- a/drivers/pci/controller/pci-xgene.c +++ b/drivers/pci/controller/pci-xgene.c @@ -479,28 +479,27 @@ static int xgene_pcie_select_ib_reg(u8 *ib_reg_mask, u64 size) } static void xgene_pcie_setup_ib_reg(struct xgene_pcie *port, - struct resource_entry *entry, - u8 *ib_reg_mask) + struct of_pci_range *range, u8 *ib_reg_mask) { void __iomem *cfg_base = port->cfg_base; struct device *dev = port->dev; void __iomem *bar_addr; u32 pim_reg; - u64 cpu_addr = entry->res->start; - u64 pci_addr = cpu_addr - entry->offset; - u64 size = resource_size(entry->res); + u64 cpu_addr = range->cpu_addr; + u64 pci_addr = range->pci_addr; + u64 size = range->size; u64 mask = ~(size - 1) | EN_REG; u32 flags = PCI_BASE_ADDRESS_MEM_TYPE_64; u32 bar_low; int region; - region = xgene_pcie_select_ib_reg(ib_reg_mask, size); + region = xgene_pcie_select_ib_reg(ib_reg_mask, range->size); if (region < 0) { dev_warn(dev, "invalid pcie dma-range config\n"); return; } - if (entry->res->flags & IORESOURCE_PREFETCH) + if (range->flags & IORESOURCE_PREFETCH) flags |= PCI_BASE_ADDRESS_MEM_PREFETCH; bar_low = pcie_bar_low_val((u32)cpu_addr, flags); @@ -531,13 +530,25 @@ static void xgene_pcie_setup_ib_reg(struct xgene_pcie *port, static int xgene_pcie_parse_map_dma_ranges(struct xgene_pcie *port) { - struct pci_host_bridge *bridge = pci_host_bridge_from_priv(port); - struct resource_entry *entry; + struct device_node *np = port->node; + struct of_pci_range range; + struct of_pci_range_parser parser; + struct device *dev = port->dev; u8 ib_reg_mask = 0; - resource_list_for_each_entry(entry, &bridge->dma_ranges) - xgene_pcie_setup_ib_reg(port, entry, &ib_reg_mask); + if (of_pci_dma_range_parser_init(&parser, np)) { + dev_err(dev, "missing dma-ranges property\n"); + return -EINVAL; + } + + /* Get the dma-ranges from DT */ + for_each_of_pci_range(&parser, &range) { + u64 end = range.cpu_addr + range.size - 1; + dev_dbg(dev, "0x%08x 0x%016llx..0x%016llx -> 0x%016llx\n", + range.flags, range.cpu_addr, end, range.pci_addr); + xgene_pcie_setup_ib_reg(port, &range, &ib_reg_mask); + } return 0; }
Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") killed PCIe on my XGene-1 box (a Mustang board). The machine itself is still alive, but half of its storage (over NVMe) is gone, and the NVMe driver just times out. Note that this machine boots with a device tree provided by the UEFI firmware (2016 vintage), which could well be non conformant with the spec, hence the breakage. With the patch reverted, the box boots 5.17-rc8 with flying colors. Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Rob Herring <robh@kernel.org> Cc: Toan Le <toan@os.amperecomputing.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Krzysztof Wilczyński <kw@linux.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Stéphane Graber <stgraber@ubuntu.com> Cc: dann frazier <dann.frazier@canonical.com> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: stable@vger.kernel.org> --- drivers/pci/controller/pci-xgene.c | 33 ++++++++++++++++++++---------- 1 file changed, 22 insertions(+), 11 deletions(-)