diff mbox

PCI: imx6: fix downstream bus scanning

Message ID 20170510175752.6519-1-l.stach@pengutronix.de (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Lucas Stach May 10, 2017, 5:57 p.m. UTC
The change in Linux 4.12 to make PCI configuartion requests non-posted
means that we are now getting a synchronous abort when the CFG space
read to probe for downstream devices times out.

Synchronous aborts need to be handled differently from the async aborts
we were getting before, in particular the PC needs to be advanced when
resolving the abort. This is mostly a copy of what other PCI drivers do
on ARM to handle those aborts.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
---
This is a fix that needs to go in for 4.12, but I would hope to get
some thorough testing before.
---
 drivers/pci/dwc/pci-imx6.c | 33 ++++++++++++++++++++++++++++++---
 1 file changed, 30 insertions(+), 3 deletions(-)

Comments

Fabio Estevam May 10, 2017, 6:35 p.m. UTC | #1
Hi Lucas,

On Wed, May 10, 2017 at 2:57 PM, Lucas Stach <l.stach@pengutronix.de> wrote:
> The change in Linux 4.12 to make PCI configuartion requests non-posted
> means that we are now getting a synchronous abort when the CFG space
> read to probe for downstream devices times out.
>
> Synchronous aborts need to be handled differently from the async aborts
> we were getting before, in particular the PC needs to be advanced when
> resolving the abort. This is mostly a copy of what other PCI drivers do
> on ARM to handle those aborts.
>
> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
> ---
> This is a fix that needs to go in for 4.12, but I would hope to get
> some thorough testing before.

This fixes the kernel crash on my mx6q board with a PCI switch, thanks!

PCI Wifi card is also correctly detected.
Fabio Estevam May 10, 2017, 7:04 p.m. UTC | #2
On Wed, May 10, 2017 at 3:35 PM, Fabio Estevam <festevam@gmail.com> wrote:
> Hi Lucas,
>
> On Wed, May 10, 2017 at 2:57 PM, Lucas Stach <l.stach@pengutronix.de> wrote:
>> The change in Linux 4.12 to make PCI configuartion requests non-posted
>> means that we are now getting a synchronous abort when the CFG space
>> read to probe for downstream devices times out.
>>
>> Synchronous aborts need to be handled differently from the async aborts
>> we were getting before, in particular the PC needs to be advanced when
>> resolving the abort. This is mostly a copy of what other PCI drivers do
>> on ARM to handle those aborts.
>>
>> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
>> ---
>> This is a fix that needs to go in for 4.12, but I would hope to get
>> some thorough testing before.
>
> This fixes the kernel crash on my mx6q board with a PCI switch, thanks!
>
> PCI Wifi card is also correctly detected.

Forgot to add:

Tested-by: Fabio Estevam <fabio.estevam@nxp.com>
Peter Senna Tschudin May 10, 2017, 7:51 p.m. UTC | #3
On Wed, May 10, 2017 at 07:57:52PM +0200, Lucas Stach wrote:
> The change in Linux 4.12 to make PCI configuartion requests non-posted
> means that we are now getting a synchronous abort when the CFG space
> read to probe for downstream devices times out.
> 
> Synchronous aborts need to be handled differently from the async aborts
> we were getting before, in particular the PC needs to be advanced when
> resolving the abort. This is mostly a copy of what other PCI drivers do
> on ARM to handle those aborts.

It solves the issues I was having with latest linux-next when my u-boot
do not initialize PCI. Before this patch system does not boot, after
this patch system boot and the two e1000 devices that are under the PLX
PCI bridge are properly detected. Thank you Lucas!

> 
> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Peter Senna Tschudin <peter.senna@collabora.com>
> ---
> This is a fix that needs to go in for 4.12, but I would hope to get
> some thorough testing before.
> ---
>  drivers/pci/dwc/pci-imx6.c | 33 ++++++++++++++++++++++++++++++---
>  1 file changed, 30 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/dwc/pci-imx6.c b/drivers/pci/dwc/pci-imx6.c
> index a98cba55c7f0..19a289b8cc94 100644
> --- a/drivers/pci/dwc/pci-imx6.c
> +++ b/drivers/pci/dwc/pci-imx6.c
> @@ -252,7 +252,34 @@ static void imx6_pcie_reset_phy(struct imx6_pcie *imx6_pcie)
>  static int imx6q_pcie_abort_handler(unsigned long addr,
>  		unsigned int fsr, struct pt_regs *regs)
>  {
> -	return 0;
> +	unsigned long pc = instruction_pointer(regs);
> +	unsigned long instr = *(unsigned long *)pc;
> +	int reg = (instr >> 12) & 15;
> +
> +	/*
> +	 * If the instruction being executed was a read,
> +	 * make it look like it read all-ones.
> +	 */
> +	if ((instr & 0x0c100000) == 0x04100000) {
> +		unsigned long val;
> +
> +		if (instr & 0x00400000)
> +			val = 255;
> +		else
> +			val = -1;
> +
> +		regs->uregs[reg] = val;
> +		regs->ARM_pc += 4;
> +		return 0;
> +	}
> +
> +	if ((instr & 0x0e100090) == 0x00100090) {
> +		regs->uregs[reg] = -1;
> +		regs->ARM_pc += 4;
> +		return 0;
> +	}
> +
> +	return 1;
>  }
>  
>  static void imx6_pcie_assert_core_reset(struct imx6_pcie *imx6_pcie)
> @@ -819,8 +846,8 @@ static int __init imx6_pcie_init(void)
>  	 * we can install the handler here without risking it
>  	 * accessing some uninitialized driver state.
>  	 */
> -	hook_fault_code(16 + 6, imx6q_pcie_abort_handler, SIGBUS, 0,
> -			"imprecise external abort");
> +	hook_fault_code(8, imx6q_pcie_abort_handler, SIGBUS, 0,
> +			"external abort on non-linefetch");
>  
>  	return platform_driver_register(&imx6_pcie_driver);
>  }
> -- 
> 2.11.0
>
Hongxing Zhu May 11, 2017, 2 a.m. UTC | #4
> -----Original Message-----
> From: Lucas Stach [mailto:l.stach@pengutronix.de]
> Sent: Thursday, May 11, 2017 1:58 AM
> To: linux-pci@vger.kernel.org
> Cc: Peter Senna Tschudin <peter.senna@collabora.com>; Richard Zhu
> <hongxing.zhu@nxp.com>; festevam@gmail.com; bhelgaas@google.com;
> tharvey@gateworks.com; lorenzo.pieralisi@arm.com; linux-arm-
> kernel@lists.infradead.org; kernel@pengutronix.de; patchwork-
> lst@pengutronix.de
> Subject: [PATCH] PCI: imx6: fix downstream bus scanning
> 
> The change in Linux 4.12 to make PCI configuartion requests non-posted
> means that we are now getting a synchronous abort when the CFG space
> read to probe for downstream devices times out.
> 
> Synchronous aborts need to be handled differently from the async aborts we
> were getting before, in particular the PC needs to be advanced when
> resolving the abort. This is mostly a copy of what other PCI drivers do on ARM
> to handle those aborts.
> 
> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>

Thanks lot for your great help at first.
Only one little spell comment, "configuartion" in the commit message should be "configuration",
 the others are okay.
Acked-by: Richard Zhu <hongxing.zhu@nxp.com>

> ---
> This is a fix that needs to go in for 4.12, but I would hope to get some
> thorough testing before.
> ---
>  drivers/pci/dwc/pci-imx6.c | 33 ++++++++++++++++++++++++++++++---
>  1 file changed, 30 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/dwc/pci-imx6.c b/drivers/pci/dwc/pci-imx6.c index
> a98cba55c7f0..19a289b8cc94 100644
> --- a/drivers/pci/dwc/pci-imx6.c
> +++ b/drivers/pci/dwc/pci-imx6.c
> @@ -252,7 +252,34 @@ static void imx6_pcie_reset_phy(struct imx6_pcie
> *imx6_pcie)  static int imx6q_pcie_abort_handler(unsigned long addr,
>  		unsigned int fsr, struct pt_regs *regs)  {
> -	return 0;
> +	unsigned long pc = instruction_pointer(regs);
> +	unsigned long instr = *(unsigned long *)pc;
> +	int reg = (instr >> 12) & 15;
> +
> +	/*
> +	 * If the instruction being executed was a read,
> +	 * make it look like it read all-ones.
> +	 */
> +	if ((instr & 0x0c100000) == 0x04100000) {
> +		unsigned long val;
> +
> +		if (instr & 0x00400000)
> +			val = 255;
> +		else
> +			val = -1;
> +
> +		regs->uregs[reg] = val;
> +		regs->ARM_pc += 4;
> +		return 0;
> +	}
> +
> +	if ((instr & 0x0e100090) == 0x00100090) {
> +		regs->uregs[reg] = -1;
> +		regs->ARM_pc += 4;
> +		return 0;
> +	}
> +
> +	return 1;
>  }
> 
>  static void imx6_pcie_assert_core_reset(struct imx6_pcie *imx6_pcie) @@ -
> 819,8 +846,8 @@ static int __init imx6_pcie_init(void)
>  	 * we can install the handler here without risking it
>  	 * accessing some uninitialized driver state.
>  	 */
> -	hook_fault_code(16 + 6, imx6q_pcie_abort_handler, SIGBUS, 0,
> -			"imprecise external abort");
> +	hook_fault_code(8, imx6q_pcie_abort_handler, SIGBUS, 0,
> +			"external abort on non-linefetch");
> 
>  	return platform_driver_register(&imx6_pcie_driver);
>  }
> --
> 2.11.0
Bjorn Helgaas May 22, 2017, 10:13 p.m. UTC | #5
On Wed, May 10, 2017 at 07:57:52PM +0200, Lucas Stach wrote:
> The change in Linux 4.12 to make PCI configuartion requests non-posted
> means that we are now getting a synchronous abort when the CFG space
> read to probe for downstream devices times out.
> 
> Synchronous aborts need to be handled differently from the async aborts
> we were getting before, in particular the PC needs to be advanced when
> resolving the abort. This is mostly a copy of what other PCI drivers do
> on ARM to handle those aborts.
> 
> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>

Applied to for-linus for v4.12 with tested-by and acks from Fabio,
Peter, and Richard.  I updated the subject line to:

  PCI: imx6: Fix config read timeout handling

because I don't think this change is specific to bus scanning and I
wanted a hint that it's related to the config space mapping changes.

In fact, I'd really like to include the specific commit that caused
this problem so that if the original commit is backported, there's a
hint that we should backport this one, too.  I *think* it might be
cc7b0d495589, so I updated the changelog to this:

  Commit cc7b0d495589 ("PCI: designware: Update PCI config space remap 
  function") made PCI configuration requests non-posted, which means we now
  get a synchronous abort when the CFG space read to probe for downstream
  devices times out. 

  Synchronous aborts need to be handled differently from the async aborts we
  were getting before, in particular the PC needs to be advanced when
  resolving the abort.  This is mostly a copy of what other PCI drivers do on
  ARM to handle those aborts.

  [bhelgaas: changelog, "Fixes"]
  Fixes: cc7b0d495589 ("PCI: designware: Update PCI config space remap function")

Please let me know if this is the wrong commit.

> ---
> This is a fix that needs to go in for 4.12, but I would hope to get
> some thorough testing before.
> ---
>  drivers/pci/dwc/pci-imx6.c | 33 ++++++++++++++++++++++++++++++---
>  1 file changed, 30 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/dwc/pci-imx6.c b/drivers/pci/dwc/pci-imx6.c
> index a98cba55c7f0..19a289b8cc94 100644
> --- a/drivers/pci/dwc/pci-imx6.c
> +++ b/drivers/pci/dwc/pci-imx6.c
> @@ -252,7 +252,34 @@ static void imx6_pcie_reset_phy(struct imx6_pcie *imx6_pcie)
>  static int imx6q_pcie_abort_handler(unsigned long addr,
>  		unsigned int fsr, struct pt_regs *regs)
>  {
> -	return 0;
> +	unsigned long pc = instruction_pointer(regs);
> +	unsigned long instr = *(unsigned long *)pc;
> +	int reg = (instr >> 12) & 15;
> +
> +	/*
> +	 * If the instruction being executed was a read,
> +	 * make it look like it read all-ones.
> +	 */
> +	if ((instr & 0x0c100000) == 0x04100000) {
> +		unsigned long val;
> +
> +		if (instr & 0x00400000)
> +			val = 255;
> +		else
> +			val = -1;
> +
> +		regs->uregs[reg] = val;
> +		regs->ARM_pc += 4;
> +		return 0;
> +	}
> +
> +	if ((instr & 0x0e100090) == 0x00100090) {
> +		regs->uregs[reg] = -1;
> +		regs->ARM_pc += 4;
> +		return 0;
> +	}
> +
> +	return 1;
>  }
>  
>  static void imx6_pcie_assert_core_reset(struct imx6_pcie *imx6_pcie)
> @@ -819,8 +846,8 @@ static int __init imx6_pcie_init(void)
>  	 * we can install the handler here without risking it
>  	 * accessing some uninitialized driver state.
>  	 */
> -	hook_fault_code(16 + 6, imx6q_pcie_abort_handler, SIGBUS, 0,
> -			"imprecise external abort");
> +	hook_fault_code(8, imx6q_pcie_abort_handler, SIGBUS, 0,
> +			"external abort on non-linefetch");
>  
>  	return platform_driver_register(&imx6_pcie_driver);
>  }
> -- 
> 2.11.0
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Lorenzo Pieralisi May 23, 2017, 9:07 a.m. UTC | #6
On Mon, May 22, 2017 at 05:13:46PM -0500, Bjorn Helgaas wrote:
> On Wed, May 10, 2017 at 07:57:52PM +0200, Lucas Stach wrote:
> > The change in Linux 4.12 to make PCI configuartion requests non-posted
> > means that we are now getting a synchronous abort when the CFG space
> > read to probe for downstream devices times out.
> > 
> > Synchronous aborts need to be handled differently from the async aborts
> > we were getting before, in particular the PC needs to be advanced when
> > resolving the abort. This is mostly a copy of what other PCI drivers do
> > on ARM to handle those aborts.
> > 
> > Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
> 
> Applied to for-linus for v4.12 with tested-by and acks from Fabio,
> Peter, and Richard.  I updated the subject line to:
> 
>   PCI: imx6: Fix config read timeout handling
> 
> because I don't think this change is specific to bus scanning and I
> wanted a hint that it's related to the config space mapping changes.
> 
> In fact, I'd really like to include the specific commit that caused
> this problem so that if the original commit is backported, there's a
> hint that we should backport this one, too.  I *think* it might be
> cc7b0d495589, so I updated the changelog to this:
> 
>   Commit cc7b0d495589 ("PCI: designware: Update PCI config space remap 
>   function") made PCI configuration requests non-posted, which means we now
>   get a synchronous abort when the CFG space read to probe for downstream
>   devices times out. 
> 
>   Synchronous aborts need to be handled differently from the async aborts we
>   were getting before, in particular the PC needs to be advanced when
>   resolving the abort.  This is mostly a copy of what other PCI drivers do on
>   ARM to handle those aborts.
> 
>   [bhelgaas: changelog, "Fixes"]
>   Fixes: cc7b0d495589 ("PCI: designware: Update PCI config space remap function")
> 
> Please let me know if this is the wrong commit.

I think it is the right commit but it is better if Lucas can confirm,
apologies for the breakage caused, I just could not test on iMX6.

Thanks,
Lorenzo

> > This is a fix that needs to go in for 4.12, but I would hope to get
> > some thorough testing before.
> > ---
> >  drivers/pci/dwc/pci-imx6.c | 33 ++++++++++++++++++++++++++++++---
> >  1 file changed, 30 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/pci/dwc/pci-imx6.c b/drivers/pci/dwc/pci-imx6.c
> > index a98cba55c7f0..19a289b8cc94 100644
> > --- a/drivers/pci/dwc/pci-imx6.c
> > +++ b/drivers/pci/dwc/pci-imx6.c
> > @@ -252,7 +252,34 @@ static void imx6_pcie_reset_phy(struct imx6_pcie *imx6_pcie)
> >  static int imx6q_pcie_abort_handler(unsigned long addr,
> >  		unsigned int fsr, struct pt_regs *regs)
> >  {
> > -	return 0;
> > +	unsigned long pc = instruction_pointer(regs);
> > +	unsigned long instr = *(unsigned long *)pc;
> > +	int reg = (instr >> 12) & 15;
> > +
> > +	/*
> > +	 * If the instruction being executed was a read,
> > +	 * make it look like it read all-ones.
> > +	 */
> > +	if ((instr & 0x0c100000) == 0x04100000) {
> > +		unsigned long val;
> > +
> > +		if (instr & 0x00400000)
> > +			val = 255;
> > +		else
> > +			val = -1;
> > +
> > +		regs->uregs[reg] = val;
> > +		regs->ARM_pc += 4;
> > +		return 0;
> > +	}
> > +
> > +	if ((instr & 0x0e100090) == 0x00100090) {
> > +		regs->uregs[reg] = -1;
> > +		regs->ARM_pc += 4;
> > +		return 0;
> > +	}
> > +
> > +	return 1;
> >  }
> >  
> >  static void imx6_pcie_assert_core_reset(struct imx6_pcie *imx6_pcie)
> > @@ -819,8 +846,8 @@ static int __init imx6_pcie_init(void)
> >  	 * we can install the handler here without risking it
> >  	 * accessing some uninitialized driver state.
> >  	 */
> > -	hook_fault_code(16 + 6, imx6q_pcie_abort_handler, SIGBUS, 0,
> > -			"imprecise external abort");
> > +	hook_fault_code(8, imx6q_pcie_abort_handler, SIGBUS, 0,
> > +			"external abort on non-linefetch");
> >  
> >  	return platform_driver_register(&imx6_pcie_driver);
> >  }
> > -- 
> > 2.11.0
> > 
> > 
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Fabio Estevam May 23, 2017, 10:11 p.m. UTC | #7
Hi Bjorn,

On Mon, May 22, 2017 at 7:13 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:

> Applied to for-linus for v4.12 with tested-by and acks from Fabio,
> Peter, and Richard.  I updated the subject line to:
>
>   PCI: imx6: Fix config read timeout handling
>
> because I don't think this change is specific to bus scanning and I
> wanted a hint that it's related to the config space mapping changes.
>
> In fact, I'd really like to include the specific commit that caused
> this problem so that if the original commit is backported, there's a
> hint that we should backport this one, too.  I *think* it might be
> cc7b0d495589, so I updated the changelog to this:
>
>   Commit cc7b0d495589 ("PCI: designware: Update PCI config space remap
>   function") made PCI configuration requests non-posted, which means we now
>   get a synchronous abort when the CFG space read to probe for downstream
>   devices times out.
>
>   Synchronous aborts need to be handled differently from the async aborts we
>   were getting before, in particular the PC needs to be advanced when
>   resolving the abort.  This is mostly a copy of what other PCI drivers do on
>   ARM to handle those aborts.
>
>   [bhelgaas: changelog, "Fixes"]
>   Fixes: cc7b0d495589 ("PCI: designware: Update PCI config space remap function")
>
> Please let me know if this is the wrong commit.

I just tested on a mx6q board and can confirm that this is the correct
offending commit, thanks.
Bjorn Helgaas May 24, 2017, 5:16 p.m. UTC | #8
On Tue, May 23, 2017 at 07:11:26PM -0300, Fabio Estevam wrote:
> Hi Bjorn,
> 
> On Mon, May 22, 2017 at 7:13 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> 
> > Applied to for-linus for v4.12 with tested-by and acks from Fabio,
> > Peter, and Richard.  I updated the subject line to:
> >
> >   PCI: imx6: Fix config read timeout handling
> >
> > because I don't think this change is specific to bus scanning and I
> > wanted a hint that it's related to the config space mapping changes.
> >
> > In fact, I'd really like to include the specific commit that caused
> > this problem so that if the original commit is backported, there's a
> > hint that we should backport this one, too.  I *think* it might be
> > cc7b0d495589, so I updated the changelog to this:
> >
> >   Commit cc7b0d495589 ("PCI: designware: Update PCI config space remap
> >   function") made PCI configuration requests non-posted, which means we now
> >   get a synchronous abort when the CFG space read to probe for downstream
> >   devices times out.
> >
> >   Synchronous aborts need to be handled differently from the async aborts we
> >   were getting before, in particular the PC needs to be advanced when
> >   resolving the abort.  This is mostly a copy of what other PCI drivers do on
> >   ARM to handle those aborts.
> >
> >   [bhelgaas: changelog, "Fixes"]
> >   Fixes: cc7b0d495589 ("PCI: designware: Update PCI config space remap function")
> >
> > Please let me know if this is the wrong commit.
> 
> I just tested on a mx6q board and can confirm that this is the correct
> offending commit, thanks.

Thanks for checking this out, Fabio!
diff mbox

Patch

diff --git a/drivers/pci/dwc/pci-imx6.c b/drivers/pci/dwc/pci-imx6.c
index a98cba55c7f0..19a289b8cc94 100644
--- a/drivers/pci/dwc/pci-imx6.c
+++ b/drivers/pci/dwc/pci-imx6.c
@@ -252,7 +252,34 @@  static void imx6_pcie_reset_phy(struct imx6_pcie *imx6_pcie)
 static int imx6q_pcie_abort_handler(unsigned long addr,
 		unsigned int fsr, struct pt_regs *regs)
 {
-	return 0;
+	unsigned long pc = instruction_pointer(regs);
+	unsigned long instr = *(unsigned long *)pc;
+	int reg = (instr >> 12) & 15;
+
+	/*
+	 * If the instruction being executed was a read,
+	 * make it look like it read all-ones.
+	 */
+	if ((instr & 0x0c100000) == 0x04100000) {
+		unsigned long val;
+
+		if (instr & 0x00400000)
+			val = 255;
+		else
+			val = -1;
+
+		regs->uregs[reg] = val;
+		regs->ARM_pc += 4;
+		return 0;
+	}
+
+	if ((instr & 0x0e100090) == 0x00100090) {
+		regs->uregs[reg] = -1;
+		regs->ARM_pc += 4;
+		return 0;
+	}
+
+	return 1;
 }
 
 static void imx6_pcie_assert_core_reset(struct imx6_pcie *imx6_pcie)
@@ -819,8 +846,8 @@  static int __init imx6_pcie_init(void)
 	 * we can install the handler here without risking it
 	 * accessing some uninitialized driver state.
 	 */
-	hook_fault_code(16 + 6, imx6q_pcie_abort_handler, SIGBUS, 0,
-			"imprecise external abort");
+	hook_fault_code(8, imx6q_pcie_abort_handler, SIGBUS, 0,
+			"external abort on non-linefetch");
 
 	return platform_driver_register(&imx6_pcie_driver);
 }