diff mbox series

[1/1] PCI/ASPM: Add a fix for an erratum of the PI7C9X111SLB PCI-to-PCIe bridge

Message ID 20181101192229.48352-2-stefan.maetje@esd.eu (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show
Series PCI/ASPM: Proposal to add a fix for an erratum of the PI7C9X111SLB PCI-to-PCIe bridge | expand

Commit Message

Stefan Mätje Nov. 1, 2018, 7:22 p.m. UTC
Due to an erratum in the Pericom PI7C9X111SLB bridge in reverse mode the
retrain link bit needs to be cleared again manually to allow the link
training to succeed.

If it is not cleared manually the link training is continuously restarted
and all devices below the PCI-to-PCIe bridge can't be accessed any more.
That means drivers for devices below the bridge will be loaded but won't
work or even crash because the driver is only reading 0xffff.

See also the Pericom Errata Sheet PI7C9X111SLB_errata_rev1.2_102711.pdf.

Signed-off-by: Stefan Mätje <stefan.maetje@esd.eu>
---
 drivers/pci/pcie/aspm.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Sinan Kaya Nov. 1, 2018, 8:06 p.m. UTC | #1
On 11/1/2018 3:22 PM, Stefan Mätje wrote:
> + b/drivers/pci/pcie/aspm.c
> @@ -268,6 +268,15 @@ static void pcie_aspm_configure_common_clock(struct pcie_link_state *link)
>   	/* Retrain link */
>   	reg16 |= PCI_EXP_LNKCTL_RL;
>   	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
> +	if (0x12d8 == parent->vendor && 0xe111 == parent->device) {
> +		/*
> +		 * Due to an erratum in the Pericom PI7C9X111SLB bridge in
> +		 * reverse mode the retrain link bit needs to be cleared
> +		 * again manually to allow the link training to succeed.
> +		 */
> +		reg16 &= ~PCI_EXP_LNKCTL_RL;
> +		pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
> +	}

The typical model is to abstract quirk work into quirks.c and add some
callbacks from the actual code.
Stefan Mätje Nov. 2, 2018, 11:08 a.m. UTC | #2
Am 01.11.18 um 21:06 schrieb Sinan Kaya:
> On 11/1/2018 3:22 PM, Stefan Mätje wrote:
>> + b/drivers/pci/pcie/aspm.c
>> @@ -268,6 +268,15 @@ static void pcie_aspm_configure_common_clock(struct pcie_link_state *link)
>>   	/* Retrain link */
>>   	reg16 |= PCI_EXP_LNKCTL_RL;
>>   	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
>> +	if (0x12d8 == parent->vendor && 0xe111 == parent->device) {
>> +		/*
>> +		 * Due to an erratum in the Pericom PI7C9X111SLB bridge in
>> +		 * reverse mode the retrain link bit needs to be cleared
>> +		 * again manually to allow the link training to succeed.
>> +		 */
>> +		reg16 &= ~PCI_EXP_LNKCTL_RL;
>> +		pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
>> +	}
> 
> The typical model is to abstract quirk work into quirks.c and add some
> callbacks from the actual code.

Yes, I'm aware of the quirks.c code. But I don't believe the problem can be solved
by a quirk function that is run via pci_fixup_device() at certain points of the
PCI scan (i. e. pci_fixup_pass like pci_fixup_early / pci_fixup_header ...) after 
pcie_aspm_cap_init() has run.

Let's have a look at the function pcie_aspm_cap_init() from where pcie_aspm_configure_common_clock()
is called (where the patch was included). Be aware of the fact that the PCI express link downstream 
is broken after leaving that function without the patch. But looking in pcie_aspm_cap_init() you can 
see that it is reloading the ASPM registers from the child device after returning from 
pcie_aspm_configure_common_clock() and from this point on it is working on bogus ASPM register 
contents.

Therefore I think the rest of pcie_aspm_cap_init() is doing nothing sensible for the downstream 
PCIe tree. Also I think that pcie_aspm_configure_common_clock() must be fixed in a way that after
leaving that function the PCIe downstream link is still working. This is what my patch is good for.

Best regards,
    Stefan
Bjorn Helgaas Jan. 30, 2019, 11:26 p.m. UTC | #3
Hi Stefan,

On Thu, Nov 01, 2018 at 08:22:29PM +0100, Stefan Mätje wrote:
> Due to an erratum in the Pericom PI7C9X111SLB bridge in reverse mode the
> retrain link bit needs to be cleared again manually to allow the link
> training to succeed.
> 
> If it is not cleared manually the link training is continuously restarted
> and all devices below the PCI-to-PCIe bridge can't be accessed any more.
> That means drivers for devices below the bridge will be loaded but won't
> work or even crash because the driver is only reading 0xffff.
> 
> See also the Pericom Errata Sheet PI7C9X111SLB_errata_rev1.2_102711.pdf.

Is there a public URL for this?

Are there any bug reports for which you could include URLs?

> Signed-off-by: Stefan Mätje <stefan.maetje@esd.eu>
> ---
>  drivers/pci/pcie/aspm.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 5326916715d2..89a245023aa9 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -268,6 +268,15 @@ static void pcie_aspm_configure_common_clock(struct pcie_link_state *link)
>  	/* Retrain link */
>  	reg16 |= PCI_EXP_LNKCTL_RL;
>  	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
> +	if (0x12d8 == parent->vendor && 0xe111 == parent->device) {
> +		/*
> +		 * Due to an erratum in the Pericom PI7C9X111SLB bridge in
> +		 * reverse mode the retrain link bit needs to be cleared
> +		 * again manually to allow the link training to succeed.
> +		 */
> +		reg16 &= ~PCI_EXP_LNKCTL_RL;
> +		pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);

There's no timing constraint, e.g., PCI_EXP_LNKCTL_RL doesn't have to be
maintained for some minimum time before being cleared?

> +	}

Sinan suggested a quirk, which I think is a good idea.  Possible
implementation:

  - add a pcie_retrain_link() interface (internal to PCI core, maybe even
    internal to aspm.c)
  - call pcie_retrain_link() from pcie_aspm_configure_common_clock()
  - add a pci_dev.clear_retrain_link:1 bit
  - set the bit in a quirk
  - test the bit in pcie_retrain_link()

>  	/* Wait for link training end. Break out after waiting for timeout */
>  	start_jiffies = jiffies;
> -- 
> 2.15.0
>
Stefan Mätje Feb. 7, 2019, 3:16 p.m. UTC | #4
Hello Björn,

I'm happy that you come back to my problem report.

Am 31.01.19 um 00:26 schrieb Bjorn Helgaas:
> Hi Stefan,
> 
> On Thu, Nov 01, 2018 at 08:22:29PM +0100, Stefan Mätje wrote:
>> Due to an erratum in the Pericom PI7C9X111SLB bridge in reverse mode the
>> retrain link bit needs to be cleared again manually to allow the link
>> training to succeed.
>>
>> If it is not cleared manually the link training is continuously restarted
>> and all devices below the PCI-to-PCIe bridge can't be accessed any more.
>> That means drivers for devices below the bridge will be loaded but won't
>> work or even crash because the driver is only reading 0xffff.
>>
>> See also the Pericom Errata Sheet PI7C9X111SLB_errata_rev1.2_102711.pdf.
> 
> Is there a public URL for this?

There is no public URL to download that errata sheet. Because Pericom has been
acquired by Diodes Inc. all information has to be downloaded from their web site.
Following the link below you can find a datasheet and there is a button to 
request additional documents like the errata sheet for instance.

https://www.diodes.com/products/connectivity-and-timing/pcie-packet-switchbridges/pcie-pci-bridges/part/PI7C9X111SL#tab-details

> Are there any bug reports for which you could include URLs?

I'm sorry. Last November when I found the bug my Internet searches turned up at least one
other victim of that bug, but I don't find it again now.

>> Signed-off-by: Stefan Mätje <stefan.maetje@esd.eu>
>> ---
>>  drivers/pci/pcie/aspm.c | 9 +++++++++
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>> index 5326916715d2..89a245023aa9 100644
>> --- a/drivers/pci/pcie/aspm.c
>> +++ b/drivers/pci/pcie/aspm.c
>> @@ -268,6 +268,15 @@ static void pcie_aspm_configure_common_clock(struct pcie_link_state *link)
>>  	/* Retrain link */
>>  	reg16 |= PCI_EXP_LNKCTL_RL;
>>  	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
>> +	if (0x12d8 == parent->vendor && 0xe111 == parent->device) {
>> +		/*
>> +		 * Due to an erratum in the Pericom PI7C9X111SLB bridge in
>> +		 * reverse mode the retrain link bit needs to be cleared
>> +		 * again manually to allow the link training to succeed.
>> +		 */
>> +		reg16 &= ~PCI_EXP_LNKCTL_RL;
>> +		pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
> 
> There's no timing constraint, e.g., PCI_EXP_LNKCTL_RL doesn't have to be
> maintained for some minimum time before being cleared?

There is no timing constraint. I will quote here the errata sheet. It says:

E6: 	In Reverse Mode, retrain Link bit is not cleared automatically; this bit
	needs to be cleared manually by configuration write after it is set.

Problem:
	In Reverse mode, after setting Retrain Link (bit 5 of register C0h), this bit will stay on
	and PI7C9x111SL will continuously retrain until this bit is cleared by another
	Configuration Write to register C0h.
Workaround:
	Issue another configuration write to clear Retrain Link bit after setting this bit. No delay
	is required between these two configuration write.

>> +	}
> 
> Sinan suggested a quirk, which I think is a good idea.  Possible
> implementation:
> 
>   - add a pcie_retrain_link() interface (internal to PCI core, maybe even
>     internal to aspm.c)
>   - call pcie_retrain_link() from pcie_aspm_configure_common_clock()
>   - add a pci_dev.clear_retrain_link:1 bit
>   - set the bit in a quirk
>   - test the bit in pcie_retrain_link()

Thank you for depicting possible way on how to implement that in more detail. This makes it much
more clear to me.

But it will take some time for me to come up with a patch implemented in that style because I'm
very busy with a complete different project.

>>  	/* Wait for link training end. Break out after waiting for timeout */
>>  	start_jiffies = jiffies;
>> -- 
>> 2.15.0

Best regards,
	Stefan
diff mbox series

Patch

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 5326916715d2..89a245023aa9 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -268,6 +268,15 @@  static void pcie_aspm_configure_common_clock(struct pcie_link_state *link)
 	/* Retrain link */
 	reg16 |= PCI_EXP_LNKCTL_RL;
 	pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
+	if (0x12d8 == parent->vendor && 0xe111 == parent->device) {
+		/*
+		 * Due to an erratum in the Pericom PI7C9X111SLB bridge in
+		 * reverse mode the retrain link bit needs to be cleared
+		 * again manually to allow the link training to succeed.
+		 */
+		reg16 &= ~PCI_EXP_LNKCTL_RL;
+		pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
+	}
 
 	/* Wait for link training end. Break out after waiting for timeout */
 	start_jiffies = jiffies;