diff mbox series

[v2,1/2] PCI: Use the correct bit in Link Training not active check

Message ID 20240423130820.43824-1-ilpo.jarvinen@linux.intel.com (mailing list archive)
State Accepted
Delegated to: Bjorn Helgaas
Headers show
Series [v2,1/2] PCI: Use the correct bit in Link Training not active check | expand

Commit Message

Ilpo Järvinen April 23, 2024, 1:08 p.m. UTC
Two changes were made into link retraining logic independent of each
other.

The commit e7e39756363a ("PCI/ASPM: Avoid link retraining race") added
check to ensure no Link Training is currently active into
pcie_retrain_link() to address the Implementation Note in PCIe r6.1 sec
7.5.3.7. At that time pcie_wait_for_retrain() only checked for Link
Training (LT) bit being cleared.

The commit 680e9c47a229 ("PCI: Add support for polling DLLLA to
pcie_retrain_link()") generalized pcie_wait_for_retrain() into
pcie_wait_for_link_status() which can wait either for LT or Data Link
Layer Link Active (DLLLA) bit with 'use_lt' argument and supporting
waiting for either cleared or set using 'active' argument.

In the merge commit commit 1abb47390350 ("Merge branch
'pci/enumeration'"), those two divergent branches converged. The merge
changed LT bit checking added in the commit e7e39756363a ("PCI/ASPM:
Avoid link retraining race") to now wait for completion of any ongoing
Link Training using DLLLA bit being set if 'use_lt' is false.

When 'use_lt' is false, the pseudo-code steps of what occurs in
pcie_retrain_link():

	1. Wait for DLLLA=1
	2. Trigger link to retrain
	3. Wait for DLLLA=1

Step 3 waits for the link to come up from the retraining triggered by
Step 2. As Step 1 is supposed to wait for any ongoing retraining to
end, using DLLLA also for it does not make sense because link training
being active is still indicated using LT bit, not with DLLLA.

Correct the pcie_wait_for_link_status() parameters in Step 1 to only
wait for LT=0 to ensure there is no ongoing Link Training.

This only impacts the Target Speed quirk, which is the only case where
waiting for DLLLA bit is used. It currently works in the problematic
case by means of link training getting initiated by hardware repeatedly
and respecting the new link parameters set by the caller, which then
make training succeed and bring the link up, setting DLLLA and causing
pcie_wait_for_link_status() to return success. We are not supposed to
rely on luck and need to make sure that LT transitioned through the
inactive state though before we initiate link training by hand via RL
(Retrain Link) bit.

Fixes: 1abb47390350 ("Merge branch 'pci/enumeration'")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---

v2:
- Improve commit message

NOTE: Maciej NAK'ed the v1 of this patch but has since retracted his
NAK.

Maciej, if possible, could you please test this with your HW?

---
 drivers/pci/pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Maciej W. Rozycki April 24, 2024, 11:09 a.m. UTC | #1
On Tue, 23 Apr 2024, Ilpo Järvinen wrote:

> v2:
> - Improve commit message
> 
> NOTE: Maciej NAK'ed the v1 of this patch but has since retracted his
> NAK.
> 
> Maciej, if possible, could you please test this with your HW?

 I'll try before the end of this week, thanks for your work in this area.

  Maciej
Bjorn Helgaas April 25, 2024, 5:56 p.m. UTC | #2
On Tue, Apr 23, 2024 at 04:08:19PM +0300, Ilpo Järvinen wrote:
> Two changes were made into link retraining logic independent of each
> other.
> 
> The commit e7e39756363a ("PCI/ASPM: Avoid link retraining race") added
> check to ensure no Link Training is currently active into
> pcie_retrain_link() to address the Implementation Note in PCIe r6.1 sec
> 7.5.3.7. At that time pcie_wait_for_retrain() only checked for Link
> Training (LT) bit being cleared.
> 
> The commit 680e9c47a229 ("PCI: Add support for polling DLLLA to
> pcie_retrain_link()") generalized pcie_wait_for_retrain() into
> pcie_wait_for_link_status() which can wait either for LT or Data Link
> Layer Link Active (DLLLA) bit with 'use_lt' argument and supporting
> waiting for either cleared or set using 'active' argument.
> 
> In the merge commit commit 1abb47390350 ("Merge branch
> 'pci/enumeration'"), those two divergent branches converged. The merge
> changed LT bit checking added in the commit e7e39756363a ("PCI/ASPM:
> Avoid link retraining race") to now wait for completion of any ongoing
> Link Training using DLLLA bit being set if 'use_lt' is false.
> 
> When 'use_lt' is false, the pseudo-code steps of what occurs in
> pcie_retrain_link():
> 
> 	1. Wait for DLLLA=1
> 	2. Trigger link to retrain
> 	3. Wait for DLLLA=1
> 
> Step 3 waits for the link to come up from the retraining triggered by
> Step 2. As Step 1 is supposed to wait for any ongoing retraining to
> end, using DLLLA also for it does not make sense because link training
> being active is still indicated using LT bit, not with DLLLA.
> 
> Correct the pcie_wait_for_link_status() parameters in Step 1 to only
> wait for LT=0 to ensure there is no ongoing Link Training.
> 
> This only impacts the Target Speed quirk, which is the only case where
> waiting for DLLLA bit is used. It currently works in the problematic
> case by means of link training getting initiated by hardware repeatedly
> and respecting the new link parameters set by the caller, which then
> make training succeed and bring the link up, setting DLLLA and causing
> pcie_wait_for_link_status() to return success. We are not supposed to
> rely on luck and need to make sure that LT transitioned through the
> inactive state though before we initiate link training by hand via RL
> (Retrain Link) bit.
> 
> Fixes: 1abb47390350 ("Merge branch 'pci/enumeration'")
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

I applied both of these with minor typo fixes to pci/enumeration for
v6.10, thanks!

  73cb3a35f94d ("PCI: Wait for Link Training==0 before starting Link retrain")
  cdc6c4abcb31 ("PCI: Clarify intent of LT wait")

We can update if needed based on feedback from Maciej.

> ---
> 
> v2:
> - Improve commit message
> 
> NOTE: Maciej NAK'ed the v1 of this patch but has since retracted his
> NAK.
> 
> Maciej, if possible, could you please test this with your HW?
> 
> ---
>  drivers/pci/pci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index e5f243dd4288..70b8c87055cb 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4629,7 +4629,7 @@ int pcie_retrain_link(struct pci_dev *pdev, bool use_lt)
>  	 * avoid LTSSM race as recommended in Implementation Note at the
>  	 * end of PCIe r6.0.1 sec 7.5.3.7.
>  	 */
> -	rc = pcie_wait_for_link_status(pdev, use_lt, !use_lt);
> +	rc = pcie_wait_for_link_status(pdev, true, false);
>  	if (rc)
>  		return rc;
>  
> -- 
> 2.39.2
>
diff mbox series

Patch

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e5f243dd4288..70b8c87055cb 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4629,7 +4629,7 @@  int pcie_retrain_link(struct pci_dev *pdev, bool use_lt)
 	 * avoid LTSSM race as recommended in Implementation Note at the
 	 * end of PCIe r6.0.1 sec 7.5.3.7.
 	 */
-	rc = pcie_wait_for_link_status(pdev, use_lt, !use_lt);
+	rc = pcie_wait_for_link_status(pdev, true, false);
 	if (rc)
 		return rc;