diff mbox series

[v10,3/3] PCI/ASPM: Make pci_save_aspm_l1ss_state save both child and parent's L1SS configuration

Message ID 20240927103723.24622-2-jhp@endlessos.org (mailing list archive)
State Superseded
Delegated to: Krzysztof Wilczyński
Headers show
Series PCI: vmd: Enable PCI PM's L1 substates of remapped PCIe Root Port and NVMe | expand

Commit Message

Jian-Hong Pan Sept. 27, 2024, 10:37 a.m. UTC
PCI devices' parameters on the VMD bus have been programmed properly
originally. But, cleared after pci_reset_bus() and have not been restored
correctly. This leads the link's L1.2 between PCIe Root Port and child
device gets wrong configs.

Here is a failed example on ASUS B1400CEAE with enabled VMD. Both PCIe
bridge and NVMe device should have the same LTR1.2_Threshold value.
However, they are configured as different values in this case:

10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal decode])
  ...
  Capabilities: [200 v1] L1 PM Substates
    L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
      PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
    L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
      T_CommonMode=0us LTR1.2_Threshold=0ns
    L1SubCtl2: T_PwrOn=0us

10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
  ...
  Capabilities: [900 v1] L1 PM Substates
    L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
      PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
    L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
      T_CommonMode=0us LTR1.2_Threshold=101376ns
    L1SubCtl2: T_PwrOn=50us

Here is VMD mapped PCI device tree:

-+-[0000:00]-+-00.0  Intel Corporation Device 9a04
 | ...
 \-[10000:e0]-+-06.0-[e1]----00.0  Sandisk Corp WD Blue SN550 NVMe SSD
              \-17.0  Intel Corporation Tiger Lake-LP SATA Controller

When pci_reset_bus() resets the bus [e1] of the NVMe, it only saves and
restores NVMe's state before and after reset. Because bus [e1] has only one
device: 10000:e1:00.0 NVMe. The PCIe bridge is missed. However, when it
restores the NVMe's state, it also restores the ASPM L1SS between the PCIe
bridge and the NVMe by pci_restore_aspm_l1ss_state(). The NVMe's L1SS is
restored correctly. But, the PCIe bridge's L1SS is restored with the wrong
value 0x0 [1]. Because, the parent device (PCIe bridge)'s L1SS is not saved
by pci_save_aspm_l1ss_state() before reset. That is why
pci_restore_aspm_l1ss_state() gets the parent device (PCIe bridge)'s saved
state capability data and restores L1SS with value 0.

So, if the PCI device has a parent, make pci_save_aspm_l1ss_state() save
the parent's L1SS configuration, too. This is symmetric on
pci_restore_aspm_l1ss_state().

[1]: https://lore.kernel.org/linux-pci/CAPpJ_eexU0gCHMbXw_z924WxXw0+B6SdS4eG9oGpEX1wmnMLkQ@mail.gmail.com/

Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394
Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
---
v9:
- Drop the v8 fix about drivers/pci/pcie/aspm.c. Use this in VMD instead.

v10:
- Drop the v9 fix about drivers/pci/controller/vmd.c
- Fix in PCIe ASPM to make it symmetric between pci_save_aspm_l1ss_state()
  and pci_restore_aspm_l1ss_state()

 drivers/pci/pcie/aspm.c | 39 +++++++++++++++++++++++++++++++--------
 1 file changed, 31 insertions(+), 8 deletions(-)

Comments

Ilpo Järvinen Sept. 27, 2024, 11:16 a.m. UTC | #1
On Fri, 27 Sep 2024, Jian-Hong Pan wrote:

> PCI devices' parameters on the VMD bus have been programmed properly
> originally. But, cleared after pci_reset_bus() and have not been restored
> correctly. This leads the link's L1.2 between PCIe Root Port and child
> device gets wrong configs.
> 
> Here is a failed example on ASUS B1400CEAE with enabled VMD. Both PCIe
> bridge and NVMe device should have the same LTR1.2_Threshold value.
> However, they are configured as different values in this case:
> 
> 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal decode])
>   ...
>   Capabilities: [200 v1] L1 PM Substates
>     L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
>       PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
>     L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
>       T_CommonMode=0us LTR1.2_Threshold=0ns
>     L1SubCtl2: T_PwrOn=0us
> 
> 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
>   ...
>   Capabilities: [900 v1] L1 PM Substates
>     L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
>       PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
>     L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
>       T_CommonMode=0us LTR1.2_Threshold=101376ns
>     L1SubCtl2: T_PwrOn=50us
> 
> Here is VMD mapped PCI device tree:
> 
> -+-[0000:00]-+-00.0  Intel Corporation Device 9a04
>  | ...
>  \-[10000:e0]-+-06.0-[e1]----00.0  Sandisk Corp WD Blue SN550 NVMe SSD
>               \-17.0  Intel Corporation Tiger Lake-LP SATA Controller
> 
> When pci_reset_bus() resets the bus [e1] of the NVMe, it only saves and
> restores NVMe's state before and after reset.

> Because bus [e1] has only one
> device: 10000:e1:00.0 NVMe.

This is incomplete sentence. And I don't understand why the number of 
devices is relevant. Perhaps just drop it?

> The PCIe bridge is missed.

Unclear what this refers to (I know what you mean but please write it 
so that other reading this 10 years from now will also get what is
missed).

> However, when it
> restores the NVMe's state, it also restores the ASPM L1SS between the PCIe
> bridge and the NVMe by pci_restore_aspm_l1ss_state(). 

"it also restores ..." -> "ASPM code restores L1SS for both the parent 
bridge and the NVMe in pci_restore_aspm_l1ss_state()."

> The NVMe's L1SS is
> restored correctly. But, the PCIe bridge's L1SS is restored with the wrong
> value 0x0 [1]. Because, the parent device (PCIe bridge)'s L1SS is not saved

Join these sentences without . and drop the comma.

I'd say "was not saved" because at that point it occurred clearly before 
the restore.

> by pci_save_aspm_l1ss_state() before reset. That is why
> pci_restore_aspm_l1ss_state() gets the parent device (PCIe bridge)'s saved
> state capability data and restores L1SS with value 0.

This last sentence seems duplicated.
 
> So, if the PCI device has a parent, make pci_save_aspm_l1ss_state() save
> the parent's L1SS configuration, too. This is symmetric on
> pci_restore_aspm_l1ss_state().
> 
> [1]: https://lore.kernel.org/linux-pci/CAPpJ_eexU0gCHMbXw_z924WxXw0+B6SdS4eG9oGpEX1wmnMLkQ@mail.gmail.com/

Just put this into Link tag.

> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394

Closes: ?

Fixes tag is also missing.

Add:

Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

> Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
> ---
> v9:
> - Drop the v8 fix about drivers/pci/pcie/aspm.c. Use this in VMD instead.
> 
> v10:
> - Drop the v9 fix about drivers/pci/controller/vmd.c
> - Fix in PCIe ASPM to make it symmetric between pci_save_aspm_l1ss_state()
>   and pci_restore_aspm_l1ss_state()
> 
>  drivers/pci/pcie/aspm.c | 39 +++++++++++++++++++++++++++++++--------
>  1 file changed, 31 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index bd0a8a05647e..823aaf813978 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -81,24 +81,47 @@ void pci_configure_aspm_l1ss(struct pci_dev *pdev)
>  
>  void pci_save_aspm_l1ss_state(struct pci_dev *pdev)
>  {
> -	struct pci_cap_saved_state *save_state;
> -	u16 l1ss = pdev->l1ss;
> +	struct pci_cap_saved_state *pl_save_state, *cl_save_state;
> +	struct pci_dev *parent;
>  	u32 *cap;
>  
>  	/*
>  	 * Save L1 substate configuration. The ASPM L0s/L1 configuration
>  	 * in PCI_EXP_LNKCTL_ASPMC is saved by pci_save_pcie_state().
>  	 */
> -	if (!l1ss)
> +	if (!pdev->l1ss)
>  		return;
>  
> -	save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_L1SS);
> -	if (!save_state)
> +	cl_save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_L1SS);
> +	if (!cl_save_state)
>  		return;
>  
> -	cap = &save_state->cap.data[0];
> -	pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL2, cap++);
> -	pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL1, cap++);
> +	cap = &cl_save_state->cap.data[0];
> +	pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL2, cap++);
> +	pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL1, cap++);
> +
> +	/*
> +	 * If here is a parent device and it has not saved state, save parent's
> +	 * L1 substate configuration, too. This is symmetric on
> +	 * pci_restore_aspm_l1ss_state().
> +	 */
> +	if (!pdev->bus || !pdev->bus->self)
> +		return;
> +
> +	parent = pdev->bus->self;
> +	if (!parent->l1ss)
> +		return;
> +
> +	pl_save_state = pci_find_saved_ext_cap(parent, PCI_EXT_CAP_ID_L1SS);
> +	if (!pl_save_state)
> +		return;
> +
> +	if (parent->state_saved)
> +		return;

This decision can be made before even reading the pl_saved_state.

> +
> +	cap = &pl_save_state->cap.data[0];
> +	pci_read_config_dword(parent, parent->l1ss + PCI_L1SS_CTL2, cap++);
> +	pci_read_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, cap++);

This approach duplicates code and that seems unnecessary to me.

Could you instead leave the current function (nearly?) as is and use it as 
a helper for a new pci_save_aspm_l1ss_state() which calls the helper first 
for the pdev and then for its parent (if needed). Or do I miss something 
why that is not possible?
Jian-Hong Pan Sept. 30, 2024, 9:30 a.m. UTC | #2
Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> 於 2024年9月27日 週五 下午7:17寫道:
>
> On Fri, 27 Sep 2024, Jian-Hong Pan wrote:
>
> > PCI devices' parameters on the VMD bus have been programmed properly
> > originally. But, cleared after pci_reset_bus() and have not been restored
> > correctly. This leads the link's L1.2 between PCIe Root Port and child
> > device gets wrong configs.
> >
> > Here is a failed example on ASUS B1400CEAE with enabled VMD. Both PCIe
> > bridge and NVMe device should have the same LTR1.2_Threshold value.
> > However, they are configured as different values in this case:
> >
> > 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal decode])
> >   ...
> >   Capabilities: [200 v1] L1 PM Substates
> >     L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
> >       PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
> >     L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> >       T_CommonMode=0us LTR1.2_Threshold=0ns
> >     L1SubCtl2: T_PwrOn=0us
> >
> > 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
> >   ...
> >   Capabilities: [900 v1] L1 PM Substates
> >     L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
> >       PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
> >     L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> >       T_CommonMode=0us LTR1.2_Threshold=101376ns
> >     L1SubCtl2: T_PwrOn=50us
> >
> > Here is VMD mapped PCI device tree:
> >
> > -+-[0000:00]-+-00.0  Intel Corporation Device 9a04
> >  | ...
> >  \-[10000:e0]-+-06.0-[e1]----00.0  Sandisk Corp WD Blue SN550 NVMe SSD
> >               \-17.0  Intel Corporation Tiger Lake-LP SATA Controller
> >
> > When pci_reset_bus() resets the bus [e1] of the NVMe, it only saves and
> > restores NVMe's state before and after reset.
>
> > Because bus [e1] has only one
> > device: 10000:e1:00.0 NVMe.
>
> This is incomplete sentence. And I don't understand why the number of
> devices is relevant. Perhaps just drop it?
>
> > The PCIe bridge is missed.
>
> Unclear what this refers to (I know what you mean but please write it
> so that other reading this 10 years from now will also get what is
> missed).
>
> > However, when it
> > restores the NVMe's state, it also restores the ASPM L1SS between the PCIe
> > bridge and the NVMe by pci_restore_aspm_l1ss_state().
>
> "it also restores ..." -> "ASPM code restores L1SS for both the parent
> bridge and the NVMe in pci_restore_aspm_l1ss_state()."
>
> > The NVMe's L1SS is
> > restored correctly. But, the PCIe bridge's L1SS is restored with the wrong
> > value 0x0 [1]. Because, the parent device (PCIe bridge)'s L1SS is not saved
>
> Join these sentences without . and drop the comma.
>
> I'd say "was not saved" because at that point it occurred clearly before
> the restore.
>
> > by pci_save_aspm_l1ss_state() before reset. That is why
> > pci_restore_aspm_l1ss_state() gets the parent device (PCIe bridge)'s saved
> > state capability data and restores L1SS with value 0.
>
> This last sentence seems duplicated.
>
> > So, if the PCI device has a parent, make pci_save_aspm_l1ss_state() save
> > the parent's L1SS configuration, too. This is symmetric on
> > pci_restore_aspm_l1ss_state().
> >
> > [1]: https://lore.kernel.org/linux-pci/CAPpJ_eexU0gCHMbXw_z924WxXw0+B6SdS4eG9oGpEX1wmnMLkQ@mail.gmail.com/
>
> Just put this into Link tag.
>
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394
>
> Closes: ?
>
> Fixes tag is also missing.
>
> Add:
>
> Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>
> > Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
> > ---
> > v9:
> > - Drop the v8 fix about drivers/pci/pcie/aspm.c. Use this in VMD instead.
> >
> > v10:
> > - Drop the v9 fix about drivers/pci/controller/vmd.c
> > - Fix in PCIe ASPM to make it symmetric between pci_save_aspm_l1ss_state()
> >   and pci_restore_aspm_l1ss_state()
> >
> >  drivers/pci/pcie/aspm.c | 39 +++++++++++++++++++++++++++++++--------
> >  1 file changed, 31 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index bd0a8a05647e..823aaf813978 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -81,24 +81,47 @@ void pci_configure_aspm_l1ss(struct pci_dev *pdev)
> >
> >  void pci_save_aspm_l1ss_state(struct pci_dev *pdev)
> >  {
> > -     struct pci_cap_saved_state *save_state;
> > -     u16 l1ss = pdev->l1ss;
> > +     struct pci_cap_saved_state *pl_save_state, *cl_save_state;
> > +     struct pci_dev *parent;
> >       u32 *cap;
> >
> >       /*
> >        * Save L1 substate configuration. The ASPM L0s/L1 configuration
> >        * in PCI_EXP_LNKCTL_ASPMC is saved by pci_save_pcie_state().
> >        */
> > -     if (!l1ss)
> > +     if (!pdev->l1ss)
> >               return;
> >
> > -     save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_L1SS);
> > -     if (!save_state)
> > +     cl_save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_L1SS);
> > +     if (!cl_save_state)
> >               return;
> >
> > -     cap = &save_state->cap.data[0];
> > -     pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL2, cap++);
> > -     pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL1, cap++);
> > +     cap = &cl_save_state->cap.data[0];
> > +     pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL2, cap++);
> > +     pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL1, cap++);
> > +
> > +     /*
> > +      * If here is a parent device and it has not saved state, save parent's
> > +      * L1 substate configuration, too. This is symmetric on
> > +      * pci_restore_aspm_l1ss_state().
> > +      */
> > +     if (!pdev->bus || !pdev->bus->self)
> > +             return;
> > +
> > +     parent = pdev->bus->self;
> > +     if (!parent->l1ss)
> > +             return;
> > +
> > +     pl_save_state = pci_find_saved_ext_cap(parent, PCI_EXT_CAP_ID_L1SS);
> > +     if (!pl_save_state)
> > +             return;
> > +
> > +     if (parent->state_saved)
> > +             return;
>
> This decision can be made before even reading the pl_saved_state.
>
> > +
> > +     cap = &pl_save_state->cap.data[0];
> > +     pci_read_config_dword(parent, parent->l1ss + PCI_L1SS_CTL2, cap++);
> > +     pci_read_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, cap++);
>
> This approach duplicates code and that seems unnecessary to me.
>
> Could you instead leave the current function (nearly?) as is and use it as
> a helper for a new pci_save_aspm_l1ss_state() which calls the helper first
> for the pdev and then for its parent (if needed). Or do I miss something
> why that is not possible?

I was thinking of invoking it recursively the first time. But, that
will go to parent's parent and so on ...
So, I flattened the code in pci_save_aspm_l1ss_state().

Having a helper doing what original pci_save_aspm_l1ss_state()'s work
sounds good!

Sent the v11 patch, including the modified commit message.

Jian-Hong Pan
diff mbox series

Patch

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index bd0a8a05647e..823aaf813978 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -81,24 +81,47 @@  void pci_configure_aspm_l1ss(struct pci_dev *pdev)
 
 void pci_save_aspm_l1ss_state(struct pci_dev *pdev)
 {
-	struct pci_cap_saved_state *save_state;
-	u16 l1ss = pdev->l1ss;
+	struct pci_cap_saved_state *pl_save_state, *cl_save_state;
+	struct pci_dev *parent;
 	u32 *cap;
 
 	/*
 	 * Save L1 substate configuration. The ASPM L0s/L1 configuration
 	 * in PCI_EXP_LNKCTL_ASPMC is saved by pci_save_pcie_state().
 	 */
-	if (!l1ss)
+	if (!pdev->l1ss)
 		return;
 
-	save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_L1SS);
-	if (!save_state)
+	cl_save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_L1SS);
+	if (!cl_save_state)
 		return;
 
-	cap = &save_state->cap.data[0];
-	pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL2, cap++);
-	pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL1, cap++);
+	cap = &cl_save_state->cap.data[0];
+	pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL2, cap++);
+	pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL1, cap++);
+
+	/*
+	 * If here is a parent device and it has not saved state, save parent's
+	 * L1 substate configuration, too. This is symmetric on
+	 * pci_restore_aspm_l1ss_state().
+	 */
+	if (!pdev->bus || !pdev->bus->self)
+		return;
+
+	parent = pdev->bus->self;
+	if (!parent->l1ss)
+		return;
+
+	pl_save_state = pci_find_saved_ext_cap(parent, PCI_EXT_CAP_ID_L1SS);
+	if (!pl_save_state)
+		return;
+
+	if (parent->state_saved)
+		return;
+
+	cap = &pl_save_state->cap.data[0];
+	pci_read_config_dword(parent, parent->l1ss + PCI_L1SS_CTL2, cap++);
+	pci_read_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, cap++);
 }
 
 void pci_restore_aspm_l1ss_state(struct pci_dev *pdev)