Message ID | 20250131152913.2507-1-ilpo.jarvinen@linux.intel.com (mailing list archive) |
---|---|
State | Accepted |
Commit | be6358acefc1f4261c6fa43721e0d2d94ca4266c |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | [1/1] PCI/ASPM: Fix L1SS saving | expand |
[+cc Rafael] On Fri, Jan 31, 2025 at 05:29:13PM +0200, Ilpo Järvinen wrote: > The commit 1db806ec06b7 ("PCI/ASPM: Save parent L1SS config in > pci_save_aspm_l1ss_state()") aimed to perform L1SS config save for both > the Upstream Port and its upstream bridge when handling an Upstream > Port, which matches what the L1SS restore side does. However, > parent->state_saved can be set true at an earlier time when the > upstream bridge saved other parts of its state. So I guess the scenario is that we got here because some driver called pci_save_state(pdev): pci_save_state dev->state_saved = true <-- pci_save_pcie_state pci_save_aspm_l1ss_state if (pcie_downstream_port(pdev)) return # save pdev L1SS state here if (parent->state_saved) <-- return # save parent L1SS state here and the problem is that we previously called pci_save_state(parent), which set "parent->state_saved = true" but did not save its L1SS state because pci_save_aspm_l1ss_state() is a no-op for Downstream Ports, right? But I would think this would be a very common situation because pcie_portdrv_probe() calls pci_save_state() for Downstream Ports when pciehp, AER, PME, etc, are enabled, and this would happen before the pci_save_state() calls from Endpoint drivers. So I'm a little surprised that this didn't blow up for everybody immediately. Is there something that makes this unusual? > Then later when > attempting to save the L1SS config while handling the Upstream Port, > parent->state_saved is true in pci_save_aspm_l1ss_state() resulting in > early return and skipping saving bridge's L1SS config because it is > assumed to be already saved. Later on restore, junk is written into > L1SS config which causes issues with some devices. > > Remove parent->state_saved check and unconditionally save L1SS config > also for the upstream bridge from an Upstream Port which ought to be > harmless from correctness point of view. With the Upstream Port check > now present, saving the L1SS config more than once for the bridge is no > longer a problem (unlike when the parent->state_saved check got > introduced into the fix during its development). > > Fixes: 1db806ec06b7 ("PCI/ASPM: Save parent L1SS config in pci_save_aspm_l1ss_state()") > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219731 > Reported-by: Niklāvs Koļesņikovs <pinkflames.linux@gmail.com> > Tested-by: Niklāvs Koļesņikovs <pinkflames.linux@gmail.com> > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> > --- > drivers/pci/pcie/aspm.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > index e0bc90597dca..da3e7edcf49d 100644 > --- a/drivers/pci/pcie/aspm.c > +++ b/drivers/pci/pcie/aspm.c > @@ -108,9 +108,6 @@ void pci_save_aspm_l1ss_state(struct pci_dev *pdev) > pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL2, cap++); > pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL1, cap++); > > - if (parent->state_saved) > - return; > - > /* > * Save parent's L1 substate configuration so we have it for > * pci_restore_aspm_l1ss_state(pdev) to restore. > > base-commit: 72deda0abee6e705ae71a93f69f55e33be5bca5c > -- > 2.39.5 >
Hi, I'm the person who first reported this issue. The most obvious reason to a layman such as myself would be the simple fact that desktop platforms by default do not enable L1SS in their EFI/BIOS configurations. Meaning only weirdoes such as myself who tinker with EFI and kernel settings [while looking at a power meter readings] would ever encounter this on a desktop system. I don't know about portable devices but perhaps they rely on their embedded controllers instead of exposing L1SS to the OS, too? Just guessing. That being said, it's possible my layman intuition is very wrong. If so, sorry about the noise. Cheers, Niklāvs Koļesņikovs P.S. Sorry, if reply to all was the wrong GMail button to press. otrd., 2025. g. 4. febr., plkst. 22:39 — lietotājs Bjorn Helgaas (<helgaas@kernel.org>) rakstīja: > > [+cc Rafael] > > On Fri, Jan 31, 2025 at 05:29:13PM +0200, Ilpo Järvinen wrote: > > The commit 1db806ec06b7 ("PCI/ASPM: Save parent L1SS config in > > pci_save_aspm_l1ss_state()") aimed to perform L1SS config save for both > > the Upstream Port and its upstream bridge when handling an Upstream > > Port, which matches what the L1SS restore side does. However, > > parent->state_saved can be set true at an earlier time when the > > upstream bridge saved other parts of its state. > > So I guess the scenario is that we got here because some driver called > pci_save_state(pdev): > > pci_save_state > dev->state_saved = true <-- > pci_save_pcie_state > pci_save_aspm_l1ss_state > if (pcie_downstream_port(pdev)) > return > # save pdev L1SS state here > if (parent->state_saved) <-- > return > # save parent L1SS state here > > and the problem is that we previously called pci_save_state(parent), > which set "parent->state_saved = true" but did not save its L1SS state > because pci_save_aspm_l1ss_state() is a no-op for Downstream Ports, > right? > > But I would think this would be a very common situation because > pcie_portdrv_probe() calls pci_save_state() for Downstream Ports when > pciehp, AER, PME, etc, are enabled, and this would happen before the > pci_save_state() calls from Endpoint drivers. > > So I'm a little surprised that this didn't blow up for everybody > immediately. Is there something that makes this unusual? > > > Then later when > > attempting to save the L1SS config while handling the Upstream Port, > > parent->state_saved is true in pci_save_aspm_l1ss_state() resulting in > > early return and skipping saving bridge's L1SS config because it is > > assumed to be already saved. Later on restore, junk is written into > > L1SS config which causes issues with some devices. > > > > Remove parent->state_saved check and unconditionally save L1SS config > > also for the upstream bridge from an Upstream Port which ought to be > > harmless from correctness point of view. With the Upstream Port check > > now present, saving the L1SS config more than once for the bridge is no > > longer a problem (unlike when the parent->state_saved check got > > introduced into the fix during its development). > > > > Fixes: 1db806ec06b7 ("PCI/ASPM: Save parent L1SS config in pci_save_aspm_l1ss_state()") > > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219731 > > Reported-by: Niklāvs Koļesņikovs <pinkflames.linux@gmail.com> > > Tested-by: Niklāvs Koļesņikovs <pinkflames.linux@gmail.com> > > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> > > --- > > drivers/pci/pcie/aspm.c | 3 --- > > 1 file changed, 3 deletions(-) > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > > index e0bc90597dca..da3e7edcf49d 100644 > > --- a/drivers/pci/pcie/aspm.c > > +++ b/drivers/pci/pcie/aspm.c > > @@ -108,9 +108,6 @@ void pci_save_aspm_l1ss_state(struct pci_dev *pdev) > > pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL2, cap++); > > pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL1, cap++); > > > > - if (parent->state_saved) > > - return; > > - > > /* > > * Save parent's L1 substate configuration so we have it for > > * pci_restore_aspm_l1ss_state(pdev) to restore. > > > > base-commit: 72deda0abee6e705ae71a93f69f55e33be5bca5c > > -- > > 2.39.5 > >
On Tue, 4 Feb 2025, Bjorn Helgaas wrote: > [+cc Rafael] > > On Fri, Jan 31, 2025 at 05:29:13PM +0200, Ilpo Järvinen wrote: > > The commit 1db806ec06b7 ("PCI/ASPM: Save parent L1SS config in > > pci_save_aspm_l1ss_state()") aimed to perform L1SS config save for both > > the Upstream Port and its upstream bridge when handling an Upstream > > Port, which matches what the L1SS restore side does. However, > > parent->state_saved can be set true at an earlier time when the > > upstream bridge saved other parts of its state. > > So I guess the scenario is that we got here because some driver called > pci_save_state(pdev): > > pci_save_state > dev->state_saved = true <-- > pci_save_pcie_state > pci_save_aspm_l1ss_state > if (pcie_downstream_port(pdev)) > return > # save pdev L1SS state here > if (parent->state_saved) <-- > return > # save parent L1SS state here > > and the problem is that we previously called pci_save_state(parent), > which set "parent->state_saved = true" but did not save its L1SS state > because pci_save_aspm_l1ss_state() is a no-op for Downstream Ports, > right? Yes! An unfortunate interaction between those two checks. > But I would think this would be a very common situation because > pcie_portdrv_probe() calls pci_save_state() for Downstream Ports when > pciehp, AER, PME, etc, are enabled, and this would happen before the > pci_save_state() calls from Endpoint drivers. > > So I'm a little surprised that this didn't blow up for everybody > immediately. Is there something that makes this unusual? I agree it should be very common and was quite surprised that -next did not catch it. What I recall though is you modified the patch while applying it by adding those Downstream Port checks so the fix patch's Tested-by was for different code from what got applied (and it would have been caught would the original author have tested also the modified commit). Unfortunately, I cannot think of anything that would be so unusual to warrant not detecting it earlier. Maybe it was just the holiday period causing less testing and lower level of awareness in general? The machine doesn't always hang because of the problem as was the case with Niklāvs, so it might have occurred but went unnoticed if it occurred for a device that is not critical. > > Then later when > > attempting to save the L1SS config while handling the Upstream Port, > > parent->state_saved is true in pci_save_aspm_l1ss_state() resulting in > > early return and skipping saving bridge's L1SS config because it is > > assumed to be already saved. Later on restore, junk is written into > > L1SS config which causes issues with some devices. > > > > Remove parent->state_saved check and unconditionally save L1SS config > > also for the upstream bridge from an Upstream Port which ought to be > > harmless from correctness point of view. With the Upstream Port check > > now present, saving the L1SS config more than once for the bridge is no > > longer a problem (unlike when the parent->state_saved check got > > introduced into the fix during its development). > > > > Fixes: 1db806ec06b7 ("PCI/ASPM: Save parent L1SS config in pci_save_aspm_l1ss_state()") > > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219731 > > Reported-by: Niklāvs Koļesņikovs <pinkflames.linux@gmail.com> > > Tested-by: Niklāvs Koļesņikovs <pinkflames.linux@gmail.com> > > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> > > --- > > drivers/pci/pcie/aspm.c | 3 --- > > 1 file changed, 3 deletions(-) > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > > index e0bc90597dca..da3e7edcf49d 100644 > > --- a/drivers/pci/pcie/aspm.c > > +++ b/drivers/pci/pcie/aspm.c > > @@ -108,9 +108,6 @@ void pci_save_aspm_l1ss_state(struct pci_dev *pdev) > > pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL2, cap++); > > pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL1, cap++); > > > > - if (parent->state_saved) > > - return; > > - > > /* > > * Save parent's L1 substate configuration so we have it for > > * pci_restore_aspm_l1ss_state(pdev) to restore. > > > > base-commit: 72deda0abee6e705ae71a93f69f55e33be5bca5c > > -- > > 2.39.5 > > >
On Wed, Feb 05, 2025 at 10:38:24AM +0200, Ilpo Järvinen wrote: > On Tue, 4 Feb 2025, Bjorn Helgaas wrote: > > > [+cc Rafael] > > > > On Fri, Jan 31, 2025 at 05:29:13PM +0200, Ilpo Järvinen wrote: > > > The commit 1db806ec06b7 ("PCI/ASPM: Save parent L1SS config in > > > pci_save_aspm_l1ss_state()") aimed to perform L1SS config save for both > > > the Upstream Port and its upstream bridge when handling an Upstream > > > Port, which matches what the L1SS restore side does. However, > > > parent->state_saved can be set true at an earlier time when the > > > upstream bridge saved other parts of its state. > > > > So I guess the scenario is that we got here because some driver called > > pci_save_state(pdev): > > > > pci_save_state > > dev->state_saved = true <-- > > pci_save_pcie_state > > pci_save_aspm_l1ss_state > > if (pcie_downstream_port(pdev)) > > return > > # save pdev L1SS state here > > if (parent->state_saved) <-- > > return > > # save parent L1SS state here > > > > and the problem is that we previously called pci_save_state(parent), > > which set "parent->state_saved = true" but did not save its L1SS state > > because pci_save_aspm_l1ss_state() is a no-op for Downstream Ports, > > right? > > Yes! An unfortunate interaction between those two checks. > > > But I would think this would be a very common situation because > > pcie_portdrv_probe() calls pci_save_state() for Downstream Ports when > > pciehp, AER, PME, etc, are enabled, and this would happen before the > > pci_save_state() calls from Endpoint drivers. > > > > So I'm a little surprised that this didn't blow up for everybody > > immediately. Is there something that makes this unusual? > > I agree it should be very common and was quite surprised that -next > did not catch it. What I recall though is you modified the patch while > applying it by adding those Downstream Port checks so the fix > patch's Tested-by was for different code from what got applied (and it > would have been caught would the original author have tested also the > modified commit). Sigh, that makes sense, it's probably my fault, sorry. I apologize for messing this up and causing all this extra work. I applied this to pci/for-linus yesterday, so it should make it for v6.14-rc2. > Unfortunately, I cannot think of anything that would be so unusual to > warrant not detecting it earlier. Maybe it was just the holiday period > causing less testing and lower level of awareness in general? The machine > doesn't always hang because of the problem as was the case with Niklāvs, > so it might have occurred but went unnoticed if it occurred for a device > that is not critical. > > > > Then later when > > > attempting to save the L1SS config while handling the Upstream Port, > > > parent->state_saved is true in pci_save_aspm_l1ss_state() resulting in > > > early return and skipping saving bridge's L1SS config because it is > > > assumed to be already saved. Later on restore, junk is written into > > > L1SS config which causes issues with some devices. > > > > > > Remove parent->state_saved check and unconditionally save L1SS config > > > also for the upstream bridge from an Upstream Port which ought to be > > > harmless from correctness point of view. With the Upstream Port check > > > now present, saving the L1SS config more than once for the bridge is no > > > longer a problem (unlike when the parent->state_saved check got > > > introduced into the fix during its development). > > > > > > Fixes: 1db806ec06b7 ("PCI/ASPM: Save parent L1SS config in pci_save_aspm_l1ss_state()") > > > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219731 > > > Reported-by: Niklāvs Koļesņikovs <pinkflames.linux@gmail.com> > > > Tested-by: Niklāvs Koļesņikovs <pinkflames.linux@gmail.com> > > > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> > > > --- > > > drivers/pci/pcie/aspm.c | 3 --- > > > 1 file changed, 3 deletions(-) > > > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > > > index e0bc90597dca..da3e7edcf49d 100644 > > > --- a/drivers/pci/pcie/aspm.c > > > +++ b/drivers/pci/pcie/aspm.c > > > @@ -108,9 +108,6 @@ void pci_save_aspm_l1ss_state(struct pci_dev *pdev) > > > pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL2, cap++); > > > pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL1, cap++); > > > > > > - if (parent->state_saved) > > > - return; > > > - > > > /* > > > * Save parent's L1 substate configuration so we have it for > > > * pci_restore_aspm_l1ss_state(pdev) to restore. > > > > > > base-commit: 72deda0abee6e705ae71a93f69f55e33be5bca5c > > > -- > > > 2.39.5 > > > > > > > -- > i.
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index e0bc90597dca..da3e7edcf49d 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -108,9 +108,6 @@ void pci_save_aspm_l1ss_state(struct pci_dev *pdev) pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL2, cap++); pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL1, cap++); - if (parent->state_saved) - return; - /* * Save parent's L1 substate configuration so we have it for * pci_restore_aspm_l1ss_state(pdev) to restore.