diff mbox series

[v2,03/13] PCI/ASPM: Disable ASPM when driver requests it

Message ID 20230918131103.24119-4-ilpo.jarvinen@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series PCI/ASPM: Make ASPM in core robust and remove driver workarounds | expand

Commit Message

Ilpo Järvinen Sept. 18, 2023, 1:10 p.m. UTC
PCI core/ASPM service driver allows controlling ASPM state through
pci_disable_link_state() and pci_enable_link_state() API. It was
decided earlier (see the Link below), to not allow ASPM changes when OS
does not have control over it but only log a warning about the problem
(commit 2add0ec14c25 ("PCI/ASPM: Warn when driver asks to disable ASPM,
but we can't do it")). Similarly, if ASPM is not enabled through
config, ASPM cannot be disabled.

A number of drivers have added workarounds to force ASPM off with own
writes into the Link Control Register (some even with comments
explaining why PCI core does not disable it under some circumstances).
According to the comments, some drivers require ASPM to be off for
reliable operation.

Having custom ASPM handling in drivers is problematic because the state
kept in the ASPM service driver is not updated by the changes made
outside the link state management API.

As the first step to address this issue, make pci_disable_link_state()
to unconditionally disable ASPM so the motivation for drivers to come
up with custom ASPM handling code is eliminated.

Place the minimal ASPM disable handling into own file as it is too
complicated to fit into a header as static inline and it has almost no
overlap with the existing, more complicated ASPM code in
drivers/pci/pce/aspm.c.

Make pci_disable_link_state() function comment to comply kerneldoc
formatting while changing the description.

Link: https://lore.kernel.org/all/CANUX_P3F5YhbZX3WGU-j1AGpbXb_T9Bis2ErhvKkFMtDvzatVQ@mail.gmail.com/
Link: https://lore.kernel.org/all/20230511131441.45704-1-ilpo.jarvinen@linux.intel.com/
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/pci/pcie/Makefile       |  1 +
 drivers/pci/pcie/aspm.c         | 33 ++++++++++-------
 drivers/pci/pcie/aspm_minimal.c | 66 +++++++++++++++++++++++++++++++++
 include/linux/pci.h             |  6 +--
 4 files changed, 88 insertions(+), 18 deletions(-)
 create mode 100644 drivers/pci/pcie/aspm_minimal.c

Comments

Bjorn Helgaas Oct. 11, 2023, 8:04 p.m. UTC | #1
On Mon, Sep 18, 2023 at 04:10:53PM +0300, Ilpo Järvinen wrote:
> PCI core/ASPM service driver allows controlling ASPM state through
> pci_disable_link_state() and pci_enable_link_state() API. It was
> decided earlier (see the Link below), to not allow ASPM changes when OS
> does not have control over it but only log a warning about the problem
> (commit 2add0ec14c25 ("PCI/ASPM: Warn when driver asks to disable ASPM,
> but we can't do it")). Similarly, if ASPM is not enabled through
> config, ASPM cannot be disabled.
> 
> A number of drivers have added workarounds to force ASPM off with own
> writes into the Link Control Register (some even with comments
> explaining why PCI core does not disable it under some circumstances).
> According to the comments, some drivers require ASPM to be off for
> reliable operation.
> 
> Having custom ASPM handling in drivers is problematic because the state
> kept in the ASPM service driver is not updated by the changes made
> outside the link state management API.
> 
> As the first step to address this issue, make pci_disable_link_state()
> to unconditionally disable ASPM so the motivation for drivers to come
> up with custom ASPM handling code is eliminated.
> 
> Place the minimal ASPM disable handling into own file as it is too
> complicated to fit into a header as static inline and it has almost no
> overlap with the existing, more complicated ASPM code in
> drivers/pci/pce/aspm.c.
> 
> Make pci_disable_link_state() function comment to comply kerneldoc
> formatting while changing the description.
> 
> Link: https://lore.kernel.org/all/CANUX_P3F5YhbZX3WGU-j1AGpbXb_T9Bis2ErhvKkFMtDvzatVQ@mail.gmail.com/
> Link: https://lore.kernel.org/all/20230511131441.45704-1-ilpo.jarvinen@linux.intel.com/
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> ---
>  drivers/pci/pcie/Makefile       |  1 +
>  drivers/pci/pcie/aspm.c         | 33 ++++++++++-------
>  drivers/pci/pcie/aspm_minimal.c | 66 +++++++++++++++++++++++++++++++++
>  include/linux/pci.h             |  6 +--
>  4 files changed, 88 insertions(+), 18 deletions(-)
>  create mode 100644 drivers/pci/pcie/aspm_minimal.c
> 
> diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
> index 8de4ed5f98f1..ec7f04037b01 100644
> --- a/drivers/pci/pcie/Makefile
> +++ b/drivers/pci/pcie/Makefile
> @@ -6,6 +6,7 @@ pcieportdrv-y			:= portdrv.o rcec.o
>  
>  obj-$(CONFIG_PCIEPORTBUS)	+= pcieportdrv.o
>  
> +obj-y				+= aspm_minimal.o

Can we put this code in drivers/pci/pci.c instead of creating a new
file for it?  pci.c is kind of a dumping ground and isn't ideal
either, but we do have a few other things there that we *always* want
even though they're related to a separate Kconfig feature, e.g.,
pci_bridge_reconfigure_ltr(), pcie_clear_device_status(),
pcie_clear_root_pme_status().

>  obj-$(CONFIG_PCIEASPM)		+= aspm.o

Or maybe it would be better to just put it in aspm.c, drop this
compilation guard, and wrap the rest of the file in #ifdef
CONFIG_PCIEASPM.  Then everything would be in one file, which is a
major boon for code readers.

What do you think?

>  obj-$(CONFIG_PCIEAER)		+= aer.o err.o
>  obj-$(CONFIG_PCIEAER_INJECT)	+= aer_inject.o
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 860bc94974ec..ec6d7a092ac1 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -1042,16 +1042,23 @@ static int __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem)
>  		return -EINVAL;
>  	/*
>  	 * A driver requested that ASPM be disabled on this device, but
> -	 * if we don't have permission to manage ASPM (e.g., on ACPI
> +	 * if we might not have permission to manage ASPM (e.g., on ACPI
>  	 * systems we have to observe the FADT ACPI_FADT_NO_ASPM bit and
> -	 * the _OSC method), we can't honor that request.  Windows has
> -	 * a similar mechanism using "PciASPMOptOut", which is also
> -	 * ignored in this situation.
> +	 * the _OSC method), previously we chose to not honor disable
> +	 * request in that case. Windows has a similar mechanism using
> +	 * "PciASPMOptOut", which is also ignored in this situation.
> +	 *
> +	 * Not honoring the requests to disable ASPM, however, led to
> +	 * drivers forcing ASPM off on their own. As such changes of ASPM
> +	 * state are not tracked by this service driver, the state kept here
> +	 * became out of sync.
> +	 *
> +	 * Therefore, honor ASPM disable requests even when OS does not have
> +	 * ASPM control. Plain disable for ASPM is assumed to be slightly
> +	 * safer than fully managing it.
>  	 */
> -	if (aspm_disabled) {
> -		pci_warn(pdev, "can't disable ASPM; OS doesn't have ASPM control\n");
> -		return -EPERM;
> -	}
> +	if (aspm_disabled)
> +		pci_warn(pdev, "OS doesn't have ASPM control, disabling ASPM anyway\n");

I think this is better than the previous situation, but I think we
should taint the kernel here because it's possible the firmware had a
reason for retaining ASPM control, so we might be stepping on
something.  Arguably the message is already enough of a signal, but
checking for a taint is potentially a little more automatable.

> +int pci_disable_link_state_locked(struct pci_dev *pdev, int state)
> +{
> +	struct pci_dev *parent = pdev->bus->self;
> +	struct pci_bus *linkbus = pdev->bus;
> +	struct pci_dev *child;
> +	u16 aspm_enabled, linkctl;
> +	int ret;
> +
> +	if (!parent)
> +		return -ENODEV;
> +
> +	ret = pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &linkctl);
> +	if (ret != PCIBIOS_SUCCESSFUL)
> +		return pcibios_err_to_errno(ret);
> +	aspm_enabled = linkctl & PCI_EXP_LNKCTL_ASPMC;

In this case, we don't care about the shift offset of the
PCI_EXP_LNKCTL_ASPMC bitfield, but if we use FIELD_GET() in most/all
other cases where we look at PCI_EXP_LNKCTL, maybe it would be worth
using it here as well?

Tangent, but I'm always dubious about the idea that e1000e is so
special that only there do we need the "_locked" variant of this
function.  No suggestion though; no need to do anything about it in
this series ;)

> +	ret = pcie_capability_read_word(pdev, PCI_EXP_LNKCTL, &linkctl);
> +	if (ret != PCIBIOS_SUCCESSFUL)
> +		return pcibios_err_to_errno(ret);
> +	aspm_enabled |= linkctl & PCI_EXP_LNKCTL_ASPMC;
> +
> +	/* If no states need to be disabled, don't touch LNKCTL */
> +	if (state & aspm_enabled)
> +		return 0;
> +
> +	ret = pcie_capability_clear_word(parent, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_ASPMC);
> +	if (ret != PCIBIOS_SUCCESSFUL)
> +		return pcibios_err_to_errno(ret);
> +	list_for_each_entry(child, &linkbus->devices, bus_list)
> +		pcie_capability_clear_word(child, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_ASPMC);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL(pci_disable_link_state_locked);
Bjorn Helgaas Oct. 11, 2023, 9:22 p.m. UTC | #2
On Mon, Sep 18, 2023 at 04:10:53PM +0300, Ilpo Järvinen wrote:
> PCI core/ASPM service driver allows controlling ASPM state through
> pci_disable_link_state() and pci_enable_link_state() API. It was
> decided earlier (see the Link below), to not allow ASPM changes when OS
> does not have control over it but only log a warning about the problem
> (commit 2add0ec14c25 ("PCI/ASPM: Warn when driver asks to disable ASPM,
> but we can't do it")). Similarly, if ASPM is not enabled through
> config, ASPM cannot be disabled.
> ...

> +#ifndef CONFIG_PCIEASPM
> +/*
> + * Always disable ASPM when requested, even when CONFIG_PCIEASPM is
> + * not build to avoid drivers adding code to do it on their own
> + * which caused issues when core does not know about the out-of-band
> + * ASPM state changes.
> + */
> +int pci_disable_link_state_locked(struct pci_dev *pdev, int state)
> +{
> +	struct pci_dev *parent = pdev->bus->self;
> +	struct pci_bus *linkbus = pdev->bus;
> +	struct pci_dev *child;
> +	u16 aspm_enabled, linkctl;
> +	int ret;
> +
> +	if (!parent)
> +		return -ENODEV;

P.S. I think this should look the same to the user (same dmesg log and
same taint, if we do that) as the CONFIG_PCIEASPM=y case.

> +	ret = pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &linkctl);
> +	if (ret != PCIBIOS_SUCCESSFUL)
> +		return pcibios_err_to_errno(ret);
> +	aspm_enabled = linkctl & PCI_EXP_LNKCTL_ASPMC;
> +
> +	ret = pcie_capability_read_word(pdev, PCI_EXP_LNKCTL, &linkctl);
> +	if (ret != PCIBIOS_SUCCESSFUL)
> +		return pcibios_err_to_errno(ret);
> +	aspm_enabled |= linkctl & PCI_EXP_LNKCTL_ASPMC;
> +
> +	/* If no states need to be disabled, don't touch LNKCTL */
> +	if (state & aspm_enabled)
> +		return 0;
> +
> +	ret = pcie_capability_clear_word(parent, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_ASPMC);
> +	if (ret != PCIBIOS_SUCCESSFUL)
> +		return pcibios_err_to_errno(ret);
> +	list_for_each_entry(child, &linkbus->devices, bus_list)
> +		pcie_capability_clear_word(child, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_ASPMC);

This disables *all* ASPM states, unlike the version when
CONFIG_PCIEASPM is enabled.  I suppose there's a reason, and maybe a
comment could elaborate on it?

When CONFIG_PCIEASPM is not enabled, I don't think we actively
*disable* ASPM in the hardware; we just leave it as-is, so firmware
might have left it enabled.

> +
> +	return 0;
> +}

Conceptually it seems like the LNKCTL updates here should be the same
whether CONFIG_PCIEASPM is enabled or not (subject to the question
above).

When CONFIG_PCIEASPM is enabled, we might need to do more stuff, but
it seems like the core should be the same.

Bjorn
Ilpo Järvinen Oct. 12, 2023, 10:47 a.m. UTC | #3
On Wed, 11 Oct 2023, Bjorn Helgaas wrote:

> On Mon, Sep 18, 2023 at 04:10:53PM +0300, Ilpo Järvinen wrote:
> > PCI core/ASPM service driver allows controlling ASPM state through
> > pci_disable_link_state() and pci_enable_link_state() API. It was
> > decided earlier (see the Link below), to not allow ASPM changes when OS
> > does not have control over it but only log a warning about the problem
> > (commit 2add0ec14c25 ("PCI/ASPM: Warn when driver asks to disable ASPM,
> > but we can't do it")). Similarly, if ASPM is not enabled through
> > config, ASPM cannot be disabled.
> > 
> > A number of drivers have added workarounds to force ASPM off with own
> > writes into the Link Control Register (some even with comments
> > explaining why PCI core does not disable it under some circumstances).
> > According to the comments, some drivers require ASPM to be off for
> > reliable operation.
> > 
> > Having custom ASPM handling in drivers is problematic because the state
> > kept in the ASPM service driver is not updated by the changes made
> > outside the link state management API.
> > 
> > As the first step to address this issue, make pci_disable_link_state()
> > to unconditionally disable ASPM so the motivation for drivers to come
> > up with custom ASPM handling code is eliminated.
> > 
> > Place the minimal ASPM disable handling into own file as it is too
> > complicated to fit into a header as static inline and it has almost no
> > overlap with the existing, more complicated ASPM code in
> > drivers/pci/pce/aspm.c.
> > 
> > Make pci_disable_link_state() function comment to comply kerneldoc
> > formatting while changing the description.
> > 
> > Link: https://lore.kernel.org/all/CANUX_P3F5YhbZX3WGU-j1AGpbXb_T9Bis2ErhvKkFMtDvzatVQ@mail.gmail.com/
> > Link: https://lore.kernel.org/all/20230511131441.45704-1-ilpo.jarvinen@linux.intel.com/
> > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> > ---
> >  drivers/pci/pcie/Makefile       |  1 +
> >  drivers/pci/pcie/aspm.c         | 33 ++++++++++-------
> >  drivers/pci/pcie/aspm_minimal.c | 66 +++++++++++++++++++++++++++++++++
> >  include/linux/pci.h             |  6 +--
> >  4 files changed, 88 insertions(+), 18 deletions(-)
> >  create mode 100644 drivers/pci/pcie/aspm_minimal.c
> > 
> > diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
> > index 8de4ed5f98f1..ec7f04037b01 100644
> > --- a/drivers/pci/pcie/Makefile
> > +++ b/drivers/pci/pcie/Makefile
> > @@ -6,6 +6,7 @@ pcieportdrv-y			:= portdrv.o rcec.o
> >  
> >  obj-$(CONFIG_PCIEPORTBUS)	+= pcieportdrv.o
> >  
> > +obj-y				+= aspm_minimal.o
> 
> Can we put this code in drivers/pci/pci.c instead of creating a new
> file for it?  pci.c is kind of a dumping ground and isn't ideal
> either, but we do have a few other things there that we *always* want
> even though they're related to a separate Kconfig feature, e.g.,
> pci_bridge_reconfigure_ltr(), pcie_clear_device_status(),
> pcie_clear_root_pme_status().
> 
> >  obj-$(CONFIG_PCIEASPM)		+= aspm.o
> 
> Or maybe it would be better to just put it in aspm.c, drop this
> compilation guard, and wrap the rest of the file in #ifdef
> CONFIG_PCIEASPM.  Then everything would be in one file, which is a
> major boon for code readers.
> 
> What do you think?

I was not sure which was the best place for such "reverse config trickery"  
so I just picked one of the possible ones (it's easy to tweak it anyway).

I think I'll now go with aspm.c but then I'll have to change aspm.o to 
obj-y which is really CONFIG_PCI because of the dir.

> >  obj-$(CONFIG_PCIEAER)		+= aer.o err.o
> >  obj-$(CONFIG_PCIEAER_INJECT)	+= aer_inject.o
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index 860bc94974ec..ec6d7a092ac1 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -1042,16 +1042,23 @@ static int __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem)
> >  		return -EINVAL;
> >  	/*
> >  	 * A driver requested that ASPM be disabled on this device, but
> > -	 * if we don't have permission to manage ASPM (e.g., on ACPI
> > +	 * if we might not have permission to manage ASPM (e.g., on ACPI
> >  	 * systems we have to observe the FADT ACPI_FADT_NO_ASPM bit and
> > -	 * the _OSC method), we can't honor that request.  Windows has
> > -	 * a similar mechanism using "PciASPMOptOut", which is also
> > -	 * ignored in this situation.
> > +	 * the _OSC method), previously we chose to not honor disable
> > +	 * request in that case. Windows has a similar mechanism using
> > +	 * "PciASPMOptOut", which is also ignored in this situation.
> > +	 *
> > +	 * Not honoring the requests to disable ASPM, however, led to
> > +	 * drivers forcing ASPM off on their own. As such changes of ASPM
> > +	 * state are not tracked by this service driver, the state kept here
> > +	 * became out of sync.
> > +	 *
> > +	 * Therefore, honor ASPM disable requests even when OS does not have
> > +	 * ASPM control. Plain disable for ASPM is assumed to be slightly
> > +	 * safer than fully managing it.
> >  	 */
> > -	if (aspm_disabled) {
> > -		pci_warn(pdev, "can't disable ASPM; OS doesn't have ASPM control\n");
> > -		return -EPERM;
> > -	}
> > +	if (aspm_disabled)
> > +		pci_warn(pdev, "OS doesn't have ASPM control, disabling ASPM anyway\n");
> 
> I think this is better than the previous situation, but I think we
> should taint the kernel here because it's possible the firmware had a
> reason for retaining ASPM control, so we might be stepping on
> something.  Arguably the message is already enough of a signal, but
> checking for a taint is potentially a little more automatable.

That's probably a good idea, yes.

> > +int pci_disable_link_state_locked(struct pci_dev *pdev, int state)
> > +{
> > +	struct pci_dev *parent = pdev->bus->self;
> > +	struct pci_bus *linkbus = pdev->bus;
> > +	struct pci_dev *child;
> > +	u16 aspm_enabled, linkctl;
> > +	int ret;
> > +
> > +	if (!parent)
> > +		return -ENODEV;
> > +
> > +	ret = pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &linkctl);
> > +	if (ret != PCIBIOS_SUCCESSFUL)
> > +		return pcibios_err_to_errno(ret);
> > +	aspm_enabled = linkctl & PCI_EXP_LNKCTL_ASPMC;
> 
> In this case, we don't care about the shift offset of the
> PCI_EXP_LNKCTL_ASPMC bitfield, but if we use FIELD_GET() in most/all
> other cases where we look at PCI_EXP_LNKCTL, maybe it would be worth
> using it here as well?

I can take a look at that.

> Tangent, but I'm always dubious about the idea that e1000e is so
> special that only there do we need the "_locked" variant of this
> function.  No suggestion though; no need to do anything about it in
> this series ;)

There was some case where it was needed based on the history search
but perhaps e1000e could do something to avoid calling it while still 
under the lock, it doesn't seem something that would immediately blow up
if that state adjustment is delayed slightly.
Ilpo Järvinen Oct. 12, 2023, 10:56 a.m. UTC | #4
On Wed, 11 Oct 2023, Bjorn Helgaas wrote:

> On Mon, Sep 18, 2023 at 04:10:53PM +0300, Ilpo Järvinen wrote:
> > PCI core/ASPM service driver allows controlling ASPM state through
> > pci_disable_link_state() and pci_enable_link_state() API. It was
> > decided earlier (see the Link below), to not allow ASPM changes when OS
> > does not have control over it but only log a warning about the problem
> > (commit 2add0ec14c25 ("PCI/ASPM: Warn when driver asks to disable ASPM,
> > but we can't do it")). Similarly, if ASPM is not enabled through
> > config, ASPM cannot be disabled.
> > ...
> 
> > +#ifndef CONFIG_PCIEASPM
> > +/*
> > + * Always disable ASPM when requested, even when CONFIG_PCIEASPM is
> > + * not build to avoid drivers adding code to do it on their own
> > + * which caused issues when core does not know about the out-of-band
> > + * ASPM state changes.
> > + */
> > +int pci_disable_link_state_locked(struct pci_dev *pdev, int state)
> > +{
> > +	struct pci_dev *parent = pdev->bus->self;
> > +	struct pci_bus *linkbus = pdev->bus;
> > +	struct pci_dev *child;
> > +	u16 aspm_enabled, linkctl;
> > +	int ret;
> > +
> > +	if (!parent)
> > +		return -ENODEV;
> 
> P.S. I think this should look the same to the user (same dmesg log and
> same taint, if we do that) as the CONFIG_PCIEASPM=y case.

Okay.

> > +	ret = pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &linkctl);
> > +	if (ret != PCIBIOS_SUCCESSFUL)
> > +		return pcibios_err_to_errno(ret);
> > +	aspm_enabled = linkctl & PCI_EXP_LNKCTL_ASPMC;
> > +
> > +	ret = pcie_capability_read_word(pdev, PCI_EXP_LNKCTL, &linkctl);
> > +	if (ret != PCIBIOS_SUCCESSFUL)
> > +		return pcibios_err_to_errno(ret);
> > +	aspm_enabled |= linkctl & PCI_EXP_LNKCTL_ASPMC;
> > +
> > +	/* If no states need to be disabled, don't touch LNKCTL */
> > +	if (state & aspm_enabled)
> > +		return 0;
> > +
> > +	ret = pcie_capability_clear_word(parent, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_ASPMC);
> > +	if (ret != PCIBIOS_SUCCESSFUL)
> > +		return pcibios_err_to_errno(ret);
> > +	list_for_each_entry(child, &linkbus->devices, bus_list)
> > +		pcie_capability_clear_word(child, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_ASPMC);
> 
> This disables *all* ASPM states, unlike the version when
> CONFIG_PCIEASPM is enabled.  I suppose there's a reason, and maybe a
> comment could elaborate on it?
>
> When CONFIG_PCIEASPM is not enabled, I don't think we actively
> *disable* ASPM in the hardware; we just leave it as-is, so firmware
> might have left it enabled.

This whole trickery is intended for drivers that do not want to have ASPM 
because the devices are broken with it. So leaving it as-is is not really 
an option (as demonstrated by the custom workarounds).

> > +
> > +	return 0;
> > +}
> 
> Conceptually it seems like the LNKCTL updates here should be the same
> whether CONFIG_PCIEASPM is enabled or not (subject to the question
> above).
> 
> When CONFIG_PCIEASPM is enabled, we might need to do more stuff, but
> it seems like the core should be the same.

So you think it's safer to partially disable ASPM (as per driver's 
request) rather than disable it completely? I got the impression that the 
latter might be safer from what Rafael said earlier but I suppose I might 
have misinterpreted him since he didn't exactly say that it might be safer 
to _completely_ disable it.
Bjorn Helgaas Oct. 13, 2023, 4:42 p.m. UTC | #5
On Thu, Oct 12, 2023 at 01:56:16PM +0300, Ilpo Järvinen wrote:
> On Wed, 11 Oct 2023, Bjorn Helgaas wrote:
> > On Mon, Sep 18, 2023 at 04:10:53PM +0300, Ilpo Järvinen wrote:
> > > PCI core/ASPM service driver allows controlling ASPM state through
> > > pci_disable_link_state() and pci_enable_link_state() API. It was
> > > decided earlier (see the Link below), to not allow ASPM changes when OS
> > > does not have control over it but only log a warning about the problem
> > > (commit 2add0ec14c25 ("PCI/ASPM: Warn when driver asks to disable ASPM,
> > > but we can't do it")). Similarly, if ASPM is not enabled through
> > > config, ASPM cannot be disabled.
> ...

> > This disables *all* ASPM states, unlike the version when
> > CONFIG_PCIEASPM is enabled.  I suppose there's a reason, and maybe a
> > comment could elaborate on it?
> >
> > When CONFIG_PCIEASPM is not enabled, I don't think we actively
> > *disable* ASPM in the hardware; we just leave it as-is, so firmware
> > might have left it enabled.
> 
> This whole trickery is intended for drivers that do not want to have ASPM 
> because the devices are broken with it. So leaving it as-is is not really 
> an option (as demonstrated by the custom workarounds).

Right.

> > Conceptually it seems like the LNKCTL updates here should be the same
> > whether CONFIG_PCIEASPM is enabled or not (subject to the question
> > above).
> > 
> > When CONFIG_PCIEASPM is enabled, we might need to do more stuff, but
> > it seems like the core should be the same.
> 
> So you think it's safer to partially disable ASPM (as per driver's 
> request) rather than disable it completely? I got the impression that the 
> latter might be safer from what Rafael said earlier but I suppose I might 
> have misinterpreted him since he didn't exactly say that it might be safer 
> to _completely_ disable it.

My question is whether the state of the device should depend on
CONFIG_PCIEASPM.  If the driver does this:

  pci_disable_link_state(PCIE_LINK_STATE_L0S)

do we want to leave L1 enabled when CONFIG_PCIEASPM=y but disable L1
when CONFIG_PCIEASPM is unset?

I can see arguments both ways.  My thought was that it would be nice
to end up with a single implementation of pci_disable_link_state()
with an #ifdef around the CONFIG_PCIEASPM-enabled stuff because it
makes the code easier to read.

Bjorn
Ilpo Järvinen Oct. 16, 2023, 2:27 p.m. UTC | #6
On Fri, 13 Oct 2023, Bjorn Helgaas wrote:
> On Thu, Oct 12, 2023 at 01:56:16PM +0300, Ilpo Järvinen wrote:
> > On Wed, 11 Oct 2023, Bjorn Helgaas wrote:
> > > On Mon, Sep 18, 2023 at 04:10:53PM +0300, Ilpo Järvinen wrote:
> > > > PCI core/ASPM service driver allows controlling ASPM state through
> > > > pci_disable_link_state() and pci_enable_link_state() API. It was
> > > > decided earlier (see the Link below), to not allow ASPM changes when OS
> > > > does not have control over it but only log a warning about the problem
> > > > (commit 2add0ec14c25 ("PCI/ASPM: Warn when driver asks to disable ASPM,
> > > > but we can't do it")). Similarly, if ASPM is not enabled through
> > > > config, ASPM cannot be disabled.
> > ...
> 
> > > This disables *all* ASPM states, unlike the version when
> > > CONFIG_PCIEASPM is enabled.  I suppose there's a reason, and maybe a
> > > comment could elaborate on it?
> > >
> > > When CONFIG_PCIEASPM is not enabled, I don't think we actively
> > > *disable* ASPM in the hardware; we just leave it as-is, so firmware
> > > might have left it enabled.
> > 
> > This whole trickery is intended for drivers that do not want to have ASPM 
> > because the devices are broken with it. So leaving it as-is is not really 
> > an option (as demonstrated by the custom workarounds).
> 
> Right.
> 
> > > Conceptually it seems like the LNKCTL updates here should be the same
> > > whether CONFIG_PCIEASPM is enabled or not (subject to the question
> > > above).
> > > 
> > > When CONFIG_PCIEASPM is enabled, we might need to do more stuff, but
> > > it seems like the core should be the same.
> > 
> > So you think it's safer to partially disable ASPM (as per driver's 
> > request) rather than disable it completely? I got the impression that the 
> > latter might be safer from what Rafael said earlier but I suppose I might 
> > have misinterpreted him since he didn't exactly say that it might be safer 
> > to _completely_ disable it.
> 
> My question is whether the state of the device should depend on
> CONFIG_PCIEASPM.  If the driver does this:
> 
>   pci_disable_link_state(PCIE_LINK_STATE_L0S)
> 
> do we want to leave L1 enabled when CONFIG_PCIEASPM=y but disable L1
> when CONFIG_PCIEASPM is unset?
> 
> I can see arguments both ways.  My thought was that it would be nice
> to end up with a single implementation of pci_disable_link_state()
> with an #ifdef around the CONFIG_PCIEASPM-enabled stuff because it
> makes the code easier to read.

Hi Bjorn,

Thanks a lot for all your feedback so far, it has been very helpful.

I think there's still one important thing to discuss and none of the 
comments have covered that area so far.

The drivers that have workaround are not going to turn more dangerous than 
they're already without this change, so we're mostly within charted waters 
there even with what you propose. However, I think the bigger catch and 
potential source of problems, with both this v2 and your alternative, are 
the drivers that do not have the workarounds around CONFIG_PCIEASPM=n 
and/or _OSC permissions. Those code paths just call 
pci_disable_link_state() and do nothing else.

Do you think it's okay to alter the behavior for those drivers too 
(disable ASPM where it previously was a no-op)?

I'm okay with going the direction you indicated but I just wanted to ask
this in advance before reworking the behavior so I can take that detail 
also into account.
Bjorn Helgaas Oct. 26, 2023, 10:02 p.m. UTC | #7
On Mon, Oct 16, 2023 at 05:27:37PM +0300, Ilpo Järvinen wrote:
> On Fri, 13 Oct 2023, Bjorn Helgaas wrote:
> > On Thu, Oct 12, 2023 at 01:56:16PM +0300, Ilpo Järvinen wrote:
> > > On Wed, 11 Oct 2023, Bjorn Helgaas wrote:
> > > > On Mon, Sep 18, 2023 at 04:10:53PM +0300, Ilpo Järvinen wrote:
> > > > > PCI core/ASPM service driver allows controlling ASPM state
> > > > > through pci_disable_link_state() and pci_enable_link_state()
> > > > > API. It was decided earlier (see the Link below), to not
> > > > > allow ASPM changes when OS does not have control over it but
> > > > > only log a warning about the problem (commit 2add0ec14c25
> > > > > ("PCI/ASPM: Warn when driver asks to disable ASPM, but we
> > > > > can't do it")). Similarly, if ASPM is not enabled through
> > > > > config, ASPM cannot be disabled.
> > > ...
> > 
> > > > This disables *all* ASPM states, unlike the version when
> > > > CONFIG_PCIEASPM is enabled.  I suppose there's a reason, and
> > > > maybe a comment could elaborate on it?
> > > >
> > > > When CONFIG_PCIEASPM is not enabled, I don't think we actively
> > > > *disable* ASPM in the hardware; we just leave it as-is, so
> > > > firmware might have left it enabled.
> > > 
> > > This whole trickery is intended for drivers that do not want to
> > > have ASPM because the devices are broken with it. So leaving it
> > > as-is is not really an option (as demonstrated by the custom
> > > workarounds).
> > 
> > Right.
> > 
> > > > Conceptually it seems like the LNKCTL updates here should be
> > > > the same whether CONFIG_PCIEASPM is enabled or not (subject to
> > > > the question above).
> > > > 
> > > > When CONFIG_PCIEASPM is enabled, we might need to do more
> > > > stuff, but it seems like the core should be the same.
> > > 
> > > So you think it's safer to partially disable ASPM (as per
> > > driver's request) rather than disable it completely? I got the
> > > impression that the latter might be safer from what Rafael said
> > > earlier but I suppose I might have misinterpreted him since he
> > > didn't exactly say that it might be safer to _completely_
> > > disable it.
> > 
> > My question is whether the state of the device should depend on
> > CONFIG_PCIEASPM.  If the driver does this:
> > 
> >   pci_disable_link_state(PCIE_LINK_STATE_L0S)
> > 
> > do we want to leave L1 enabled when CONFIG_PCIEASPM=y but disable L1
> > when CONFIG_PCIEASPM is unset?
> > 
> > I can see arguments both ways.  My thought was that it would be nice
> > to end up with a single implementation of pci_disable_link_state()
> > with an #ifdef around the CONFIG_PCIEASPM-enabled stuff because it
> > makes the code easier to read.

Responding to myself here, I think we should do the partial disables
because it matches what the drivers did previously by hand, we can
reduce the number of code paths, and the resulting device state will
be the same regardless of CONFIG_PCIEASPM.

> I think there's still one important thing to discuss and none of the
> comments have covered that area so far.
> 
> The drivers that have workaround are not going to turn more
> dangerous than they're already without this change, so we're mostly
> within charted waters there even with what you propose. However, I
> think the bigger catch and potential source of problems, with both
> this v2 and your alternative, are the drivers that do not have the
> workarounds around CONFIG_PCIEASPM=n and/or _OSC permissions. Those
> code paths just call pci_disable_link_state() and do nothing else.
> 
> Do you think it's okay to alter the behavior for those drivers too
> (disable ASPM where it previously was a no-op)?

Yes.  I assume the reason those drivers call pci_disable_link_state()
is because some hardware defect means ASPM doesn't work correctly.

This change means pci_disable_link_state() will disable ASPM even when
the OS doesn't own ASPM or CONFIG_PCIEASPM is unset.  I think those
cases are unusual and probably not well tested, and I suspect that if
we *did* test them, we'd find that ASPM doesn't work with the current
kernel.

So I think this is more likely to *fix* something than to break it.

Bjorn
diff mbox series

Patch

diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
index 8de4ed5f98f1..ec7f04037b01 100644
--- a/drivers/pci/pcie/Makefile
+++ b/drivers/pci/pcie/Makefile
@@ -6,6 +6,7 @@  pcieportdrv-y			:= portdrv.o rcec.o
 
 obj-$(CONFIG_PCIEPORTBUS)	+= pcieportdrv.o
 
+obj-y				+= aspm_minimal.o
 obj-$(CONFIG_PCIEASPM)		+= aspm.o
 obj-$(CONFIG_PCIEAER)		+= aer.o err.o
 obj-$(CONFIG_PCIEAER_INJECT)	+= aer_inject.o
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 860bc94974ec..ec6d7a092ac1 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -1042,16 +1042,23 @@  static int __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem)
 		return -EINVAL;
 	/*
 	 * A driver requested that ASPM be disabled on this device, but
-	 * if we don't have permission to manage ASPM (e.g., on ACPI
+	 * if we might not have permission to manage ASPM (e.g., on ACPI
 	 * systems we have to observe the FADT ACPI_FADT_NO_ASPM bit and
-	 * the _OSC method), we can't honor that request.  Windows has
-	 * a similar mechanism using "PciASPMOptOut", which is also
-	 * ignored in this situation.
+	 * the _OSC method), previously we chose to not honor disable
+	 * request in that case. Windows has a similar mechanism using
+	 * "PciASPMOptOut", which is also ignored in this situation.
+	 *
+	 * Not honoring the requests to disable ASPM, however, led to
+	 * drivers forcing ASPM off on their own. As such changes of ASPM
+	 * state are not tracked by this service driver, the state kept here
+	 * became out of sync.
+	 *
+	 * Therefore, honor ASPM disable requests even when OS does not have
+	 * ASPM control. Plain disable for ASPM is assumed to be slightly
+	 * safer than fully managing it.
 	 */
-	if (aspm_disabled) {
-		pci_warn(pdev, "can't disable ASPM; OS doesn't have ASPM control\n");
-		return -EPERM;
-	}
+	if (aspm_disabled)
+		pci_warn(pdev, "OS doesn't have ASPM control, disabling ASPM anyway\n");
 
 	if (sem)
 		down_read(&pci_bus_sem);
@@ -1087,13 +1094,13 @@  int pci_disable_link_state_locked(struct pci_dev *pdev, int state)
 EXPORT_SYMBOL(pci_disable_link_state_locked);
 
 /**
- * pci_disable_link_state - Disable device's link state, so the link will
- * never enter specific states.  Note that if the BIOS didn't grant ASPM
- * control to the OS, this does nothing because we can't touch the LNKCTL
- * register. Returns 0 or a negative errno.
- *
+ * pci_disable_link_state - Disable device's link state
  * @pdev: PCI device
  * @state: ASPM link state to disable
+ *
+ * Disable device's link state so the link will never enter specific states.
+ *
+ * Return: 0 or a negative errno
  */
 int pci_disable_link_state(struct pci_dev *pdev, int state)
 {
diff --git a/drivers/pci/pcie/aspm_minimal.c b/drivers/pci/pcie/aspm_minimal.c
new file mode 100644
index 000000000000..4e4f63e51b21
--- /dev/null
+++ b/drivers/pci/pcie/aspm_minimal.c
@@ -0,0 +1,66 @@ 
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Minimal PCIe ASPM handling when CONFIG_PCIEASPM is not set.
+ *
+ * Copyright (C) 2023 Intel Corporation.
+ */
+
+#include <linux/pci.h>
+
+#include "../pci.h"
+
+#ifndef CONFIG_PCIEASPM
+/*
+ * Always disable ASPM when requested, even when CONFIG_PCIEASPM is
+ * not build to avoid drivers adding code to do it on their own
+ * which caused issues when core does not know about the out-of-band
+ * ASPM state changes.
+ */
+int pci_disable_link_state_locked(struct pci_dev *pdev, int state)
+{
+	struct pci_dev *parent = pdev->bus->self;
+	struct pci_bus *linkbus = pdev->bus;
+	struct pci_dev *child;
+	u16 aspm_enabled, linkctl;
+	int ret;
+
+	if (!parent)
+		return -ENODEV;
+
+	ret = pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &linkctl);
+	if (ret != PCIBIOS_SUCCESSFUL)
+		return pcibios_err_to_errno(ret);
+	aspm_enabled = linkctl & PCI_EXP_LNKCTL_ASPMC;
+
+	ret = pcie_capability_read_word(pdev, PCI_EXP_LNKCTL, &linkctl);
+	if (ret != PCIBIOS_SUCCESSFUL)
+		return pcibios_err_to_errno(ret);
+	aspm_enabled |= linkctl & PCI_EXP_LNKCTL_ASPMC;
+
+	/* If no states need to be disabled, don't touch LNKCTL */
+	if (state & aspm_enabled)
+		return 0;
+
+	ret = pcie_capability_clear_word(parent, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_ASPMC);
+	if (ret != PCIBIOS_SUCCESSFUL)
+		return pcibios_err_to_errno(ret);
+	list_for_each_entry(child, &linkbus->devices, bus_list)
+		pcie_capability_clear_word(child, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_ASPMC);
+
+	return 0;
+}
+EXPORT_SYMBOL(pci_disable_link_state_locked);
+
+int pci_disable_link_state(struct pci_dev *pdev, int state)
+{
+	int ret;
+
+	down_read(&pci_bus_sem);
+	ret = pci_disable_link_state_locked(pdev, state);
+	up_read(&pci_bus_sem);
+
+	return ret;
+}
+EXPORT_SYMBOL(pci_disable_link_state);
+
+#endif
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 7df56988ff48..3c24ca164104 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1773,18 +1773,14 @@  extern bool pcie_ports_native;
 					 PCIE_LINK_STATE_L1_2 | PCIE_LINK_STATE_L1_1_PCIPM |\
 					 PCIE_LINK_STATE_L1_2_PCIPM)
 
-#ifdef CONFIG_PCIEASPM
 int pci_disable_link_state(struct pci_dev *pdev, int state);
 int pci_disable_link_state_locked(struct pci_dev *pdev, int state);
+#ifdef CONFIG_PCIEASPM
 int pci_set_default_link_state(struct pci_dev *pdev, int state);
 void pcie_no_aspm(void);
 bool pcie_aspm_support_enabled(void);
 bool pcie_aspm_enabled(struct pci_dev *pdev);
 #else
-static inline int pci_disable_link_state(struct pci_dev *pdev, int state)
-{ return 0; }
-static inline int pci_disable_link_state_locked(struct pci_dev *pdev, int state)
-{ return 0; }
 static inline int pci_set_default_link_state(struct pci_dev *pdev, int state)
 { return 0; }
 static inline void pcie_no_aspm(void) { }