diff mbox series

PCI: PM: Do not read power state in pci_enable_device_flags()

Message ID 3219454.74lMxhSOWB@kreacher (mailing list archive)
State Not Applicable
Delegated to: Bjorn Helgaas
Headers show
Series PCI: PM: Do not read power state in pci_enable_device_flags() | expand

Commit Message

Rafael J. Wysocki March 16, 2021, 3:51 p.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

It should not be necessary to update the current_state field of
struct pci_dev in pci_enable_device_flags() before calling
do_pci_enable_device() for the device, because none of the
code between that point and the pci_set_power_state() call in
do_pci_enable_device() invoked later depends on it.

Moreover, doing that is actively harmful in some cases.  For example,
if the given PCI device depends on an ACPI power resource whose _STA
method initially returns 0 ("off"), but the config space of the PCI
device is accessible and the power state retrieved from the
PCI_PM_CTRL register is D0, the current_state field in the struct
pci_dev representing that device will get out of sync with the
power.state of its ACPI companion object and that will lead to
power management issues going forward.

To avoid such issues it is better to leave the current_state value
as is until it is changed to PCI_D0 by do_pci_enable_device() as
appropriate.  However, the power state of the device is not changed
to PCI_D0 if it is already enabled when pci_enable_device_flags()
gets called for it, so update its current_state in that case, but
use pci_update_current_state() covering platform PM too for that.

Link: https://lore.kernel.org/lkml/20210314000439.3138941-1-luzmaximilian@gmail.com/
Reported-by: Maximilian Luz <luzmaximilian@gmail.com>
Tested-by: Maximilian Luz <luzmaximilian@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

Max, I've added a T-by from you even though the patch is slightly different
from what you have tested, but the difference shouldn't matter for your case.

---
 drivers/pci/pci.c |   16 +++-------------
 1 file changed, 3 insertions(+), 13 deletions(-)

Comments

Maximilian Luz March 16, 2021, 10:28 p.m. UTC | #1
On 3/16/21 4:51 PM, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> It should not be necessary to update the current_state field of
> struct pci_dev in pci_enable_device_flags() before calling
> do_pci_enable_device() for the device, because none of the
> code between that point and the pci_set_power_state() call in
> do_pci_enable_device() invoked later depends on it.
> 
> Moreover, doing that is actively harmful in some cases.  For example,
> if the given PCI device depends on an ACPI power resource whose _STA
> method initially returns 0 ("off"), but the config space of the PCI
> device is accessible and the power state retrieved from the
> PCI_PM_CTRL register is D0, the current_state field in the struct
> pci_dev representing that device will get out of sync with the
> power.state of its ACPI companion object and that will lead to
> power management issues going forward.
> 
> To avoid such issues it is better to leave the current_state value
> as is until it is changed to PCI_D0 by do_pci_enable_device() as
> appropriate.  However, the power state of the device is not changed
> to PCI_D0 if it is already enabled when pci_enable_device_flags()
> gets called for it, so update its current_state in that case, but
> use pci_update_current_state() covering platform PM too for that.
> 
> Link: https://lore.kernel.org/lkml/20210314000439.3138941-1-luzmaximilian@gmail.com/
> Reported-by: Maximilian Luz <luzmaximilian@gmail.com>
> Tested-by: Maximilian Luz <luzmaximilian@gmail.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> Max, I've added a T-by from you even though the patch is slightly different
> from what you have tested, but the difference shouldn't matter for your case.

Thanks! I've tested this now as well, all looks good.

Regards,
Max

> 
> ---
>   drivers/pci/pci.c |   16 +++-------------
>   1 file changed, 3 insertions(+), 13 deletions(-)
> 
> Index: linux-pm/drivers/pci/pci.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci.c
> +++ linux-pm/drivers/pci/pci.c
> @@ -1870,20 +1870,10 @@ static int pci_enable_device_flags(struc
>   	int err;
>   	int i, bars = 0;
>   
> -	/*
> -	 * Power state could be unknown at this point, either due to a fresh
> -	 * boot or a device removal call.  So get the current power state
> -	 * so that things like MSI message writing will behave as expected
> -	 * (e.g. if the device really is in D0 at enable time).
> -	 */
> -	if (dev->pm_cap) {
> -		u16 pmcsr;
> -		pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
> -		dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK);
> -	}
> -
> -	if (atomic_inc_return(&dev->enable_cnt) > 1)
> +	if (atomic_inc_return(&dev->enable_cnt) > 1) {
> +		pci_update_current_state(dev, dev->current_state);
>   		return 0;		/* already enabled */
> +	}
>   
>   	bridge = pci_upstream_bridge(dev);
>   	if (bridge)
> 
> 
>
Mika Westerberg March 17, 2021, 10:02 a.m. UTC | #2
On Tue, Mar 16, 2021 at 04:51:40PM +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> It should not be necessary to update the current_state field of
> struct pci_dev in pci_enable_device_flags() before calling
> do_pci_enable_device() for the device, because none of the
> code between that point and the pci_set_power_state() call in
> do_pci_enable_device() invoked later depends on it.
> 
> Moreover, doing that is actively harmful in some cases.  For example,
> if the given PCI device depends on an ACPI power resource whose _STA
> method initially returns 0 ("off"), but the config space of the PCI
> device is accessible and the power state retrieved from the
> PCI_PM_CTRL register is D0, the current_state field in the struct
> pci_dev representing that device will get out of sync with the
> power.state of its ACPI companion object and that will lead to
> power management issues going forward.
> 
> To avoid such issues it is better to leave the current_state value
> as is until it is changed to PCI_D0 by do_pci_enable_device() as
> appropriate.  However, the power state of the device is not changed
> to PCI_D0 if it is already enabled when pci_enable_device_flags()
> gets called for it, so update its current_state in that case, but
> use pci_update_current_state() covering platform PM too for that.
> 
> Link: https://lore.kernel.org/lkml/20210314000439.3138941-1-luzmaximilian@gmail.com/
> Reported-by: Maximilian Luz <luzmaximilian@gmail.com>
> Tested-by: Maximilian Luz <luzmaximilian@gmail.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Rafael J. Wysocki March 22, 2021, 2:32 p.m. UTC | #3
On Tue, Mar 16, 2021 at 4:52 PM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> It should not be necessary to update the current_state field of
> struct pci_dev in pci_enable_device_flags() before calling
> do_pci_enable_device() for the device, because none of the
> code between that point and the pci_set_power_state() call in
> do_pci_enable_device() invoked later depends on it.
>
> Moreover, doing that is actively harmful in some cases.  For example,
> if the given PCI device depends on an ACPI power resource whose _STA
> method initially returns 0 ("off"), but the config space of the PCI
> device is accessible and the power state retrieved from the
> PCI_PM_CTRL register is D0, the current_state field in the struct
> pci_dev representing that device will get out of sync with the
> power.state of its ACPI companion object and that will lead to
> power management issues going forward.
>
> To avoid such issues it is better to leave the current_state value
> as is until it is changed to PCI_D0 by do_pci_enable_device() as
> appropriate.  However, the power state of the device is not changed
> to PCI_D0 if it is already enabled when pci_enable_device_flags()
> gets called for it, so update its current_state in that case, but
> use pci_update_current_state() covering platform PM too for that.
>
> Link: https://lore.kernel.org/lkml/20210314000439.3138941-1-luzmaximilian@gmail.com/
> Reported-by: Maximilian Luz <luzmaximilian@gmail.com>
> Tested-by: Maximilian Luz <luzmaximilian@gmail.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Bjorn, can I take this, or do you want to take care of it yourself?

> ---
>
> Max, I've added a T-by from you even though the patch is slightly different
> from what you have tested, but the difference shouldn't matter for your case.
>
> ---
>  drivers/pci/pci.c |   16 +++-------------
>  1 file changed, 3 insertions(+), 13 deletions(-)
>
> Index: linux-pm/drivers/pci/pci.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci.c
> +++ linux-pm/drivers/pci/pci.c
> @@ -1870,20 +1870,10 @@ static int pci_enable_device_flags(struc
>         int err;
>         int i, bars = 0;
>
> -       /*
> -        * Power state could be unknown at this point, either due to a fresh
> -        * boot or a device removal call.  So get the current power state
> -        * so that things like MSI message writing will behave as expected
> -        * (e.g. if the device really is in D0 at enable time).
> -        */
> -       if (dev->pm_cap) {
> -               u16 pmcsr;
> -               pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
> -               dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK);
> -       }
> -
> -       if (atomic_inc_return(&dev->enable_cnt) > 1)
> +       if (atomic_inc_return(&dev->enable_cnt) > 1) {
> +               pci_update_current_state(dev, dev->current_state);
>                 return 0;               /* already enabled */
> +       }
>
>         bridge = pci_upstream_bridge(dev);
>         if (bridge)
>
>
>
Rafael J. Wysocki March 24, 2021, 3:43 p.m. UTC | #4
On Mon, Mar 22, 2021 at 3:32 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Tue, Mar 16, 2021 at 4:52 PM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > It should not be necessary to update the current_state field of
> > struct pci_dev in pci_enable_device_flags() before calling
> > do_pci_enable_device() for the device, because none of the
> > code between that point and the pci_set_power_state() call in
> > do_pci_enable_device() invoked later depends on it.
> >
> > Moreover, doing that is actively harmful in some cases.  For example,
> > if the given PCI device depends on an ACPI power resource whose _STA
> > method initially returns 0 ("off"), but the config space of the PCI
> > device is accessible and the power state retrieved from the
> > PCI_PM_CTRL register is D0, the current_state field in the struct
> > pci_dev representing that device will get out of sync with the
> > power.state of its ACPI companion object and that will lead to
> > power management issues going forward.
> >
> > To avoid such issues it is better to leave the current_state value
> > as is until it is changed to PCI_D0 by do_pci_enable_device() as
> > appropriate.  However, the power state of the device is not changed
> > to PCI_D0 if it is already enabled when pci_enable_device_flags()
> > gets called for it, so update its current_state in that case, but
> > use pci_update_current_state() covering platform PM too for that.
> >
> > Link: https://lore.kernel.org/lkml/20210314000439.3138941-1-luzmaximilian@gmail.com/
> > Reported-by: Maximilian Luz <luzmaximilian@gmail.com>
> > Tested-by: Maximilian Luz <luzmaximilian@gmail.com>
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Bjorn, can I take this, or do you want to take care of it yourself?

I'm taking the silence as consent, so the patch has been applied as
5.13 material with the R-by from Mika.

> > ---
> >
> > Max, I've added a T-by from you even though the patch is slightly different
> > from what you have tested, but the difference shouldn't matter for your case.
> >
> > ---
> >  drivers/pci/pci.c |   16 +++-------------
> >  1 file changed, 3 insertions(+), 13 deletions(-)
> >
> > Index: linux-pm/drivers/pci/pci.c
> > ===================================================================
> > --- linux-pm.orig/drivers/pci/pci.c
> > +++ linux-pm/drivers/pci/pci.c
> > @@ -1870,20 +1870,10 @@ static int pci_enable_device_flags(struc
> >         int err;
> >         int i, bars = 0;
> >
> > -       /*
> > -        * Power state could be unknown at this point, either due to a fresh
> > -        * boot or a device removal call.  So get the current power state
> > -        * so that things like MSI message writing will behave as expected
> > -        * (e.g. if the device really is in D0 at enable time).
> > -        */
> > -       if (dev->pm_cap) {
> > -               u16 pmcsr;
> > -               pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
> > -               dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK);
> > -       }
> > -
> > -       if (atomic_inc_return(&dev->enable_cnt) > 1)
> > +       if (atomic_inc_return(&dev->enable_cnt) > 1) {
> > +               pci_update_current_state(dev, dev->current_state);
> >                 return 0;               /* already enabled */
> > +       }
> >
> >         bridge = pci_upstream_bridge(dev);
> >         if (bridge)
> >
> >
> >
Salvatore Bonaccorso June 21, 2021, 7:27 p.m. UTC | #5
Hi,

On Tue, Mar 16, 2021 at 04:51:40PM +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> It should not be necessary to update the current_state field of
> struct pci_dev in pci_enable_device_flags() before calling
> do_pci_enable_device() for the device, because none of the
> code between that point and the pci_set_power_state() call in
> do_pci_enable_device() invoked later depends on it.
> 
> Moreover, doing that is actively harmful in some cases.  For example,
> if the given PCI device depends on an ACPI power resource whose _STA
> method initially returns 0 ("off"), but the config space of the PCI
> device is accessible and the power state retrieved from the
> PCI_PM_CTRL register is D0, the current_state field in the struct
> pci_dev representing that device will get out of sync with the
> power.state of its ACPI companion object and that will lead to
> power management issues going forward.
> 
> To avoid such issues it is better to leave the current_state value
> as is until it is changed to PCI_D0 by do_pci_enable_device() as
> appropriate.  However, the power state of the device is not changed
> to PCI_D0 if it is already enabled when pci_enable_device_flags()
> gets called for it, so update its current_state in that case, but
> use pci_update_current_state() covering platform PM too for that.
> 
> Link: https://lore.kernel.org/lkml/20210314000439.3138941-1-luzmaximilian@gmail.com/
> Reported-by: Maximilian Luz <luzmaximilian@gmail.com>
> Tested-by: Maximilian Luz <luzmaximilian@gmail.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> Max, I've added a T-by from you even though the patch is slightly different
> from what you have tested, but the difference shouldn't matter for your case.
> 
> ---
>  drivers/pci/pci.c |   16 +++-------------
>  1 file changed, 3 insertions(+), 13 deletions(-)
> 
> Index: linux-pm/drivers/pci/pci.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci.c
> +++ linux-pm/drivers/pci/pci.c
> @@ -1870,20 +1870,10 @@ static int pci_enable_device_flags(struc
>  	int err;
>  	int i, bars = 0;
>  
> -	/*
> -	 * Power state could be unknown at this point, either due to a fresh
> -	 * boot or a device removal call.  So get the current power state
> -	 * so that things like MSI message writing will behave as expected
> -	 * (e.g. if the device really is in D0 at enable time).
> -	 */
> -	if (dev->pm_cap) {
> -		u16 pmcsr;
> -		pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
> -		dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK);
> -	}
> -
> -	if (atomic_inc_return(&dev->enable_cnt) > 1)
> +	if (atomic_inc_return(&dev->enable_cnt) > 1) {
> +		pci_update_current_state(dev, dev->current_state);
>  		return 0;		/* already enabled */
> +	}
>  
>  	bridge = pci_upstream_bridge(dev);
>  	if (bridge)

A user in Debian reported that this commit caused an issue, cf.
https://bugs.debian.org/990008#10 with the e1000e driver failing to
probe the device. It was reported as well to
https://bugzilla.kernel.org/show_bug.cgi?id=213481

According to the above and
https://bugzilla.kernel.org/show_bug.cgi?id=213481#c2 reverting
4514d991d992 ("PCI: PM: Do not read power state in
pci_enable_device_flags()") fixes the issue.

Any idea what is going on here?

Regards,
Salvatore
Rafael J. Wysocki June 23, 2021, 5:52 p.m. UTC | #6
On Mon, Jun 21, 2021 at 9:27 PM Salvatore Bonaccorso <carnil@debian.org> wrote:
>
> Hi,
>
> On Tue, Mar 16, 2021 at 04:51:40PM +0100, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > It should not be necessary to update the current_state field of
> > struct pci_dev in pci_enable_device_flags() before calling
> > do_pci_enable_device() for the device, because none of the
> > code between that point and the pci_set_power_state() call in
> > do_pci_enable_device() invoked later depends on it.
> >
> > Moreover, doing that is actively harmful in some cases.  For example,
> > if the given PCI device depends on an ACPI power resource whose _STA
> > method initially returns 0 ("off"), but the config space of the PCI
> > device is accessible and the power state retrieved from the
> > PCI_PM_CTRL register is D0, the current_state field in the struct
> > pci_dev representing that device will get out of sync with the
> > power.state of its ACPI companion object and that will lead to
> > power management issues going forward.
> >
> > To avoid such issues it is better to leave the current_state value
> > as is until it is changed to PCI_D0 by do_pci_enable_device() as
> > appropriate.  However, the power state of the device is not changed
> > to PCI_D0 if it is already enabled when pci_enable_device_flags()
> > gets called for it, so update its current_state in that case, but
> > use pci_update_current_state() covering platform PM too for that.
> >
> > Link: https://lore.kernel.org/lkml/20210314000439.3138941-1-luzmaximilian@gmail.com/
> > Reported-by: Maximilian Luz <luzmaximilian@gmail.com>
> > Tested-by: Maximilian Luz <luzmaximilian@gmail.com>
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >
> > Max, I've added a T-by from you even though the patch is slightly different
> > from what you have tested, but the difference shouldn't matter for your case.
> >
> > ---
> >  drivers/pci/pci.c |   16 +++-------------
> >  1 file changed, 3 insertions(+), 13 deletions(-)
> >
> > Index: linux-pm/drivers/pci/pci.c
> > ===================================================================
> > --- linux-pm.orig/drivers/pci/pci.c
> > +++ linux-pm/drivers/pci/pci.c
> > @@ -1870,20 +1870,10 @@ static int pci_enable_device_flags(struc
> >       int err;
> >       int i, bars = 0;
> >
> > -     /*
> > -      * Power state could be unknown at this point, either due to a fresh
> > -      * boot or a device removal call.  So get the current power state
> > -      * so that things like MSI message writing will behave as expected
> > -      * (e.g. if the device really is in D0 at enable time).
> > -      */
> > -     if (dev->pm_cap) {
> > -             u16 pmcsr;
> > -             pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
> > -             dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK);
> > -     }
> > -
> > -     if (atomic_inc_return(&dev->enable_cnt) > 1)
> > +     if (atomic_inc_return(&dev->enable_cnt) > 1) {
> > +             pci_update_current_state(dev, dev->current_state);
> >               return 0;               /* already enabled */
> > +     }
> >
> >       bridge = pci_upstream_bridge(dev);
> >       if (bridge)
>
> A user in Debian reported that this commit caused an issue, cf.
> https://bugs.debian.org/990008#10 with the e1000e driver failing to
> probe the device. It was reported as well to
> https://bugzilla.kernel.org/show_bug.cgi?id=213481
>
> According to the above and
> https://bugzilla.kernel.org/show_bug.cgi?id=213481#c2 reverting
> 4514d991d992 ("PCI: PM: Do not read power state in
> pci_enable_device_flags()") fixes the issue.

This commit has just been reverted.

We will try to address the original issue addressed by it in a different way.

Thanks!
diff mbox series

Patch

Index: linux-pm/drivers/pci/pci.c
===================================================================
--- linux-pm.orig/drivers/pci/pci.c
+++ linux-pm/drivers/pci/pci.c
@@ -1870,20 +1870,10 @@  static int pci_enable_device_flags(struc
 	int err;
 	int i, bars = 0;
 
-	/*
-	 * Power state could be unknown at this point, either due to a fresh
-	 * boot or a device removal call.  So get the current power state
-	 * so that things like MSI message writing will behave as expected
-	 * (e.g. if the device really is in D0 at enable time).
-	 */
-	if (dev->pm_cap) {
-		u16 pmcsr;
-		pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
-		dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK);
-	}
-
-	if (atomic_inc_return(&dev->enable_cnt) > 1)
+	if (atomic_inc_return(&dev->enable_cnt) > 1) {
+		pci_update_current_state(dev, dev->current_state);
 		return 0;		/* already enabled */
+	}
 
 	bridge = pci_upstream_bridge(dev);
 	if (bridge)