Message ID | 20ccb6cc38f872c3d56b44cc2b9776a4a39c7dc5.1463134232.git.lukas@wunner.de (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Fri, May 13, 2016 at 01:15:31PM +0200, Lukas Wunner wrote: > There are devices wich are not power-managed by the platform, yet can be s/wich/which/ > runtime suspended to D3cold with some other mechanism. When putting the > system to sleep, we currently handle such devices improperly by trying > to transition them from D3cold to D3hot (the default power state defined > at the beginning of pci_target_state()). Avoid that. > > An example for devices affected by this are Thunderbolt controllers > built into Macs which can be put into D3cold with nonstandard ACPI > methods. > > Signed-off-by: Lukas Wunner <lukas@wunner.de> This needs an ack from Rafael. Naive question: why is the default target_state PCI_D3hot? > --- > drivers/pci/pci.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 791dfe7..6af9911 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -1943,6 +1943,8 @@ static pci_power_t pci_target_state(struct pci_dev *dev) > && !(dev->pme_support & (1 << target_state))) > target_state--; > } > + } else if (dev->current_state == PCI_D3cold) { > + target_state = PCI_D3cold; > } This only covers the case of !device_may_wakeup(). So I guess device_may_wakeup() is false for these Thunderbolt controllers. Is there a reason you don't want to do this check for devices that may wakeup? Sorry, more naive questions. I don't know anything about power management, and it all looks like black magic to me. > return target_state; > -- > 2.8.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jun 17, 2016 at 04:09:24PM -0500, Bjorn Helgaas wrote: > On Fri, May 13, 2016 at 01:15:31PM +0200, Lukas Wunner wrote: > > There are devices wich are not power-managed by the platform, yet can be > > s/wich/which/ Oops. > > > runtime suspended to D3cold with some other mechanism. When putting the > > system to sleep, we currently handle such devices improperly by trying > > to transition them from D3cold to D3hot (the default power state defined > > at the beginning of pci_target_state()). Avoid that. > > > > An example for devices affected by this are Thunderbolt controllers > > built into Macs which can be put into D3cold with nonstandard ACPI > > methods. > > > > Signed-off-by: Lukas Wunner <lukas@wunner.de> > > This needs an ack from Rafael. > > Naive question: why is the default target_state PCI_D3hot? No idea. :-) > > > --- > > drivers/pci/pci.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > index 791dfe7..6af9911 100644 > > --- a/drivers/pci/pci.c > > +++ b/drivers/pci/pci.c > > @@ -1943,6 +1943,8 @@ static pci_power_t pci_target_state(struct pci_dev *dev) > > && !(dev->pme_support & (1 << target_state))) > > target_state--; > > } > > + } else if (dev->current_state == PCI_D3cold) { > > + target_state = PCI_D3cold; > > } > > This only covers the case of !device_may_wakeup(). So I guess > device_may_wakeup() is false for these Thunderbolt controllers. Correct. device_may_wakeup() is defined in include/linux/pm_wakeup.h as: dev->power.can_wakeup && !!dev->power.wakeup The first one, dev->power.can_wakeup is true because PME is claimed to be supported for all power states in the PMC register, so pci_pm_init() calls device_set_wakeup_capable(&dev->dev, true): Capabilities: [80] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME- The second one, dev->power.wakeup is false because device_wakeup_enable() is never called. > Is there a reason you don't want to do this check for devices that > may wakeup? Fear of breaking things. It would mean that a device would be left in D3cold even though it may not be able to signal wakeup from that power state. That's a change of behaviour the consequences of which I cannot estimate. Intuitively, I would expect breakage from such a change. Thanks, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Saturday, June 18, 2016 12:14:07 AM Lukas Wunner wrote: > On Fri, Jun 17, 2016 at 04:09:24PM -0500, Bjorn Helgaas wrote: > > On Fri, May 13, 2016 at 01:15:31PM +0200, Lukas Wunner wrote: > > > There are devices wich are not power-managed by the platform, yet can be > > > > s/wich/which/ > > Oops. > > > > > > runtime suspended to D3cold with some other mechanism. When putting the > > > system to sleep, we currently handle such devices improperly by trying > > > to transition them from D3cold to D3hot (the default power state defined > > > at the beginning of pci_target_state()). Avoid that. > > > > > > An example for devices affected by this are Thunderbolt controllers > > > built into Macs which can be put into D3cold with nonstandard ACPI > > > methods. > > > > > > Signed-off-by: Lukas Wunner <lukas@wunner.de> > > > > This needs an ack from Rafael. > > > > Naive question: why is the default target_state PCI_D3hot? > > No idea. :-) Because D3_hot is the deepest state you can *program* the device to go into unless the platform can cut off power from it. > > > > > --- > > > drivers/pci/pci.c | 2 ++ > > > 1 file changed, 2 insertions(+) > > > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > > index 791dfe7..6af9911 100644 > > > --- a/drivers/pci/pci.c > > > +++ b/drivers/pci/pci.c > > > @@ -1943,6 +1943,8 @@ static pci_power_t pci_target_state(struct pci_dev *dev) > > > && !(dev->pme_support & (1 << target_state))) > > > target_state--; > > > } > > > + } else if (dev->current_state == PCI_D3cold) { > > > + target_state = PCI_D3cold; > > > } > > > > This only covers the case of !device_may_wakeup(). So I guess > > device_may_wakeup() is false for these Thunderbolt controllers. > > Correct. device_may_wakeup() is defined in include/linux/pm_wakeup.h as: > dev->power.can_wakeup && !!dev->power.wakeup > > The first one, dev->power.can_wakeup is true because PME is claimed to be > supported for all power states in the PMC register, so pci_pm_init() calls > device_set_wakeup_capable(&dev->dev, true): > Capabilities: [80] Power Management version 3 > Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+) > Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME- > > The second one, dev->power.wakeup is false because device_wakeup_enable() > is never called. > > > > Is there a reason you don't want to do this check for devices that > > may wakeup? > > Fear of breaking things. It would mean that a device would be left in > D3cold even though it may not be able to signal wakeup from that power > state. Then it should not be put into D3_cold at run time too if it is wakeup-capable. > That's a change of behaviour the consequences of which I cannot > estimate. Intuitively, I would expect breakage from such a change. That would have been the case if the device had been capable of signaling wakeup from D3_cold at run time, but not from system sleep. However, that can only happen when platform_pci_power_manageable() is true AFAICS. So I'd change the switch () under the platform_pci_power_manageable() check to return "state" in the default case and then do return dev->current_state < target_state ? target_state : dev->current_state; at the end of the function. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jul 18, 2016 at 03:39:15PM +0200, Rafael J. Wysocki wrote: > On Saturday, June 18, 2016 12:14:07 AM Lukas Wunner wrote: > > On Fri, Jun 17, 2016 at 04:09:24PM -0500, Bjorn Helgaas wrote: > > > On Fri, May 13, 2016 at 01:15:31PM +0200, Lukas Wunner wrote: > > > > There are devices wich are not power-managed by the platform, yet can be > > > > runtime suspended to D3cold with some other mechanism. When putting the > > > > system to sleep, we currently handle such devices improperly by trying > > > > to transition them from D3cold to D3hot (the default power state defined > > > > at the beginning of pci_target_state()). Avoid that. > > > > > > > > An example for devices affected by this are Thunderbolt controllers > > > > built into Macs which can be put into D3cold with nonstandard ACPI > > > > methods. > > > > > > > > Signed-off-by: Lukas Wunner <lukas@wunner.de> > > > > > > This needs an ack from Rafael. > > > > > > > --- > > > > drivers/pci/pci.c | 2 ++ > > > > 1 file changed, 2 insertions(+) > > > > > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > > > index 791dfe7..6af9911 100644 > > > > --- a/drivers/pci/pci.c > > > > +++ b/drivers/pci/pci.c > > > > @@ -1943,6 +1943,8 @@ static pci_power_t pci_target_state(struct pci_dev *dev) > > > > && !(dev->pme_support & (1 << target_state))) > > > > target_state--; > > > > } > > > > + } else if (dev->current_state == PCI_D3cold) { > > > > + target_state = PCI_D3cold; > > > > } > > > > > > This only covers the case of !device_may_wakeup(). So I guess > > > device_may_wakeup() is false for these Thunderbolt controllers. > > > Is there a reason you don't want to do this check for devices that > > > may wakeup? > > > > Fear of breaking things. It would mean that a device would be left in > > D3cold even though it may not be able to signal wakeup from that power > > state. > > Then it should not be put into D3_cold at run time too if it is wakeup- > capable. > > > That's a change of behaviour the consequences of which I cannot > > estimate. Intuitively, I would expect breakage from such a change. > > That would have been the case if the device had been capable of signaling > wakeup from D3_cold at run time, but not from system sleep. However, that > can only happen when platform_pci_power_manageable() is true AFAICS. > > So I'd change the switch () under the platform_pci_power_manageable() check > to return "state" in the default case and then do > > return dev->current_state < target_state ? target_state : dev->current_state; > > at the end of the function. That suggestion doesn't seem to be correct because there's another value besides PCI_D3cold which is also greater than PCI_D3hot, namely PCI_UNKNOWN. (If the device is in that state, e.g. after pci_device_remove() has been called, and the system goes to sleep, we'd leave the device as is and not put it into D3hot as we do now.) I will update this patch with Bjorn's suggestion to also leave the device in D3cold if it is wakeup-capable. The idea is to just change the default state in the first line of the function like this: - pci_power_t target_state = PCI_D3hot; + pci_power_t target_state = + dev->current_state == PCI_D3cold ? PCI_D3cold : PCI_D3hot; Thanks, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Aug 3, 2016 at 2:28 PM, Lukas Wunner <lukas@wunner.de> wrote: > On Mon, Jul 18, 2016 at 03:39:15PM +0200, Rafael J. Wysocki wrote: >> On Saturday, June 18, 2016 12:14:07 AM Lukas Wunner wrote: >> > On Fri, Jun 17, 2016 at 04:09:24PM -0500, Bjorn Helgaas wrote: >> > > On Fri, May 13, 2016 at 01:15:31PM +0200, Lukas Wunner wrote: >> > > > There are devices wich are not power-managed by the platform, yet can be >> > > > runtime suspended to D3cold with some other mechanism. When putting the >> > > > system to sleep, we currently handle such devices improperly by trying >> > > > to transition them from D3cold to D3hot (the default power state defined >> > > > at the beginning of pci_target_state()). Avoid that. >> > > > >> > > > An example for devices affected by this are Thunderbolt controllers >> > > > built into Macs which can be put into D3cold with nonstandard ACPI >> > > > methods. >> > > > >> > > > Signed-off-by: Lukas Wunner <lukas@wunner.de> >> > > >> > > This needs an ack from Rafael. >> > > >> > > > --- >> > > > drivers/pci/pci.c | 2 ++ >> > > > 1 file changed, 2 insertions(+) >> > > > >> > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c >> > > > index 791dfe7..6af9911 100644 >> > > > --- a/drivers/pci/pci.c >> > > > +++ b/drivers/pci/pci.c >> > > > @@ -1943,6 +1943,8 @@ static pci_power_t pci_target_state(struct pci_dev *dev) >> > > > && !(dev->pme_support & (1 << target_state))) >> > > > target_state--; >> > > > } >> > > > + } else if (dev->current_state == PCI_D3cold) { >> > > > + target_state = PCI_D3cold; >> > > > } >> > > >> > > This only covers the case of !device_may_wakeup(). So I guess >> > > device_may_wakeup() is false for these Thunderbolt controllers. >> > > Is there a reason you don't want to do this check for devices that >> > > may wakeup? >> > >> > Fear of breaking things. It would mean that a device would be left in >> > D3cold even though it may not be able to signal wakeup from that power >> > state. >> >> Then it should not be put into D3_cold at run time too if it is wakeup- >> capable. >> >> > That's a change of behaviour the consequences of which I cannot >> > estimate. Intuitively, I would expect breakage from such a change. >> >> That would have been the case if the device had been capable of signaling >> wakeup from D3_cold at run time, but not from system sleep. However, that >> can only happen when platform_pci_power_manageable() is true AFAICS. >> >> So I'd change the switch () under the platform_pci_power_manageable() check >> to return "state" in the default case and then do >> >> return dev->current_state < target_state ? target_state : dev->current_state; >> >> at the end of the function. > > That suggestion doesn't seem to be correct because there's another > value besides PCI_D3cold which is also greater than PCI_D3hot, > namely PCI_UNKNOWN. (If the device is in that state, e.g. after > pci_device_remove() has been called, and the system goes to sleep, > we'd leave the device as is and not put it into D3hot as we do now.) Right, I obviously forgot about PCI_UNKNOWN. > I will update this patch with Bjorn's suggestion to also leave the > device in D3cold if it is wakeup-capable. The idea is to just change > the default state in the first line of the function like this: > > - pci_power_t target_state = PCI_D3hot; > + pci_power_t target_state = > + dev->current_state == PCI_D3cold ? PCI_D3cold : PCI_D3hot; That should work (even though it is a little clumsy IMO). Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 04, 2016 at 01:50:39AM +0200, Rafael J. Wysocki wrote: > On Wed, Aug 3, 2016 at 2:28 PM, Lukas Wunner <lukas@wunner.de> wrote: > > On Mon, Jul 18, 2016 at 03:39:15PM +0200, Rafael J. Wysocki wrote: > >> On Saturday, June 18, 2016 12:14:07 AM Lukas Wunner wrote: > >> > On Fri, Jun 17, 2016 at 04:09:24PM -0500, Bjorn Helgaas wrote: > >> > > On Fri, May 13, 2016 at 01:15:31PM +0200, Lukas Wunner wrote: > >> > > > There are devices wich are not power-managed by the platform, yet can be > >> > > > runtime suspended to D3cold with some other mechanism. When putting the > >> > > > system to sleep, we currently handle such devices improperly by trying > >> > > > to transition them from D3cold to D3hot (the default power state defined > >> > > > at the beginning of pci_target_state()). Avoid that. > >> > > > > >> > > > An example for devices affected by this are Thunderbolt controllers > >> > > > built into Macs which can be put into D3cold with nonstandard ACPI > >> > > > methods. > >> > > > > >> > > > Signed-off-by: Lukas Wunner <lukas@wunner.de> > >> > > > >> > > This needs an ack from Rafael. > >> > > > >> > > > --- > >> > > > drivers/pci/pci.c | 2 ++ > >> > > > 1 file changed, 2 insertions(+) > >> > > > > >> > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > >> > > > index 791dfe7..6af9911 100644 > >> > > > --- a/drivers/pci/pci.c > >> > > > +++ b/drivers/pci/pci.c > >> > > > @@ -1943,6 +1943,8 @@ static pci_power_t pci_target_state(struct pci_dev *dev) > >> > > > && !(dev->pme_support & (1 << target_state))) > >> > > > target_state--; > >> > > > } > >> > > > + } else if (dev->current_state == PCI_D3cold) { > >> > > > + target_state = PCI_D3cold; > >> > > > } > >> > > > >> > > This only covers the case of !device_may_wakeup(). So I guess > >> > > device_may_wakeup() is false for these Thunderbolt controllers. > >> > > Is there a reason you don't want to do this check for devices that > >> > > may wakeup? > >> > > >> > Fear of breaking things. It would mean that a device would be left in > >> > D3cold even though it may not be able to signal wakeup from that power > >> > state. > >> > >> Then it should not be put into D3_cold at run time too if it is wakeup- > >> capable. > >> > >> > That's a change of behaviour the consequences of which I cannot > >> > estimate. Intuitively, I would expect breakage from such a change. > >> > >> That would have been the case if the device had been capable of signaling > >> wakeup from D3_cold at run time, but not from system sleep. However, that > >> can only happen when platform_pci_power_manageable() is true AFAICS. > >> > >> So I'd change the switch () under the platform_pci_power_manageable() check > >> to return "state" in the default case and then do > >> > >> return dev->current_state < target_state ? target_state : dev->current_state; > >> > >> at the end of the function. > > > > That suggestion doesn't seem to be correct because there's another > > value besides PCI_D3cold which is also greater than PCI_D3hot, > > namely PCI_UNKNOWN. (If the device is in that state, e.g. after > > pci_device_remove() has been called, and the system goes to sleep, > > we'd leave the device as is and not put it into D3hot as we do now.) > > Right, I obviously forgot about PCI_UNKNOWN. > > > I will update this patch with Bjorn's suggestion to also leave the > > device in D3cold if it is wakeup-capable. The idea is to just change > > the default state in the first line of the function like this: > > > > - pci_power_t target_state = PCI_D3hot; > > + pci_power_t target_state = > > + dev->current_state == PCI_D3cold ? PCI_D3cold : PCI_D3hot; > > That should work (even though it is a little clumsy IMO). Not sure why that is clumsy but happy to use something else if you have a suggestion? Thanks, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 4, 2016 at 2:45 AM, Lukas Wunner <lukas@wunner.de> wrote: > On Thu, Aug 04, 2016 at 01:50:39AM +0200, Rafael J. Wysocki wrote: >> On Wed, Aug 3, 2016 at 2:28 PM, Lukas Wunner <lukas@wunner.de> wrote: >> > On Mon, Jul 18, 2016 at 03:39:15PM +0200, Rafael J. Wysocki wrote: >> >> On Saturday, June 18, 2016 12:14:07 AM Lukas Wunner wrote: >> >> > On Fri, Jun 17, 2016 at 04:09:24PM -0500, Bjorn Helgaas wrote: >> >> > > On Fri, May 13, 2016 at 01:15:31PM +0200, Lukas Wunner wrote: >> >> > > > There are devices wich are not power-managed by the platform, yet can be >> >> > > > runtime suspended to D3cold with some other mechanism. When putting the >> >> > > > system to sleep, we currently handle such devices improperly by trying >> >> > > > to transition them from D3cold to D3hot (the default power state defined >> >> > > > at the beginning of pci_target_state()). Avoid that. >> >> > > > >> >> > > > An example for devices affected by this are Thunderbolt controllers >> >> > > > built into Macs which can be put into D3cold with nonstandard ACPI >> >> > > > methods. >> >> > > > >> >> > > > Signed-off-by: Lukas Wunner <lukas@wunner.de> >> >> > > >> >> > > This needs an ack from Rafael. >> >> > > >> >> > > > --- >> >> > > > drivers/pci/pci.c | 2 ++ >> >> > > > 1 file changed, 2 insertions(+) >> >> > > > >> >> > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c >> >> > > > index 791dfe7..6af9911 100644 >> >> > > > --- a/drivers/pci/pci.c >> >> > > > +++ b/drivers/pci/pci.c >> >> > > > @@ -1943,6 +1943,8 @@ static pci_power_t pci_target_state(struct pci_dev *dev) >> >> > > > && !(dev->pme_support & (1 << target_state))) >> >> > > > target_state--; >> >> > > > } >> >> > > > + } else if (dev->current_state == PCI_D3cold) { >> >> > > > + target_state = PCI_D3cold; >> >> > > > } >> >> > > >> >> > > This only covers the case of !device_may_wakeup(). So I guess >> >> > > device_may_wakeup() is false for these Thunderbolt controllers. >> >> > > Is there a reason you don't want to do this check for devices that >> >> > > may wakeup? >> >> > >> >> > Fear of breaking things. It would mean that a device would be left in >> >> > D3cold even though it may not be able to signal wakeup from that power >> >> > state. >> >> >> >> Then it should not be put into D3_cold at run time too if it is wakeup- >> >> capable. >> >> >> >> > That's a change of behaviour the consequences of which I cannot >> >> > estimate. Intuitively, I would expect breakage from such a change. >> >> >> >> That would have been the case if the device had been capable of signaling >> >> wakeup from D3_cold at run time, but not from system sleep. However, that >> >> can only happen when platform_pci_power_manageable() is true AFAICS. >> >> >> >> So I'd change the switch () under the platform_pci_power_manageable() check >> >> to return "state" in the default case and then do >> >> >> >> return dev->current_state < target_state ? target_state : dev->current_state; >> >> >> >> at the end of the function. >> > >> > That suggestion doesn't seem to be correct because there's another >> > value besides PCI_D3cold which is also greater than PCI_D3hot, >> > namely PCI_UNKNOWN. (If the device is in that state, e.g. after >> > pci_device_remove() has been called, and the system goes to sleep, >> > we'd leave the device as is and not put it into D3hot as we do now.) >> >> Right, I obviously forgot about PCI_UNKNOWN. >> >> > I will update this patch with Bjorn's suggestion to also leave the >> > device in D3cold if it is wakeup-capable. The idea is to just change >> > the default state in the first line of the function like this: >> > >> > - pci_power_t target_state = PCI_D3hot; >> > + pci_power_t target_state = >> > + dev->current_state == PCI_D3cold ? PCI_D3cold : PCI_D3hot; >> >> That should work (even though it is a little clumsy IMO). > > Not sure why that is clumsy but happy to use something else if you > have a suggestion? The clumsy thing is that we'd take the target_state as D3cold only if the device already was in that state. Otherwise, we'd take D3hot as the target state for the same device, which doesn't seem particularly consistent to me. Not that I have better ideas ATM, but then the current code works for my use cases. :-) Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 04, 2016 at 03:07:56AM +0200, Rafael J. Wysocki wrote: > On Thu, Aug 4, 2016 at 2:45 AM, Lukas Wunner <lukas@wunner.de> wrote: > > On Thu, Aug 04, 2016 at 01:50:39AM +0200, Rafael J. Wysocki wrote: > >> On Wed, Aug 3, 2016 at 2:28 PM, Lukas Wunner <lukas@wunner.de> wrote: > >> > On Mon, Jul 18, 2016 at 03:39:15PM +0200, Rafael J. Wysocki wrote: > >> >> On Saturday, June 18, 2016 12:14:07 AM Lukas Wunner wrote: > >> >> > On Fri, Jun 17, 2016 at 04:09:24PM -0500, Bjorn Helgaas wrote: > >> >> > > On Fri, May 13, 2016 at 01:15:31PM +0200, Lukas Wunner wrote: > >> >> > > > There are devices wich are not power-managed by the platform, yet can be > >> >> > > > runtime suspended to D3cold with some other mechanism. When putting the > >> >> > > > system to sleep, we currently handle such devices improperly by trying > >> >> > > > to transition them from D3cold to D3hot (the default power state defined > >> >> > > > at the beginning of pci_target_state()). Avoid that. > >> >> > > > > >> >> > > > An example for devices affected by this are Thunderbolt controllers > >> >> > > > built into Macs which can be put into D3cold with nonstandard ACPI > >> >> > > > methods. > >> >> > > > > >> >> > > > Signed-off-by: Lukas Wunner <lukas@wunner.de> > >> >> > > > --- > >> >> > > > drivers/pci/pci.c | 2 ++ > >> >> > > > 1 file changed, 2 insertions(+) > >> >> > > > > >> >> > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > >> >> > > > index 791dfe7..6af9911 100644 > >> >> > > > --- a/drivers/pci/pci.c > >> >> > > > +++ b/drivers/pci/pci.c > >> >> > > > @@ -1943,6 +1943,8 @@ static pci_power_t pci_target_state(struct pci_dev *dev) > >> >> > > > && !(dev->pme_support & (1 << target_state))) > >> >> > > > target_state--; > >> >> > > > } > >> >> > > > + } else if (dev->current_state == PCI_D3cold) { > >> >> > > > + target_state = PCI_D3cold; > >> >> > > > } > >> >> > > > >> > I will update this patch with Bjorn's suggestion to also leave the > >> > device in D3cold if it is wakeup-capable. The idea is to just change > >> > the default state in the first line of the function like this: > >> > > >> > - pci_power_t target_state = PCI_D3hot; > >> > + pci_power_t target_state = > >> > + dev->current_state == PCI_D3cold ? PCI_D3cold : PCI_D3hot; > >> > >> That should work (even though it is a little clumsy IMO). > > > > Not sure why that is clumsy but happy to use something else if you > > have a suggestion? > > The clumsy thing is that we'd take the target_state as D3cold only if > the device already was in that state. > > Otherwise, we'd take D3hot as the target state for the same device, > which doesn't seem particularly consistent to me. > > Not that I have better ideas ATM, but then the current code works for > my use cases. :-) The goal is to afford direct-complete to devices which are not power- manageable by the platform but can still be runtime suspended to D3cold. Right now we wake those devices up from D3cold to D3hot before going to sleep, which is a waste of energy and prolongs the suspend sequence (waking up the Thunderbolt controller takes 2 seconds). The de facto standard to power manage such devices seems to be with dev_pm_domain_set(). That's what vga_switcheroo does and I'll move to that as well for v3 of this series. I could add a "bool can_power_off" to struct dev_pm_domain. Then I could change pci_target_state() like this: pci_power_t target_state = PCI_D3hot; if (platform_pci_power_manageable(dev)) { [...] + } else if (dev->dev.pm_domain && dev->dev.pm_domain.can_power_off) { + target_state = PCI_D3cold; } else if [...] Another idea would be to add a ->choose_state hook to dev_pm_domain, but that would have to return a PCI-specific power state, so we'd be in clumsy territory again. Thoughts? Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 4, 2016 at 10:14 AM, Lukas Wunner <lukas@wunner.de> wrote: > On Thu, Aug 04, 2016 at 03:07:56AM +0200, Rafael J. Wysocki wrote: >> On Thu, Aug 4, 2016 at 2:45 AM, Lukas Wunner <lukas@wunner.de> wrote: >> > On Thu, Aug 04, 2016 at 01:50:39AM +0200, Rafael J. Wysocki wrote: >> >> On Wed, Aug 3, 2016 at 2:28 PM, Lukas Wunner <lukas@wunner.de> wrote: >> >> > On Mon, Jul 18, 2016 at 03:39:15PM +0200, Rafael J. Wysocki wrote: >> >> >> On Saturday, June 18, 2016 12:14:07 AM Lukas Wunner wrote: >> >> >> > On Fri, Jun 17, 2016 at 04:09:24PM -0500, Bjorn Helgaas wrote: >> >> >> > > On Fri, May 13, 2016 at 01:15:31PM +0200, Lukas Wunner wrote: >> >> >> > > > There are devices wich are not power-managed by the platform, yet can be >> >> >> > > > runtime suspended to D3cold with some other mechanism. When putting the >> >> >> > > > system to sleep, we currently handle such devices improperly by trying >> >> >> > > > to transition them from D3cold to D3hot (the default power state defined >> >> >> > > > at the beginning of pci_target_state()). Avoid that. >> >> >> > > > >> >> >> > > > An example for devices affected by this are Thunderbolt controllers >> >> >> > > > built into Macs which can be put into D3cold with nonstandard ACPI >> >> >> > > > methods. >> >> >> > > > >> >> >> > > > Signed-off-by: Lukas Wunner <lukas@wunner.de> >> >> >> > > > --- >> >> >> > > > drivers/pci/pci.c | 2 ++ >> >> >> > > > 1 file changed, 2 insertions(+) >> >> >> > > > >> >> >> > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c >> >> >> > > > index 791dfe7..6af9911 100644 >> >> >> > > > --- a/drivers/pci/pci.c >> >> >> > > > +++ b/drivers/pci/pci.c >> >> >> > > > @@ -1943,6 +1943,8 @@ static pci_power_t pci_target_state(struct pci_dev *dev) >> >> >> > > > && !(dev->pme_support & (1 << target_state))) >> >> >> > > > target_state--; >> >> >> > > > } >> >> >> > > > + } else if (dev->current_state == PCI_D3cold) { >> >> >> > > > + target_state = PCI_D3cold; >> >> >> > > > } >> >> >> > > >> >> > I will update this patch with Bjorn's suggestion to also leave the >> >> > device in D3cold if it is wakeup-capable. The idea is to just change >> >> > the default state in the first line of the function like this: >> >> > >> >> > - pci_power_t target_state = PCI_D3hot; >> >> > + pci_power_t target_state = >> >> > + dev->current_state == PCI_D3cold ? PCI_D3cold : PCI_D3hot; >> >> >> >> That should work (even though it is a little clumsy IMO). >> > >> > Not sure why that is clumsy but happy to use something else if you >> > have a suggestion? >> >> The clumsy thing is that we'd take the target_state as D3cold only if >> the device already was in that state. >> >> Otherwise, we'd take D3hot as the target state for the same device, >> which doesn't seem particularly consistent to me. >> >> Not that I have better ideas ATM, but then the current code works for >> my use cases. :-) > > The goal is to afford direct-complete to devices which are not power- > manageable by the platform but can still be runtime suspended to D3cold. Well, this is a bit misleading. According to the PCI spec there are two ways to put a device into D3cold: either by putting its bus into B3 (which for PCIe means turning the link off IIRC) which happens when the bridge goes into D3hot, or through the platform. You aren't talking about any of those cases, though, so we go outside of the spec here. > Right now we wake those devices up from D3cold to D3hot before going to > sleep, which is a waste of energy and prolongs the suspend sequence > (waking up the Thunderbolt controller takes 2 seconds). Understood. > The de facto standard to power manage such devices seems to be with > dev_pm_domain_set(). That's what vga_switcheroo does and I'll move > to that as well for v3 of this series. OK > I could add a "bool can_power_off" to struct dev_pm_domain. I'm not sure if dev_pm_domain is the right level. The "can_power_off" thing would be sort of specific to your particular use case. Say you have something like struct pci_pm_domain { struct dev_pm_domain pd; ... }; > Then I could change pci_target_state() like this: > > pci_power_t target_state = PCI_D3hot; > > if (platform_pci_power_manageable(dev)) { > [...] > + } else if (dev->dev.pm_domain && dev->dev.pm_domain.can_power_off) { so you can do something like } else if (dev->dev.pm_domain) { struct pci_pm_domain *pci_pd = to_pci_pm_domain(dev->dev.pm_domain); .... } else if [...] and it may be a bit more PCI-oriented without expanding generic data types. > + target_state = PCI_D3cold; > } else if [...] > > Another idea would be to add a ->choose_state hook to dev_pm_domain, > but that would have to return a PCI-specific power state, so we'd be > in clumsy territory again. Right. > Thoughts? Essentially, the PCI PM code needs to be told that there is a way to put the device into D3cold by non-standard means. There are a couple of ways to do that (a new flag in struct pci_dev, the above, probably more), but in any case it needs to be clear that this is non-standard IMO. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 04, 2016 at 05:30:47PM +0200, Rafael J. Wysocki wrote: > On Thu, Aug 4, 2016 at 10:14 AM, Lukas Wunner <lukas@wunner.de> wrote: > > On Thu, Aug 04, 2016 at 03:07:56AM +0200, Rafael J. Wysocki wrote: > >> On Thu, Aug 4, 2016 at 2:45 AM, Lukas Wunner <lukas@wunner.de> wrote: > >> > On Thu, Aug 04, 2016 at 01:50:39AM +0200, Rafael J. Wysocki wrote: > >> >> On Wed, Aug 3, 2016 at 2:28 PM, Lukas Wunner <lukas@wunner.de> wrote: > >> >> > I will update this patch with Bjorn's suggestion to also leave the > >> >> > device in D3cold if it is wakeup-capable. The idea is to just change > >> >> > the default state in the first line of the function like this: > >> >> > > >> >> > - pci_power_t target_state = PCI_D3hot; > >> >> > + pci_power_t target_state = > >> >> > + dev->current_state == PCI_D3cold ? PCI_D3cold : PCI_D3hot; > >> >> > >> >> That should work (even though it is a little clumsy IMO). > >> > > >> > Not sure why that is clumsy but happy to use something else if you > >> > have a suggestion? > >> > >> The clumsy thing is that we'd take the target_state as D3cold only if > >> the device already was in that state. > >> > >> Otherwise, we'd take D3hot as the target state for the same device, > >> which doesn't seem particularly consistent to me. > >> > >> Not that I have better ideas ATM, but then the current code works for > >> my use cases. :-) > > > > The goal is to afford direct-complete to devices which are not power- > > manageable by the platform but can still be runtime suspended to D3cold. > > Well, this is a bit misleading. > > According to the PCI spec there are two ways to put a device into > D3cold: either by putting its bus into B3 (which for PCIe means > turning the link off IIRC) which happens when the bridge goes into > D3hot, or through the platform. > > You aren't talking about any of those cases, though, so we go outside > of the spec here. Yes. With Nvidia Optimus / AMD PowerXpress hybrid graphics on non-Macs and Thunderbolt on Macs, it could still be argued that D3cold is facilitated by the platform, albeit with custom methods instead of _PS3. With hybrid graphics on Macs, the discrete GPU is turned off by a gmux controller on the LPC bus which is controlled via I/O ports. So the ACPI platform isn't involved at all and at least then we're in completely nonstandard territory. > > Right now we wake those devices up from D3cold to D3hot before going to > > sleep, which is a waste of energy and prolongs the suspend sequence > > (waking up the Thunderbolt controller takes 2 seconds). > > Understood. > > > The de facto standard to power manage such devices seems to be with > > dev_pm_domain_set(). That's what vga_switcheroo does and I'll move > > to that as well for v3 of this series. > > OK > > > I could add a "bool can_power_off" to struct dev_pm_domain. > > I'm not sure if dev_pm_domain is the right level. The "can_power_off" > thing would be sort of specific to your particular use case. > > Say you have something like > > struct pci_pm_domain { > struct dev_pm_domain pd; > ... > }; > > > Then I could change pci_target_state() like this: > > > > pci_power_t target_state = PCI_D3hot; > > > > if (platform_pci_power_manageable(dev)) { > > [...] > > + } else if (dev->dev.pm_domain && dev->dev.pm_domain.can_power_off) { > > so you can do something like > > } else if (dev->dev.pm_domain) { > struct pci_pm_domain *pci_pd = > to_pci_pm_domain(dev->dev.pm_domain); > > .... > } else if [...] > > and it may be a bit more PCI-oriented without expanding generic data types. > > > + target_state = PCI_D3cold; > > } else if [...] > > > > Another idea would be to add a ->choose_state hook to dev_pm_domain, > > but that would have to return a PCI-specific power state, so we'd be > > in clumsy territory again. > > Right. > > > Thoughts? > > Essentially, the PCI PM code needs to be told that there is a way to > put the device into D3cold by non-standard means. There are a couple > of ways to do that (a new flag in struct pci_dev, the above, probably > more), but in any case it needs to be clear that this is non-standard > IMO. The more I think about it, the more I lean towards the one-line change at the top of this e-mail, even though you found it clumsy. It's small and simple and fixes the problem without overengineering things. The reasoning is that going from D3cold to D3hot before system sleep just never makes sense, no matter if the device got there by standard or nonstandard means. So the default target state should be D3cold if the device is already there, and D3hot otherwise. I could perhaps try to make this clearer by adding a comment. Thanks, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sunday, August 07, 2016 11:03:47 AM Lukas Wunner wrote: > On Thu, Aug 04, 2016 at 05:30:47PM +0200, Rafael J. Wysocki wrote: > > On Thu, Aug 4, 2016 at 10:14 AM, Lukas Wunner <lukas@wunner.de> wrote: > > > On Thu, Aug 04, 2016 at 03:07:56AM +0200, Rafael J. Wysocki wrote: > > >> On Thu, Aug 4, 2016 at 2:45 AM, Lukas Wunner <lukas@wunner.de> wrote: > > >> > On Thu, Aug 04, 2016 at 01:50:39AM +0200, Rafael J. Wysocki wrote: > > >> >> On Wed, Aug 3, 2016 at 2:28 PM, Lukas Wunner <lukas@wunner.de> wrote: > > >> >> > I will update this patch with Bjorn's suggestion to also leave the > > >> >> > device in D3cold if it is wakeup-capable. The idea is to just change > > >> >> > the default state in the first line of the function like this: > > >> >> > > > >> >> > - pci_power_t target_state = PCI_D3hot; > > >> >> > + pci_power_t target_state = > > >> >> > + dev->current_state == PCI_D3cold ? PCI_D3cold : PCI_D3hot; > > >> >> > > >> >> That should work (even though it is a little clumsy IMO). > > >> > > > >> > Not sure why that is clumsy but happy to use something else if you > > >> > have a suggestion? > > >> > > >> The clumsy thing is that we'd take the target_state as D3cold only if > > >> the device already was in that state. > > >> > > >> Otherwise, we'd take D3hot as the target state for the same device, > > >> which doesn't seem particularly consistent to me. > > >> > > >> Not that I have better ideas ATM, but then the current code works for > > >> my use cases. :-) > > > > > > The goal is to afford direct-complete to devices which are not power- > > > manageable by the platform but can still be runtime suspended to D3cold. > > > > Well, this is a bit misleading. > > > > According to the PCI spec there are two ways to put a device into > > D3cold: either by putting its bus into B3 (which for PCIe means > > turning the link off IIRC) which happens when the bridge goes into > > D3hot, or through the platform. > > > > You aren't talking about any of those cases, though, so we go outside > > of the spec here. > > Yes. With Nvidia Optimus / AMD PowerXpress hybrid graphics on non-Macs > and Thunderbolt on Macs, it could still be argued that D3cold is > facilitated by the platform, albeit with custom methods instead of _PS3. So you'd need a custom set of callbacks for that "platform", but that's only a few devices in the system, so you would also need normal ACPI callbacks for the rest. Conceivably, that could be addressed with per-device platform callbacks, but that is conceptually equivalent to adding a pm_domain pointer to the devices in question. > With hybrid graphics on Macs, the discrete GPU is turned off by > a gmux controller on the LPC bus which is controlled via I/O ports. > So the ACPI platform isn't involved at all and at least then we're > in completely nonstandard territory. > > > > Right now we wake those devices up from D3cold to D3hot before going to > > > sleep, which is a waste of energy and prolongs the suspend sequence > > > (waking up the Thunderbolt controller takes 2 seconds). > > > > Understood. > > > > > The de facto standard to power manage such devices seems to be with > > > dev_pm_domain_set(). That's what vga_switcheroo does and I'll move > > > to that as well for v3 of this series. > > > > OK > > > > > I could add a "bool can_power_off" to struct dev_pm_domain. > > > > I'm not sure if dev_pm_domain is the right level. The "can_power_off" > > thing would be sort of specific to your particular use case. > > > > Say you have something like > > > > struct pci_pm_domain { > > struct dev_pm_domain pd; > > ... > > }; > > > > > Then I could change pci_target_state() like this: > > > > > > pci_power_t target_state = PCI_D3hot; > > > > > > if (platform_pci_power_manageable(dev)) { > > > [...] > > > + } else if (dev->dev.pm_domain && dev->dev.pm_domain.can_power_off) { > > > > so you can do something like > > > > } else if (dev->dev.pm_domain) { > > struct pci_pm_domain *pci_pd = > > to_pci_pm_domain(dev->dev.pm_domain); > > > > .... > > } else if [...] > > > > and it may be a bit more PCI-oriented without expanding generic data types. > > > > > + target_state = PCI_D3cold; > > > } else if [...] > > > > > > Another idea would be to add a ->choose_state hook to dev_pm_domain, > > > but that would have to return a PCI-specific power state, so we'd be > > > in clumsy territory again. > > > > Right. > > > > > Thoughts? > > > > Essentially, the PCI PM code needs to be told that there is a way to > > put the device into D3cold by non-standard means. There are a couple > > of ways to do that (a new flag in struct pci_dev, the above, probably > > more), but in any case it needs to be clear that this is non-standard > > IMO. > > The more I think about it, the more I lean towards the one-line change > at the top of this e-mail, even though you found it clumsy. It's small > and simple and fixes the problem without overengineering things. I still would like the default D3cold to only apply if the device has no platform support, though. > The reasoning is that going from D3cold to D3hot before system sleep > just never makes sense, no matter if the device got there by standard > or nonstandard means. That may not be true in theory. If this is a wakeup device, it may not be able to generate wakeup signals from D3cold while the system is in the target system state, although it might be able to generate those signals when the system is in S0 (in the ACPI case). That's why I'd leave the platform support case as is. > So the default target state should be D3cold if the device is already there, > and D3hot otherwise. I could perhaps try to make this clearer by adding a > comment. A comment would certainly be useful. In particular, about how the device can get into D3cold at all if the default is D3cold only when the device is already there. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 08, 2016 at 01:32:54AM +0200, Rafael J. Wysocki wrote: > On Sunday, August 07, 2016 11:03:47 AM Lukas Wunner wrote: > > On Thu, Aug 04, 2016 at 05:30:47PM +0200, Rafael J. Wysocki wrote: > > > On Thu, Aug 4, 2016 at 10:14 AM, Lukas Wunner <lukas@wunner.de> wrote: > > > > On Thu, Aug 04, 2016 at 03:07:56AM +0200, Rafael J. Wysocki wrote: > > > >> On Thu, Aug 4, 2016 at 2:45 AM, Lukas Wunner <lukas@wunner.de> wrote: > > > >> > On Thu, Aug 04, 2016 at 01:50:39AM +0200, Rafael J. Wysocki wrote: > > > >> >> On Wed, Aug 3, 2016 at 2:28 PM, Lukas Wunner <lukas@wunner.de> wrote: > > > >> >> > I will update this patch with Bjorn's suggestion to also leave the > > > >> >> > device in D3cold if it is wakeup-capable. The idea is to just change > > > >> >> > the default state in the first line of the function like this: > > > >> >> > > > > >> >> > - pci_power_t target_state = PCI_D3hot; > > > >> >> > + pci_power_t target_state = > > > >> >> > + dev->current_state == PCI_D3cold ? PCI_D3cold : PCI_D3hot; > > > >> >> > > > >> >> That should work (even though it is a little clumsy IMO). > > > >> > > > > >> > Not sure why that is clumsy but happy to use something else if you > > > >> > have a suggestion? > > > >> > > > >> The clumsy thing is that we'd take the target_state as D3cold only if > > > >> the device already was in that state. > > > >> > > > >> Otherwise, we'd take D3hot as the target state for the same device, > > > >> which doesn't seem particularly consistent to me. > > > >> > > > >> Not that I have better ideas ATM, but then the current code works for > > > >> my use cases. :-) > > > > > > > > The goal is to afford direct-complete to devices which are not power- > > > > manageable by the platform but can still be runtime suspended to D3cold. > > > > > > Well, this is a bit misleading. > > > > > > According to the PCI spec there are two ways to put a device into > > > D3cold: either by putting its bus into B3 (which for PCIe means > > > turning the link off IIRC) which happens when the bridge goes into > > > D3hot, or through the platform. > > > > > > You aren't talking about any of those cases, though, so we go outside > > > of the spec here. > > > > Yes. With Nvidia Optimus / AMD PowerXpress hybrid graphics on non-Macs > > and Thunderbolt on Macs, it could still be argued that D3cold is > > facilitated by the platform, albeit with custom methods instead of _PS3. > > So you'd need a custom set of callbacks for that "platform", but that's > only a few devices in the system, so you would also need normal ACPI callbacks > for the rest. > > Conceivably, that could be addressed with per-device platform callbacks, > but that is conceptually equivalent to adding a pm_domain pointer to the > devices in question. Precisely. > > > > The de facto standard to power manage such devices seems to be with > > > > dev_pm_domain_set(). That's what vga_switcheroo does and I'll move > > > > to that as well for v3 of this series. > > > > > > OK > > > > > > > I could add a "bool can_power_off" to struct dev_pm_domain. > > > > > > I'm not sure if dev_pm_domain is the right level. The "can_power_off" > > > thing would be sort of specific to your particular use case. > > > > > > Say you have something like > > > > > > struct pci_pm_domain { > > > struct dev_pm_domain pd; > > > ... > > > }; So I would like to find a common ground and something you feel comfortable to ack. The problem I see with your suggested approach of subclassing struct dev_pm_domain in a struct pci_pm_domain is that I can easily envision Apple putting some custom methods in the DSDT to power a non-PCI device up and down. They're starting to use SPI and UART to attach devices in newer machines. Hence my suggestion to add a flag to struct dev_pm_domain, even though at the moment that flag would only be queried by the PCI core. I don't care if this is called can_power_off or power_manageable or whatever. Thanks, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 11, 2016 at 3:20 PM, Lukas Wunner <lukas@wunner.de> wrote: > On Mon, Aug 08, 2016 at 01:32:54AM +0200, Rafael J. Wysocki wrote: >> On Sunday, August 07, 2016 11:03:47 AM Lukas Wunner wrote: >> > On Thu, Aug 04, 2016 at 05:30:47PM +0200, Rafael J. Wysocki wrote: >> > > On Thu, Aug 4, 2016 at 10:14 AM, Lukas Wunner <lukas@wunner.de> wrote: >> > > > On Thu, Aug 04, 2016 at 03:07:56AM +0200, Rafael J. Wysocki wrote: >> > > >> On Thu, Aug 4, 2016 at 2:45 AM, Lukas Wunner <lukas@wunner.de> wrote: >> > > >> > On Thu, Aug 04, 2016 at 01:50:39AM +0200, Rafael J. Wysocki wrote: >> > > >> >> On Wed, Aug 3, 2016 at 2:28 PM, Lukas Wunner <lukas@wunner.de> wrote: >> > > >> >> > I will update this patch with Bjorn's suggestion to also leave the >> > > >> >> > device in D3cold if it is wakeup-capable. The idea is to just change >> > > >> >> > the default state in the first line of the function like this: >> > > >> >> > >> > > >> >> > - pci_power_t target_state = PCI_D3hot; >> > > >> >> > + pci_power_t target_state = >> > > >> >> > + dev->current_state == PCI_D3cold ? PCI_D3cold : PCI_D3hot; >> > > >> >> >> > > >> >> That should work (even though it is a little clumsy IMO). >> > > >> > >> > > >> > Not sure why that is clumsy but happy to use something else if you >> > > >> > have a suggestion? >> > > >> >> > > >> The clumsy thing is that we'd take the target_state as D3cold only if >> > > >> the device already was in that state. >> > > >> >> > > >> Otherwise, we'd take D3hot as the target state for the same device, >> > > >> which doesn't seem particularly consistent to me. >> > > >> >> > > >> Not that I have better ideas ATM, but then the current code works for >> > > >> my use cases. :-) >> > > > >> > > > The goal is to afford direct-complete to devices which are not power- >> > > > manageable by the platform but can still be runtime suspended to D3cold. >> > > >> > > Well, this is a bit misleading. >> > > >> > > According to the PCI spec there are two ways to put a device into >> > > D3cold: either by putting its bus into B3 (which for PCIe means >> > > turning the link off IIRC) which happens when the bridge goes into >> > > D3hot, or through the platform. >> > > >> > > You aren't talking about any of those cases, though, so we go outside >> > > of the spec here. >> > >> > Yes. With Nvidia Optimus / AMD PowerXpress hybrid graphics on non-Macs >> > and Thunderbolt on Macs, it could still be argued that D3cold is >> > facilitated by the platform, albeit with custom methods instead of _PS3. >> >> So you'd need a custom set of callbacks for that "platform", but that's >> only a few devices in the system, so you would also need normal ACPI callbacks >> for the rest. >> >> Conceivably, that could be addressed with per-device platform callbacks, >> but that is conceptually equivalent to adding a pm_domain pointer to the >> devices in question. > > Precisely. > >> > > > The de facto standard to power manage such devices seems to be with >> > > > dev_pm_domain_set(). That's what vga_switcheroo does and I'll move >> > > > to that as well for v3 of this series. >> > > >> > > OK >> > > >> > > > I could add a "bool can_power_off" to struct dev_pm_domain. >> > > >> > > I'm not sure if dev_pm_domain is the right level. The "can_power_off" >> > > thing would be sort of specific to your particular use case. >> > > >> > > Say you have something like >> > > >> > > struct pci_pm_domain { >> > > struct dev_pm_domain pd; >> > > ... >> > > }; > > So I would like to find a common ground and something you feel > comfortable to ack. The problem I see with your suggested approach > of subclassing struct dev_pm_domain in a struct pci_pm_domain is > that I can easily envision Apple putting some custom methods in the > DSDT to power a non-PCI device up and down. They're starting to use > SPI and UART to attach devices in newer machines. Those devices have no standard power state definitions. The problem you have here really is PCI-specific, because you want to use PCI PM along with the non-standard methods. > Hence my suggestion to add a flag to struct dev_pm_domain, even > though at the moment that flag would only be queried by the PCI core. > I don't care if this is called can_power_off or power_manageable or > whatever. struct dev_pm_domain is way too generic for that though, as I'm sure there are users of it where the can_power_off thing wouldn't make any sense whatever. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Aug 12, 2016 at 02:50:04AM +0200, Rafael J. Wysocki wrote: > On Thu, Aug 11, 2016 at 3:20 PM, Lukas Wunner <lukas@wunner.de> wrote: > > So I would like to find a common ground and something you feel > > comfortable to ack. The problem I see with your suggested approach > > of subclassing struct dev_pm_domain in a struct pci_pm_domain is > > that I can easily envision Apple putting some custom methods in the > > DSDT to power a non-PCI device up and down. They're starting to use > > SPI and UART to attach devices in newer machines. > > Those devices have no standard power state definitions. > > The problem you have here really is PCI-specific, because you want to > use PCI PM along with the non-standard methods. If I introduce a struct pci_pm_domain like you suggested, it would mean that *all* PCI devices using dev_pm_domain_set() have to be changed, else the container_of() wouldn't work. The resulting code bloat alone inhibits me from implementing this. Plus, it's a tripwire for anyone wishing to assign a dev_pm_domain to their PCI device. > > Hence my suggestion to add a flag to struct dev_pm_domain, even > > though at the moment that flag would only be queried by the PCI core. > > I don't care if this is called can_power_off or power_manageable or > > whatever. > > struct dev_pm_domain is way too generic for that though, as I'm sure > there are users of it where the can_power_off thing wouldn't make any > sense whatever. That seems like a small tradeoff compared to introducing a struct pci_pm_domain. If you dislike a can_power_off flag in struct dev_pm_domain, that only leaves the option to add a one-liner to pci_target_state(), unless I'm missing something. BTW there seems to be a contradiction in your statements on wakeup devices: On Mon, Aug 08, 2016 at 01:32:54AM +0200, Rafael J. Wysocki wrote: > On Sunday, August 07, 2016 11:03:47 AM Lukas Wunner wrote: > > The reasoning is that going from D3cold to D3hot before system sleep > > just never makes sense, no matter if the device got there by standard > > or nonstandard means. > > That may not be true in theory. > > If this is a wakeup device, it may not be able to generate wakeup signals > from D3cold while the system is in the target system state, although it might > be able to generate those signals when the system is in S0 (in the ACPI case). However earlier you wrote: On Mon, Jul 18, 2016 at 03:39:15PM +0200, Rafael J. Wysocki wrote: > On Saturday, June 18, 2016 12:14:07 AM Lukas Wunner wrote: > > On Fri, Jun 17, 2016 at 04:09:24PM -0500, Bjorn Helgaas wrote: > > > Is there a reason you don't want to do this check for devices that > > > may wakeup? > > > > Fear of breaking things. It would mean that a device would be left in > > D3cold even though it may not be able to signal wakeup from that power > > state. > > Then it should not be put into D3_cold at run time too if it is wakeup-capable. So on the one hand, you warn that a wakeup-capable device may have been put into D3cold at runtime but needs to be woken before system sleep because it might otherwise not be able to signal wakeup. On the other hand you say that such devices should not be put into D3cold at runtime at all. Which one is it? Thanks, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Friday, August 12, 2016 06:16:09 PM Lukas Wunner wrote: > On Fri, Aug 12, 2016 at 02:50:04AM +0200, Rafael J. Wysocki wrote: > > On Thu, Aug 11, 2016 at 3:20 PM, Lukas Wunner <lukas@wunner.de> wrote: > > > So I would like to find a common ground and something you feel > > > comfortable to ack. The problem I see with your suggested approach > > > of subclassing struct dev_pm_domain in a struct pci_pm_domain is > > > that I can easily envision Apple putting some custom methods in the > > > DSDT to power a non-PCI device up and down. They're starting to use > > > SPI and UART to attach devices in newer machines. > > > > Those devices have no standard power state definitions. > > > > The problem you have here really is PCI-specific, because you want to > > use PCI PM along with the non-standard methods. > > If I introduce a struct pci_pm_domain like you suggested, it would mean > that *all* PCI devices using dev_pm_domain_set() have to be changed, > else the container_of() wouldn't work. The resulting code bloat alone > inhibits me from implementing this. Plus, it's a tripwire for anyone > wishing to assign a dev_pm_domain to their PCI device. > > > > Hence my suggestion to add a flag to struct dev_pm_domain, even > > > though at the moment that flag would only be queried by the PCI core. > > > I don't care if this is called can_power_off or power_manageable or > > > whatever. > > > > struct dev_pm_domain is way too generic for that though, as I'm sure > > there are users of it where the can_power_off thing wouldn't make any > > sense whatever. > > That seems like a small tradeoff compared to introducing a struct > pci_pm_domain. I'm not going to apply any patches addding can_power_off or similar flags to struct dev_pm_domain. > If you dislike a can_power_off flag in struct dev_pm_domain, that only > leaves the option to add a one-liner to pci_target_state(), unless I'm > missing something. I'm not sure why you are insisting on setting target_state to D3cold before taking the platform_pci_power_manageable() branch. Why don't you simply rearrange the routine like pci_power_t target_state = PCI_D3hot; if (platform_pci_power_manageable(dev)) { ... return target_state; } if (!dev->pm_cap) return PCI_D0; if (dev->current_state == PCI_D3cold) target_state = PCI_D3cold; if (device_may_wakeup(&dev->dev)) { ... } return target_state; And that would be fine by me. That said I'm not sure why you want to use pci_target_state() so badly? If you are going to use a PM domain, why do you still need that function? > BTW there seems to be a contradiction in your statements on wakeup devices: > > On Mon, Aug 08, 2016 at 01:32:54AM +0200, Rafael J. Wysocki wrote: > > On Sunday, August 07, 2016 11:03:47 AM Lukas Wunner wrote: > > > The reasoning is that going from D3cold to D3hot before system sleep > > > just never makes sense, no matter if the device got there by standard > > > or nonstandard means. > > > > That may not be true in theory. > > > > If this is a wakeup device, it may not be able to generate wakeup signals > > from D3cold while the system is in the target system state, although it might > > be able to generate those signals when the system is in S0 (in the ACPI case). > > However earlier you wrote: > > On Mon, Jul 18, 2016 at 03:39:15PM +0200, Rafael J. Wysocki wrote: > > On Saturday, June 18, 2016 12:14:07 AM Lukas Wunner wrote: > > > On Fri, Jun 17, 2016 at 04:09:24PM -0500, Bjorn Helgaas wrote: > > > > Is there a reason you don't want to do this check for devices that > > > > may wakeup? > > > > > > Fear of breaking things. It would mean that a device would be left in > > > D3cold even though it may not be able to signal wakeup from that power > > > state. > > > > Then it should not be put into D3_cold at run time too if it is wakeup-capable. > > So on the one hand, you warn that a wakeup-capable device may have been > put into D3cold at runtime but needs to be woken before system sleep > because it might otherwise not be able to signal wakeup. Yes, so specifically I'm concerned about the pci_target_state() invocation in pci_dev_keep_suspended() which is done exactly for this purpose. If you apply the "keep it in D3cold if already there" logic to that case, it may lead to a wrong decision in theory. Say the device is in D3cold and platform_pci_choose_state() returns D1, but pci_no_d1d2() returns true, the device will end up in D3cold, but it may not be able to signal wakeup from that state after the system has been suspended. > On the other hand you say that such devices should not be put into D3cold > at runtime at all. > > Which one is it? I said the latter under the assumption that the device would not be able to signal wakeup from D3cold at all (including at run time). I may have not understand the context of your conversation with Bjorn correctly. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Saturday, August 13, 2016 12:18:26 AM Rafael J. Wysocki wrote: > On Friday, August 12, 2016 06:16:09 PM Lukas Wunner wrote: > > On Fri, Aug 12, 2016 at 02:50:04AM +0200, Rafael J. Wysocki wrote: > > > On Thu, Aug 11, 2016 at 3:20 PM, Lukas Wunner <lukas@wunner.de> wrote: > > > > So I would like to find a common ground and something you feel > > > > comfortable to ack. The problem I see with your suggested approach > > > > of subclassing struct dev_pm_domain in a struct pci_pm_domain is > > > > that I can easily envision Apple putting some custom methods in the > > > > DSDT to power a non-PCI device up and down. They're starting to use > > > > SPI and UART to attach devices in newer machines. > > > > > > Those devices have no standard power state definitions. > > > > > > The problem you have here really is PCI-specific, because you want to > > > use PCI PM along with the non-standard methods. > > > > If I introduce a struct pci_pm_domain like you suggested, it would mean > > that *all* PCI devices using dev_pm_domain_set() have to be changed, > > else the container_of() wouldn't work. The resulting code bloat alone > > inhibits me from implementing this. Plus, it's a tripwire for anyone > > wishing to assign a dev_pm_domain to their PCI device. > > > > > > Hence my suggestion to add a flag to struct dev_pm_domain, even > > > > though at the moment that flag would only be queried by the PCI core. > > > > I don't care if this is called can_power_off or power_manageable or > > > > whatever. > > > > > > struct dev_pm_domain is way too generic for that though, as I'm sure > > > there are users of it where the can_power_off thing wouldn't make any > > > sense whatever. > > > > That seems like a small tradeoff compared to introducing a struct > > pci_pm_domain. > > I'm not going to apply any patches addding can_power_off or similar flags to > struct dev_pm_domain. > > > If you dislike a can_power_off flag in struct dev_pm_domain, that only > > leaves the option to add a one-liner to pci_target_state(), unless I'm > > missing something. > > I'm not sure why you are insisting on setting target_state to D3cold > before taking the platform_pci_power_manageable() branch. Why don't > you simply rearrange the routine like > > pci_power_t target_state = PCI_D3hot; > > if (platform_pci_power_manageable(dev)) { > ... > return target_state; > } > > if (!dev->pm_cap) > return PCI_D0; > > if (dev->current_state == PCI_D3cold) > target_state = PCI_D3cold; > > if (device_may_wakeup(&dev->dev)) { > ... > } > > return target_state; > > And that would be fine by me. > > That said I'm not sure why you want to use pci_target_state() so badly? > > If you are going to use a PM domain, why do you still need that function? > > > BTW there seems to be a contradiction in your statements on wakeup devices: > > > > On Mon, Aug 08, 2016 at 01:32:54AM +0200, Rafael J. Wysocki wrote: > > > On Sunday, August 07, 2016 11:03:47 AM Lukas Wunner wrote: > > > > The reasoning is that going from D3cold to D3hot before system sleep > > > > just never makes sense, no matter if the device got there by standard > > > > or nonstandard means. > > > > > > That may not be true in theory. > > > > > > If this is a wakeup device, it may not be able to generate wakeup signals > > > from D3cold while the system is in the target system state, although it might > > > be able to generate those signals when the system is in S0 (in the ACPI case). > > > > However earlier you wrote: > > > > On Mon, Jul 18, 2016 at 03:39:15PM +0200, Rafael J. Wysocki wrote: > > > On Saturday, June 18, 2016 12:14:07 AM Lukas Wunner wrote: > > > > On Fri, Jun 17, 2016 at 04:09:24PM -0500, Bjorn Helgaas wrote: > > > > > Is there a reason you don't want to do this check for devices that > > > > > may wakeup? > > > > > > > > Fear of breaking things. It would mean that a device would be left in > > > > D3cold even though it may not be able to signal wakeup from that power > > > > state. > > > > > > Then it should not be put into D3_cold at run time too if it is wakeup-capable. > > > > So on the one hand, you warn that a wakeup-capable device may have been > > put into D3cold at runtime but needs to be woken before system sleep > > because it might otherwise not be able to signal wakeup. > > Yes, so specifically I'm concerned about the pci_target_state() invocation in > pci_dev_keep_suspended() which is done exactly for this purpose. > > If you apply the "keep it in D3cold if already there" logic to that case, it > may lead to a wrong decision in theory. Say the device is in D3cold and > platform_pci_choose_state() returns D1, but pci_no_d1d2() returns true, > the device will end up in D3cold, but it may not be able to signal wakeup > from that state after the system has been suspended. Of course, I guess you'll say that it may not be able to signal wakeup from D3hot as well in that case, which is correct. :-) So I guess the one-liner change in pci_target_state() would be fine if Bjorn likes it. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Aug 13, 2016 at 12:18:26AM +0200, Rafael J. Wysocki wrote: > Yes, so specifically I'm concerned about the pci_target_state() invocation > in pci_dev_keep_suspended() which is done exactly for this purpose. > > If you apply the "keep it in D3cold if already there" logic to that case, > it may lead to a wrong decision in theory. Say the device is in D3cold and > platform_pci_choose_state() returns D1, but pci_no_d1d2() returns true, > the device will end up in D3cold, but it may not be able to signal wakeup > from that state after the system has been suspended. Ugh, I had missed those break statements in the platform-case. I must be blind. You're right of course, that wouldn't be correct. > Of course, I guess you'll say that it may not be able to signal wakeup from > D3hot as well in that case, which is correct. :-) Hm, what would be the correct power state in that case then? PCI_D0? > Why don't you simply rearrange the routine like > > pci_power_t target_state = PCI_D3hot; > > if (platform_pci_power_manageable(dev)) { > ... > return target_state; > } > > if (!dev->pm_cap) > return PCI_D0; > > if (dev->current_state == PCI_D3cold) > target_state = PCI_D3cold; > > if (device_may_wakeup(&dev->dev)) { > ... > } > > return target_state; > > And that would be fine by me. Looks good, I'll give that a try. If the correct power state in the pci_no_d1d2() case is PCI_D0, I could fix that up as well. > That said I'm not sure why you want to use pci_target_state() so badly? > > If you are going to use a PM domain, why do you still need that function? The dev_pm_domain is only assigned to the topmost device exposed by the Thunderbolt controller (the upstream bridge). I would like to avoid having to assign separate dev_pm_domains to the downstream bridges. So I let the NHI and downstream bridges go to D3hot. And when the upstream bridge cuts power, it iterates over all child devices and changes their current_state to D3cold to reflect reality. When the system is later put to sleep, this patch ensures that the NHI and downstream bridges are not unnecessarily resumed to D3hot. So why change the current_state of the children at all? I could just leave the (incorrect) PCI_D3hot and everything would be peachy, right? Well, there's another problem: The first few Thunderbolt chips had broken MSI, they have to use INTx to signal hotplug. Unfortunately on some Macs built 2011/2012, the IRQ is shared with multiple other devices, most importantly the wireless card which can generate thousands of interrupts on a crowded WLAN. If power is cut to the Thunderbolt controller, reading from the hotplug ports' config space in pcie_isr() fails and results in a "no response from device" message logged with KERN_INFO. Getting thousands of such messages is annoying, not to mention the giant waste of CPU cycles to read from the config space of a device which we *know* is powered down. The solution I came up with is to add a tiny two-liner to pcie_isr() with commit ed91de7e14fb ("PCI: pciehp: Ignore interrupts during D3cold"). But that requires that I update the children's current_state to D3cold, and necessitates that pci_target_state() doesn't resume them to D3hot for system sleep. Hence the need for this patch. The approach has the additional benefit that hybrid graphics devices are implicitly also afforded direct-complete without having to add a ->prepare hook that returns a positive int. They only need to set their current_state to D3cold, which they already do, see azx_vs_set_state(), nouveau_pmops_runtime_suspend(), radeon_pmops_runtime_suspend(), amdgpu_pmops_runtime_suspend(). However this also means that adding a can_power_off flag to struct dev_pm_domain wouldn't be a viable solution because then I'd have to assign a dev_pm_domain to the downstream bridges. Another thing I've missed. Ugh. This is so complicated it's easy to get tangled up in all these intricate little details. Thanks for your patience in dealing with these issues, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sunday, August 14, 2016 12:27:25 PM Lukas Wunner wrote: > On Sat, Aug 13, 2016 at 12:18:26AM +0200, Rafael J. Wysocki wrote: > > Yes, so specifically I'm concerned about the pci_target_state() invocation > > in pci_dev_keep_suspended() which is done exactly for this purpose. > > > > If you apply the "keep it in D3cold if already there" logic to that case, > > it may lead to a wrong decision in theory. Say the device is in D3cold and > > platform_pci_choose_state() returns D1, but pci_no_d1d2() returns true, > > the device will end up in D3cold, but it may not be able to signal wakeup > > from that state after the system has been suspended. > > Ugh, I had missed those break statements in the platform-case. > I must be blind. You're right of course, that wouldn't be correct. > > > Of course, I guess you'll say that it may not be able to signal wakeup from > > D3hot as well in that case, which is correct. :-) > > Hm, what would be the correct power state in that case then? PCI_D0? D0 may not be a good choice here too. The problem in this case is the discrepancy between what the platform firmware tells us and what we know from other sources, so this way or another, something may be broken. I guess the safest option is to just keep the current behavior. :-) > > Why don't you simply rearrange the routine like > > > > pci_power_t target_state = PCI_D3hot; > > > > if (platform_pci_power_manageable(dev)) { > > ... > > return target_state; > > } > > > > if (!dev->pm_cap) > > return PCI_D0; > > > > if (dev->current_state == PCI_D3cold) > > target_state = PCI_D3cold; > > > > if (device_may_wakeup(&dev->dev)) { > > ... > > } > > > > return target_state; > > > > And that would be fine by me. > > Looks good, I'll give that a try. > > If the correct power state in the pci_no_d1d2() case is PCI_D0, > I could fix that up as well. > > > That said I'm not sure why you want to use pci_target_state() so badly? > > > > If you are going to use a PM domain, why do you still need that function? > > The dev_pm_domain is only assigned to the topmost device exposed by > the Thunderbolt controller (the upstream bridge). I would like to avoid > having to assign separate dev_pm_domains to the downstream bridges. > > So I let the NHI and downstream bridges go to D3hot. And when the > upstream bridge cuts power, it iterates over all child devices > and changes their current_state to D3cold to reflect reality. > > When the system is later put to sleep, this patch ensures that the > NHI and downstream bridges are not unnecessarily resumed to D3hot. > > So why change the current_state of the children at all? I could just > leave the (incorrect) PCI_D3hot and everything would be peachy, right? > Well, there's another problem: The first few Thunderbolt chips had > broken MSI, they have to use INTx to signal hotplug. Unfortunately on > some Macs built 2011/2012, the IRQ is shared with multiple other devices, > most importantly the wireless card which can generate thousands of > interrupts on a crowded WLAN. If power is cut to the Thunderbolt > controller, reading from the hotplug ports' config space in pcie_isr() > fails and results in a "no response from device" message logged with > KERN_INFO. Getting thousands of such messages is annoying, not to > mention the giant waste of CPU cycles to read from the config space > of a device which we *know* is powered down. > > The solution I came up with is to add a tiny two-liner to pcie_isr() > with commit ed91de7e14fb ("PCI: pciehp: Ignore interrupts during D3cold"). > But that requires that I update the children's current_state to D3cold, > and necessitates that pci_target_state() doesn't resume them to D3hot > for system sleep. Hence the need for this patch. > > The approach has the additional benefit that hybrid graphics devices > are implicitly also afforded direct-complete without having to add a > ->prepare hook that returns a positive int. They only need to set their > current_state to D3cold, which they already do, see azx_vs_set_state(), > nouveau_pmops_runtime_suspend(), radeon_pmops_runtime_suspend(), > amdgpu_pmops_runtime_suspend(). Sounds reasonable to me. > However this also means that adding a can_power_off flag to struct > dev_pm_domain wouldn't be a viable solution because then I'd have to > assign a dev_pm_domain to the downstream bridges. Another thing I've > missed. Ugh. This is so complicated it's easy to get tangled up in > all these intricate little details. > > Thanks for your patience in dealing with these issues, No problem. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 791dfe7..6af9911 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1943,6 +1943,8 @@ static pci_power_t pci_target_state(struct pci_dev *dev) && !(dev->pme_support & (1 << target_state))) target_state--; } + } else if (dev->current_state == PCI_D3cold) { + target_state = PCI_D3cold; } return target_state;
There are devices wich are not power-managed by the platform, yet can be runtime suspended to D3cold with some other mechanism. When putting the system to sleep, we currently handle such devices improperly by trying to transition them from D3cold to D3hot (the default power state defined at the beginning of pci_target_state()). Avoid that. An example for devices affected by this are Thunderbolt controllers built into Macs which can be put into D3cold with nonstandard ACPI methods. Signed-off-by: Lukas Wunner <lukas@wunner.de> --- drivers/pci/pci.c | 2 ++ 1 file changed, 2 insertions(+)