Message ID | 2981101.tKVHzisTAg@kreacher (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | PM: PCI/ACPI: Hibernation handling fixes | expand |
Hi Rafael, On 29-06-19 11:50, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > If the resume_from_noirq flag is set in dev_desc, the ->suspend_late > callback provided by the device driver will be invoked at the "noirq" > stage of system suspend, via acpi_lpss_do_suspend_late(), which is > incorrect. > > To fix that, drop acpi_lpss_do_suspend_late() and rearrange > acpi_lpss_suspend_late() to call pm_generic_suspend_late() > directly, before calling acpi_lpss_suspend(), in analogy with > acpi_subsys_suspend_late(). Ah now I see the logic in your previous test-patch. I'm afraid that this is going to break things though, the calling of the device-driver's suspend-late method at noirq time is *intentional* ! The resume_from_noirq flag is only set for i2c controllers which use: drivers/i2c/busses/i2c-designware-platdrv.c as driver. This driver's suspend late method looks like this: static int dw_i2c_plat_suspend(struct device *dev) { struct dw_i2c_dev *i_dev = dev_get_drvdata(dev); i_dev->suspended = true; if (i_dev->shared_with_punit) return 0; i_dev->disable(i_dev); i2c_dw_prepare_clk(i_dev, false); return 0; } The i_dev->disable(i_dev) and i2c_dw_prepare_clk(i_dev, false) calls here will make the i2c controller non functional. But (some of) these i2c controllers are used by code in the _PS0 / _PS3 methods of some PCI devices and the PCI core calls _PS0 / _PS3 at *noirq* time, so as explained in the commit message which introduced acpi_lpss_do_suspend_late(): https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=48402cee6889fb3fce58e95fea1471626286dc63 We must not only make sure that the suspending of the i2c controller is ordered so that it happens after these PCI devices are suspended, we must also make sure that the i2c controller stays functional until the i2c-controller is put in the suspend-noirq state. If you really want to go this route, we must duplicate the resume_from_noirq flag inside drivers/i2c/busses/i2c-designware-platdrv.c, setting it only for acpi_lpss enumerated devices (the driver handles a whole lot more devices) ans then make the driver's suspend_late method a no-op and instead to the suspend from its suspend_noirq callback. Since pm_generic_suspend_late() is just a wrapper to call dev->driver->pm->suspend_late duplicating the resume_from_noirq flag inside i2c-designware-platdrv.c seems unproductive. Note that we have the same thing going on in acpi_lpss.c with resume_early vs resume_noirq, we call the resume_early callback from acpi_lpss_resume_noirq if the resume_from_noirq flag is set. TL;DR: the behavior you are trying to fix here is intentional and IMHO this patch should be dropped. I guess we could / should do a patch adding a comment that the calling the drivers' suspend_late / resume_early callback at noirq time is intentional to avoid this confusing people in the future. ### Patches 1-3 and 6 look good to me and you may add my: Reviewed-by: Hans de Goede <hdegoede@redhat.com> to those. The comments from this patch also impact patch 5 "ACPI: PM: Introduce "poweroff" callbacks for ACPI PM domain and LPSS" that should be rewritten to call the device's poweroff_late method at noirq time if the resume_from_noirq flag is set. I wonder if patch 5 is necessary though, the acpi_lpss code only deals with a small set of devices, at least i2c-designware-platdrv.c uses SET_LATE_SYSTEM_SLEEP_PM_OPS so the poweroff and suspend methods are 1 and the same. If this is true for all involved drivers (I did not check) then patch 5 is not really necessary. I guess if fixed to deal with resume_from_noirq the same way the suspend methods do this patch cannot hurt and if we every add separate power_off methods to the device drivers later we need this. Regards, Hans > Also notice that acpi_subsys_suspend_late() is not used outside > of the file where it is defined, so make it static and drop its > header for the header file containing it. > > Fixes: 48402cee6889 (ACPI / LPSS: Resume BYT/CHT I2C controllers from resume_noirq) > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- > drivers/acpi/acpi_lpss.c | 23 ++++++++++------------- > drivers/acpi/device_pm.c | 3 +-- > include/linux/acpi.h | 2 -- > 3 files changed, 11 insertions(+), 17 deletions(-) > > Index: linux-pm/drivers/acpi/acpi_lpss.c > =================================================================== > --- linux-pm.orig/drivers/acpi/acpi_lpss.c > +++ linux-pm/drivers/acpi/acpi_lpss.c > @@ -1034,34 +1034,31 @@ static int acpi_lpss_resume(struct devic > } > > #ifdef CONFIG_PM_SLEEP > -static int acpi_lpss_do_suspend_late(struct device *dev) > +static int acpi_lpss_suspend_late(struct device *dev) > { > + struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev)); > int ret; > > if (dev_pm_smart_suspend_and_suspended(dev)) > return 0; > > ret = pm_generic_suspend_late(dev); > - return ret ? ret : acpi_lpss_suspend(dev, device_may_wakeup(dev)); > -} > + if (ret) > + return ret; > > -static int acpi_lpss_suspend_late(struct device *dev) > -{ > - struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev)); > - > - if (pdata->dev_desc->resume_from_noirq) > - return 0; > + if (!pdata->dev_desc->resume_from_noirq) > + return acpi_lpss_suspend(dev, device_may_wakeup(dev)); > > - return acpi_lpss_do_suspend_late(dev); > + return 0; > } > > static int acpi_lpss_suspend_noirq(struct device *dev) > { > struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev)); > - int ret; > > - if (pdata->dev_desc->resume_from_noirq) { > - ret = acpi_lpss_do_suspend_late(dev); > + if (!dev_pm_smart_suspend_and_suspended(dev) && > + pdata->dev_desc->resume_from_noirq) { > + int ret = acpi_lpss_suspend(dev, device_may_wakeup(dev)); > if (ret) > return ret; > } > Index: linux-pm/drivers/acpi/device_pm.c > =================================================================== > --- linux-pm.orig/drivers/acpi/device_pm.c > +++ linux-pm/drivers/acpi/device_pm.c > @@ -1069,7 +1069,7 @@ EXPORT_SYMBOL_GPL(acpi_subsys_suspend); > * Carry out the generic late suspend procedure for @dev and use ACPI to put > * it into a low-power state during system transition into a sleep state. > */ > -int acpi_subsys_suspend_late(struct device *dev) > +static int acpi_subsys_suspend_late(struct device *dev) > { > int ret; > > @@ -1079,7 +1079,6 @@ int acpi_subsys_suspend_late(struct devi > ret = pm_generic_suspend_late(dev); > return ret ? ret : acpi_dev_suspend(dev, device_may_wakeup(dev)); > } > -EXPORT_SYMBOL_GPL(acpi_subsys_suspend_late); > > /** > * acpi_subsys_suspend_noirq - Run the device driver's "noirq" suspend callback. > Index: linux-pm/include/linux/acpi.h > =================================================================== > --- linux-pm.orig/include/linux/acpi.h > +++ linux-pm/include/linux/acpi.h > @@ -916,7 +916,6 @@ static inline int acpi_dev_pm_attach(str > int acpi_dev_suspend_late(struct device *dev); > int acpi_subsys_prepare(struct device *dev); > void acpi_subsys_complete(struct device *dev); > -int acpi_subsys_suspend_late(struct device *dev); > int acpi_subsys_suspend_noirq(struct device *dev); > int acpi_subsys_suspend(struct device *dev); > int acpi_subsys_freeze(struct device *dev); > @@ -924,7 +923,6 @@ int acpi_subsys_freeze(struct device *de > static inline int acpi_dev_resume_early(struct device *dev) { return 0; } > static inline int acpi_subsys_prepare(struct device *dev) { return 0; } > static inline void acpi_subsys_complete(struct device *dev) {} > -static inline int acpi_subsys_suspend_late(struct device *dev) { return 0; } > static inline int acpi_subsys_suspend_noirq(struct device *dev) { return 0; } > static inline int acpi_subsys_suspend(struct device *dev) { return 0; } > static inline int acpi_subsys_freeze(struct device *dev) { return 0; } > > > >
On Sat, Jun 29, 2019 at 1:34 PM Hans de Goede <hdegoede@redhat.com> wrote: > > Hi Rafael, > > On 29-06-19 11:50, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > If the resume_from_noirq flag is set in dev_desc, the ->suspend_late > > callback provided by the device driver will be invoked at the "noirq" > > stage of system suspend, via acpi_lpss_do_suspend_late(), which is > > incorrect. > > > > To fix that, drop acpi_lpss_do_suspend_late() and rearrange > > acpi_lpss_suspend_late() to call pm_generic_suspend_late() > > directly, before calling acpi_lpss_suspend(), in analogy with > > acpi_subsys_suspend_late(). > > Ah now I see the logic in your previous test-patch. > > I'm afraid that this is going to break things though, the calling > of the device-driver's suspend-late method at noirq time is > *intentional* ! But it is a bug too. > The resume_from_noirq flag is only set for i2c controllers which > use: drivers/i2c/busses/i2c-designware-platdrv.c as driver. > > This driver's suspend late method looks like this: > > static int dw_i2c_plat_suspend(struct device *dev) > { > struct dw_i2c_dev *i_dev = dev_get_drvdata(dev); > > i_dev->suspended = true; > > if (i_dev->shared_with_punit) > return 0; > > i_dev->disable(i_dev); > i2c_dw_prepare_clk(i_dev, false); > > return 0; > } > > The i_dev->disable(i_dev) and i2c_dw_prepare_clk(i_dev, false) calls here > will make the i2c controller non functional. But (some of) these i2c > controllers are used by code in the _PS0 / _PS3 methods of some PCI > devices and the PCI core calls _PS0 / _PS3 at *noirq* time, so as explained > in the commit message which introduced acpi_lpss_do_suspend_late(): > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=48402cee6889fb3fce58e95fea1471626286dc63 > > We must not only make sure that the suspending of the i2c controller is > ordered so that it happens after these PCI devices are suspended, we must > also make sure that the i2c controller stays functional until the > i2c-controller is put in the suspend-noirq state. > > If you really want to go this route, we must duplicate the resume_from_noirq > flag inside drivers/i2c/busses/i2c-designware-platdrv.c, setting it > only for acpi_lpss enumerated devices (the driver handles a whole lot more > devices) ans then make the driver's suspend_late method a no-op and instead > to the suspend from its suspend_noirq callback. > > Since pm_generic_suspend_late() is just a wrapper to call dev->driver->pm->suspend_late > duplicating the resume_from_noirq flag inside i2c-designware-platdrv.c seems > unproductive. > > Note that we have the same thing going on in acpi_lpss.c with resume_early vs > resume_noirq, we call the resume_early callback from acpi_lpss_resume_noirq > if the resume_from_noirq flag is set. > > TL;DR: the behavior you are trying to fix here is intentional and > IMHO this patch should be dropped. I can drop the patch, but the current code is simply incorrect. If the driver provided a ->suspend_late callback, it wanted that callback to be invoked during the "late" stage of suspend. Calling it later simply papers over a driver bug. If invoking that callback during the "late" stage doesn't work, the driver should have provided a "noirq" callback instead. > I guess we could / should do a patch adding a comment that the calling > the drivers' suspend_late / resume_early callback at noirq time is intentional > to avoid this confusing people in the future. No. We need to fix drivers doing wrong things. Thanks!
On Sun, Jun 30, 2019 at 12:02 AM Rafael J. Wysocki <rafael@kernel.org> wrote: > > On Sat, Jun 29, 2019 at 1:34 PM Hans de Goede <hdegoede@redhat.com> wrote: > > > > Hi Rafael, > > > > On 29-06-19 11:50, Rafael J. Wysocki wrote: > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > > > If the resume_from_noirq flag is set in dev_desc, the ->suspend_late > > > callback provided by the device driver will be invoked at the "noirq" > > > stage of system suspend, via acpi_lpss_do_suspend_late(), which is > > > incorrect. > > > > > > To fix that, drop acpi_lpss_do_suspend_late() and rearrange > > > acpi_lpss_suspend_late() to call pm_generic_suspend_late() > > > directly, before calling acpi_lpss_suspend(), in analogy with > > > acpi_subsys_suspend_late(). > > > > Ah now I see the logic in your previous test-patch. > > > > I'm afraid that this is going to break things though, the calling > > of the device-driver's suspend-late method at noirq time is > > *intentional* ! > > But it is a bug too. Well, not strictly a bug, but a departure from a clearly established convention. What happens is that the driver provides ->suspend_late and ->resume_early, because the upper layer normally would remove power and power up the device at those stages, respectively. However, due to dependencies between devices, the upper layer kind of works around its own limitation. That is not straightforward at all. I will retain this setup in the current patch series, but going forward it would be good to clean it up. I wonder if using non-PM-runtime device links to represent the dependencies would work in this case.
Index: linux-pm/drivers/acpi/acpi_lpss.c =================================================================== --- linux-pm.orig/drivers/acpi/acpi_lpss.c +++ linux-pm/drivers/acpi/acpi_lpss.c @@ -1034,34 +1034,31 @@ static int acpi_lpss_resume(struct devic } #ifdef CONFIG_PM_SLEEP -static int acpi_lpss_do_suspend_late(struct device *dev) +static int acpi_lpss_suspend_late(struct device *dev) { + struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev)); int ret; if (dev_pm_smart_suspend_and_suspended(dev)) return 0; ret = pm_generic_suspend_late(dev); - return ret ? ret : acpi_lpss_suspend(dev, device_may_wakeup(dev)); -} + if (ret) + return ret; -static int acpi_lpss_suspend_late(struct device *dev) -{ - struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev)); - - if (pdata->dev_desc->resume_from_noirq) - return 0; + if (!pdata->dev_desc->resume_from_noirq) + return acpi_lpss_suspend(dev, device_may_wakeup(dev)); - return acpi_lpss_do_suspend_late(dev); + return 0; } static int acpi_lpss_suspend_noirq(struct device *dev) { struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev)); - int ret; - if (pdata->dev_desc->resume_from_noirq) { - ret = acpi_lpss_do_suspend_late(dev); + if (!dev_pm_smart_suspend_and_suspended(dev) && + pdata->dev_desc->resume_from_noirq) { + int ret = acpi_lpss_suspend(dev, device_may_wakeup(dev)); if (ret) return ret; } Index: linux-pm/drivers/acpi/device_pm.c =================================================================== --- linux-pm.orig/drivers/acpi/device_pm.c +++ linux-pm/drivers/acpi/device_pm.c @@ -1069,7 +1069,7 @@ EXPORT_SYMBOL_GPL(acpi_subsys_suspend); * Carry out the generic late suspend procedure for @dev and use ACPI to put * it into a low-power state during system transition into a sleep state. */ -int acpi_subsys_suspend_late(struct device *dev) +static int acpi_subsys_suspend_late(struct device *dev) { int ret; @@ -1079,7 +1079,6 @@ int acpi_subsys_suspend_late(struct devi ret = pm_generic_suspend_late(dev); return ret ? ret : acpi_dev_suspend(dev, device_may_wakeup(dev)); } -EXPORT_SYMBOL_GPL(acpi_subsys_suspend_late); /** * acpi_subsys_suspend_noirq - Run the device driver's "noirq" suspend callback. Index: linux-pm/include/linux/acpi.h =================================================================== --- linux-pm.orig/include/linux/acpi.h +++ linux-pm/include/linux/acpi.h @@ -916,7 +916,6 @@ static inline int acpi_dev_pm_attach(str int acpi_dev_suspend_late(struct device *dev); int acpi_subsys_prepare(struct device *dev); void acpi_subsys_complete(struct device *dev); -int acpi_subsys_suspend_late(struct device *dev); int acpi_subsys_suspend_noirq(struct device *dev); int acpi_subsys_suspend(struct device *dev); int acpi_subsys_freeze(struct device *dev); @@ -924,7 +923,6 @@ int acpi_subsys_freeze(struct device *de static inline int acpi_dev_resume_early(struct device *dev) { return 0; } static inline int acpi_subsys_prepare(struct device *dev) { return 0; } static inline void acpi_subsys_complete(struct device *dev) {} -static inline int acpi_subsys_suspend_late(struct device *dev) { return 0; } static inline int acpi_subsys_suspend_noirq(struct device *dev) { return 0; } static inline int acpi_subsys_suspend(struct device *dev) { return 0; } static inline int acpi_subsys_freeze(struct device *dev) { return 0; }