diff mbox

drm/i915: fix failure to power off after hibernate

Message ID 1424794340.15554.3.camel@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Imre Deak Feb. 24, 2015, 4:12 p.m. UTC
On ti, 2015-02-24 at 15:58 +0100, Bjørn Mork wrote:
> This fixes a bug where hibernation completes, but the system
> fails to power off after the image has been saved.
> 
> Bisection lead to commit da2bc1b9db33 ("drm/i915: add poweroff_late
> handler") which added a .poweroff_late hook pointing to the same
> function as the .freeze_late hook, without any justification or
> explanation why this would be correct, except "for consistency".
> 
> The assumption that this "makes the power off handling identical to
> the S3 suspend and S4 freeze handling" is completely bogus. As clearly
> documented in Documentation/power/devices.txt, the poweroff* hooks
> are part of the hibernation API along with the freeze* hooks.  The
> phases involved in hibernation are:
> 
>    prepare, freeze, freeze_late, freeze_noirq, thaw_noirq, thaw_early,
>    thaw, complete, prepare, poweroff, poweroff_late, poweroff_noirq
> 
> With the above sequence in mind, it should be fairly obvious that you
> cannot save registers and disable the device in both the freeze_late
> and poweroff_late phases.  Or as Documentation/power/devices.txt
> explains it:
> 
>  The poweroff, poweroff_late and poweroff_noirq callbacks should do essentially
>  the same things as the suspend, suspend_late and suspend_noirq callbacks,
>  respectively.  The only notable difference is that they need not store the
>  device register values, because the registers should already have been stored
>  during the freeze, freeze_late or freeze_noirq phases.
> 
> The "consistency" between suspend and hibernate was already in
> place, with freeze_late pointing to the same function as suspend_late.
> There is no need for any poweroff_late hook in this driver.

The poweroff handlers undo the actions of the thaw handlers. As the
original commit stated saving the registers is not needed there, but
it's also not a big overhead and there should be no problem doing it. We
are planning to optimize the hibernation sequence by for example not
shutting down the display between freeze and thaw, and also getting rid
of unnecessary steps at the power off phase. But before that we want to
actually unify things rather than having special cases, as maintaining
the special code paths caused already quite a lot of problems for us so
far.

Reverting the commit may hide some other issue, so before doing that
could you try the following patch:
http://lists.freedesktop.org/archives/intel-gfx/2015-February/060529.html

If with that you still get the hang could you try on top of that the
patch below, first having only pci_set_power_state uncommented, then
both pci_set_power_state and pci_disable_device uncommented?

Thanks,
Imre

Comments

Bjørn Mork Feb. 24, 2015, 7 p.m. UTC | #1
Imre Deak <imre.deak@intel.com> writes:

> The poweroff handlers undo the actions of the thaw handlers. As the
> original commit stated saving the registers is not needed there, but
> it's also not a big overhead and there should be no problem doing it. We
> are planning to optimize the hibernation sequence by for example not
> shutting down the display between freeze and thaw, and also getting rid
> of unnecessary steps at the power off phase. But before that we want to
> actually unify things rather than having special cases, as maintaining
> the special code paths caused already quite a lot of problems for us so
> far.

That sounds like a worthy goal.  I don't understand what you hope to
achieve by having a poweroff_late hook, since there aren't really
anything useful left to do at the point it is called, but if you want a
dummy callback there for code structure reasons then fine.

But you cannot just run around breaking stuff while slowly moving
towards this goal over multiple releases... v3.19 is currently broken
and that seems totally unnecessary.

In any case: You should have noticed this problem while testing your
patches.  The breakage is 100% reproducible. Unfortunately I had to do a
bisect to realize what you had done to the i915 driver, something I
unfortunately didn't find time to do before v3.19 was released.  But I
do find it unnecessary to release with such bugs.  Any attempt to
exercise the code path you modified would have revealed the bug.


> Reverting the commit may hide some other issue, so before doing that
> could you try the following patch:
> http://lists.freedesktop.org/archives/intel-gfx/2015-February/060529.html

Makes no difference.  I assume that patch fixes an unrelated bug? The
age and reported symptoms indicates so.  Note that I am reporting a
regression introduced after v3.18, while that seems to fix a bug
introduced in v3.17. Both v3.17 and v3.18 (including v3.18.6), as
well as earlier releases, work fine for me.

> If with that you still get the hang could you try on top of that the
> patch below, first having only pci_set_power_state uncommented, then
> both pci_set_power_state and pci_disable_device uncommented?

That patch fixes the problem, with only pci_set_power_state commented
out.  Do you still want me to try with pci_disable_device() commented
out as well?



Bjørn
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.c
b/drivers/gpu/drm/i915/i915_drv.c
index 4badb23..dc91d4b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -968,6 +968,23 @@  static int i915_pm_suspend_late(struct device *dev)
 	return i915_drm_suspend_late(drm_dev);
 }
 
+static int i915_pm_poweroff_late(struct device *dev)
+{
+	struct drm_device *drm_dev = dev_to_i915(dev)->dev;
+	struct drm_i915_private *dev_priv = drm_dev->dev_private;
+	int ret;
+
+	ret = intel_suspend_complete(dev_priv);
+
+	if (ret)
+		return ret;
+
+	pci_disable_device(drm_dev->pdev);
+//	pci_set_power_state(drm_dev->pdev, PCI_D3hot);
+
+	return 0;
+}
+
 static int i915_pm_resume_early(struct device *dev)
 {
 	struct drm_device *drm_dev = dev_to_i915(dev)->dev;
@@ -1535,7 +1552,7 @@  static const struct dev_pm_ops i915_pm_ops = {
 	.thaw_early = i915_pm_resume_early,
 	.thaw = i915_pm_resume,
 	.poweroff = i915_pm_suspend,
-	.poweroff_late = i915_pm_suspend_late,
+	.poweroff_late = i915_pm_poweroff_late,
 	.restore_early = i915_pm_resume_early,
 	.restore = i915_pm_resume,