diff mbox

drm/i915: enable BIOS hang workaround for Lenovo T60 too

Message ID 1434694676-2039-1-git-send-email-mikko.rapeli@iki.fi (mailing list archive)
State New, archived
Headers show

Commit Message

Mikko Rapeli June 19, 2015, 6:17 a.m. UTC
When trying to hibernate a Lenovo T60 the half moon led keeps blinking and
devices does not power off since commit da2bc1b9db3.

T60 chip details:

00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 943/940GM
L Express Integrated Graphics Controller (rev 03) (prog-if 00 [VGA controller])
        Subsystem: Lenovo ThinkPad R60/T60/X60 series
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Step
ping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 16
        Region 0: Memory at ee100000 (32-bit, non-prefetchable) [size=512K]
        Region 1: I/O ports at 1800 [size=8]
        Region 2: Memory at d0000000 (32-bit, prefetchable) [size=256M]
        Region 3: Memory at ee200000 (32-bit, non-prefetchable) [size=256K]
        Expansion ROM at <unassigned> [disabled]
        Capabilities: [90] MSI: Enable- Count=1/1 Maskable- 64bit-
                Address: 00000000  Data: 0000
        Capabilities: [d0] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Kernel driver in use: i915

Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
---
 drivers/gpu/drm/i915/i915_drv.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Daniel Vetter June 19, 2015, 3:44 p.m. UTC | #1
On Fri, Jun 19, 2015 at 08:17:55AM +0200, Mikko Rapeli wrote:
> When trying to hibernate a Lenovo T60 the half moon led keeps blinking and
> devices does not power off since commit da2bc1b9db3.
> 
> T60 chip details:
> 
> 00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 943/940GM
> L Express Integrated Graphics Controller (rev 03) (prog-if 00 [VGA controller])
>         Subsystem: Lenovo ThinkPad R60/T60/X60 series
>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Step
> ping- SERR- FastB2B- DisINTx-
>         Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort-
> <MAbort- >SERR- <PERR- INTx-
>         Latency: 0
>         Interrupt: pin A routed to IRQ 16
>         Region 0: Memory at ee100000 (32-bit, non-prefetchable) [size=512K]
>         Region 1: I/O ports at 1800 [size=8]
>         Region 2: Memory at d0000000 (32-bit, prefetchable) [size=256M]
>         Region 3: Memory at ee200000 (32-bit, non-prefetchable) [size=256K]
>         Expansion ROM at <unassigned> [disabled]
>         Capabilities: [90] MSI: Enable- Count=1/1 Maskable- 64bit-
>                 Address: 00000000  Data: 0000
>         Capabilities: [d0] Power Management version 2
>                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
>                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
>         Kernel driver in use: i915
> 
> Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
> ---
>  drivers/gpu/drm/i915/i915_drv.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index ec4d932..36e311e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -641,11 +641,12 @@ static int i915_drm_suspend_late(struct drm_device *drm_dev, bool hibernation)
>  	 * the device even though it's already in D3 and hang the machine. So
>  	 * leave the device in D0 on those platforms and hope the BIOS will
>  	 * power down the device properly. Platforms where this was seen:
> -	 * Lenovo Thinkpad X301, X61s
> +	 * Lenovo Thinkpad X301, X61s, T60
>  	 */
>  	if (!(hibernation &&
>  	      drm_dev->pdev->subsystem_vendor == PCI_VENDOR_ID_LENOVO &&
> -	      INTEL_INFO(dev_priv)->gen == 4))
> +	      ((INTEL_INFO(dev_priv)->gen == 3) ||
> +	       (INTEL_INFO(dev_priv)->gen == 4)))

I wonder whether we shouldn't do this unconditionally for gen4 and earlier
for Lenovo ... Anyway this needs Cc: stable@vger.kernel.org and is for
Jani to pick up.

Thanks for figuring out what's been broken here.
-Daniel

>  		pci_set_power_state(drm_dev->pdev, PCI_D3hot);
>  
>  	return 0;
> -- 
> 2.1.4
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
Paul Bolle June 19, 2015, 4:36 p.m. UTC | #2
On Fri, 2015-06-19 at 17:44 +0200, Daniel Vetter wrote:
> I wonder whether we shouldn't do this unconditionally for gen4 and earlier
> for Lenovo ... Anyway this needs Cc: stable@vger.kernel.org and is for
> Jani to pick up.
> 
> Thanks for figuring out what's been broken here.
> -Daniel
> 
> >  		pci_set_power_state(drm_dev->pdev, PCI_D3hot);
> >  
> >  	return 0;

Just two days ago we discussed this bug too, see
https://lkml.org/lkml/2015/6/17/327 . That message contains a link to a
patch I cobbled together for yet another ThinkPad hitting this, but with
PCI_SUBVENDOR_ID_IBM involved.


Paul Bolle
Mikko Rapeli June 19, 2015, 5:17 p.m. UTC | #3
On Fri Jun 19 17:44:31 2015 GMT+0200, Daniel Vetter wrote:
> On Fri, Jun 19, 2015 at 08:17:55AM +0200, Mikko Rapeli wrote:

> > When trying to hibernate a Lenovo T60 the half moon led keeps blinking and

> > devices does not power off since commit da2bc1b9db3.

> > 

> > T60 chip details:

> > 

> > 00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 943/940GM

> > L Express Integrated Graphics Controller (rev 03) (prog-if 00 [VGA controller])

> >         Subsystem: Lenovo ThinkPad R60/T60/X60 series

> >         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Step

> > ping- SERR- FastB2B- DisINTx-

> >         Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort-

> > <MAbort- >SERR- <PERR- INTx-

> >         Latency: 0

> >         Interrupt: pin A routed to IRQ 16

> >         Region 0: Memory at ee100000 (32-bit, non-prefetchable) [size=512K]

> >         Region 1: I/O ports at 1800 [size=8]

> >         Region 2: Memory at d0000000 (32-bit, prefetchable) [size=256M]

> >         Region 3: Memory at ee200000 (32-bit, non-prefetchable) [size=256K]

> >         Expansion ROM at <unassigned> [disabled]

> >         Capabilities: [90] MSI: Enable- Count=1/1 Maskable- 64bit-

> >                 Address: 00000000  Data: 0000

> >         Capabilities: [d0] Power Management version 2

> >                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)

> >                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-

> >         Kernel driver in use: i915

> > 

> > Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>

> > ---

> >  drivers/gpu/drm/i915/i915_drv.c | 5 +++--

> >  1 file changed, 3 insertions(+), 2 deletions(-)

> > 

> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c

> > index ec4d932..36e311e 100644

> > --- a/drivers/gpu/drm/i915/i915_drv.c

> > +++ b/drivers/gpu/drm/i915/i915_drv.c

> > @@ -641,11 +641,12 @@ static int i915_drm_suspend_late(struct drm_device *drm_dev, bool hibernation)

> >  	 * the device even though it's already in D3 and hang the machine. So

> >  	 * leave the device in D0 on those platforms and hope the BIOS will

> >  	 * power down the device properly. Platforms where this was seen:

> > -	 * Lenovo Thinkpad X301, X61s

> > +	 * Lenovo Thinkpad X301, X61s, T60

> >  	 */

> >  	if (!(hibernation &&

> >  	      drm_dev->pdev->subsystem_vendor == PCI_VENDOR_ID_LENOVO &&

> > -	      INTEL_INFO(dev_priv)->gen == 4))

> > +	      ((INTEL_INFO(dev_priv)->gen == 3) ||

> > +	       (INTEL_INFO(dev_priv)->gen == 4)))

> 

> I wonder whether we shouldn't do this unconditionally for gen4 and earlier

> for Lenovo ... Anyway this needs Cc: stable@vger.kernel.org and is for

> Jani to pick up.



Yes, I also tought about enabling this for all older Lenovo devices but I don't know the hardware or chips and can't test so I let it be. Also reverting the original bug causing patch could be an option since this is affecting so many devices and users.
 
> Thanks for figuring out what's been broken here.


You're welcome.

-Mikko

> -Daniel

> 

> >  		pci_set_power_state(drm_dev->pdev, PCI_D3hot);

> >  

> >  	return 0;

> > -- 

> > 2.1.4

> > 

> > _______________________________________________

> > dri-devel mailing list

> > dri-devel@lists.freedesktop.org

> > http://lists.freedesktop.org/mailman/listinfo/dri-devel

> 

> -- 

> Daniel Vetter

> Software Engineer, Intel Corporation

> http://blog.ffwll.ch

>


-- 
Sent from my Jolla
Imre Deak June 22, 2015, 1:43 p.m. UTC | #4
On pe, 2015-06-19 at 18:36 +0200, Paul Bolle wrote:
> On Fri, 2015-06-19 at 17:44 +0200, Daniel Vetter wrote:
> > I wonder whether we shouldn't do this unconditionally for gen4 and earlier
> > for Lenovo ... Anyway this needs Cc: stable@vger.kernel.org and is for
> > Jani to pick up.
> > 
> > Thanks for figuring out what's been broken here.
> > -Daniel
> > 
> > >  		pci_set_power_state(drm_dev->pdev, PCI_D3hot);
> > >  
> > >  	return 0;
> 
> Just two days ago we discussed this bug too, see
> https://lkml.org/lkml/2015/6/17/327 . That message contains a link to a
> patch I cobbled together for yet another ThinkPad hitting this, but with
> PCI_SUBVENDOR_ID_IBM involved.

To summarize, since we extended the range of platforms to apply the
workaround in
commit ab3be73fa7b43f4c3648ce29b5fd649ea54d3adb
Author: Imre Deak <imre.deak@intel.com>
Date:   Mon Mar 2 13:04:41 2015 +0200

    drm/i915: gen4: work around hang during hibernation

we had the following reports I know of with the same issue:
- Acer Aspire 1830T by Ilya (Gen5) [1]
- Fujitsu FSC S7110 by Dirk (Gen4.5) [2]
- ThinkPad X60 by Pavel (Gen4.5) [3]
- ThinkPad T60 by Mikko (Gen4.5) [4]
- ThinkPad X41 by Paul (Gen3) [5]

Based on this I would give up on a vendor specific blacklist and apply
the workaround for anything < GEN6.

About completely reverting the original
commit da2bc1b9db3351addd293e5b82757efe1f77ed1d
Author: Imre Deak <imre.deak@intel.com>
Date:   Thu Oct 23 19:23:26 2014 +0300

    drm/i915: add poweroff_late handler

I still believe that the normal thing to do is to power off the device
during S4. This is the default action taken by the kernel's PCI core for
every device. S4 is not a state where it'd be guaranteed that all
devices are powered off, there may be wake-up devices that are still
powered for example, so powering off any devices explicitly that are not
wake-up sources makes sense to me. I think we need a point where we stop
applying this workaround and GEN6 seems like a good point for that,
since I haven't seen any report past GEN5.

--Imre

[1] https://bugzilla.kernel.org/show_bug.cgi?id=94241#c8
[2] https://bugzilla.kernel.org/show_bug.cgi?id=95061
[3] https://lkml.org/lkml/2015/6/17/404
[4] http://lists.freedesktop.org/archives/intel-gfx/2015-June/069305.html
[5] https://lkml.org/lkml/2015/3/18/133
Mikko Rapeli June 23, 2015, 10:42 a.m. UTC | #5
Hi Imre,

On Mon, Jun 22, 2015 at 04:43:50PM +0300, Imre Deak wrote:
>
> To summarize, since we extended the range of platforms to apply the
> workaround in
> commit ab3be73fa7b43f4c3648ce29b5fd649ea54d3adb
> Author: Imre Deak <imre.deak@intel.com>
> Date:   Mon Mar 2 13:04:41 2015 +0200
> 
>     drm/i915: gen4: work around hang during hibernation
> 
> we had the following reports I know of with the same issue:
> - Acer Aspire 1830T by Ilya (Gen5) [1]
> - Fujitsu FSC S7110 by Dirk (Gen4.5) [2]
> - ThinkPad X60 by Pavel (Gen4.5) [3]
> - ThinkPad T60 by Mikko (Gen4.5) [4]
> - ThinkPad X41 by Paul (Gen3) [5]
> 
> Based on this I would give up on a vendor specific blacklist and apply
> the workaround for anything < GEN6.

I would not trust just user feedback on this. Do you have some HW knowledge
from chips, reference desings on mother boards, and BIOS knowledge
which supports enabling this poweroff_late and related workaround for all
devices older than GEN6? Also, how does it work on Windows and on OS X
(if that's relevant)?

> About completely reverting the original
> commit da2bc1b9db3351addd293e5b82757efe1f77ed1d
> Author: Imre Deak <imre.deak@intel.com>
> Date:   Thu Oct 23 19:23:26 2014 +0300
> 
>     drm/i915: add poweroff_late handler
> 
> I still believe that the normal thing to do is to power off the device
> during S4. This is the default action taken by the kernel's PCI core for
> every device. S4 is not a state where it'd be guaranteed that all
> devices are powered off, there may be wake-up devices that are still
> powered for example, so powering off any devices explicitly that are not
> wake-up sources makes sense to me. I think we need a point where we stop
> applying this workaround and GEN6 seems like a good point for that,
> since I haven't seen any report past GEN5.

As a Linux user, I'd just like to be more confident that da2bc1b9db3351add and
workaround for all chips older than GEN6 is the right thing to do. User
testing shouldn't be the only criteria.

-Mikko

> --Imre
> 
> [1] https://bugzilla.kernel.org/show_bug.cgi?id=94241#c8
> [2] https://bugzilla.kernel.org/show_bug.cgi?id=95061
> [3] https://lkml.org/lkml/2015/6/17/404
> [4] http://lists.freedesktop.org/archives/intel-gfx/2015-June/069305.html
> [5] https://lkml.org/lkml/2015/3/18/133
> 
>
Imre Deak June 23, 2015, 12:23 p.m. UTC | #6
On ti, 2015-06-23 at 13:42 +0300, Mikko Rapeli wrote:
> Hi Imre,
> 
> On Mon, Jun 22, 2015 at 04:43:50PM +0300, Imre Deak wrote:
> >
> > To summarize, since we extended the range of platforms to apply the
> > workaround in
> > commit ab3be73fa7b43f4c3648ce29b5fd649ea54d3adb
> > Author: Imre Deak <imre.deak@intel.com>
> > Date:   Mon Mar 2 13:04:41 2015 +0200
> > 
> >     drm/i915: gen4: work around hang during hibernation
> > 
> > we had the following reports I know of with the same issue:
> > - Acer Aspire 1830T by Ilya (Gen5) [1]
> > - Fujitsu FSC S7110 by Dirk (Gen4.5) [2]
> > - ThinkPad X60 by Pavel (Gen4.5) [3]
> > - ThinkPad T60 by Mikko (Gen4.5) [4]
> > - ThinkPad X41 by Paul (Gen3) [5]
> > 
> > Based on this I would give up on a vendor specific blacklist and apply
> > the workaround for anything < GEN6.
> 
> I would not trust just user feedback on this. Do you have some HW knowledge
> from chips, reference desings on mother boards, and BIOS knowledge
> which supports enabling this poweroff_late and related workaround for all
> devices older than GEN6?

I don't have knowledge about the BIOS internals and I'm not sure how
reasonable it is to know about each vendor's implementation. What we
know is that ACPI mandates that the OS puts non-wakeup devices into PCI
D3 state during S4.

> Also, how does it work on Windows and on OS X (if that's relevant)?

I don't have the info on this. It would be interesting to know, but I
wouldn't feel completely confident either to just do the same as a given
version of a given OS.

> > About completely reverting the original
> > commit da2bc1b9db3351addd293e5b82757efe1f77ed1d
> > Author: Imre Deak <imre.deak@intel.com>
> > Date:   Thu Oct 23 19:23:26 2014 +0300
> > 
> >     drm/i915: add poweroff_late handler
> > 
> > I still believe that the normal thing to do is to power off the device
> > during S4. This is the default action taken by the kernel's PCI core for
> > every device. S4 is not a state where it'd be guaranteed that all
> > devices are powered off, there may be wake-up devices that are still
> > powered for example, so powering off any devices explicitly that are not
> > wake-up sources makes sense to me. I think we need a point where we stop
> > applying this workaround and GEN6 seems like a good point for that,
> > since I haven't seen any report past GEN5.
> 
> As a Linux user, I'd just like to be more confident that da2bc1b9db3351add and
> workaround for all chips older than GEN6 is the right thing to do. User
> testing shouldn't be the only criteria.

We had the workaround in place before da2bc1b9db335 for all platforms,
meaning we kept the device in D0 state during hibernation. What we are
aiming at now is to remove this workaround to align with the ACPI spec
and to prepare for a future state where we can simply leave the device
in runtime suspended state if it was already runtime suspended when
hibernation starts.

Removing the workaround turned out to be a bad idea on old platforms,
where we have BIOSes that depend on the GFX device being in D0 state
during hibernation (contradicting the spec).

I don't have a better idea now to address the above issues than
restoring the workaround for those old platforms.

--Imre
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index ec4d932..36e311e 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -641,11 +641,12 @@  static int i915_drm_suspend_late(struct drm_device *drm_dev, bool hibernation)
 	 * the device even though it's already in D3 and hang the machine. So
 	 * leave the device in D0 on those platforms and hope the BIOS will
 	 * power down the device properly. Platforms where this was seen:
-	 * Lenovo Thinkpad X301, X61s
+	 * Lenovo Thinkpad X301, X61s, T60
 	 */
 	if (!(hibernation &&
 	      drm_dev->pdev->subsystem_vendor == PCI_VENDOR_ID_LENOVO &&
-	      INTEL_INFO(dev_priv)->gen == 4))
+	      ((INTEL_INFO(dev_priv)->gen == 3) ||
+	       (INTEL_INFO(dev_priv)->gen == 4)))
 		pci_set_power_state(drm_dev->pdev, PCI_D3hot);
 
 	return 0;