diff mbox

x86, suspend: Save/restore THERM_CONTROL register for suspend

Message ID 1439800192-3034-1-git-send-email-yu.c.chen@intel.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Chen Yu Aug. 17, 2015, 8:29 a.m. UTC
A bug is reported(https://bugzilla.redhat.com/show_bug.cgi?id=1227208)
that, after resuming from S3, CPU is working at a low speed.
After investigation, it is found that, BIOS has modified the value
of THERM_CONTROL register during S3, changes it from 0 to 0x10,
while the latter means CPU can only get 25% of the Duty Cycle,
and this caused the problem.

Simple scenario to reproduce:
1.Boot up system
2.Get MSR with address 0x19a, it should output 0
3.Put system into sleep, then wake up
4.Get MSR with address 0x19a, it should output 0(actual it outputs 0x10)

Although this is a BIOS issue, it would be more robust for linux to deal
with this situation. This patch fixes this issue by saving/restoring
THERM_CONTROL(now called CLOCK_MODULATION) register on suspend/resume.

Tested-by: Marcin Kaszewski <marcin.kaszewski@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
 arch/x86/include/asm/suspend_64.h | 1 +
 arch/x86/power/cpu.c              | 2 ++
 2 files changed, 3 insertions(+)

Comments

Ingo Molnar Aug. 17, 2015, 10:11 a.m. UTC | #1
* Chen Yu <yu.c.chen@intel.com> wrote:

> A bug is reported(https://bugzilla.redhat.com/show_bug.cgi?id=1227208)
> that, after resuming from S3, CPU is working at a low speed.
> After investigation, it is found that, BIOS has modified the value
> of THERM_CONTROL register during S3, changes it from 0 to 0x10,
> while the latter means CPU can only get 25% of the Duty Cycle,
> and this caused the problem.
> 
> Simple scenario to reproduce:
> 1.Boot up system
> 2.Get MSR with address 0x19a, it should output 0
> 3.Put system into sleep, then wake up
> 4.Get MSR with address 0x19a, it should output 0(actual it outputs 0x10)
> 
> Although this is a BIOS issue, it would be more robust for linux to deal
> with this situation. This patch fixes this issue by saving/restoring
> THERM_CONTROL(now called CLOCK_MODULATION) register on suspend/resume.
> 
> Tested-by: Marcin Kaszewski <marcin.kaszewski@intel.com>
> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> ---
>  arch/x86/include/asm/suspend_64.h | 1 +
>  arch/x86/power/cpu.c              | 2 ++
>  2 files changed, 3 insertions(+)
> 
> diff --git a/arch/x86/include/asm/suspend_64.h b/arch/x86/include/asm/suspend_64.h
> index 7ebf0eb..b9f5591 100644
> --- a/arch/x86/include/asm/suspend_64.h
> +++ b/arch/x86/include/asm/suspend_64.h
> @@ -25,6 +25,7 @@ struct saved_context {
>  	u64 misc_enable;
>  	bool misc_enable_saved;
>  	unsigned long efer;
> +	unsigned long clock_modulation;
>  	u16 gdt_pad; /* Unused */
>  	struct desc_ptr gdt_desc;
>  	u16 idt_pad;
> diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
> index 9ab5279..f82577b 100644
> --- a/arch/x86/power/cpu.c
> +++ b/arch/x86/power/cpu.c
> @@ -97,6 +97,7 @@ static void __save_processor_state(struct saved_context *ctxt)
>  	mtrr_save_fixed_ranges(NULL);
>  
>  	rdmsrl(MSR_EFER, ctxt->efer);
> +	rdmsrl(MSR_IA32_THERM_CONTROL, ctxt->clock_modulation);

So what your changelog fails to mention:

 - You only add this code to the 64-bit kernel. Are 32-bit kernels not affected?

 - the MSR read is done unconditionally. Is MSR_IA32_THERM_CONTROL available
   architecturally and readable (and has sensible values) on all 64-bit capable
   x86 CPUs that run this code path?

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chen Yu Aug. 17, 2015, 11:43 a.m. UTC | #2
SGksIEluZ28sIHRoYW5rcyBmb3IgeW91ciByZXZpZXcsDQpPbiBNb24sIDIwMTUtMDgtMTcgYXQg
MTI6MTEgKzAyMDAsIEluZ28gTW9sbmFyIHdyb3RlOg0KPiBTbyB3aGF0IHlvdXIgY2hhbmdlbG9n
IGZhaWxzIHRvIG1lbnRpb246DQo+IA0KPiAgLSBZb3Ugb25seSBhZGQgdGhpcyBjb2RlIHRvIHRo
ZSA2NC1iaXQga2VybmVsLiBBcmUgMzItYml0IGtlcm5lbHMgbm90IGFmZmVjdGVkPw0KWWVzLCAz
Mi1iaXQga2VybmVsIHNob3VsZCBhbHNvIGRvIHRoZSBzYXZlL3Jlc3RvcmUgb3BlcmF0aW9uLiAN
CkknbGwgYWRqdXN0IHRoZW0gdG8gNjQvMzItYml0IGNvbW1vbiBwYXRoLiANCj4gDQo+ICAtIHRo
ZSBNU1IgcmVhZCBpcyBkb25lIHVuY29uZGl0aW9uYWxseS4gSXMgTVNSX0lBMzJfVEhFUk1fQ09O
VFJPTCBhdmFpbGFibGUNCj4gICAgYXJjaGl0ZWN0dXJhbGx5IGFuZCByZWFkYWJsZSAoYW5kIGhh
cyBzZW5zaWJsZSB2YWx1ZXMpIG9uIGFsbCA2NC1iaXQgY2FwYWJsZQ0KPiAgICB4ODYgQ1BVcyB0
aGF0IHJ1biB0aGlzIGNvZGUgcGF0aD8NCk1TUl9JQTMyX1RIRVJNX0NPTlRST0wgaXMgYXZhaWxh
YmxlIG9uIEludGVsIFBlbnRpdW0gNCwgWGVvbiwgUGVudGl1bSBNIGFuZCBsYXRlcg0KcHJvY2Vz
c29ycywgc28gSSB0aGluayBub3QgYWxsIHRoZSA2NC8zMi1iaXQgY2FwYWJsZSB4ODYgQ1BVcyBo
YXZlIHRoaXMNCnJlZ2lzdGVyLiBNYXliZSBjb2RlcyBsaWtlIHRoZSBmb2xsb3dpbmcgd291bGQg
YmUgbW9yZSByZWFzb25hYmxlPw0KDQpzYXZlOg0KY3R4dC0+Y2xvY2tfbW9kdWxhdGlvbl9zYXZl
ZCA9ICFyZG1zcmxfc2FmZShNU1JfSUEzMl9USEVSTV9DT05UUk9MLA0KCSZjdHh0LT5jbG9ja19t
b2R1bGF0aW9uKTsNCg0KcmVzdG9yZToNCmlmIChjdHh0LT5jbG9ja19tb2R1bGF0aW9uX3NhdmVk
KQ0KCXdybXNybChNU1JfSUEzMl9USEVSTV9DT05UUk9MLCBjdHh0LT5jbG9ja19tb2R1bGF0aW9u
KTsNCg0KVGhhbmtzIGEgbG90Lg0KDQpCZXN0IFJlZ2FyZHMsDQpZdQ0K
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pavel Machek Aug. 17, 2015, 1:27 p.m. UTC | #3
On Mon 2015-08-17 12:11:15, Ingo Molnar wrote:
> 
> * Chen Yu <yu.c.chen@intel.com> wrote:
> 
> > A bug is reported(https://bugzilla.redhat.com/show_bug.cgi?id=1227208)

Access denied :-(

> > that, after resuming from S3, CPU is working at a low speed.
> > After investigation, it is found that, BIOS has modified the value
> > of THERM_CONTROL register during S3, changes it from 0 to 0x10,
> > while the latter means CPU can only get 25% of the Duty Cycle,
> > and this caused the problem.

What HW is this on?

> > --- a/arch/x86/power/cpu.c
> > +++ b/arch/x86/power/cpu.c
> > @@ -97,6 +97,7 @@ static void __save_processor_state(struct saved_context *ctxt)
> >  	mtrr_save_fixed_ranges(NULL);
> >  
> >  	rdmsrl(MSR_EFER, ctxt->efer);
> > +	rdmsrl(MSR_IA32_THERM_CONTROL, ctxt->clock_modulation);
> 
> So what your changelog fails to mention:
> 
>  - You only add this code to the 64-bit kernel. Are 32-bit kernels not affected?
> 
>  - the MSR read is done unconditionally. Is MSR_IA32_THERM_CONTROL available
>    architecturally and readable (and has sensible values) on all 64-bit capable
>    x86 CPUs that run this code path?

- So BIOS expects to control MSR_IA32_THERM_CONTROL . Now you suspend
  in hot enironment but resume in cool one. BIOS sets up
  MSR_IA32_THERM_CONTROL the right way, but you override it.

  As BIOS expects to control MSR_IA32_THERM_CONTROL and machine is
  kept cool, BIOS will not write new value to it, and machine will
  keep running slowly.

Doing this unconditionally is asking for trouble. Blacklist entry with
affected BIOS info might be acceptable, but...
									Pavel
Chen Yu Aug. 18, 2015, 2:02 a.m. UTC | #4
?not sure if previous reply has been sent out)
Hi, Ingo, thanks for your review,
On 08/17/2015 06:11 PM, Ingo Molnar wrote:
>
> * Chen Yu <yu.c.chen@intel.com> wrote:
>
>
> So what your changelog fails to mention:
>
>   - You only add this code to the 64-bit kernel. Are 32-bit kernels not affected?
>
I missed the 32-bit case, I'll adjust them to 32/64-bit common path.
>   - the MSR read is done unconditionally. Is MSR_IA32_THERM_CONTROL available
>     architecturally and readable (and has sensible values) on all 64-bit capable
>     x86 CPUs that run this code path?
>
MSR_IA32_THERM_CONTROL is avaliable on Intel Pentium 4, Xeon, Pentium M
and later processors, so I think not all of the 32/64-bit x86 CPUs have
this regiser. Maybe codes like this would be more reasonable?
save:
ctx->clock_modulation_saved = !rdmsrl_safe(MSR_IA32_THERM_CONTROL,
	&ctxt->clock_modulation);
retore:
if (ctxt->clock_modulation_saved)
	wrmsrl(MSR_IA32_THERM_CONTROL, ctxt->clock_modulation);


Best Regards,
Yu
> Thanks,
>
> 	Ingo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chen Yu Aug. 18, 2015, 2:23 a.m. UTC | #5
Hi Pavel, thanks for your review,
On 08/17/2015 09:27 PM, Pavel Machek wrote:
> On Mon 2015-08-17 12:11:15, Ingo Molnar wrote:
>>
>> * Chen Yu <yu.c.chen@intel.com> wrote:
>>
>>> A bug is reported(https://bugzilla.redhat.com/show_bug.cgi?id=1227208)
>
> Access denied :-(
>
Might need to register for accessing.
>
> What HW is this on?
>
Intel Braswell and Broadwell, detail for Broadwell:
Platform: MayanCity
Processor: 2x BROADWELL BDX_EP A0 QHPR
Chipset: Wellsburg B1 QR7E

>
> - So BIOS expects to control MSR_IA32_THERM_CONTROL . Now you suspend
>    in hot enironment but resume in cool one. BIOS sets up
>    MSR_IA32_THERM_CONTROL the right way, but you override it.
>
>    As BIOS expects to control MSR_IA32_THERM_CONTROL and machine is
>    kept cool, BIOS will not write new value to it, and machine will
>    keep running slowly.
>
Sorry, I can not quite catch up, do you mean we should let
  BIOS modifying MSR_IA32_THERM_CONTROL and  leverage linux
to adjust this value at runtime(after S3)?

> Doing this unconditionally is asking for trouble. Blacklist entry with
> affected BIOS info might be acceptable, but...
> 									Pavel
>
you mean a quirk here(accroding to dmi info ,etc)?


Best Regards,
Yu

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pavel Machek Aug. 18, 2015, 8:02 a.m. UTC | #6
Hi!

> >>>A bug is reported(https://bugzilla.redhat.com/show_bug.cgi?id=1227208)
> >
> >Access denied :-(
> >
> Might need to register for accessing.
> >
> >What HW is this on?
> >
> Intel Braswell and Broadwell, detail for Broadwell:
> Platform: MayanCity
> Processor: 2x BROADWELL BDX_EP A0 QHPR
> Chipset: Wellsburg B1 QR7E

So it is desktop board?

> >Doing this unconditionally is asking for trouble. Blacklist entry with
> >affected BIOS info might be acceptable, but...
> >
> you mean a quirk here(accroding to dmi info ,etc)?

Yes, please.
									Pavel
Chen Yu Aug. 18, 2015, 8:54 a.m. UTC | #7
Hi!
On 08/18/2015 04:02 PM, Pavel Machek wrote:
> Hi!
>
>>>>> A bug is reported(https://bugzilla.redhat.com/show_bug.cgi?id=1227208)
>>>
>>> Access denied :-(
>>>
>> Might need to register for accessing.
>>>
>>> What HW is this on?
>>>
>> Intel Braswell and Broadwell, detail for Broadwell:
>> Platform: MayanCity
>> Processor: 2x BROADWELL BDX_EP A0 QHPR
>> Chipset: Wellsburg B1 QR7E
>
> So it is desktop board?
>
It is a server of Xeon family.

>>> Doing this unconditionally is asking for trouble. Blacklist entry with
>>> affected BIOS info might be acceptable, but...
>>>
>> you mean a quirk here(accroding to dmi info ,etc)?
>
> Yes, please.
Will do. thanks
> 									Pavel
>

Best Regards,
Yu

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/include/asm/suspend_64.h b/arch/x86/include/asm/suspend_64.h
index 7ebf0eb..b9f5591 100644
--- a/arch/x86/include/asm/suspend_64.h
+++ b/arch/x86/include/asm/suspend_64.h
@@ -25,6 +25,7 @@  struct saved_context {
 	u64 misc_enable;
 	bool misc_enable_saved;
 	unsigned long efer;
+	unsigned long clock_modulation;
 	u16 gdt_pad; /* Unused */
 	struct desc_ptr gdt_desc;
 	u16 idt_pad;
diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
index 9ab5279..f82577b 100644
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -97,6 +97,7 @@  static void __save_processor_state(struct saved_context *ctxt)
 	mtrr_save_fixed_ranges(NULL);
 
 	rdmsrl(MSR_EFER, ctxt->efer);
+	rdmsrl(MSR_IA32_THERM_CONTROL, ctxt->clock_modulation);
 #endif
 
 	/*
@@ -178,6 +179,7 @@  static void notrace __restore_processor_state(struct saved_context *ctxt)
 #else
 /* CONFIG X86_64 */
 	wrmsrl(MSR_EFER, ctxt->efer);
+	wrmsrl(MSR_IA32_THERM_CONTROL, ctxt->clock_modulation);
 	write_cr8(ctxt->cr8);
 	__write_cr4(ctxt->cr4);
 #endif