Patchworkβ x86, therm: Only read the initial value of thermal LVT entry on BSP

login
register
about
Submitter Suresh Siddha
Date 2009-11-07 02:56:05
Message ID <1257562565.4083.485.camel@sbs-t61.sc.intel.com>
Download mbox | patch
Permalink /patch/58287/
State New
Headers show

Comments

Suresh Siddha - 2009-11-07 02:56:05
On Fri, 2009-11-06 at 16:17 -0800, Yong Wang wrote:
> Only read the initial value of thermal LVT entry on BSP. The initial
> value of thermal LVT entries on all APs always reads 0x10000 because
> APs are woken up by BSP issuing INIT-SIPI-SIPI sequence to them and
> LVT registers are reset to 0s except for mask bits which are set to
> 1s when APs receive INIT IPI.
> 
> Also restore the value that BIOS has programmed on AP based on BSP's
> info we saved since BIOS is always setting the same value for all
> threads/cores.

Yong, I have appended a new patch with an enhanced change log and
subject. In future, when you modify and post another version of the
patch, can you please update the patch version and elaborate what has
changed, why etc, so that it will be easier for the reviewers.

Ingo/Peter, please review and queue this patch from Yong. Thanks.
---

From: Yong Wang <yong.y.wang@intel.com>
Subject: x86: under bios control, restore AP's APIC_LVTTHMR to the BSP value

On platforms where bios handles the thermal monitor interrupt,
APIC_LVTTHMR on each logical CPU is programmed to generate a SMI and OS
can't touch it.

Unfortunately AP bringup sequence using INIT-SIPI-SIPI clear all
the LVT entries except the mask bit. Essentially this results in
all LVT entries including the thermal monitoring interrupt set to masked
(clearing the bios programmed value for APIC_LVTTHMR).

And this leads to kernel take over the thermal monitoring interrupt
on AP's but not on BSP (leaving the bios programmed value only on BSP).

As a result of this, we have seen system hangs when the thermal
monitoring interrupt is generated.

Fix this by reading the initial value of thermal LVT entry on BSP
and if bios has taken over the control, then program the same value
on all AP's and leave the thermal monitoring interrupt control
on all the logical cpu's to the bios.

Signed-off-by: Yong Wang <yong.y.wang@intel.com>
Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: stable@kernel.org
---
 arch/x86/kernel/cpu/mcheck/therm_throt.c |   20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Yong Wang - 2009-11-07 09:52:30
On Fri, Nov 06, 2009 at 06:56:05PM -0800, Suresh Siddha wrote:
> On Fri, 2009-11-06 at 16:17 -0800, Yong Wang wrote:
> > Only read the initial value of thermal LVT entry on BSP. The initial
> > value of thermal LVT entries on all APs always reads 0x10000 because
> > APs are woken up by BSP issuing INIT-SIPI-SIPI sequence to them and
> > LVT registers are reset to 0s except for mask bits which are set to
> > 1s when APs receive INIT IPI.
> > 
> > Also restore the value that BIOS has programmed on AP based on BSP's
> > info we saved since BIOS is always setting the same value for all
> > threads/cores.
> 
> Yong, I have appended a new patch with an enhanced change log and
> subject. In future, when you modify and post another version of the
> patch, can you please update the patch version and elaborate what has
> changed, why etc, so that it will be easier for the reviewers.
> 

Thanks a lot for your review and comments. Will follow the conventions
going forward.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Ingo Molnar - 2009-11-08 10:25:21
* Suresh Siddha <suresh.b.siddha@intel.com> wrote:

> On Fri, 2009-11-06 at 16:17 -0800, Yong Wang wrote:
> > Only read the initial value of thermal LVT entry on BSP. The initial
> > value of thermal LVT entries on all APs always reads 0x10000 because
> > APs are woken up by BSP issuing INIT-SIPI-SIPI sequence to them and
> > LVT registers are reset to 0s except for mask bits which are set to
> > 1s when APs receive INIT IPI.
> > 
> > Also restore the value that BIOS has programmed on AP based on BSP's
> > info we saved since BIOS is always setting the same value for all
> > threads/cores.
> 
> Yong, I have appended a new patch with an enhanced change log and
> subject. In future, when you modify and post another version of the
> patch, can you please update the patch version and elaborate what has
> changed, why etc, so that it will be easier for the reviewers.
> 
> Ingo/Peter, please review and queue this patch from Yong. Thanks.
> ---
> 
> From: Yong Wang <yong.y.wang@intel.com>
> Subject: x86: under bios control, restore AP's APIC_LVTTHMR to the BSP value
> 
> On platforms where bios handles the thermal monitor interrupt,
> APIC_LVTTHMR on each logical CPU is programmed to generate a SMI and OS
> can't touch it.
> 
> Unfortunately AP bringup sequence using INIT-SIPI-SIPI clear all
> the LVT entries except the mask bit. Essentially this results in
> all LVT entries including the thermal monitoring interrupt set to masked
> (clearing the bios programmed value for APIC_LVTTHMR).
> 
> And this leads to kernel take over the thermal monitoring interrupt
> on AP's but not on BSP (leaving the bios programmed value only on BSP).
> 
> As a result of this, we have seen system hangs when the thermal
> monitoring interrupt is generated.
> 
> Fix this by reading the initial value of thermal LVT entry on BSP
> and if bios has taken over the control, then program the same value
> on all AP's and leave the thermal monitoring interrupt control
> on all the logical cpu's to the bios.
> 
> Signed-off-by: Yong Wang <yong.y.wang@intel.com>
> Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com>
> Cc: stable@kernel.org
> ---
>  arch/x86/kernel/cpu/mcheck/therm_throt.c |   20 +++++++++++++++++++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
> index b3a1dba..1fd42db 100644
> --- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
> +++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
> @@ -259,6 +259,7 @@ void intel_init_thermal(struct cpuinfo_x86 *c)
>  	unsigned int cpu = smp_processor_id();
>  	int tm2 = 0;
>  	u32 l, h;
> +	static u32 lvtthmr;
>  
>  	/* Thermal monitoring depends on ACPI and clock modulation*/
>  	if (!cpu_has(c, X86_FEATURE_ACPI) || !cpu_has(c, X86_FEATURE_ACC))
> @@ -270,7 +271,24 @@ void intel_init_thermal(struct cpuinfo_x86 *c)
>  	 * since it might be delivered via SMI already:
>  	 */
>  	rdmsr(MSR_IA32_MISC_ENABLE, l, h);
> -	h = apic_read(APIC_LVTTHMR);
> +
> +	/*
> +	 * Only read the initial value of thermal LVT entry on BSP. The
> +	 * initial value of thermal LVT entries on all APs always reads
> +	 * 0x10000 because APs are woken up by BSP issuing INIT-SIPI-SIPI
> +	 * sequence to them and LVT registers are reset to 0s except for
> +	 * the mask bits which are set to 1s when APs receive INIT IPI.
> +	 * Also restore the value that BIOS has programmed on AP based on
> +	 * BSP's info we saved since BIOS is always setting the same value
> +	 * for all threads/cores
> +	 */
> +	if (cpu == 0)
> +		lvtthmr = apic_read(APIC_LVTTHMR);
> +	else
> +		apic_write(APIC_LVTTHMR, lvtthmr);
> +
> +	h = lvtthmr;
> +

i dont disagree with the fix, but could we please do it a bit cleaner, 
and initialize a proper file-scope lvtthrm_init value from a different 
boot-CPU-only function? (not intel_init_thermal)

that makes it cleaner, and also it will work if we dont boot on cpu==0. 
(should that ever occur)

Thanks,

	Ingo

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Yong Wang - 2009-11-08 12:38:12
On Sun, Nov 08, 2009 at 11:25:21AM +0100, Ingo Molnar wrote:
> 
> i dont disagree with the fix, but could we please do it a bit cleaner, 
> and initialize a proper file-scope lvtthrm_init value from a different 
> boot-CPU-only function? (not intel_init_thermal)
> 

Thanks for your comments. Just want to make sure I understand correctly.
By 'file-scope', do you want me to define lvtthrm_init as a static
variable but not to define it in any function?

> that makes it cleaner, and also it will work if we dont boot on cpu==0. 
> (should that ever occur)
> 

May I know when will this happen?

Thanks
-Yong
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Ingo Molnar - 2009-11-08 13:16:55
* Yong Wang <yong.y.wang@linux.intel.com> wrote:

> On Sun, Nov 08, 2009 at 11:25:21AM +0100, Ingo Molnar wrote:
> > 
> > i dont disagree with the fix, but could we please do it a bit cleaner, 
> > and initialize a proper file-scope lvtthrm_init value from a different 
> > boot-CPU-only function? (not intel_init_thermal)
> > 
> 
> Thanks for your comments. Just want to make sure I understand correctly.
> By 'file-scope', do you want me to define lvtthrm_init as a static
> variable but not to define it in any function?

Correct - i'd suggest to put it next to other file-scope variables at 
the top of the .c file. Maybe make it __read_mostly as well.

> > that makes it cleaner, and also it will work if we dont boot on 
> > cpu==0. (should that ever occur)
> > 
> 
> May I know when will this happen?

It's not really expected - we factorize the CPU IDs (which are logical) 
so that the boot CPU is 0. But relying on cpu==0 is the boot cpu is not 
clean and the resulting code is harder to read.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Yong Wang - 2009-11-08 13:19:39
On Sun, Nov 08, 2009 at 02:16:55PM +0100, Ingo Molnar wrote:
> 
> * Yong Wang <yong.y.wang@linux.intel.com> wrote:
> 
> > On Sun, Nov 08, 2009 at 11:25:21AM +0100, Ingo Molnar wrote:
> > > 
> > > i dont disagree with the fix, but could we please do it a bit cleaner, 
> > > and initialize a proper file-scope lvtthrm_init value from a different 
> > > boot-CPU-only function? (not intel_init_thermal)
> > > 
> > 
> > Thanks for your comments. Just want to make sure I understand correctly.
> > By 'file-scope', do you want me to define lvtthrm_init as a static
> > variable but not to define it in any function?
> 
> Correct - i'd suggest to put it next to other file-scope variables at 
> the top of the .c file. Maybe make it __read_mostly as well.
> 

OK, will do.

> > > that makes it cleaner, and also it will work if we dont boot on 
> > > cpu==0. (should that ever occur)
> > > 
> > 
> > May I know when will this happen?
> 
> It's not really expected - we factorize the CPU IDs (which are logical) 
> so that the boot CPU is 0. But relying on cpu==0 is the boot cpu is not 
> clean and the resulting code is harder to read.
> 

There does not seem to be 'boot-CPU-only function' in that .c file. What
about changing cpu==0 to cpu==boot_cpu_id? Does that help?

Thanks
-Yong
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Cyrill Gorcunov - 2009-11-08 13:42:30
[Ingo Molnar - Sun, Nov 08, 2009 at 02:16:55PM +0100]
| 
...
|
| > > that makes it cleaner, and also it will work if we dont boot on 
| > > cpu==0. (should that ever occur)
| > > 
| > 
| > May I know when will this happen?
| 
| It's not really expected - we factorize the CPU IDs (which are logical) 
| so that the boot CPU is 0. But relying on cpu==0 is the boot cpu is not 
| clean and the resulting code is harder to read.

Perhaps we need some is_bsp_cpu() helper? Though to cover all x86 places
we need some efforts to apply :/

| 
| 	Ingo
| 
	-- Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Ingo Molnar - 2009-11-08 15:59:25
* Yong Wang <yong.y.wang@linux.intel.com> wrote:

> On Sun, Nov 08, 2009 at 02:16:55PM +0100, Ingo Molnar wrote:
> > 
> > * Yong Wang <yong.y.wang@linux.intel.com> wrote:
> > 
> > > On Sun, Nov 08, 2009 at 11:25:21AM +0100, Ingo Molnar wrote:
> > > > 
> > > > i dont disagree with the fix, but could we please do it a bit cleaner, 
> > > > and initialize a proper file-scope lvtthrm_init value from a different 
> > > > boot-CPU-only function? (not intel_init_thermal)
> > > > 
> > > 
> > > Thanks for your comments. Just want to make sure I understand correctly.
> > > By 'file-scope', do you want me to define lvtthrm_init as a static
> > > variable but not to define it in any function?
> > 
> > Correct - i'd suggest to put it next to other file-scope variables at 
> > the top of the .c file. Maybe make it __read_mostly as well.
> > 
> 
> OK, will do.
> 
> > > > that makes it cleaner, and also it will work if we dont boot on 
> > > > cpu==0. (should that ever occur)
> > > > 
> > > 
> > > May I know when will this happen?
> > 
> > It's not really expected - we factorize the CPU IDs (which are logical) 
> > so that the boot CPU is 0. But relying on cpu==0 is the boot cpu is not 
> > clean and the resulting code is harder to read.
> > 
> 
> There does not seem to be 'boot-CPU-only function' in that .c file. What
> about changing cpu==0 to cpu==boot_cpu_id? Does that help?

Then create one and call it - it's still cleaner that way.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Patch

diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
index b3a1dba..1fd42db 100644
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -259,6 +259,7 @@  void intel_init_thermal(struct cpuinfo_x86 *c)
 	unsigned int cpu = smp_processor_id();
 	int tm2 = 0;
 	u32 l, h;
+	static u32 lvtthmr;
 
 	/* Thermal monitoring depends on ACPI and clock modulation*/
 	if (!cpu_has(c, X86_FEATURE_ACPI) || !cpu_has(c, X86_FEATURE_ACC))
@@ -270,7 +271,24 @@  void intel_init_thermal(struct cpuinfo_x86 *c)
 	 * since it might be delivered via SMI already:
 	 */
 	rdmsr(MSR_IA32_MISC_ENABLE, l, h);
-	h = apic_read(APIC_LVTTHMR);
+
+	/*
+	 * Only read the initial value of thermal LVT entry on BSP. The
+	 * initial value of thermal LVT entries on all APs always reads
+	 * 0x10000 because APs are woken up by BSP issuing INIT-SIPI-SIPI
+	 * sequence to them and LVT registers are reset to 0s except for
+	 * the mask bits which are set to 1s when APs receive INIT IPI.
+	 * Also restore the value that BIOS has programmed on AP based on
+	 * BSP's info we saved since BIOS is always setting the same value
+	 * for all threads/cores
+	 */
+	if (cpu == 0)
+		lvtthmr = apic_read(APIC_LVTTHMR);
+	else
+		apic_write(APIC_LVTTHMR, lvtthmr);
+
+	h = lvtthmr;
+
 	if ((l & MSR_IA32_MISC_ENABLE_TM1) && (h & APIC_DM_SMI)) {
 		printk(KERN_DEBUG
 		       "CPU%d: Thermal monitoring handled by SMI\n", cpu);