| Submitter | Suresh Siddha |
|---|---|
| Date | 2009-11-07 02:56:05 |
| Message ID | <1257562565.4083.485.camel@sbs-t61.sc.intel.com> |
| Download | mbox | patch |
| Permalink | /patch/58287/ |
| State | New |
| Headers | show |
Comments
On Fri, Nov 06, 2009 at 06:56:05PM -0800, Suresh Siddha wrote: > On Fri, 2009-11-06 at 16:17 -0800, Yong Wang wrote: > > Only read the initial value of thermal LVT entry on BSP. The initial > > value of thermal LVT entries on all APs always reads 0x10000 because > > APs are woken up by BSP issuing INIT-SIPI-SIPI sequence to them and > > LVT registers are reset to 0s except for mask bits which are set to > > 1s when APs receive INIT IPI. > > > > Also restore the value that BIOS has programmed on AP based on BSP's > > info we saved since BIOS is always setting the same value for all > > threads/cores. > > Yong, I have appended a new patch with an enhanced change log and > subject. In future, when you modify and post another version of the > patch, can you please update the patch version and elaborate what has > changed, why etc, so that it will be easier for the reviewers. > Thanks a lot for your review and comments. Will follow the conventions going forward. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
* Suresh Siddha <suresh.b.siddha@intel.com> wrote: > On Fri, 2009-11-06 at 16:17 -0800, Yong Wang wrote: > > Only read the initial value of thermal LVT entry on BSP. The initial > > value of thermal LVT entries on all APs always reads 0x10000 because > > APs are woken up by BSP issuing INIT-SIPI-SIPI sequence to them and > > LVT registers are reset to 0s except for mask bits which are set to > > 1s when APs receive INIT IPI. > > > > Also restore the value that BIOS has programmed on AP based on BSP's > > info we saved since BIOS is always setting the same value for all > > threads/cores. > > Yong, I have appended a new patch with an enhanced change log and > subject. In future, when you modify and post another version of the > patch, can you please update the patch version and elaborate what has > changed, why etc, so that it will be easier for the reviewers. > > Ingo/Peter, please review and queue this patch from Yong. Thanks. > --- > > From: Yong Wang <yong.y.wang@intel.com> > Subject: x86: under bios control, restore AP's APIC_LVTTHMR to the BSP value > > On platforms where bios handles the thermal monitor interrupt, > APIC_LVTTHMR on each logical CPU is programmed to generate a SMI and OS > can't touch it. > > Unfortunately AP bringup sequence using INIT-SIPI-SIPI clear all > the LVT entries except the mask bit. Essentially this results in > all LVT entries including the thermal monitoring interrupt set to masked > (clearing the bios programmed value for APIC_LVTTHMR). > > And this leads to kernel take over the thermal monitoring interrupt > on AP's but not on BSP (leaving the bios programmed value only on BSP). > > As a result of this, we have seen system hangs when the thermal > monitoring interrupt is generated. > > Fix this by reading the initial value of thermal LVT entry on BSP > and if bios has taken over the control, then program the same value > on all AP's and leave the thermal monitoring interrupt control > on all the logical cpu's to the bios. > > Signed-off-by: Yong Wang <yong.y.wang@intel.com> > Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com> > Cc: stable@kernel.org > --- > arch/x86/kernel/cpu/mcheck/therm_throt.c | 20 +++++++++++++++++++- > 1 file changed, 19 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c > index b3a1dba..1fd42db 100644 > --- a/arch/x86/kernel/cpu/mcheck/therm_throt.c > +++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c > @@ -259,6 +259,7 @@ void intel_init_thermal(struct cpuinfo_x86 *c) > unsigned int cpu = smp_processor_id(); > int tm2 = 0; > u32 l, h; > + static u32 lvtthmr; > > /* Thermal monitoring depends on ACPI and clock modulation*/ > if (!cpu_has(c, X86_FEATURE_ACPI) || !cpu_has(c, X86_FEATURE_ACC)) > @@ -270,7 +271,24 @@ void intel_init_thermal(struct cpuinfo_x86 *c) > * since it might be delivered via SMI already: > */ > rdmsr(MSR_IA32_MISC_ENABLE, l, h); > - h = apic_read(APIC_LVTTHMR); > + > + /* > + * Only read the initial value of thermal LVT entry on BSP. The > + * initial value of thermal LVT entries on all APs always reads > + * 0x10000 because APs are woken up by BSP issuing INIT-SIPI-SIPI > + * sequence to them and LVT registers are reset to 0s except for > + * the mask bits which are set to 1s when APs receive INIT IPI. > + * Also restore the value that BIOS has programmed on AP based on > + * BSP's info we saved since BIOS is always setting the same value > + * for all threads/cores > + */ > + if (cpu == 0) > + lvtthmr = apic_read(APIC_LVTTHMR); > + else > + apic_write(APIC_LVTTHMR, lvtthmr); > + > + h = lvtthmr; > + i dont disagree with the fix, but could we please do it a bit cleaner, and initialize a proper file-scope lvtthrm_init value from a different boot-CPU-only function? (not intel_init_thermal) that makes it cleaner, and also it will work if we dont boot on cpu==0. (should that ever occur) Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Sun, Nov 08, 2009 at 11:25:21AM +0100, Ingo Molnar wrote: > > i dont disagree with the fix, but could we please do it a bit cleaner, > and initialize a proper file-scope lvtthrm_init value from a different > boot-CPU-only function? (not intel_init_thermal) > Thanks for your comments. Just want to make sure I understand correctly. By 'file-scope', do you want me to define lvtthrm_init as a static variable but not to define it in any function? > that makes it cleaner, and also it will work if we dont boot on cpu==0. > (should that ever occur) > May I know when will this happen? Thanks -Yong -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
* Yong Wang <yong.y.wang@linux.intel.com> wrote: > On Sun, Nov 08, 2009 at 11:25:21AM +0100, Ingo Molnar wrote: > > > > i dont disagree with the fix, but could we please do it a bit cleaner, > > and initialize a proper file-scope lvtthrm_init value from a different > > boot-CPU-only function? (not intel_init_thermal) > > > > Thanks for your comments. Just want to make sure I understand correctly. > By 'file-scope', do you want me to define lvtthrm_init as a static > variable but not to define it in any function? Correct - i'd suggest to put it next to other file-scope variables at the top of the .c file. Maybe make it __read_mostly as well. > > that makes it cleaner, and also it will work if we dont boot on > > cpu==0. (should that ever occur) > > > > May I know when will this happen? It's not really expected - we factorize the CPU IDs (which are logical) so that the boot CPU is 0. But relying on cpu==0 is the boot cpu is not clean and the resulting code is harder to read. Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Sun, Nov 08, 2009 at 02:16:55PM +0100, Ingo Molnar wrote: > > * Yong Wang <yong.y.wang@linux.intel.com> wrote: > > > On Sun, Nov 08, 2009 at 11:25:21AM +0100, Ingo Molnar wrote: > > > > > > i dont disagree with the fix, but could we please do it a bit cleaner, > > > and initialize a proper file-scope lvtthrm_init value from a different > > > boot-CPU-only function? (not intel_init_thermal) > > > > > > > Thanks for your comments. Just want to make sure I understand correctly. > > By 'file-scope', do you want me to define lvtthrm_init as a static > > variable but not to define it in any function? > > Correct - i'd suggest to put it next to other file-scope variables at > the top of the .c file. Maybe make it __read_mostly as well. > OK, will do. > > > that makes it cleaner, and also it will work if we dont boot on > > > cpu==0. (should that ever occur) > > > > > > > May I know when will this happen? > > It's not really expected - we factorize the CPU IDs (which are logical) > so that the boot CPU is 0. But relying on cpu==0 is the boot cpu is not > clean and the resulting code is harder to read. > There does not seem to be 'boot-CPU-only function' in that .c file. What about changing cpu==0 to cpu==boot_cpu_id? Does that help? Thanks -Yong -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[Ingo Molnar - Sun, Nov 08, 2009 at 02:16:55PM +0100] | ... | | > > that makes it cleaner, and also it will work if we dont boot on | > > cpu==0. (should that ever occur) | > > | > | > May I know when will this happen? | | It's not really expected - we factorize the CPU IDs (which are logical) | so that the boot CPU is 0. But relying on cpu==0 is the boot cpu is not | clean and the resulting code is harder to read. Perhaps we need some is_bsp_cpu() helper? Though to cover all x86 places we need some efforts to apply :/ | | Ingo | -- Cyrill -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
* Yong Wang <yong.y.wang@linux.intel.com> wrote: > On Sun, Nov 08, 2009 at 02:16:55PM +0100, Ingo Molnar wrote: > > > > * Yong Wang <yong.y.wang@linux.intel.com> wrote: > > > > > On Sun, Nov 08, 2009 at 11:25:21AM +0100, Ingo Molnar wrote: > > > > > > > > i dont disagree with the fix, but could we please do it a bit cleaner, > > > > and initialize a proper file-scope lvtthrm_init value from a different > > > > boot-CPU-only function? (not intel_init_thermal) > > > > > > > > > > Thanks for your comments. Just want to make sure I understand correctly. > > > By 'file-scope', do you want me to define lvtthrm_init as a static > > > variable but not to define it in any function? > > > > Correct - i'd suggest to put it next to other file-scope variables at > > the top of the .c file. Maybe make it __read_mostly as well. > > > > OK, will do. > > > > > that makes it cleaner, and also it will work if we dont boot on > > > > cpu==0. (should that ever occur) > > > > > > > > > > May I know when will this happen? > > > > It's not really expected - we factorize the CPU IDs (which are logical) > > so that the boot CPU is 0. But relying on cpu==0 is the boot cpu is not > > clean and the resulting code is harder to read. > > > > There does not seem to be 'boot-CPU-only function' in that .c file. What > about changing cpu==0 to cpu==boot_cpu_id? Does that help? Then create one and call it - it's still cleaner that way. Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Patch
diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c index b3a1dba..1fd42db 100644 --- a/arch/x86/kernel/cpu/mcheck/therm_throt.c +++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c @@ -259,6 +259,7 @@ void intel_init_thermal(struct cpuinfo_x86 *c) unsigned int cpu = smp_processor_id(); int tm2 = 0; u32 l, h; + static u32 lvtthmr; /* Thermal monitoring depends on ACPI and clock modulation*/ if (!cpu_has(c, X86_FEATURE_ACPI) || !cpu_has(c, X86_FEATURE_ACC)) @@ -270,7 +271,24 @@ void intel_init_thermal(struct cpuinfo_x86 *c) * since it might be delivered via SMI already: */ rdmsr(MSR_IA32_MISC_ENABLE, l, h); - h = apic_read(APIC_LVTTHMR); + + /* + * Only read the initial value of thermal LVT entry on BSP. The + * initial value of thermal LVT entries on all APs always reads + * 0x10000 because APs are woken up by BSP issuing INIT-SIPI-SIPI + * sequence to them and LVT registers are reset to 0s except for + * the mask bits which are set to 1s when APs receive INIT IPI. + * Also restore the value that BIOS has programmed on AP based on + * BSP's info we saved since BIOS is always setting the same value + * for all threads/cores + */ + if (cpu == 0) + lvtthmr = apic_read(APIC_LVTTHMR); + else + apic_write(APIC_LVTTHMR, lvtthmr); + + h = lvtthmr; + if ((l & MSR_IA32_MISC_ENABLE_TM1) && (h & APIC_DM_SMI)) { printk(KERN_DEBUG "CPU%d: Thermal monitoring handled by SMI\n", cpu);