diff mbox

intel_pstate divide error with v3.13-rc4-256-gb7000ad

Message ID 52C716C1.6070704@gmail.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

dirk.brandewie@gmail.com Jan. 3, 2014, 8 p.m. UTC
On 01/03/2014 10:04 AM, Gleb Natapov wrote:
> On Fri, Jan 03, 2014 at 09:30:28AM -0800, Dirk Brandewie wrote:
>> Hi All,
>>
>> Sorry for being late to the party but I just got back from vacation.
>>
>> There is something deeply wrong here.  We should have never gotten to
>> intel_pstate_init_cpu().  The VM had to have returned value from the read
>> of the max pstate at driver init time and 0 when the CPU was being brought up.
>>
>> intel_pstate_msrs_not_valid() was added to solve this issue early on
>> if I remember correctly it was Josh that reported it then.  Is there
>> a definative way to detect whether we are running in a VM?
>>
> Checking for VM is a wrong thing to do here. KVM should behave like it
> does not support p-state.
>
>> Can some one tell me how the nested environment differs in regards to
>> reading MSRs?
>>
> It shouldn't differ, but there may be bug somewhere in nested emulation.
> We shouldn't try and hind the bug by doing more checks in Linux but
> rather fixing KVM bug that causes Linux to behave incorrectly.

Based on the unhandled MSR messages in the host dmesg the following patch
should make sure the correct values are returned for PLAT_FORM_INFO, APERF
and MPERF.  intel_pstate and turbostat are the only users of these registers.

Could you try the folloinw patch minus Rafael's patch please.
Compile tested only.

TIA
--Dirk

commit 5594b89bee7f83200c1a70bf95d50ac35e4fe3f8
Author: Dirk Brandewie <dirk.j.brandewie@intel.com>
Date:   Fri Jan 3 11:44:15 2014 -0800

     x86/kvm: Handle MSR_PLATFORM_INFO, MSR_IA32_MPERF and MSR_IA32_APERF MSRs

     Handle MSRs correctly when read from a nested KVM

     Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
---
  arch/x86/kvm/x86.c | 3 +++
  1 file changed, 3 insertions(+)

  	case MSR_IA32_LASTBRANCHFROMIP:

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Paolo Bonzini Jan. 3, 2014, 10:06 p.m. UTC | #1
Il 03/01/2014 21:00, Dirk Brandewie ha scritto:
> +    case MSR_IA32_MPERF:
> +    case MSR_IA32_APERF:

These should never be accessed.  A KVM VM will always have
CPUID[06H].ECX = 0, and the Intel manual says that the MSRs are only
present if CPUID returns that value with bit 0 set.

I think the actual bug is that intel_pstate_init does not check the
feature bits in CPUID (either manually or via x86_match_cpu).

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
dirk.brandewie@gmail.com Jan. 6, 2014, 5:18 p.m. UTC | #2
On 01/03/2014 02:06 PM, Paolo Bonzini wrote:
> Il 03/01/2014 21:00, Dirk Brandewie ha scritto:
>> +    case MSR_IA32_MPERF:
>> +    case MSR_IA32_APERF:
>
OK I will spin the patch to only add MSR_PLATFORM_INFO.

> These should never be accessed.  A KVM VM will always have
> CPUID[06H].ECX = 0, and the Intel manual says that the MSRs are only
> present if CPUID returns that value with bit 0 set.
>
> I think the actual bug is that intel_pstate_init does not check the
> feature bits in CPUID (either manually or via x86_match_cpu).

I will add the feature check.

What are the differences between the first and the nested KVM's?
At load time intel_pstate checks that APERF and MPERF are incrementing
and that PLATFORM_INFO has some value.  Somehow these checks pass
in the nested environment and we fall over when the CPU is being added
by cpufreq.

--Dirk

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gleb Natapov Jan. 7, 2014, 4:11 p.m. UTC | #3
On Mon, Jan 06, 2014 at 09:18:44AM -0800, Dirk Brandewie wrote:
> On 01/03/2014 02:06 PM, Paolo Bonzini wrote:
> >Il 03/01/2014 21:00, Dirk Brandewie ha scritto:
> >>+    case MSR_IA32_MPERF:
> >>+    case MSR_IA32_APERF:
> >
> OK I will spin the patch to only add MSR_PLATFORM_INFO.
> 
> >These should never be accessed.  A KVM VM will always have
> >CPUID[06H].ECX = 0, and the Intel manual says that the MSRs are only
> >present if CPUID returns that value with bit 0 set.
> >
> >I think the actual bug is that intel_pstate_init does not check the
> >feature bits in CPUID (either manually or via x86_match_cpu).
> 
> I will add the feature check.
> 
> What are the differences between the first and the nested KVM's?
There shouldn't be any. There is a bug in nested emulation probably.

> At load time intel_pstate checks that APERF and MPERF are incrementing
> and that PLATFORM_INFO has some value.  Somehow these checks pass
> in the nested environment and we fall over when the CPU is being added
> by cpufreq.
> 
KVM does not emulate either of those and inject #GP if one is accessed. Linux
catches those #GPs and fixs up rdmsr to return zero. It would be interesting to
see ftrace for nested kvm run.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5d004da..390ef27 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2336,6 +2336,9 @@  int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 
*pdata)

  	switch (msr) {
  	case MSR_IA32_PLATFORM_ID:
+	case MSR_IA32_MPERF:
+	case MSR_IA32_APERF:
+	case MSR_PLATFORM_INFO:
  	case MSR_IA32_EBL_CR_POWERON:
  	case MSR_IA32_DEBUGCTLMSR: