diff mbox series

KVM: x86/vPMU: Forbid writing to MSR_F15H_PERF MSRs when guest doesn't have X86_FEATURE_PERFCTR_CORE

Message ID 20210323084515.1346540-1-vkuznets@redhat.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86/vPMU: Forbid writing to MSR_F15H_PERF MSRs when guest doesn't have X86_FEATURE_PERFCTR_CORE | expand

Commit Message

Vitaly Kuznetsov March 23, 2021, 8:45 a.m. UTC
MSR_F15H_PERF_CTL0-5, MSR_F15H_PERF_CTR0-5 MSRs are only available when
X86_FEATURE_PERFCTR_CORE CPUID bit was exposed to the guest. KVM, however,
allows these MSRs unconditionally because kvm_pmu_is_valid_msr() ->
amd_msr_idx_to_pmc() check always passes and because kvm_pmu_set_msr() ->
amd_pmu_set_msr() doesn't fail.

In case of a counter (CTRn), no big harm is done as we only increase
internal PMC's value but in case of an eventsel (CTLn), we go deep into
perf internals with a non-existing counter.

Note, kvm_get_msr_common() just returns '0' when these MSRs don't exist
and this also seems to contradict architectural behavior which is #GP
(I did check one old Opteron host) but changing this status quo is a bit
scarier.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/svm/pmu.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Haiwei Li March 24, 2021, 12:23 p.m. UTC | #1
On Tue, Mar 23, 2021 at 4:48 PM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
> MSR_F15H_PERF_CTL0-5, MSR_F15H_PERF_CTR0-5 MSRs are only available when
> X86_FEATURE_PERFCTR_CORE CPUID bit was exposed to the guest. KVM, however,
> allows these MSRs unconditionally because kvm_pmu_is_valid_msr() ->
> amd_msr_idx_to_pmc() check always passes and because kvm_pmu_set_msr() ->
> amd_pmu_set_msr() doesn't fail.

I have tested on AMD EPYC platform with perfctr_core(`cat
/proc/cpuinfo | grep perfctr_core`).
I started a vm without `perfctr-core`(-cpu host,-perfctr-core).

Before patch :

$ rdmsr 0xc0010200
0
$ wrmsr 0xc0010200 1
$ rdmsr 0xc0010200
1

After patch:

# rdmsr 0xc0010200
0
# wrmsr 0xc0010200 1
wrmsr: CPU 0 cannot set MSR 0xc0010200 to 0x0000000000000001
# rdmsr 0xc0010200
0

So,

Tested-by: Haiwei Li <lihaiwei@tencent.com>
Haiwei Li March 25, 2021, 7:51 a.m. UTC | #2
On Tue, Mar 23, 2021 at 4:48 PM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
> MSR_F15H_PERF_CTL0-5, MSR_F15H_PERF_CTR0-5 MSRs are only available when
> X86_FEATURE_PERFCTR_CORE CPUID bit was exposed to the guest. KVM, however,
> allows these MSRs unconditionally because kvm_pmu_is_valid_msr() ->
> amd_msr_idx_to_pmc() check always passes and because kvm_pmu_set_msr() ->
> amd_pmu_set_msr() doesn't fail.
>
> In case of a counter (CTRn), no big harm is done as we only increase
> internal PMC's value but in case of an eventsel (CTLn), we go deep into
> perf internals with a non-existing counter.
>
> Note, kvm_get_msr_common() just returns '0' when these MSRs don't exist
> and this also seems to contradict architectural behavior which is #GP
> (I did check one old Opteron host) but changing this status quo is a bit
> scarier.

When msr doesn't exist, kvm_get_msr_common() returns KVM_MSR_RET_INVALID
in `default:` and kvm_complete_insn_gp() will inject #GP to guest.

Also i have wrote a kvm-unit-test, tested both on amd EPYC and intel
CascadeLake. A #GP error was printed.
Just like:

Unhandled exception 13 #GP at ip 0000000000400420
error_code=0000      rflags=00010006      cs=00000008
rax=0000000000000000 rcx=0000000000000620 rdx=00000000006164a0
rbx=0000000000009500
rbp=0000000000517490 rsi=0000000000616ae0 rdi=0000000000000001
 r8=0000000000000001  r9=00000000000003f8 r10=000000000000000d
r11=0000000000000000
r12=0000000000000000 r13=0000000000000000 r14=0000000000000000
r15=0000000000000000
cr0=0000000080000011 cr2=0000000000000000 cr3=000000000040b000
cr4=0000000000000020
cr8=0000000000000000
STACK: @400420 400338
Vitaly Kuznetsov March 25, 2021, 8:10 a.m. UTC | #3
Haiwei Li <lihaiwei.kernel@gmail.com> writes:

> On Tue, Mar 23, 2021 at 4:48 PM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>>
>> MSR_F15H_PERF_CTL0-5, MSR_F15H_PERF_CTR0-5 MSRs are only available when
>> X86_FEATURE_PERFCTR_CORE CPUID bit was exposed to the guest. KVM, however,
>> allows these MSRs unconditionally because kvm_pmu_is_valid_msr() ->
>> amd_msr_idx_to_pmc() check always passes and because kvm_pmu_set_msr() ->
>> amd_pmu_set_msr() doesn't fail.
>>
>> In case of a counter (CTRn), no big harm is done as we only increase
>> internal PMC's value but in case of an eventsel (CTLn), we go deep into
>> perf internals with a non-existing counter.
>>
>> Note, kvm_get_msr_common() just returns '0' when these MSRs don't exist
>> and this also seems to contradict architectural behavior which is #GP
>> (I did check one old Opteron host) but changing this status quo is a bit
>> scarier.
>
> When msr doesn't exist, kvm_get_msr_common() returns KVM_MSR_RET_INVALID
> in `default:` and kvm_complete_insn_gp() will inject #GP to guest.
>

I'm looking at the following in kvm_get_msr_common():

        switch (msr_info->index) {
        ...
 	case MSR_F15H_PERF_CTL0 ... MSR_F15H_PERF_CTR5:
        ...
		if (kvm_pmu_is_valid_msr(vcpu, msr_info->index))
			return kvm_pmu_get_msr(vcpu, msr_info);
		msr_info->data = 0;
		break;
        ...
	}
	return 0;

so it's kind of 'always exists' or am I wrong?

> Also i have wrote a kvm-unit-test, tested both on amd EPYC and intel
> CascadeLake. A #GP error was printed.
> Just like:
>
> Unhandled exception 13 #GP at ip 0000000000400420
> error_code=0000      rflags=00010006      cs=00000008
> rax=0000000000000000 rcx=0000000000000620 rdx=00000000006164a0
> rbx=0000000000009500
> rbp=0000000000517490 rsi=0000000000616ae0 rdi=0000000000000001
>  r8=0000000000000001  r9=00000000000003f8 r10=000000000000000d
> r11=0000000000000000
> r12=0000000000000000 r13=0000000000000000 r14=0000000000000000
> r15=0000000000000000
> cr0=0000000080000011 cr2=0000000000000000 cr3=000000000040b000
> cr4=0000000000000020
> cr8=0000000000000000
> STACK: @400420 400338

Did this happen on read or write? The later is expected, the former is
not. Could you maybe drop your code here, I'd like to see what's going
on.
Haiwei Li March 25, 2021, 8:33 a.m. UTC | #4
On Thu, Mar 25, 2021 at 4:10 PM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
> Haiwei Li <lihaiwei.kernel@gmail.com> writes:
>
> > On Tue, Mar 23, 2021 at 4:48 PM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >>
> >> MSR_F15H_PERF_CTL0-5, MSR_F15H_PERF_CTR0-5 MSRs are only available when
> >> X86_FEATURE_PERFCTR_CORE CPUID bit was exposed to the guest. KVM, however,
> >> allows these MSRs unconditionally because kvm_pmu_is_valid_msr() ->
> >> amd_msr_idx_to_pmc() check always passes and because kvm_pmu_set_msr() ->
> >> amd_pmu_set_msr() doesn't fail.
> >>
> >> In case of a counter (CTRn), no big harm is done as we only increase
> >> internal PMC's value but in case of an eventsel (CTLn), we go deep into
> >> perf internals with a non-existing counter.
> >>
> >> Note, kvm_get_msr_common() just returns '0' when these MSRs don't exist
> >> and this also seems to contradict architectural behavior which is #GP
> >> (I did check one old Opteron host) but changing this status quo is a bit
> >> scarier.
> >
> > When msr doesn't exist, kvm_get_msr_common() returns KVM_MSR_RET_INVALID
> > in `default:` and kvm_complete_insn_gp() will inject #GP to guest.
> >
>
> I'm looking at the following in kvm_get_msr_common():
>
>         switch (msr_info->index) {
>         ...
>         case MSR_F15H_PERF_CTL0 ... MSR_F15H_PERF_CTR5:
>         ...
>                 if (kvm_pmu_is_valid_msr(vcpu, msr_info->index))
>                         return kvm_pmu_get_msr(vcpu, msr_info);
>                 msr_info->data = 0;
>                 break;
>         ...
>         }
>         return 0;
>
> so it's kind of 'always exists' or am I wrong?

I am sorry. You are right. You were talking about `MSR_F15H_PERF_CTL0
... MSR_F15H_PERF_CTR5`.
I thought you were talking about the registers not catched in
kvm_get_msr_common().

>
> > Also i have wrote a kvm-unit-test, tested both on amd EPYC and intel
> > CascadeLake. A #GP error was printed.
> > Just like:
> >
> > Unhandled exception 13 #GP at ip 0000000000400420
> > error_code=0000      rflags=00010006      cs=00000008
> > rax=0000000000000000 rcx=0000000000000620 rdx=00000000006164a0
> > rbx=0000000000009500
> > rbp=0000000000517490 rsi=0000000000616ae0 rdi=0000000000000001
> >  r8=0000000000000001  r9=00000000000003f8 r10=000000000000000d
> > r11=0000000000000000
> > r12=0000000000000000 r13=0000000000000000 r14=0000000000000000
> > r15=0000000000000000
> > cr0=0000000080000011 cr2=0000000000000000 cr3=000000000040b000
> > cr4=0000000000000020
> > cr8=0000000000000000
> > STACK: @400420 400338
>
> Did this happen on read or write? The later is expected, the former is
> not. Could you maybe drop your code here, I'd like to see what's going
> on.

I did a bad test. The msr tested is not catched in
kvm_get_msr_common(). As said, I misunderstood
what you meant.

I have tested your patch and replied. If you have any to test, I'm
glad to help. :)

--
Haiwei Li
Paolo Bonzini March 26, 2021, 5:16 p.m. UTC | #5
On 23/03/21 09:45, Vitaly Kuznetsov wrote:
> MSR_F15H_PERF_CTL0-5, MSR_F15H_PERF_CTR0-5 MSRs are only available when
> X86_FEATURE_PERFCTR_CORE CPUID bit was exposed to the guest. KVM, however,
> allows these MSRs unconditionally because kvm_pmu_is_valid_msr() ->
> amd_msr_idx_to_pmc() check always passes and because kvm_pmu_set_msr() ->
> amd_pmu_set_msr() doesn't fail.
> 
> In case of a counter (CTRn), no big harm is done as we only increase
> internal PMC's value but in case of an eventsel (CTLn), we go deep into
> perf internals with a non-existing counter.
> 
> Note, kvm_get_msr_common() just returns '0' when these MSRs don't exist
> and this also seems to contradict architectural behavior which is #GP
> (I did check one old Opteron host) but changing this status quo is a bit
> scarier.

Hmm, since these do have a cpuid bit it may not be that scary.

Queued anyway, thanks.

Paolo

> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>   arch/x86/kvm/svm/pmu.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
> index 035da07500e8..fdf587f19c5f 100644
> --- a/arch/x86/kvm/svm/pmu.c
> +++ b/arch/x86/kvm/svm/pmu.c
> @@ -98,6 +98,8 @@ static enum index msr_to_index(u32 msr)
>   static inline struct kvm_pmc *get_gp_pmc_amd(struct kvm_pmu *pmu, u32 msr,
>   					     enum pmu_type type)
>   {
> +	struct kvm_vcpu *vcpu = pmu_to_vcpu(pmu);
> +
>   	switch (msr) {
>   	case MSR_F15H_PERF_CTL0:
>   	case MSR_F15H_PERF_CTL1:
> @@ -105,6 +107,9 @@ static inline struct kvm_pmc *get_gp_pmc_amd(struct kvm_pmu *pmu, u32 msr,
>   	case MSR_F15H_PERF_CTL3:
>   	case MSR_F15H_PERF_CTL4:
>   	case MSR_F15H_PERF_CTL5:
> +		if (!guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
> +			return NULL;
> +		fallthrough;
>   	case MSR_K7_EVNTSEL0 ... MSR_K7_EVNTSEL3:
>   		if (type != PMU_TYPE_EVNTSEL)
>   			return NULL;
> @@ -115,6 +120,9 @@ static inline struct kvm_pmc *get_gp_pmc_amd(struct kvm_pmu *pmu, u32 msr,
>   	case MSR_F15H_PERF_CTR3:
>   	case MSR_F15H_PERF_CTR4:
>   	case MSR_F15H_PERF_CTR5:
> +		if (!guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
> +			return NULL;
> +		fallthrough;
>   	case MSR_K7_PERFCTR0 ... MSR_K7_PERFCTR3:
>   		if (type != PMU_TYPE_COUNTER)
>   			return NULL;
>
Vitaly Kuznetsov March 29, 2021, 8:52 a.m. UTC | #6
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 23/03/21 09:45, Vitaly Kuznetsov wrote:
>> MSR_F15H_PERF_CTL0-5, MSR_F15H_PERF_CTR0-5 MSRs are only available when
>> X86_FEATURE_PERFCTR_CORE CPUID bit was exposed to the guest. KVM, however,
>> allows these MSRs unconditionally because kvm_pmu_is_valid_msr() ->
>> amd_msr_idx_to_pmc() check always passes and because kvm_pmu_set_msr() ->
>> amd_pmu_set_msr() doesn't fail.
>> 
>> In case of a counter (CTRn), no big harm is done as we only increase
>> internal PMC's value but in case of an eventsel (CTLn), we go deep into
>> perf internals with a non-existing counter.
>> 
>> Note, kvm_get_msr_common() just returns '0' when these MSRs don't exist
>> and this also seems to contradict architectural behavior which is #GP
>> (I did check one old Opteron host) but changing this status quo is a bit
>> scarier.
>
> Hmm, since these do have a cpuid bit it may not be that scary.

Well, if you're not scared I can send a patch)
Paolo Bonzini March 29, 2021, 9:39 a.m. UTC | #7
On 29/03/21 10:52, Vitaly Kuznetsov wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>> On 23/03/21 09:45, Vitaly Kuznetsov wrote:
>>> MSR_F15H_PERF_CTL0-5, MSR_F15H_PERF_CTR0-5 MSRs are only available when
>>> X86_FEATURE_PERFCTR_CORE CPUID bit was exposed to the guest. KVM, however,
>>> allows these MSRs unconditionally because kvm_pmu_is_valid_msr() ->
>>> amd_msr_idx_to_pmc() check always passes and because kvm_pmu_set_msr() ->
>>> amd_pmu_set_msr() doesn't fail.
>>>
>>> In case of a counter (CTRn), no big harm is done as we only increase
>>> internal PMC's value but in case of an eventsel (CTLn), we go deep into
>>> perf internals with a non-existing counter.
>>>
>>> Note, kvm_get_msr_common() just returns '0' when these MSRs don't exist
>>> and this also seems to contradict architectural behavior which is #GP
>>> (I did check one old Opteron host) but changing this status quo is a bit
>>> scarier.
>>
>> Hmm, since these do have a cpuid bit it may not be that scary.
> 
> Well, if you're not scared I can send a patch)

Go ahead.

Paolo
diff mbox series

Patch

diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index 035da07500e8..fdf587f19c5f 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -98,6 +98,8 @@  static enum index msr_to_index(u32 msr)
 static inline struct kvm_pmc *get_gp_pmc_amd(struct kvm_pmu *pmu, u32 msr,
 					     enum pmu_type type)
 {
+	struct kvm_vcpu *vcpu = pmu_to_vcpu(pmu);
+
 	switch (msr) {
 	case MSR_F15H_PERF_CTL0:
 	case MSR_F15H_PERF_CTL1:
@@ -105,6 +107,9 @@  static inline struct kvm_pmc *get_gp_pmc_amd(struct kvm_pmu *pmu, u32 msr,
 	case MSR_F15H_PERF_CTL3:
 	case MSR_F15H_PERF_CTL4:
 	case MSR_F15H_PERF_CTL5:
+		if (!guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
+			return NULL;
+		fallthrough;
 	case MSR_K7_EVNTSEL0 ... MSR_K7_EVNTSEL3:
 		if (type != PMU_TYPE_EVNTSEL)
 			return NULL;
@@ -115,6 +120,9 @@  static inline struct kvm_pmc *get_gp_pmc_amd(struct kvm_pmu *pmu, u32 msr,
 	case MSR_F15H_PERF_CTR3:
 	case MSR_F15H_PERF_CTR4:
 	case MSR_F15H_PERF_CTR5:
+		if (!guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
+			return NULL;
+		fallthrough;
 	case MSR_K7_PERFCTR0 ... MSR_K7_PERFCTR3:
 		if (type != PMU_TYPE_COUNTER)
 			return NULL;