diff mbox series

[v3,2/4] KVM: SVM: Clear MSR_TSC_AUX[63:32] on write

Message ID 20210423223404.3860547-3-seanjc@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: MSR_TSC_AUX fixes and improvements | expand

Commit Message

Sean Christopherson April 23, 2021, 10:34 p.m. UTC
Force clear bits 63:32 of MSR_TSC_AUX on write to emulate current AMD
CPUs, which completely ignore the upper 32 bits, including dropping them
on write.  Emulating AMD hardware will also allow migrating a vCPU from
AMD hardware to Intel hardware without requiring userspace to manually
clear the upper bits, which are reserved on Intel hardware.

Presumably, MSR_TSC_AUX[63:32] are intended to be reserved on AMD, but
sadly the APM doesn't say _anything_ about those bits in the context of
MSR access.  The RDTSCP entry simply states that RCX contains bits 31:0
of the MSR, zero extended.  And even worse is that the RDPID description
implies that it can consume all 64 bits of the MSR:

  RDPID reads the value of TSC_AUX MSR used by the RDTSCP instruction
  into the specified destination register. Normal operand size prefixes
  do not apply and the update is either 32 bit or 64 bit based on the
  current mode.

Emulate current hardware behavior to give KVM the best odds of playing
nice with whatever the behavior of future AMD CPUs happens to be.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/svm.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Comments

Vitaly Kuznetsov April 26, 2021, 8:57 a.m. UTC | #1
Sean Christopherson <seanjc@google.com> writes:

> Force clear bits 63:32 of MSR_TSC_AUX on write to emulate current AMD
> CPUs, which completely ignore the upper 32 bits, including dropping them
> on write.  Emulating AMD hardware will also allow migrating a vCPU from
> AMD hardware to Intel hardware without requiring userspace to manually
> clear the upper bits, which are reserved on Intel hardware.
>
> Presumably, MSR_TSC_AUX[63:32] are intended to be reserved on AMD, but
> sadly the APM doesn't say _anything_ about those bits in the context of
> MSR access.  The RDTSCP entry simply states that RCX contains bits 31:0
> of the MSR, zero extended.  And even worse is that the RDPID description
> implies that it can consume all 64 bits of the MSR:
>
>   RDPID reads the value of TSC_AUX MSR used by the RDTSCP instruction
>   into the specified destination register. Normal operand size prefixes
>   do not apply and the update is either 32 bit or 64 bit based on the
>   current mode.
>
> Emulate current hardware behavior to give KVM the best odds of playing
> nice with whatever the behavior of future AMD CPUs happens to be.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/svm.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 9ed9c7bd7cfd..71d704f8d569 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -2904,8 +2904,17 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>  		 * direct_access_msrs.  Doing that would require a rdmsr in
>  		 * svm_vcpu_put.
>  		 */
> -		svm->tsc_aux = data;
>  		wrmsrl(MSR_TSC_AUX, svm->tsc_aux);

Hm, shouldn't this be 

wrmsrl(MSR_TSC_AUX, data) or wrmsrl(MSR_TSC_AUX, (u32)data)

then? As svm->tsc_aux wasn't updated yet.

> +
> +		/*
> +		 * Per Intel's SDM, bits 63:32 are reserved, but AMD's APM has
> +		 * incomplete and conflicting architectural behavior.  Current
> +		 * AMD CPUs completely ignore bits 63:32, i.e. they aren't
> +		 * reserved and always read as zeros.  Emulate AMD CPU behavior
> +		 * to avoid explosions if the vCPU is migrated from an AMD host
> +		 * to an Intel host.
> +		 */
> +		svm->tsc_aux = (u32)data;

Actually, shouldn't we just move wrmsrl() here? Assuming we're not sure
how (and if) upper 32 bits are going to be used, it would probably make
sense to not write them to the actual MSR...

>  		break;
>  	case MSR_IA32_DEBUGCTLMSR:
>  		if (!boot_cpu_has(X86_FEATURE_LBRV)) {
Sean Christopherson April 26, 2021, 7:18 p.m. UTC | #2
On Mon, Apr 26, 2021, Vitaly Kuznetsov wrote:
> Sean Christopherson <seanjc@google.com> writes:
> 
> > Force clear bits 63:32 of MSR_TSC_AUX on write to emulate current AMD
> > CPUs, which completely ignore the upper 32 bits, including dropping them
> > on write.  Emulating AMD hardware will also allow migrating a vCPU from
> > AMD hardware to Intel hardware without requiring userspace to manually
> > clear the upper bits, which are reserved on Intel hardware.
> >
> > Presumably, MSR_TSC_AUX[63:32] are intended to be reserved on AMD, but
> > sadly the APM doesn't say _anything_ about those bits in the context of
> > MSR access.  The RDTSCP entry simply states that RCX contains bits 31:0
> > of the MSR, zero extended.  And even worse is that the RDPID description
> > implies that it can consume all 64 bits of the MSR:
> >
> >   RDPID reads the value of TSC_AUX MSR used by the RDTSCP instruction
> >   into the specified destination register. Normal operand size prefixes
> >   do not apply and the update is either 32 bit or 64 bit based on the
> >   current mode.
> >
> > Emulate current hardware behavior to give KVM the best odds of playing
> > nice with whatever the behavior of future AMD CPUs happens to be.
> >
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >  arch/x86/kvm/svm/svm.c | 11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index 9ed9c7bd7cfd..71d704f8d569 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -2904,8 +2904,17 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
> >  		 * direct_access_msrs.  Doing that would require a rdmsr in
> >  		 * svm_vcpu_put.
> >  		 */
> > -		svm->tsc_aux = data;
> >  		wrmsrl(MSR_TSC_AUX, svm->tsc_aux);
> 
> Hm, shouldn't this be 
> 
> wrmsrl(MSR_TSC_AUX, data) or wrmsrl(MSR_TSC_AUX, (u32)data)
> 
> then? As svm->tsc_aux wasn't updated yet.
> 
> > +
> > +		/*
> > +		 * Per Intel's SDM, bits 63:32 are reserved, but AMD's APM has
> > +		 * incomplete and conflicting architectural behavior.  Current
> > +		 * AMD CPUs completely ignore bits 63:32, i.e. they aren't
> > +		 * reserved and always read as zeros.  Emulate AMD CPU behavior
> > +		 * to avoid explosions if the vCPU is migrated from an AMD host
> > +		 * to an Intel host.
> > +		 */
> > +		svm->tsc_aux = (u32)data;
> 
> Actually, shouldn't we just move wrmsrl() here? Assuming we're not sure
> how (and if) upper 32 bits are going to be used, it would probably make
> sense to not write them to the actual MSR...

Argh.  I got too clever in trying to minimize the patch deltas and "broke" this
intermediate patch.  Once the user return framework is used, the result of
setting the MSR is checked before setting svm->tsc_aux, i.e. don't set the
virtual state if the real state could not be set.

I'm very tempted to add yet another patch to this mess to do:

		if (wrmsrl_safe(MSR_TSC_AUX, data))
			return 1;

		svm->tsc_aux = data;

And then this patch becomes:

		data = (u32)data;

		if (wrmsrl_safe(MSR_TSC_AUX, data))
			return 1;

		svm->tsc_aux = data;

The above will also make patch 3 cleaner as it will preserve the ordering of
truncating data and the wrmsr.

> >  		break;
> >  	case MSR_IA32_DEBUGCTLMSR:
> >  		if (!boot_cpu_has(X86_FEATURE_LBRV)) {
> 
> -- 
> Vitaly
>
Sean Christopherson April 26, 2021, 7:44 p.m. UTC | #3
On Mon, Apr 26, 2021, Sean Christopherson wrote:
> On Mon, Apr 26, 2021, Vitaly Kuznetsov wrote:
> > Actually, shouldn't we just move wrmsrl() here? Assuming we're not sure
> > how (and if) upper 32 bits are going to be used, it would probably make
> > sense to not write them to the actual MSR...
> 
> Argh.  I got too clever in trying to minimize the patch deltas and "broke" this
> intermediate patch.  Once the user return framework is used, the result of
> setting the MSR is checked before setting svm->tsc_aux, i.e. don't set the
> virtual state if the real state could not be set.
> 
> I'm very tempted to add yet another patch to this mess to do:
> 
> 		if (wrmsrl_safe(MSR_TSC_AUX, data))
> 			return 1;
> 
> 		svm->tsc_aux = data;
> 
> And then this patch becomes:
> 
> 		data = (u32)data;
> 
> 		if (wrmsrl_safe(MSR_TSC_AUX, data))
> 			return 1;
> 
> 		svm->tsc_aux = data;
> 
> The above will also make patch 3 cleaner as it will preserve the ordering of
> truncating data and the wrmsr.

Ah, never mind, Paolo already pushed this to kvm/next with your above fix.
diff mbox series

Patch

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 9ed9c7bd7cfd..71d704f8d569 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2904,8 +2904,17 @@  static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 		 * direct_access_msrs.  Doing that would require a rdmsr in
 		 * svm_vcpu_put.
 		 */
-		svm->tsc_aux = data;
 		wrmsrl(MSR_TSC_AUX, svm->tsc_aux);
+
+		/*
+		 * Per Intel's SDM, bits 63:32 are reserved, but AMD's APM has
+		 * incomplete and conflicting architectural behavior.  Current
+		 * AMD CPUs completely ignore bits 63:32, i.e. they aren't
+		 * reserved and always read as zeros.  Emulate AMD CPU behavior
+		 * to avoid explosions if the vCPU is migrated from an AMD host
+		 * to an Intel host.
+		 */
+		svm->tsc_aux = (u32)data;
 		break;
 	case MSR_IA32_DEBUGCTLMSR:
 		if (!boot_cpu_has(X86_FEATURE_LBRV)) {