diff mbox series

[04/11] KVM: x86: Disable MCE related stuff for TDX

Message ID 20211112153733.2767561-5-xiaoyao.li@intel.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: TDX preparation of introducing vm_type and blocking ioctls based on vm_type | expand

Commit Message

Xiaoyao Li Nov. 12, 2021, 3:37 p.m. UTC
From: Sean Christopherson <sean.j.christopherson@intel.com>

MCE is not supported for TDX VM and KVM cannot inject #MC to TDX VM.

Introduce kvm_guest_mce_disallowed() which actually reports the MCE
availability based on vm_type. And use it to guard all the MCE related
CAPs and IOCTLs.

Note: KVM_X86_GET_MCE_CAP_SUPPORTED is KVM scope so that what it reports
may not match the behavior of specific VM (e.g., here for TDX VM). The
same for KVM_CAP_MCE when queried from /dev/kvm. To qeuery the precise
KVM_CAP_MCE of the VM, it should use VM's fd.

[ Xiaoyao: Guard MCE related CAPs ]

Co-developed-by: Kai Huang <kai.huang@linux.intel.com>
Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 arch/x86/kvm/x86.c | 10 ++++++++++
 arch/x86/kvm/x86.h |  5 +++++
 2 files changed, 15 insertions(+)

Comments

Sean Christopherson Nov. 12, 2021, 5:01 p.m. UTC | #1
On Fri, Nov 12, 2021, Xiaoyao Li wrote:
> From: Sean Christopherson <sean.j.christopherson@intel.com>
> 
> MCE is not supported for TDX VM and KVM cannot inject #MC to TDX VM.
> 
> Introduce kvm_guest_mce_disallowed() which actually reports the MCE
> availability based on vm_type. And use it to guard all the MCE related
> CAPs and IOCTLs.
> 
> Note: KVM_X86_GET_MCE_CAP_SUPPORTED is KVM scope so that what it reports
> may not match the behavior of specific VM (e.g., here for TDX VM). The
> same for KVM_CAP_MCE when queried from /dev/kvm. To qeuery the precise
> KVM_CAP_MCE of the VM, it should use VM's fd.
> 
> [ Xiaoyao: Guard MCE related CAPs ]
> 
> Co-developed-by: Kai Huang <kai.huang@linux.intel.com>
> Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  arch/x86/kvm/x86.c | 10 ++++++++++
>  arch/x86/kvm/x86.h |  5 +++++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index b02088343d80..2b21c5169f32 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4150,6 +4150,8 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  		break;
>  	case KVM_CAP_MCE:
>  		r = KVM_MAX_MCE_BANKS;
> +		if (kvm)
> +			r = kvm_guest_mce_disallowed(kvm) ? 0 : r;

		r = KVM_MAX_MCE_BANKS;
		if (kvm && kvm_guest_mce_disallowed(kvm))
			r = 0;

or

		r = (kvm && kvm_guest_mce_disallowed(kvm)) ? 0 : KVM_MAX_MCE_BANKS;

>  		break;
>  	case KVM_CAP_XCRS:
>  		r = boot_cpu_has(X86_FEATURE_XSAVE);
> @@ -5155,6 +5157,10 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>  	case KVM_X86_SETUP_MCE: {
>  		u64 mcg_cap;
>  
> +		r = EINVAL;
> +		if (kvm_guest_mce_disallowed(vcpu->kvm))
> +			goto out;
> +
>  		r = -EFAULT;
>  		if (copy_from_user(&mcg_cap, argp, sizeof(mcg_cap)))
>  			goto out;
> @@ -5164,6 +5170,10 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>  	case KVM_X86_SET_MCE: {
>  		struct kvm_x86_mce mce;
>  
> +		r = EINVAL;
> +		if (kvm_guest_mce_disallowed(vcpu->kvm))
> +			goto out;
> +
>  		r = -EFAULT;
>  		if (copy_from_user(&mce, argp, sizeof(mce)))
>  			goto out;
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index a2813892740d..69c60297bef2 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -441,6 +441,11 @@ static __always_inline bool kvm_irq_injection_disallowed(struct kvm_vcpu *vcpu)
>  	return vcpu->kvm->arch.vm_type == KVM_X86_TDX_VM;
>  }
>  
> +static __always_inline bool kvm_guest_mce_disallowed(struct kvm *kvm)

The "guest" part is potentially confusing and incosistent with e.g.
kvm_irq_injection_disallowed.  And given the current ridiculous spec, CR4.MCE=1
is _required_, so saying "mce disallowed" is arguably wrong from that perspective.

kvm_mce_injection_disallowed() would be more appropriate.

> +{
> +	return kvm->arch.vm_type == KVM_X86_TDX_VM;
> +}
> +
>  void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu);
>  void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu);
>  int kvm_spec_ctrl_test_value(u64 value);
> -- 
> 2.27.0
>
Xiaoyao Li Nov. 15, 2021, 3:39 p.m. UTC | #2
On 11/13/2021 1:01 AM, Sean Christopherson wrote:
> On Fri, Nov 12, 2021, Xiaoyao Li wrote:
>> From: Sean Christopherson <sean.j.christopherson@intel.com>
>>
>> MCE is not supported for TDX VM and KVM cannot inject #MC to TDX VM.
>>
>> Introduce kvm_guest_mce_disallowed() which actually reports the MCE
>> availability based on vm_type. And use it to guard all the MCE related
>> CAPs and IOCTLs.
>>
>> Note: KVM_X86_GET_MCE_CAP_SUPPORTED is KVM scope so that what it reports
>> may not match the behavior of specific VM (e.g., here for TDX VM). The
>> same for KVM_CAP_MCE when queried from /dev/kvm. To qeuery the precise
>> KVM_CAP_MCE of the VM, it should use VM's fd.
>>
>> [ Xiaoyao: Guard MCE related CAPs ]
>>
>> Co-developed-by: Kai Huang <kai.huang@linux.intel.com>
>> Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
>> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>>   arch/x86/kvm/x86.c | 10 ++++++++++
>>   arch/x86/kvm/x86.h |  5 +++++
>>   2 files changed, 15 insertions(+)
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index b02088343d80..2b21c5169f32 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -4150,6 +4150,8 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>>   		break;
>>   	case KVM_CAP_MCE:
>>   		r = KVM_MAX_MCE_BANKS;
>> +		if (kvm)
>> +			r = kvm_guest_mce_disallowed(kvm) ? 0 : r;
> 
> 		r = KVM_MAX_MCE_BANKS;
> 		if (kvm && kvm_guest_mce_disallowed(kvm))
> 			r = 0;
> 
> or
> 
> 		r = (kvm && kvm_guest_mce_disallowed(kvm)) ? 0 : KVM_MAX_MCE_BANKS;

I will use this one in next submission.

>>   		break;
>>   	case KVM_CAP_XCRS:
>>   		r = boot_cpu_has(X86_FEATURE_XSAVE);
>> @@ -5155,6 +5157,10 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>>   	case KVM_X86_SETUP_MCE: {
>>   		u64 mcg_cap;
>>   
>> +		r = EINVAL;
>> +		if (kvm_guest_mce_disallowed(vcpu->kvm))
>> +			goto out;
>> +
>>   		r = -EFAULT;
>>   		if (copy_from_user(&mcg_cap, argp, sizeof(mcg_cap)))
>>   			goto out;
>> @@ -5164,6 +5170,10 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>>   	case KVM_X86_SET_MCE: {
>>   		struct kvm_x86_mce mce;
>>   
>> +		r = EINVAL;
>> +		if (kvm_guest_mce_disallowed(vcpu->kvm))
>> +			goto out;
>> +
>>   		r = -EFAULT;
>>   		if (copy_from_user(&mce, argp, sizeof(mce)))
>>   			goto out;
>> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
>> index a2813892740d..69c60297bef2 100644
>> --- a/arch/x86/kvm/x86.h
>> +++ b/arch/x86/kvm/x86.h
>> @@ -441,6 +441,11 @@ static __always_inline bool kvm_irq_injection_disallowed(struct kvm_vcpu *vcpu)
>>   	return vcpu->kvm->arch.vm_type == KVM_X86_TDX_VM;
>>   }
>>   
>> +static __always_inline bool kvm_guest_mce_disallowed(struct kvm *kvm)
> 
> The "guest" part is potentially confusing and incosistent with e.g.
> kvm_irq_injection_disallowed.  And given the current ridiculous spec, CR4.MCE=1
> is _required_, so saying "mce disallowed" is arguably wrong from that perspective.
> 
> kvm_mce_injection_disallowed() would be more appropriate.

Good advice, I'll rename to it.

>> +{
>> +	return kvm->arch.vm_type == KVM_X86_TDX_VM;
>> +}
>> +
>>   void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu);
>>   void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu);
>>   int kvm_spec_ctrl_test_value(u64 value);
>> -- 
>> 2.27.0
>>
diff mbox series

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b02088343d80..2b21c5169f32 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4150,6 +4150,8 @@  int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		break;
 	case KVM_CAP_MCE:
 		r = KVM_MAX_MCE_BANKS;
+		if (kvm)
+			r = kvm_guest_mce_disallowed(kvm) ? 0 : r;
 		break;
 	case KVM_CAP_XCRS:
 		r = boot_cpu_has(X86_FEATURE_XSAVE);
@@ -5155,6 +5157,10 @@  long kvm_arch_vcpu_ioctl(struct file *filp,
 	case KVM_X86_SETUP_MCE: {
 		u64 mcg_cap;
 
+		r = EINVAL;
+		if (kvm_guest_mce_disallowed(vcpu->kvm))
+			goto out;
+
 		r = -EFAULT;
 		if (copy_from_user(&mcg_cap, argp, sizeof(mcg_cap)))
 			goto out;
@@ -5164,6 +5170,10 @@  long kvm_arch_vcpu_ioctl(struct file *filp,
 	case KVM_X86_SET_MCE: {
 		struct kvm_x86_mce mce;
 
+		r = EINVAL;
+		if (kvm_guest_mce_disallowed(vcpu->kvm))
+			goto out;
+
 		r = -EFAULT;
 		if (copy_from_user(&mce, argp, sizeof(mce)))
 			goto out;
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index a2813892740d..69c60297bef2 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -441,6 +441,11 @@  static __always_inline bool kvm_irq_injection_disallowed(struct kvm_vcpu *vcpu)
 	return vcpu->kvm->arch.vm_type == KVM_X86_TDX_VM;
 }
 
+static __always_inline bool kvm_guest_mce_disallowed(struct kvm *kvm)
+{
+	return kvm->arch.vm_type == KVM_X86_TDX_VM;
+}
+
 void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu);
 void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu);
 int kvm_spec_ctrl_test_value(u64 value);