diff mbox series

[19/19] KVM: x86: Enable supervisor IBT support for guest

Message ID 20220616084643.19564-20-weijiang.yang@intel.com (mailing list archive)
State New, archived
Headers show
Series Refresh queued CET virtualization series | expand

Commit Message

Yang, Weijiang June 16, 2022, 8:46 a.m. UTC
Mainline kernel now supports supervisor IBT for kernel code,
to make s-IBT work in guest(nested guest), pass through
MSR_IA32_S_CET to guest(nested guest) if host kernel and KVM
enabled IBT. Note, s-IBT can work independent to host xsaves
support because guest MSR_IA32_S_CET can be stored/loaded from
specific VMCS field.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/cpuid.h      |  5 +++++
 arch/x86/kvm/vmx/nested.c |  3 +++
 arch/x86/kvm/vmx/vmx.c    | 27 ++++++++++++++++++++++++---
 arch/x86/kvm/x86.c        | 13 ++++++++++++-
 4 files changed, 44 insertions(+), 4 deletions(-)

Comments

Peter Zijlstra June 16, 2022, 11:05 a.m. UTC | #1
On Thu, Jun 16, 2022 at 04:46:43AM -0400, Yang Weijiang wrote:
> Mainline kernel now supports supervisor IBT for kernel code,
> to make s-IBT work in guest(nested guest), pass through
> MSR_IA32_S_CET to guest(nested guest) if host kernel and KVM
> enabled IBT. Note, s-IBT can work independent to host xsaves
> support because guest MSR_IA32_S_CET can be stored/loaded from
> specific VMCS field.
> 
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>  arch/x86/kvm/cpuid.h      |  5 +++++
>  arch/x86/kvm/vmx/nested.c |  3 +++
>  arch/x86/kvm/vmx/vmx.c    | 27 ++++++++++++++++++++++++---
>  arch/x86/kvm/x86.c        | 13 ++++++++++++-
>  4 files changed, 44 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index ac72aabba981..c67c1e2fc11a 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -230,4 +230,9 @@ static __always_inline bool guest_pv_has(struct kvm_vcpu *vcpu,
>  	return vcpu->arch.pv_cpuid.features & (1u << kvm_feature);
>  }
>  
> +static __always_inline bool cet_kernel_ibt_supported(void)
> +{
> +	return HAS_KERNEL_IBT && kvm_cpu_cap_has(X86_FEATURE_IBT);
> +}

As stated before; I would much rather it expose S_CET unconditional of
host kernel config.
Peter Zijlstra June 16, 2022, 11:19 a.m. UTC | #2
On Thu, Jun 16, 2022 at 04:46:43AM -0400, Yang Weijiang wrote:
> Mainline kernel now supports supervisor IBT for kernel code,
> to make s-IBT work in guest(nested guest), pass through
> MSR_IA32_S_CET to guest(nested guest) if host kernel and KVM
> enabled IBT. Note, s-IBT can work independent to host xsaves
> support because guest MSR_IA32_S_CET can be stored/loaded from
> specific VMCS field.


> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index fe049d0e5ecc..c0118b33806a 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1463,6 +1463,7 @@ static const u32 msrs_to_save_all[] = {
>  	MSR_IA32_XFD, MSR_IA32_XFD_ERR,
>  	MSR_IA32_XSS,
>  	MSR_IA32_U_CET, MSR_IA32_PL3_SSP, MSR_KVM_GUEST_SSP,
> +	MSR_IA32_S_CET,
>  };


So much like my local kvm/qemu hacks; this patch suffers the problem of
not exposing S_SHSTK. What happens if a guest tries to use that?

Should we intercept and reject setting those bits or complete this patch
and support full S_SHSTK? (with all the warts and horrors that entails)

I don't think throwing this out in this half-finished state makes much
sense (which is why I never much shared my hacks).


> @@ -11830,7 +11835,13 @@ int kvm_arch_hardware_setup(void *opaque)
>  	/* Update CET features now as kvm_caps.supported_xss is finalized. */
>  	if (!kvm_cet_user_supported()) {
>  		kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
> -		kvm_cpu_cap_clear(X86_FEATURE_IBT);
> +		/* If CET user bit is disabled due to cmdline option such as
> +		 * noxsaves, but kernel IBT is on, this means we can expose
> +		 * kernel IBT alone to guest since CET user mode msrs are not
> +		 * passed through to guest.
> +		 */

Invalid multi-line comment style.

> +		if (!cet_kernel_ibt_supported())
> +			kvm_cpu_cap_clear(X86_FEATURE_IBT);
Yang, Weijiang June 16, 2022, 3:56 p.m. UTC | #3
On 6/16/2022 7:19 PM, Peter Zijlstra wrote:
> On Thu, Jun 16, 2022 at 04:46:43AM -0400, Yang Weijiang wrote:
>> Mainline kernel now supports supervisor IBT for kernel code,
>> to make s-IBT work in guest(nested guest), pass through
>> MSR_IA32_S_CET to guest(nested guest) if host kernel and KVM
>> enabled IBT. Note, s-IBT can work independent to host xsaves
>> support because guest MSR_IA32_S_CET can be stored/loaded from
>> specific VMCS field.
>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index fe049d0e5ecc..c0118b33806a 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -1463,6 +1463,7 @@ static const u32 msrs_to_save_all[] = {
>>   	MSR_IA32_XFD, MSR_IA32_XFD_ERR,
>>   	MSR_IA32_XSS,
>>   	MSR_IA32_U_CET, MSR_IA32_PL3_SSP, MSR_KVM_GUEST_SSP,
>> +	MSR_IA32_S_CET,
>>   };
>
> So much like my local kvm/qemu hacks; this patch suffers the problem of
> not exposing S_SHSTK. What happens if a guest tries to use that?
With current solution, I think guest kernel will hit #GP while 
reading/writing PL0_SSP.
>
> Should we intercept and reject setting those bits or complete this patch
> and support full S_SHSTK? (with all the warts and horrors that entails)
>
> I don't think throwing this out in this half-finished state makes much
> sense (which is why I never much shared my hacks).

You reminded me to think over these cases even I don't have a solution now,

thank you!

>
>
>> @@ -11830,7 +11835,13 @@ int kvm_arch_hardware_setup(void *opaque)
>>   	/* Update CET features now as kvm_caps.supported_xss is finalized. */
>>   	if (!kvm_cet_user_supported()) {
>>   		kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
>> -		kvm_cpu_cap_clear(X86_FEATURE_IBT);
>> +		/* If CET user bit is disabled due to cmdline option such as
>> +		 * noxsaves, but kernel IBT is on, this means we can expose
>> +		 * kernel IBT alone to guest since CET user mode msrs are not
>> +		 * passed through to guest.
>> +		 */
> Invalid multi-line comment style.
Oops, last minute change messed it up :-(
>
>> +		if (!cet_kernel_ibt_supported())
>> +			kvm_cpu_cap_clear(X86_FEATURE_IBT);
diff mbox series

Patch

diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index ac72aabba981..c67c1e2fc11a 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -230,4 +230,9 @@  static __always_inline bool guest_pv_has(struct kvm_vcpu *vcpu,
 	return vcpu->arch.pv_cpuid.features & (1u << kvm_feature);
 }
 
+static __always_inline bool cet_kernel_ibt_supported(void)
+{
+	return HAS_KERNEL_IBT && kvm_cpu_cap_has(X86_FEATURE_IBT);
+}
+
 #endif
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index f31f3d394507..d394136891d0 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -688,6 +688,9 @@  static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
 	nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0,
 					 MSR_IA32_U_CET, MSR_TYPE_RW);
 
+	nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0,
+					 MSR_IA32_S_CET, MSR_TYPE_RW);
+
 	nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0,
 					 MSR_IA32_PL3_SSP, MSR_TYPE_RW);
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 00782d1750a5..6e7e596c0147 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -585,6 +585,7 @@  static bool is_valid_passthrough_msr(u32 msr)
 		return true;
 	case MSR_IA32_U_CET:
 	case MSR_IA32_PL3_SSP:
+	case MSR_IA32_S_CET:
 		return true;
 	}
 
@@ -1773,7 +1774,8 @@  static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
 static bool cet_is_msr_accessible(struct kvm_vcpu *vcpu,
 				  struct msr_data *msr)
 {
-	if (!kvm_cet_user_supported())
+	if (!kvm_cet_user_supported() &&
+	    !cet_kernel_ibt_supported())
 		return false;
 
 	if (msr->host_initiated)
@@ -1783,6 +1785,10 @@  static bool cet_is_msr_accessible(struct kvm_vcpu *vcpu,
 	    !guest_cpuid_has(vcpu, X86_FEATURE_IBT))
 		return false;
 
+	if (msr->index == MSR_IA32_S_CET &&
+	    guest_cpuid_has(vcpu, X86_FEATURE_IBT))
+		return true;
+
 	if ((msr->index == MSR_IA32_PL3_SSP ||
 	     msr->index == MSR_KVM_GUEST_SSP) &&
 	    !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
@@ -1933,10 +1939,13 @@  static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_IA32_U_CET:
 	case MSR_IA32_PL3_SSP:
 	case MSR_KVM_GUEST_SSP:
+	case MSR_IA32_S_CET:
 		if (!cet_is_msr_accessible(vcpu, msr_info))
 			return 1;
 		if (msr_info->index == MSR_KVM_GUEST_SSP)
 			msr_info->data = vmcs_readl(GUEST_SSP);
+		else if (msr_info->index == MSR_IA32_S_CET)
+			msr_info->data = vmcs_readl(GUEST_S_CET);
 		else
 			kvm_get_xsave_msr(msr_info);
 		break;
@@ -2273,12 +2282,16 @@  static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			vmx->pt_desc.guest.addr_a[index / 2] = data;
 		break;
 	case MSR_IA32_U_CET:
+	case MSR_IA32_S_CET:
 		if (!cet_is_msr_accessible(vcpu, msr_info))
 			return 1;
 		if ((data & GENMASK(9, 6)) ||
 		    is_noncanonical_address(data, vcpu))
 			return 1;
-		kvm_set_xsave_msr(msr_info);
+		if (msr_index == MSR_IA32_S_CET)
+			vmcs_writel(GUEST_S_CET, data);
+		else
+			kvm_set_xsave_msr(msr_info);
 		break;
 	case MSR_IA32_PL3_SSP:
 	case MSR_KVM_GUEST_SSP:
@@ -7615,6 +7628,9 @@  static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu)
 
 	incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
 	vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, MSR_TYPE_RW, incpt);
+
+	incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_IBT);
+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, MSR_TYPE_RW, incpt);
 }
 
 static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
@@ -7680,7 +7696,7 @@  static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	/* Refresh #PF interception to account for MAXPHYADDR changes. */
 	vmx_update_exception_bitmap(vcpu);
 
-	if (kvm_cet_user_supported())
+	if (kvm_cet_user_supported() || cet_kernel_ibt_supported())
 		vmx_update_intercept_for_cet_msr(vcpu);
 }
 
@@ -7743,6 +7759,11 @@  static __init void vmx_set_cpu_caps(void)
 		kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
 #endif
 
+#ifndef CONFIG_X86_KERNEL_IBT
+	if (boot_cpu_has(X86_FEATURE_IBT))
+		kvm_cpu_cap_clear(X86_FEATURE_IBT);
+#endif
+
 }
 
 static void vmx_request_immediate_exit(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fe049d0e5ecc..c0118b33806a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1463,6 +1463,7 @@  static const u32 msrs_to_save_all[] = {
 	MSR_IA32_XFD, MSR_IA32_XFD_ERR,
 	MSR_IA32_XSS,
 	MSR_IA32_U_CET, MSR_IA32_PL3_SSP, MSR_KVM_GUEST_SSP,
+	MSR_IA32_S_CET,
 };
 
 static u32 msrs_to_save[ARRAY_SIZE(msrs_to_save_all)];
@@ -6823,6 +6824,10 @@  static void kvm_init_msr_list(void)
 			if (!kvm_cet_user_supported())
 				continue;
 			break;
+		case MSR_IA32_S_CET:
+			if (!cet_kernel_ibt_supported())
+				continue;
+			break;
 		default:
 			break;
 		}
@@ -11830,7 +11835,13 @@  int kvm_arch_hardware_setup(void *opaque)
 	/* Update CET features now as kvm_caps.supported_xss is finalized. */
 	if (!kvm_cet_user_supported()) {
 		kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
-		kvm_cpu_cap_clear(X86_FEATURE_IBT);
+		/* If CET user bit is disabled due to cmdline option such as
+		 * noxsaves, but kernel IBT is on, this means we can expose
+		 * kernel IBT alone to guest since CET user mode msrs are not
+		 * passed through to guest.
+		 */
+		if (!cet_kernel_ibt_supported())
+			kvm_cpu_cap_clear(X86_FEATURE_IBT);
 	}
 
 	/*