diff mbox

[RFC,1/2] KVM: x86: Add a framework for supporting MSR-based features

Message ID 20180208225846.22074.70944.stgit@tlendack-t1.amdoffice.net (mailing list archive)
State New, archived
Headers show

Commit Message

Tom Lendacky Feb. 8, 2018, 10:58 p.m. UTC
Provide a new KVM capability that allows bits within MSRs to be recognized
as features.  Two new ioctls are added to the VM ioctl routine to retrieve
the list of these MSRs and their values. The MSR features can optionally
be exposed based on a CPU and/or a CPU feature.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    1 
 arch/x86/kvm/x86.c              |  108 ++++++++++++++++++++++++++++++++++++++-
 include/uapi/linux/kvm.h        |    1 
 3 files changed, 108 insertions(+), 2 deletions(-)

Comments

Paolo Bonzini Feb. 13, 2018, 4:21 p.m. UTC | #1
On 08/02/2018 23:58, Tom Lendacky wrote:
> Provide a new KVM capability that allows bits within MSRs to be recognized
> as features.  Two new ioctls are added to the VM ioctl routine to retrieve
> the list of these MSRs and their values. The MSR features can optionally
> be exposed based on a CPU and/or a CPU feature.

Yes, pretty much.  Just two changes:

> +struct kvm_msr_based_features {
> +	u32 msr;			/* MSR to query */
> +	u64 mask;			/* MSR mask */
> +	const struct x86_cpu_id *match;	/* Match criteria */
> +	u64 value;			/* MSR value */

1) These two should be replaced by a kvm_x86_ops callback, because
computing the value is sometimes a bit more complicated than just rdmsr
(for example, MSRs for VMX capabilities depend on the kvm_intel.ko
module parameters).


> +	case KVM_CAP_GET_MSR_FEATURES:

This should be KVM_GET_MSR.

> +		r = msr_io(NULL, argp, do_get_msr_features, 1);
> +		break;


Bonus points for writing documentation :) and for moving the MSR
handling code to arch/x86/kvm/msr.{c,h}.

Thanks,

Paolo
Paolo Bonzini Feb. 13, 2018, 4:25 p.m. UTC | #2
On 08/02/2018 23:58, Tom Lendacky wrote:
> +bool kvm_valid_msr_feature(u32 msr, u64 data)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < num_msr_based_features; i++) {
> +		struct kvm_msr_based_features *m = msr_based_features + i;
> +
> +		if (msr != m->msr)
> +			continue;
> +
> +		/* Make sure not trying to change unsupported bits */
> +		return (data & ~m->mask) ? false : true;
> +	}
> +
> +	return false;
> +}
> +EXPORT_SYMBOL_GPL(kvm_valid_msr_feature);
> +

This is probably unnecessary too (the allowed values are a bit more
complicated for, you just guessed it, VMX capability MSRs) and you can
just check bits other than LFENCE in svm_set_msr.

Paolo
Tom Lendacky Feb. 14, 2018, 4:23 a.m. UTC | #3
On 2/13/2018 10:21 AM, Paolo Bonzini wrote:
> On 08/02/2018 23:58, Tom Lendacky wrote:
>> Provide a new KVM capability that allows bits within MSRs to be recognized
>> as features.  Two new ioctls are added to the VM ioctl routine to retrieve
>> the list of these MSRs and their values. The MSR features can optionally
>> be exposed based on a CPU and/or a CPU feature.
> 
> Yes, pretty much.  Just two changes:
> 
>> +struct kvm_msr_based_features {
>> +	u32 msr;			/* MSR to query */
>> +	u64 mask;			/* MSR mask */
>> +	const struct x86_cpu_id *match;	/* Match criteria */
>> +	u64 value;			/* MSR value */
> 
> 1) These two should be replaced by a kvm_x86_ops callback, because
> computing the value is sometimes a bit more complicated than just rdmsr
> (for example, MSRs for VMX capabilities depend on the kvm_intel.ko
> module parameters).

Ok, I'll rework this.

> 
> 
>> +	case KVM_CAP_GET_MSR_FEATURES:
> 
> This should be KVM_GET_MSR.

Yup, not sure what I was thinking there.

> 
>> +		r = msr_io(NULL, argp, do_get_msr_features, 1);
>> +		break;
> 
> 
> Bonus points for writing documentation :) and for moving the MSR> handling code to arch/x86/kvm/msr.{c,h}.

Yup, there will be documentation on it - I wanted to make sure the
direction was correct first.  Splitting out msr.c/msr.h might be
best as a separate patchset, let me see what's involved.

Thanks,
Tom

> 
> Thanks,
> 
> Paolo
>
Tom Lendacky Feb. 14, 2018, 4:42 a.m. UTC | #4
On 2/13/2018 10:25 AM, Paolo Bonzini wrote:
> On 08/02/2018 23:58, Tom Lendacky wrote:
>> +bool kvm_valid_msr_feature(u32 msr, u64 data)
>> +{
>> +	unsigned int i;
>> +
>> +	for (i = 0; i < num_msr_based_features; i++) {
>> +		struct kvm_msr_based_features *m = msr_based_features + i;
>> +
>> +		if (msr != m->msr)
>> +			continue;
>> +
>> +		/* Make sure not trying to change unsupported bits */
>> +		return (data & ~m->mask) ? false : true;
>> +	}
>> +
>> +	return false;
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_valid_msr_feature);
>> +
> 
> This is probably unnecessary too (the allowed values are a bit more
> complicated for, you just guessed it, VMX capability MSRs) and you can
> just check bits other than LFENCE in svm_set_msr.

The whole routine or just the bit checking?  I can see still needing the
check to be sure the "feature" is present.

Thanks,
Tom

> 
> Paolo
>
Paolo Bonzini Feb. 14, 2018, 4:41 p.m. UTC | #5
On 14/02/2018 05:42, Tom Lendacky wrote:
>>> +bool kvm_valid_msr_feature(u32 msr, u64 data)
>>> +{
>>> +	unsigned int i;
>>> +
>>> +	for (i = 0; i < num_msr_based_features; i++) {
>>> +		struct kvm_msr_based_features *m = msr_based_features + i;
>>> +
>>> +		if (msr != m->msr)
>>> +			continue;
>>> +
>>> +		/* Make sure not trying to change unsupported bits */
>>> +		return (data & ~m->mask) ? false : true;
>>> +	}
>>> +
>>> +	return false;
>>> +}
>>> +EXPORT_SYMBOL_GPL(kvm_valid_msr_feature);
>>> +
>>
>> This is probably unnecessary too (the allowed values are a bit more
>> complicated for, you just guessed it, VMX capability MSRs) and you can
>> just check bits other than LFENCE in svm_set_msr.
>
> The whole routine or just the bit checking?  I can see still needing the
> check to be sure the "feature" is present.

You can return the MSR unconditionally from KVM_GET_MSR_INDEX_LIST.
Then KVM_GET_MSR would return 0 or 1 depending on whether the feature is
present.

Paolo
Borislav Petkov Feb. 14, 2018, 4:44 p.m. UTC | #6
On Thu, Feb 08, 2018 at 04:58:46PM -0600, Tom Lendacky wrote:
> @@ -2681,11 +2731,15 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs,
>  {
>  	int i, idx;
>  
> -	idx = srcu_read_lock(&vcpu->kvm->srcu);
> +	if (vcpu)
> +		idx = srcu_read_lock(&vcpu->kvm->srcu);
> +
>  	for (i = 0; i < msrs->nmsrs; ++i)
>  		if (do_msr(vcpu, entries[i].index, &entries[i].data))
>  			break;
> -	srcu_read_unlock(&vcpu->kvm->srcu, idx);
> +
> +	if (vcpu)
> +		srcu_read_unlock(&vcpu->kvm->srcu, idx);


./include/linux/srcu.h:175:2: warning: ‘idx’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  __srcu_read_unlock(sp, idx);
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/x86/kvm/x86.c:2739:9: note: ‘idx’ was declared here
  int i, idx;
         ^~~

I know, silly gcc.
Paolo Bonzini Feb. 14, 2018, 4:58 p.m. UTC | #7
On 14/02/2018 17:44, Borislav Petkov wrote:
> On Thu, Feb 08, 2018 at 04:58:46PM -0600, Tom Lendacky wrote:
>> @@ -2681,11 +2731,15 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs,
>>  {
>>  	int i, idx;
>>  
>> -	idx = srcu_read_lock(&vcpu->kvm->srcu);
>> +	if (vcpu)
>> +		idx = srcu_read_lock(&vcpu->kvm->srcu);
>> +
>>  	for (i = 0; i < msrs->nmsrs; ++i)
>>  		if (do_msr(vcpu, entries[i].index, &entries[i].data))
>>  			break;
>> -	srcu_read_unlock(&vcpu->kvm->srcu, idx);
>> +
>> +	if (vcpu)
>> +		srcu_read_unlock(&vcpu->kvm->srcu, idx);
> 
> 
> ./include/linux/srcu.h:175:2: warning: ‘idx’ may be used uninitialized in this function [-Wmaybe-uninitialized]
>   __srcu_read_unlock(sp, idx);
>   ^~~~~~~~~~~~~~~~~~~~~~~~~~~
> arch/x86/kvm/x86.c:2739:9: note: ‘idx’ was declared here
>   int i, idx;
>          ^~~
> 
> I know, silly gcc.
> 

Nice point---even better, just push srcu_read_lock/unlock to msr_io or
even msr_io's callers.

Thanks,

Paolo
diff mbox

Patch

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index dd6f57a..5568d0d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1198,6 +1198,7 @@  static inline int emulate_instruction(struct kvm_vcpu *vcpu,
 bool kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer);
 int kvm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr);
 int kvm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr);
+bool kvm_valid_msr_feature(u32 msr, u64 mask);
 
 struct x86_emulate_ctxt;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 07d1c7f..4251c34 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -69,6 +69,7 @@ 
 #include <asm/irq_remapping.h>
 #include <asm/mshyperv.h>
 #include <asm/hypervisor.h>
+#include <asm/cpu_device_id.h>
 
 #define CREATE_TRACE_POINTS
 #include "trace.h"
@@ -1048,6 +1049,41 @@  bool kvm_rdpmc(struct kvm_vcpu *vcpu)
 
 static unsigned num_emulated_msrs;
 
+/*
+ * List of msr numbers which are used to expose MSR-based features that
+ * can be used by QEMU to validate requested CPU features.
+ */
+struct kvm_msr_based_features {
+	u32 msr;			/* MSR to query */
+	u64 mask;			/* MSR mask */
+	const struct x86_cpu_id *match;	/* Match criteria */
+	u64 value;			/* MSR value */
+};
+
+static struct kvm_msr_based_features msr_based_features[] = {
+	{}
+};
+
+static unsigned int num_msr_based_features;
+
+bool kvm_valid_msr_feature(u32 msr, u64 data)
+{
+	unsigned int i;
+
+	for (i = 0; i < num_msr_based_features; i++) {
+		struct kvm_msr_based_features *m = msr_based_features + i;
+
+		if (msr != m->msr)
+			continue;
+
+		/* Make sure not trying to change unsupported bits */
+		return (data & ~m->mask) ? false : true;
+	}
+
+	return false;
+}
+EXPORT_SYMBOL_GPL(kvm_valid_msr_feature);
+
 bool kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer)
 {
 	if (efer & efer_reserved_bits)
@@ -1156,6 +1192,20 @@  static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
 	return kvm_set_msr(vcpu, &msr);
 }
 
+static int do_get_msr_features(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
+{
+	unsigned int i;
+
+	for (i = 0; i < num_msr_based_features; i++) {
+		if (msr_based_features[i].msr == index) {
+			*data = msr_based_features[i].value;
+			return 0;
+		}
+	}
+
+	return 1;
+}
+
 #ifdef CONFIG_X86_64
 struct pvclock_gtod_data {
 	seqcount_t	seq;
@@ -2681,11 +2731,15 @@  static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs,
 {
 	int i, idx;
 
-	idx = srcu_read_lock(&vcpu->kvm->srcu);
+	if (vcpu)
+		idx = srcu_read_lock(&vcpu->kvm->srcu);
+
 	for (i = 0; i < msrs->nmsrs; ++i)
 		if (do_msr(vcpu, entries[i].index, &entries[i].data))
 			break;
-	srcu_read_unlock(&vcpu->kvm->srcu, idx);
+
+	if (vcpu)
+		srcu_read_unlock(&vcpu->kvm->srcu, idx);
 
 	return i;
 }
@@ -2784,6 +2838,7 @@  int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_SET_BOOT_CPU_ID:
  	case KVM_CAP_SPLIT_IRQCHIP:
 	case KVM_CAP_IMMEDIATE_EXIT:
+	case KVM_CAP_GET_MSR_FEATURES:
 		r = 1;
 		break;
 	case KVM_CAP_ADJUST_CLOCK:
@@ -4408,6 +4463,36 @@  long kvm_arch_vm_ioctl(struct file *filp,
 			r = kvm_x86_ops->mem_enc_unreg_region(kvm, &region);
 		break;
 	}
+	case KVM_GET_MSR_INDEX_LIST: {
+		u32 feature_msrs[ARRAY_SIZE(msr_based_features)];
+		struct kvm_msr_list __user *user_msr_list = argp;
+		struct kvm_msr_list msr_list;
+		unsigned int i, n;
+
+		r = -EFAULT;
+		if (copy_from_user(&msr_list, user_msr_list, sizeof(msr_list)))
+			goto out;
+		n = msr_list.nmsrs;
+		msr_list.nmsrs = num_msr_based_features;
+		if (copy_to_user(user_msr_list, &msr_list, sizeof(msr_list)))
+			goto out;
+		r = -E2BIG;
+		if (n < msr_list.nmsrs)
+			goto out;
+
+		for (i = 0; i < num_msr_based_features; i++)
+			feature_msrs[i] = msr_based_features[i].msr;
+
+		r = -EFAULT;
+		if (copy_to_user(user_msr_list->indices, &feature_msrs,
+				 num_msr_based_features * sizeof(u32)))
+			goto out;
+		r = 0;
+		break;
+	}
+	case KVM_CAP_GET_MSR_FEATURES:
+		r = msr_io(NULL, argp, do_get_msr_features, 1);
+		break;
 	default:
 		r = -ENOTTY;
 	}
@@ -4462,6 +4547,25 @@  static void kvm_init_msr_list(void)
 		j++;
 	}
 	num_emulated_msrs = j;
+
+	for (i = j = 0; ; i++) {
+		struct kvm_msr_based_features *m = msr_based_features + i;
+
+		if (!m->msr)
+			break;
+
+		if (!x86_match_cpu(m->match))
+			continue;
+
+		rdmsrl_safe(m->msr, &m->value);
+		m->value &= m->mask;
+
+		if (j < i)
+			msr_based_features[j] = msr_based_features[i];
+
+		j++;
+	}
+	num_msr_based_features = j;
 }
 
 static int vcpu_mmio_write(struct kvm_vcpu *vcpu, gpa_t addr, int len,
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 0fb5ef9..429784c 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -934,6 +934,7 @@  struct kvm_ppc_resize_hpt {
 #define KVM_CAP_S390_AIS_MIGRATION 150
 #define KVM_CAP_PPC_GET_CPU_CHAR 151
 #define KVM_CAP_S390_BPB 152
+#define KVM_CAP_GET_MSR_FEATURES 153
 
 #ifdef KVM_CAP_IRQ_ROUTING