diff mbox

[3/4] KVM: Add SMAP support when setting CR4

Message ID 1395923135-15329-4-git-send-email-feng.wu@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Wu, Feng March 27, 2014, 12:25 p.m. UTC
This patch adds SMAP handling logic when setting CR4 for guests

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 arch/x86/kvm/cpuid.h |  8 ++++++++
 arch/x86/kvm/mmu.c   | 22 +++++++++++++++++++---
 arch/x86/kvm/mmu.h   |  2 ++
 arch/x86/kvm/x86.c   |  6 ++++++
 4 files changed, 35 insertions(+), 3 deletions(-)

Comments

Paolo Bonzini March 27, 2014, 11:46 a.m. UTC | #1
Il 27/03/2014 13:25, Feng Wu ha scritto:
> +void update_permission_bitmask(struct kvm_vcpu *vcpu,
>  		struct kvm_mmu *mmu, bool ept)
>  {
>  	unsigned bit, byte, pfec;
>  	u8 map;
> -	bool fault, x, w, u, wf, uf, ff, smep;
> +	bool fault, x, w, u, wf, uf, ff, smep, smap;
>
>  	smep = kvm_read_cr4_bits(vcpu, X86_CR4_SMEP);
> +	smap = kvm_read_cr4_bits(vcpu, X86_CR4_SMAP);
>  	for (byte = 0; byte < ARRAY_SIZE(mmu->permissions); ++byte) {
>  		pfec = byte << 1;
>  		map = 0;
> @@ -3617,11 +3618,26 @@ static void update_permission_bitmask(struct kvm_vcpu *vcpu,
>  				w |= !is_write_protection(vcpu) && !uf;
>  				/* Disallow supervisor fetches of user code if cr4.smep */
>  				x &= !(smep && u && !uf);
> +
> +				/*
> +				 * SMAP:kernel-mode data accesses from user-mode
> +				 * mappings should fault. A fault is considered
> +				 * as a SMAP violation if all of the following
> +				 * conditions are ture:
> +				 *   - X86_CR4_SMAP is set in CR4
> +				 *   - An user page is accessed
> +				 *   - !(CPL<3 && X86_EFLAGS_AC is set)
> +				 *   - Page fault in kernel mode
> +				 */
> +				smap = smap && u && !uf &&
> +					!((kvm_x86_ops->get_cpl(vcpu) < 3) &&
> +					((kvm_x86_ops->get_rflags(vcpu) &
> +					X86_EFLAGS_AC) == 1));

Unfortunately this doesn't work.

The reason is that changing X86_EFLAGS_AC doesn't trigger 
update_permission_bitmask.  So the value of CPL < 3 && AC = 1 must not 
be checked in update_permission_bitmask; instead, it must be included in 
the index into the permissions array.  You can reuse the PFERR_RSVD_MASK 
bit, like

	smapf = pfec & PFERR_RSVD_MASK;
	...
		smap = smap && smapf u && !uf;

The VCPU can then be passed to permission_fault in order to get the 
value of the CPL and the AC bit.

Please test nested virtualization too.  I think PFERR_RSVD_MASK should 
be removed in translate_nested_gpa.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhang, Yang Z March 28, 2014, 5:47 a.m. UTC | #2
Paolo Bonzini wrote on 2014-03-27:
> Il 27/03/2014 13:25, Feng Wu ha scritto:
>> +void update_permission_bitmask(struct kvm_vcpu *vcpu,
>>  		struct kvm_mmu *mmu, bool ept)
>>  {
>>  	unsigned bit, byte, pfec;
>>  	u8 map;
>> -	bool fault, x, w, u, wf, uf, ff, smep;
>> +	bool fault, x, w, u, wf, uf, ff, smep, smap;
>> 
>>  	smep = kvm_read_cr4_bits(vcpu, X86_CR4_SMEP); +	smap =
>>  kvm_read_cr4_bits(vcpu, X86_CR4_SMAP); 	for (byte = 0; byte <
>>  ARRAY_SIZE(mmu->permissions); ++byte) { 		pfec = byte << 1; 		map = 0;
>> @@ -3617,11 +3618,26 @@ static void update_permission_bitmask(struct
> kvm_vcpu *vcpu,
>>  				w |= !is_write_protection(vcpu) && !uf;
>>  				/* Disallow supervisor fetches of user code if cr4.smep */
>>  				x &= !(smep && u && !uf);
>> +
>> +				/*
>> +				 * SMAP:kernel-mode data accesses from user-mode
>> +				 * mappings should fault. A fault is considered
>> +				 * as a SMAP violation if all of the following
>> +				 * conditions are ture:
>> +				 *   - X86_CR4_SMAP is set in CR4
>> +				 *   - An user page is accessed
>> +				 *   - !(CPL<3 && X86_EFLAGS_AC is set)
>> +				 *   - Page fault in kernel mode
>> +				 */
>> +				smap = smap && u && !uf &&
>> +					!((kvm_x86_ops->get_cpl(vcpu) < 3) &&
>> +					((kvm_x86_ops->get_rflags(vcpu) &
>> +					X86_EFLAGS_AC) == 1));
> 
> Unfortunately this doesn't work.
> 
> The reason is that changing X86_EFLAGS_AC doesn't trigger
> update_permission_bitmask.  So the value of CPL < 3 && AC = 1 must not
> be checked in update_permission_bitmask; instead, it must be included
> in the index into the permissions array.  You can reuse the
> PFERR_RSVD_MASK bit, like
> 
> 	smapf = pfec & PFERR_RSVD_MASK;
> 	...
> 		smap = smap && smapf u && !uf;
> 
> The VCPU can then be passed to permission_fault in order to get the
> value of the CPL and the AC bit.

Is CPL check needed? Shouldn't it already have been covered by U bit?

> 
> Please test nested virtualization too.  I think PFERR_RSVD_MASK should
> be removed in translate_nested_gpa.
> 
> Paolo


Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini March 28, 2014, 6:23 a.m. UTC | #3
Il 28/03/2014 06:47, Zhang, Yang Z ha scritto:
>>> >> +				smap = smap && u && !uf &&
>>> >> +					!((kvm_x86_ops->get_cpl(vcpu) < 3) &&
>>> >> +					((kvm_x86_ops->get_rflags(vcpu) &
>>> >> +					X86_EFLAGS_AC) == 1));
>> >
>> > Unfortunately this doesn't work.
>> >
>> > The reason is that changing X86_EFLAGS_AC doesn't trigger
>> > update_permission_bitmask.  So the value of CPL < 3 && AC = 1 must not
>> > be checked in update_permission_bitmask; instead, it must be included
>> > in the index into the permissions array.  You can reuse the
>> > PFERR_RSVD_MASK bit, like
>> >
>> > 	smapf = pfec & PFERR_RSVD_MASK;
>> > 	...
>> > 		smap = smap && smapf u && !uf;
>> >
>> > The VCPU can then be passed to permission_fault in order to get the
>> > value of the CPL and the AC bit.
>
> Is CPL check needed? Shouldn't it already have been covered by U bit?

It is not needed but actually it is covered by !uf, I think.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wu, Feng March 28, 2014, 7:33 a.m. UTC | #4
> -----Original Message-----
> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] On
> Behalf Of Paolo Bonzini
> Sent: Friday, March 28, 2014 2:23 PM
> To: Zhang, Yang Z; Wu, Feng; gleb@redhat.com; hpa@zytor.com;
> kvm@vger.kernel.org
> Subject: Re: [PATCH 3/4] KVM: Add SMAP support when setting CR4
> 
> Il 28/03/2014 06:47, Zhang, Yang Z ha scritto:
> >>> >> +				smap = smap && u && !uf &&
> >>> >> +					!((kvm_x86_ops->get_cpl(vcpu) < 3) &&
> >>> >> +					((kvm_x86_ops->get_rflags(vcpu) &
> >>> >> +					X86_EFLAGS_AC) == 1));
> >> >
> >> > Unfortunately this doesn't work.
> >> >
> >> > The reason is that changing X86_EFLAGS_AC doesn't trigger
> >> > update_permission_bitmask.  So the value of CPL < 3 && AC = 1 must not
> >> > be checked in update_permission_bitmask; instead, it must be included
> >> > in the index into the permissions array.  You can reuse the
> >> > PFERR_RSVD_MASK bit, like
> >> >
> >> > 	smapf = pfec & PFERR_RSVD_MASK;
> >> > 	...
> >> > 		smap = smap && smapf u && !uf;
> >> >
> >> > The VCPU can then be passed to permission_fault in order to get the
> >> > value of the CPL and the AC bit.
> >
> > Is CPL check needed? Shouldn't it already have been covered by U bit?
> 
> It is not needed but actually it is covered by !uf, I think.

In my understanding it is needed, from Intel SDM:

"Every access to a linear address is either a supervisor-mode access
or a user-mode access. All accesses performed while the current
privilege level (CPL) is less than 3 are supervisor-mode accesses.
If CPL = 3, accesses are generally user-mode accesses. However, some
operations implicitly access system data structures, and the resulting
accesses to those data structures are supervisor-mode accesses regardless
of CPL. Examples of such implicit supervisor accesses include the following:
accesses to the global descriptor table (GDT) or local descriptor table
(LDT) to load a segment descriptor; accesses to the interrupt descriptor
table (IDT) when delivering an interrupt or exception; and accesses to the
task-state segment (TSS) as part of a task switch or change of CPL."

From the above SDM, we can see supervisor-mode access can also 
happen when CPL equals 3.

If CPL < 3, SMAP protections are disabled if EFLAGS.AC = 1. If CPL = 3, 
SMAP applies to all supervisor-mode data accesses (these are implicit
supervisor accesses) regardless of the value of EFLAGS.AC.

So when we check the value of EFLAGS.AC, we also need to check CPL, since AC
bit only takes effect when CPL<3.

U==1 means user-mode access are allowed, while !uf means it is a fault
from Supervisor-mode access, I think both *u* and *uf* cannot reflect the
value of CPL.

Correct me if I am wrong. Thanks a lot!

> 
> Paolo
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Thanks,
Feng
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wu, Feng March 28, 2014, 9:35 a.m. UTC | #5
> -----Original Message-----
> From: Paolo Bonzini [mailto:paolo.bonzini@gmail.com] On Behalf Of Paolo
> Bonzini
> Sent: Thursday, March 27, 2014 7:47 PM
> To: Wu, Feng; gleb@redhat.com; hpa@zytor.com; kvm@vger.kernel.org
> Subject: Re: [PATCH 3/4] KVM: Add SMAP support when setting CR4
> 
> Il 27/03/2014 13:25, Feng Wu ha scritto:
> > +void update_permission_bitmask(struct kvm_vcpu *vcpu,
> >  		struct kvm_mmu *mmu, bool ept)
> >  {
> >  	unsigned bit, byte, pfec;
> >  	u8 map;
> > -	bool fault, x, w, u, wf, uf, ff, smep;
> > +	bool fault, x, w, u, wf, uf, ff, smep, smap;
> >
> >  	smep = kvm_read_cr4_bits(vcpu, X86_CR4_SMEP);
> > +	smap = kvm_read_cr4_bits(vcpu, X86_CR4_SMAP);
> >  	for (byte = 0; byte < ARRAY_SIZE(mmu->permissions); ++byte) {
> >  		pfec = byte << 1;
> >  		map = 0;
> > @@ -3617,11 +3618,26 @@ static void update_permission_bitmask(struct
> kvm_vcpu *vcpu,
> >  				w |= !is_write_protection(vcpu) && !uf;
> >  				/* Disallow supervisor fetches of user code if cr4.smep */
> >  				x &= !(smep && u && !uf);
> > +
> > +				/*
> > +				 * SMAP:kernel-mode data accesses from user-mode
> > +				 * mappings should fault. A fault is considered
> > +				 * as a SMAP violation if all of the following
> > +				 * conditions are ture:
> > +				 *   - X86_CR4_SMAP is set in CR4
> > +				 *   - An user page is accessed
> > +				 *   - !(CPL<3 && X86_EFLAGS_AC is set)
> > +				 *   - Page fault in kernel mode
> > +				 */
> > +				smap = smap && u && !uf &&
> > +					!((kvm_x86_ops->get_cpl(vcpu) < 3) &&
> > +					((kvm_x86_ops->get_rflags(vcpu) &
> > +					X86_EFLAGS_AC) == 1));
> 
> Unfortunately this doesn't work.
> 
> The reason is that changing X86_EFLAGS_AC doesn't trigger
> update_permission_bitmask.  So the value of CPL < 3 && AC = 1 must not
> be checked in update_permission_bitmask; instead, it must be included in
> the index into the permissions array.  You can reuse the PFERR_RSVD_MASK
> bit, like
> 
> 	smapf = pfec & PFERR_RSVD_MASK;
> 	...
> 		smap = smap && smapf u && !uf;
> 
> The VCPU can then be passed to permission_fault in order to get the
> value of the CPL and the AC bit.
> 
> Please test nested virtualization too.  I think PFERR_RSVD_MASK should
> be removed in translate_nested_gpa.

Thanks very much for your comments, I changed the code according to it and
the patch v2 will be sent out for a review. Since setting up the environment for
nested virtualization is a little time consuming and it is in progress now, 
could you please have a look at it to see if it is what you expected first? 
Any comments are appreciated! 

> 
> Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini March 28, 2014, 2:09 p.m. UTC | #6
Il 28/03/2014 08:33, Wu, Feng ha scritto:
> In my understanding it is needed, from Intel SDM:
>
> "Every access to a linear address is either a supervisor-mode access
> or a user-mode access. All accesses performed while the current
> privilege level (CPL) is less than 3 are supervisor-mode accesses.
> If CPL = 3, accesses are generally user-mode accesses. However, some
> operations implicitly access system data structures, and the resulting
> accesses to those data structures are supervisor-mode accesses regardless
> of CPL. Examples of such implicit supervisor accesses include the following:
> accesses to the global descriptor table (GDT) or local descriptor table
> (LDT) to load a segment descriptor; accesses to the interrupt descriptor
> table (IDT) when delivering an interrupt or exception; and accesses to the
> task-state segment (TSS) as part of a task switch or change of CPL."
>
> From the above SDM, we can see supervisor-mode access can also
> happen when CPL equals 3.
>
> If CPL < 3, SMAP protections are disabled if EFLAGS.AC = 1. If CPL = 3,
> SMAP applies to all supervisor-mode data accesses (these are implicit
> supervisor accesses) regardless of the value of EFLAGS.AC.
>
> So when we check the value of EFLAGS.AC, we also need to check CPL, since AC
> bit only takes effect when CPL<3.
>
> U==1 means user-mode access are allowed, while !uf means it is a fault
> from Supervisor-mode access, I think both *u* and *uf* cannot reflect the
> value of CPL.
>
> Correct me if I am wrong. Thanks a lot!

You're right!

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index f1e4895..63124a2 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -48,6 +48,14 @@  static inline bool guest_cpuid_has_smep(struct kvm_vcpu *vcpu)
 	return best && (best->ebx & bit(X86_FEATURE_SMEP));
 }
 
+static inline bool guest_cpuid_has_smap(struct kvm_vcpu *vcpu)
+{
+	struct kvm_cpuid_entry2 *best;
+
+	best = kvm_find_cpuid_entry(vcpu, 7, 0);
+	return best && (best->ebx & bit(X86_FEATURE_SMAP));
+}
+
 static inline bool guest_cpuid_has_fsgsbase(struct kvm_vcpu *vcpu)
 {
 	struct kvm_cpuid_entry2 *best;
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 40772ef..33e656c 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3591,14 +3591,15 @@  static void reset_rsvds_bits_mask_ept(struct kvm_vcpu *vcpu,
 	}
 }
 
-static void update_permission_bitmask(struct kvm_vcpu *vcpu,
+void update_permission_bitmask(struct kvm_vcpu *vcpu,
 		struct kvm_mmu *mmu, bool ept)
 {
 	unsigned bit, byte, pfec;
 	u8 map;
-	bool fault, x, w, u, wf, uf, ff, smep;
+	bool fault, x, w, u, wf, uf, ff, smep, smap;
 
 	smep = kvm_read_cr4_bits(vcpu, X86_CR4_SMEP);
+	smap = kvm_read_cr4_bits(vcpu, X86_CR4_SMAP);
 	for (byte = 0; byte < ARRAY_SIZE(mmu->permissions); ++byte) {
 		pfec = byte << 1;
 		map = 0;
@@ -3617,11 +3618,26 @@  static void update_permission_bitmask(struct kvm_vcpu *vcpu,
 				w |= !is_write_protection(vcpu) && !uf;
 				/* Disallow supervisor fetches of user code if cr4.smep */
 				x &= !(smep && u && !uf);
+
+				/*
+				 * SMAP:kernel-mode data accesses from user-mode
+				 * mappings should fault. A fault is considered
+				 * as a SMAP violation if all of the following
+				 * conditions are ture:
+				 *   - X86_CR4_SMAP is set in CR4
+				 *   - An user page is accessed
+				 *   - !(CPL<3 && X86_EFLAGS_AC is set)
+				 *   - Page fault in kernel mode
+				 */
+				smap = smap && u && !uf &&
+					!((kvm_x86_ops->get_cpl(vcpu) < 3) &&
+					((kvm_x86_ops->get_rflags(vcpu) &
+					X86_EFLAGS_AC) == 1));
 			} else
 				/* Not really needed: no U/S accesses on ept  */
 				u = 1;
 
-			fault = (ff && !x) || (uf && !u) || (wf && !w);
+			fault = (ff && !x) || (uf && !u) || (wf && !w) || smap;
 			map |= fault << bit;
 		}
 		mmu->permissions[byte] = map;
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 2926152..8820f78 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -73,6 +73,8 @@  int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct);
 void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context);
 void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context,
 		bool execonly);
+void update_permission_bitmask(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
+		bool ept);
 
 static inline unsigned int kvm_mmu_available_pages(struct kvm *kvm)
 {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4e33b85..f8293fb 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -630,6 +630,9 @@  int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 	if (!guest_cpuid_has_smep(vcpu) && (cr4 & X86_CR4_SMEP))
 		return 1;
 
+	if (!guest_cpuid_has_smap(vcpu) && (cr4 & X86_CR4_SMAP))
+		return 1;
+
 	if (!guest_cpuid_has_fsgsbase(vcpu) && (cr4 & X86_CR4_FSGSBASE))
 		return 1;
 
@@ -658,6 +661,9 @@  int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 	    (!(cr4 & X86_CR4_PCIDE) && (old_cr4 & X86_CR4_PCIDE)))
 		kvm_mmu_reset_context(vcpu);
 
+	if ((cr4 ^ old_cr4) & X86_CR4_SMAP)
+		update_permission_bitmask(vcpu, vcpu->arch.walk_mmu, false);
+
 	if ((cr4 ^ old_cr4) & X86_CR4_OSXSAVE)
 		kvm_update_cpuid(vcpu);