diff mbox series

[v2] KVM: SVM: Fix detection of AMD Errata 1096

Message ID 20190716235658.18185-1-liran.alon@oracle.com (mailing list archive)
State New, archived
Headers show
Series [v2] KVM: SVM: Fix detection of AMD Errata 1096 | expand

Commit Message

Liran Alon July 16, 2019, 11:56 p.m. UTC
AMD Errata 1096:
When CPU raise #NPF on guest data access and vCPU CR4.SMAP=1, it is
possible that CPU microcode implementing DecodeAssist will fail
to read bytes of instruction which caused #NPF. In this case,
GuestIntrBytes field of the VMCB on a VMEXIT will incorrectly
return 0 instead of the correct guest instruction bytes.
This happens because CPU microcode reading instruction bytes
uses a special opcode which attempts to read data using CPL=0
priviledges. The microcode reads CS:RIP and if it hits a SMAP
fault, it gives up and returns no instruction bytes.

Current KVM code which attemps to detect and workaround this errata have
multiple issues:

1) Code mistakenly checks if vCPU CR4.SMAP=0 instead of vCPU CR4.SMAP=1 which
is required for encountering a SMAP fault.

2) Code assumes SMAP fault can only occur when vCPU CPL==3.
However, the condition for a SMAP fault is a data access with CPL<3
to a page mapped in page-tables as user-accessible (i.e. PTE with U/S
bit set to 1).
Therefore, in case vCPU CR4.SMEP=0, guest can execute an instruction
which reside in a user-accessible page with CPL<3 priviledge. If this
instruction raise a #NPF on it's data access, then CPU DecodeAssist
microcode will still encounter a SMAP violation.
Even though no sane OS will do so (as it's an obvious priviledge
escalation vulnerability), we still need to handle this semanticly
correct in KVM side.

As CR4.SMAP=1 is an easy triggerable condition, attempt to avoid
false-positive of detecting errata by taking note that in case vCPU
CR4.SMEP=1, errata could only be encountered in case CPL==3 (As
otherwise, CPU would raise SMEP fault to guest instead of #NPF).
This can be a useful condition to avoid false-positive because guests
usually enable SMAP if they have also enabled SMEP.

In addition, to avoid future confusion and improve code readbility,
comment errata details in code and not just in commit message.

Fixes: 05d5a4863525 ("KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)")
Cc: Singh Brijesh <brijesh.singh@amd.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
---
 arch/x86/kvm/svm.c | 42 +++++++++++++++++++++++++++++++++++-------
 1 file changed, 35 insertions(+), 7 deletions(-)

Comments

Brijesh Singh July 17, 2019, 8:44 p.m. UTC | #1
On 7/16/19 6:56 PM, Liran Alon wrote:
> AMD Errata 1096:
> When CPU raise #NPF on guest data access and vCPU CR4.SMAP=1, it is
> possible that CPU microcode implementing DecodeAssist will fail
> to read bytes of instruction which caused #NPF. In this case,
> GuestIntrBytes field of the VMCB on a VMEXIT will incorrectly
> return 0 instead of the correct guest instruction bytes.
> This happens because CPU microcode reading instruction bytes
> uses a special opcode which attempts to read data using CPL=0
> priviledges. The microcode reads CS:RIP and if it hits a SMAP
> fault, it gives up and returns no instruction bytes.
> 
> Current KVM code which attemps to detect and workaround this errata have
> multiple issues:
> 
> 1) Code mistakenly checks if vCPU CR4.SMAP=0 instead of vCPU CR4.SMAP=1 which
> is required for encountering a SMAP fault.
> 
> 2) Code assumes SMAP fault can only occur when vCPU CPL==3.
> However, the condition for a SMAP fault is a data access with CPL<3
> to a page mapped in page-tables as user-accessible (i.e. PTE with U/S
> bit set to 1).
> Therefore, in case vCPU CR4.SMEP=0, guest can execute an instruction
> which reside in a user-accessible page with CPL<3 priviledge. If this
> instruction raise a #NPF on it's data access, then CPU DecodeAssist
> microcode will still encounter a SMAP violation.
> Even though no sane OS will do so (as it's an obvious priviledge
> escalation vulnerability), we still need to handle this semanticly
> correct in KVM side.
> 
> As CR4.SMAP=1 is an easy triggerable condition, attempt to avoid
> false-positive of detecting errata by taking note that in case vCPU
> CR4.SMEP=1, errata could only be encountered in case CPL==3 (As
> otherwise, CPU would raise SMEP fault to guest instead of #NPF).
> This can be a useful condition to avoid false-positive because guests
> usually enable SMAP if they have also enabled SMEP.
> 
> In addition, to avoid future confusion and improve code readbility,
> comment errata details in code and not just in commit message.
> 
> Fixes: 05d5a4863525 ("KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)")
> Cc: Singh Brijesh <brijesh.singh@amd.com>
> Cc: Sean Christopherson <sean.j.christopherson@intel.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Signed-off-by: Liran Alon <liran.alon@oracle.com>
> ---
>   arch/x86/kvm/svm.c | 42 +++++++++++++++++++++++++++++++++++-------
>   1 file changed, 35 insertions(+), 7 deletions(-)
> 

Reviewed-by: Brijesh Singh <brijesh.singh@amd.com>

thanks
Paolo Bonzini July 18, 2019, 10:49 a.m. UTC | #2
Brijesh,

just one thing I'd like to double check: does the erratum really depend 
on guest CR4.SMAP, or rather on host CR4.SMAP?  (Of course the SMEP 
check instead needs to use guest CR4.SMEP).

And here is a commit message with style fixes:

---
When CPU raise #NPF on guest data access and guest CR4.SMAP=1, it is
possible that CPU microcode implementing DecodeAssist will fail
to read bytes of instruction which caused #NPF. This is AMD errata
1096 and it happens because CPU microcode reading instruction bytes
incorrectly attempts to read code as implicit supervisor-mode data
accesses (that is, just like it would read e.g. a TSS), which are
susceptible to SMAP faults. The microcode reads CS:RIP and if it is
a user-mode address according to the page tables, the processor
gives up and returns no instruction bytes.  In this case,
GuestIntrBytes field of the VMCB on a VMEXIT will incorrectly
return 0 instead of the correct guest instruction bytes.

Current KVM code attemps to detect and workaround this errata, but it
has multiple issues:

1) It mistakenly checks if guest CR4.SMAP=0 instead of guest CR4.SMAP=1,
which is required for encountering a SMAP fault.

2) It assumes SMAP faults can only occur when vCPU CPL==3.
However, in case vCPU CR4.SMEP=0, the guest can execute an instruction
which reside in a user-accessible page with CPL<3 priviledge. If this
instruction raise a #NPF on it's data access, then CPU DecodeAssist
microcode will still encounter a SMAP violation.  Even though no sane
OS will do so (as it's an obvious priviledge escalation vulnerability),
we still need to handle this semanticly correct in KVM side.

Note that (2) *is* a useful optimization, because CR4.SMAP=1 is an easy
triggerable condition and guests usually enable SMAP together with SMEP,
If the vCPU has CR4.SMEP=1, the errata could indeed be encountered only
at guest CPL3; otherwise, the CPU would raise a SMEP fault to guest
instead of #NPF.  We keep this condition to avoid false positives in
the detection of the errata.

In addition, to avoid future confusion and improve code readbility,
include details of the errata in code and not just in commit message.
---

Paolo
Paolo Bonzini July 18, 2019, 10:50 a.m. UTC | #3
On 17/07/19 01:56, Liran Alon wrote:
> AMD Errata 1096:
> When CPU raise #NPF on guest data access and vCPU CR4.SMAP=1, it is
> possible that CPU microcode implementing DecodeAssist will fail
> to read bytes of instruction which caused #NPF. In this case,
> GuestIntrBytes field of the VMCB on a VMEXIT will incorrectly
> return 0 instead of the correct guest instruction bytes.
> This happens because CPU microcode reading instruction bytes
> uses a special opcode which attempts to read data using CPL=0
> priviledges. The microcode reads CS:RIP and if it hits a SMAP
> fault, it gives up and returns no instruction bytes.
> 
> Current KVM code which attemps to detect and workaround this errata have
> multiple issues:
> 
> 1) Code mistakenly checks if vCPU CR4.SMAP=0 instead of vCPU CR4.SMAP=1 which
> is required for encountering a SMAP fault.
> 
> 2) Code assumes SMAP fault can only occur when vCPU CPL==3.
> However, the condition for a SMAP fault is a data access with CPL<3
> to a page mapped in page-tables as user-accessible (i.e. PTE with U/S
> bit set to 1).
> Therefore, in case vCPU CR4.SMEP=0, guest can execute an instruction
> which reside in a user-accessible page with CPL<3 priviledge. If this
> instruction raise a #NPF on it's data access, then CPU DecodeAssist
> microcode will still encounter a SMAP violation.
> Even though no sane OS will do so (as it's an obvious priviledge
> escalation vulnerability), we still need to handle this semanticly
> correct in KVM side.
> 
> As CR4.SMAP=1 is an easy triggerable condition, attempt to avoid
> false-positive of detecting errata by taking note that in case vCPU
> CR4.SMEP=1, errata could only be encountered in case CPL==3 (As
> otherwise, CPU would raise SMEP fault to guest instead of #NPF).
> This can be a useful condition to avoid false-positive because guests
> usually enable SMAP if they have also enabled SMEP.
> 
> In addition, to avoid future confusion and improve code readbility,
> comment errata details in code and not just in commit message.
> 
> Fixes: 05d5a4863525 ("KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)")
> Cc: Singh Brijesh <brijesh.singh@amd.com>
> Cc: Sean Christopherson <sean.j.christopherson@intel.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Signed-off-by: Liran Alon <liran.alon@oracle.com>
> ---
>  arch/x86/kvm/svm.c | 42 +++++++++++++++++++++++++++++++++++-------
>  1 file changed, 35 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 735b8c01895e..7d6410539dd7 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -7120,13 +7120,41 @@ static int nested_enable_evmcs(struct kvm_vcpu *vcpu,
>  
>  static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
>  {
> -	bool is_user, smap;
> -
> -	is_user = svm_get_cpl(vcpu) == 3;
> -	smap = !kvm_read_cr4_bits(vcpu, X86_CR4_SMAP);
> +	unsigned long cr4 = kvm_read_cr4(vcpu);
> +	bool smep = cr4 & X86_CR4_SMEP;
> +	bool smap = cr4 & X86_CR4_SMAP;
> +	bool is_user = svm_get_cpl(vcpu) == 3;
>  
>  	/*
> -	 * Detect and workaround Errata 1096 Fam_17h_00_0Fh
> +	 * Detect and workaround Errata 1096 Fam_17h_00_0Fh.
> +	 *
> +	 * Errata:
> +	 * When CPU raise #NPF on guest data access and vCPU CR4.SMAP=1, it is
> +	 * possible that CPU microcode implementing DecodeAssist will fail
> +	 * to read bytes of instruction which caused #NPF. In this case,
> +	 * GuestIntrBytes field of the VMCB on a VMEXIT will incorrectly
> +	 * return 0 instead of the correct guest instruction bytes.
> +	 *
> +	 * This happens because CPU microcode reading instruction bytes
> +	 * uses a special opcode which attempts to read data using CPL=0
> +	 * priviledges. The microcode reads CS:RIP and if it hits a SMAP
> +	 * fault, it gives up and returns no instruction bytes.
> +	 *
> +	 * Detection:
> +	 * We reach here in case CPU supports DecodeAssist, raised #NPF and
> +	 * returned 0 in GuestIntrBytes field of the VMCB.
> +	 * First, errata can only be triggered in case vCPU CR4.SMAP=1.
> +	 * Second, if vCPU CR4.SMEP=1, errata could only be triggered
> +	 * in case vCPU CPL==3 (Because otherwise guest would have triggered
> +	 * a SMEP fault instead of #NPF).
> +	 * Otherwise, vCPU CR4.SMEP=0, errata could be triggered by any vCPU CPL.
> +	 * As most guests enable SMAP if they have also enabled SMEP, use above
> +	 * logic in order to attempt minimize false-positive of detecting errata
> +	 * while still preserving all cases semantic correctness.
> +	 *
> +	 * Workaround:
> +	 * To determine what instruction the guest was executing, the hypervisor
> +	 * will have to decode the instruction at the instruction pointer.
>  	 *
>  	 * In non SEV guest, hypervisor will be able to read the guest
>  	 * memory to decode the instruction pointer when insn_len is zero
> @@ -7137,11 +7165,11 @@ static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
>  	 * instruction pointer so we will not able to workaround it. Lets
>  	 * print the error and request to kill the guest.
>  	 */
> -	if (is_user && smap) {
> +	if (smap && (!smep || is_user)) {
>  		if (!sev_guest(vcpu->kvm))
>  			return true;
>  
> -		pr_err_ratelimited("KVM: Guest triggered AMD Erratum 1096\n");
> +		pr_err_ratelimited("KVM: SEV Guest triggered AMD Erratum 1096\n");
>  		kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
>  	}
>  
> 

Queued, subject to Brijesh's answer on guest CR4.SMAP vs host CR4.SMAP.

Paolo
Brijesh Singh July 18, 2019, 3:25 p.m. UTC | #4
Hi Paolo,

On 7/18/19 5:49 AM, Paolo Bonzini wrote:
> Brijesh,
> 
> just one thing I'd like to double check: does the erratum really depend
> on guest CR4.SMAP, or rather on host CR4.SMAP?  (Of course the SMEP
> check instead needs to use guest CR4.SMEP).
> 

I double checked with design team to be sure, the erratum depends on
guest CR4.SMAP and not the host CR4.SMAP.

thanks

> And here is a commit message with style fixes:
> 
> ---
> When CPU raise #NPF on guest data access and guest CR4.SMAP=1, it is
> possible that CPU microcode implementing DecodeAssist will fail
> to read bytes of instruction which caused #NPF. This is AMD errata
> 1096 and it happens because CPU microcode reading instruction bytes
> incorrectly attempts to read code as implicit supervisor-mode data
> accesses (that is, just like it would read e.g. a TSS), which are
> susceptible to SMAP faults. The microcode reads CS:RIP and if it is
> a user-mode address according to the page tables, the processor
> gives up and returns no instruction bytes.  In this case,
> GuestIntrBytes field of the VMCB on a VMEXIT will incorrectly
> return 0 instead of the correct guest instruction bytes.
> 
> Current KVM code attemps to detect and workaround this errata, but it
> has multiple issues:
> 
> 1) It mistakenly checks if guest CR4.SMAP=0 instead of guest CR4.SMAP=1,
> which is required for encountering a SMAP fault.
> 
> 2) It assumes SMAP faults can only occur when vCPU CPL==3.
> However, in case vCPU CR4.SMEP=0, the guest can execute an instruction
> which reside in a user-accessible page with CPL<3 priviledge. If this
> instruction raise a #NPF on it's data access, then CPU DecodeAssist
> microcode will still encounter a SMAP violation.  Even though no sane
> OS will do so (as it's an obvious priviledge escalation vulnerability),
> we still need to handle this semanticly correct in KVM side.
> 
> Note that (2) *is* a useful optimization, because CR4.SMAP=1 is an easy
> triggerable condition and guests usually enable SMAP together with SMEP,
> If the vCPU has CR4.SMEP=1, the errata could indeed be encountered only
> at guest CPL3; otherwise, the CPU would raise a SMEP fault to guest
> instead of #NPF.  We keep this condition to avoid false positives in
> the detection of the errata.
> 
> In addition, to avoid future confusion and improve code readbility,
> include details of the errata in code and not just in commit message.
> ---
> 
> Paolo
>
diff mbox series

Patch

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 735b8c01895e..7d6410539dd7 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -7120,13 +7120,41 @@  static int nested_enable_evmcs(struct kvm_vcpu *vcpu,
 
 static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
 {
-	bool is_user, smap;
-
-	is_user = svm_get_cpl(vcpu) == 3;
-	smap = !kvm_read_cr4_bits(vcpu, X86_CR4_SMAP);
+	unsigned long cr4 = kvm_read_cr4(vcpu);
+	bool smep = cr4 & X86_CR4_SMEP;
+	bool smap = cr4 & X86_CR4_SMAP;
+	bool is_user = svm_get_cpl(vcpu) == 3;
 
 	/*
-	 * Detect and workaround Errata 1096 Fam_17h_00_0Fh
+	 * Detect and workaround Errata 1096 Fam_17h_00_0Fh.
+	 *
+	 * Errata:
+	 * When CPU raise #NPF on guest data access and vCPU CR4.SMAP=1, it is
+	 * possible that CPU microcode implementing DecodeAssist will fail
+	 * to read bytes of instruction which caused #NPF. In this case,
+	 * GuestIntrBytes field of the VMCB on a VMEXIT will incorrectly
+	 * return 0 instead of the correct guest instruction bytes.
+	 *
+	 * This happens because CPU microcode reading instruction bytes
+	 * uses a special opcode which attempts to read data using CPL=0
+	 * priviledges. The microcode reads CS:RIP and if it hits a SMAP
+	 * fault, it gives up and returns no instruction bytes.
+	 *
+	 * Detection:
+	 * We reach here in case CPU supports DecodeAssist, raised #NPF and
+	 * returned 0 in GuestIntrBytes field of the VMCB.
+	 * First, errata can only be triggered in case vCPU CR4.SMAP=1.
+	 * Second, if vCPU CR4.SMEP=1, errata could only be triggered
+	 * in case vCPU CPL==3 (Because otherwise guest would have triggered
+	 * a SMEP fault instead of #NPF).
+	 * Otherwise, vCPU CR4.SMEP=0, errata could be triggered by any vCPU CPL.
+	 * As most guests enable SMAP if they have also enabled SMEP, use above
+	 * logic in order to attempt minimize false-positive of detecting errata
+	 * while still preserving all cases semantic correctness.
+	 *
+	 * Workaround:
+	 * To determine what instruction the guest was executing, the hypervisor
+	 * will have to decode the instruction at the instruction pointer.
 	 *
 	 * In non SEV guest, hypervisor will be able to read the guest
 	 * memory to decode the instruction pointer when insn_len is zero
@@ -7137,11 +7165,11 @@  static bool svm_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
 	 * instruction pointer so we will not able to workaround it. Lets
 	 * print the error and request to kill the guest.
 	 */
-	if (is_user && smap) {
+	if (smap && (!smep || is_user)) {
 		if (!sev_guest(vcpu->kvm))
 			return true;
 
-		pr_err_ratelimited("KVM: Guest triggered AMD Erratum 1096\n");
+		pr_err_ratelimited("KVM: SEV Guest triggered AMD Erratum 1096\n");
 		kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
 	}