diff mbox series

[07/12] KVM: x86: SEV: Treat C-bit as legal GPA bit regardless of vCPU mode

Message ID 20210204000117.3303214-8-seanjc@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: Legal GPA fixes and cleanups | expand

Commit Message

Sean Christopherson Feb. 4, 2021, 12:01 a.m. UTC
Rename cr3_lm_rsvd_bits to reserved_gpa_bits, and use it for all GPA
legality checks.  AMD's APM states:

  If the C-bit is an address bit, this bit is masked from the guest
  physical address when it is translated through the nested page tables.

Thus, any access that can conceivably be run through NPT should ignore
the C-bit when checking for validity.

For features that KVM emulates in software, e.g. MTRRs, there is no
clear direction in the APM for how the C-bit should be handled.  For
such cases, follow the SME behavior inasmuch as possible, since SEV is
is essentially a VM-specific variant of SME.  For SME, the APM states:

  In this case the upper physical address bits are treated as reserved
  when the feature is enabled except where otherwise indicated.

Collecting the various relavant SME snippets in the APM and cross-
referencing the omissions with Linux kernel code, this leaves MTTRs and
APIC_BASE as the only flows that KVM emulates that should _not_ ignore
the C-bit.

Note, this means the reserved bit checks in the page tables are
technically broken.  This will be remedied in a future patch.

Although the page table checks are technically broken, in practice, it's
all but guaranteed to be irrelevant.  NPT is required for SEV, i.e.
shadowing page tables isn't needed in the common case.  Theoretically,
the checks could be in play for nested NPT, but it's extremely unlikely
that anyone is running nested VMs on SEV, as doing so would require L1
to expose sensitive data to L0, e.g. the entire VMCB.  And if anyone is
running nested VMs, L0 can't read the guest's encrypted memory, i.e. L1
would need to put its NPT in shared memory, in which case the C-bit will
never be set.  Or, L1 could use shadow paging, but again, if L0 needs to
read page tables, e.g. to load PDPTRs, the memory can't be encrypted if
L1 has any expectation of L0 doing the right thing.

Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h | 2 +-
 arch/x86/kvm/cpuid.c            | 2 +-
 arch/x86/kvm/cpuid.h            | 2 +-
 arch/x86/kvm/svm/nested.c       | 2 +-
 arch/x86/kvm/svm/svm.c          | 2 +-
 arch/x86/kvm/x86.c              | 7 +++----
 6 files changed, 8 insertions(+), 9 deletions(-)

Comments

Edgecombe, Rick P Feb. 4, 2021, 2:03 a.m. UTC | #1
On Wed, 2021-02-03 at 16:01 -0800, Sean Christopherson wrote:
>  
> -       unsigned long cr3_lm_rsvd_bits;
> +       u64 reserved_gpa_bits;

LAM defines bits above the GFN in CR3:
https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

KVM doesn't support this today of course, but it might be confusing to
try to combine the two concepts.
Sean Christopherson Feb. 4, 2021, 2:19 a.m. UTC | #2
On Thu, Feb 04, 2021, Edgecombe, Rick P wrote:
> On Wed, 2021-02-03 at 16:01 -0800, Sean Christopherson wrote:
> >  
> > -       unsigned long cr3_lm_rsvd_bits;
> > +       u64 reserved_gpa_bits;
> 
> LAM defines bits above the GFN in CR3:
> https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
> 
> KVM doesn't support this today of course, but it might be confusing to
> try to combine the two concepts.

Ah, took me a few minutes, but I see what you're saying.  LAM will introduce
bits that are repurposed for CR3, but not generic GPAs.  And, the behavior is
based on CPU support, so it'd make sense to have a mask cached in vcpu->arch
as opposed to constantly generating it on the fly.

Definitely agree that having a separate cr3_lm_rsvd_bits or whatever is the
right way to go when LAM comes along.  Not sure it's worth keeping a duplicate
field in the meantime, though it would avoid a small amount of thrash.

Paolo, any thoughts?
Paolo Bonzini Feb. 4, 2021, 10:34 a.m. UTC | #3
On 04/02/21 03:19, Sean Christopherson wrote:
> Ah, took me a few minutes, but I see what you're saying.  LAM will introduce
> bits that are repurposed for CR3, but not generic GPAs.  And, the behavior is
> based on CPU support, so it'd make sense to have a mask cached in vcpu->arch
> as opposed to constantly generating it on the fly.
> 
> Definitely agree that having a separate cr3_lm_rsvd_bits or whatever is the
> right way to go when LAM comes along.  Not sure it's worth keeping a duplicate
> field in the meantime, though it would avoid a small amount of thrash.

We don't even know if the cr3_lm_rsvd_bits would be a field in 
vcpu->arch, or rather computed on the fly.  So renaming the field in 
vcpu->arch seems like the simplest thing to do now.

Paolo
Edgecombe, Rick P Feb. 4, 2021, 5:31 p.m. UTC | #4
On Thu, 2021-02-04 at 11:34 +0100, Paolo Bonzini wrote:
> On 04/02/21 03:19, Sean Christopherson wrote:
> > Ah, took me a few minutes, but I see what you're saying.  LAM will
> > introduce
> > bits that are repurposed for CR3, but not generic GPAs.  And, the
> > behavior is
> > based on CPU support, so it'd make sense to have a mask cached in
> > vcpu->arch
> > as opposed to constantly generating it on the fly.
> > 
> > Definitely agree that having a separate cr3_lm_rsvd_bits or
> > whatever is the
> > right way to go when LAM comes along.  Not sure it's worth keeping
> > a duplicate
> > field in the meantime, though it would avoid a small amount of
> > thrash.
> 
> We don't even know if the cr3_lm_rsvd_bits would be a field in 
> vcpu->arch, or rather computed on the fly.  So renaming the field in 
> vcpu->arch seems like the simplest thing to do now.

Fair enough. But just to clarify, I meant that I thought the code would
be more confusing to use illegal gpa bit checks for checking cr3. It
seems they are only incidentally the same value. Alternatively there
could be something like a is_rsvd_cr3_bits() helper that just uses
reserved_gpa_bits for now. Probably put the comment in the wrong place.
It's a minor point in any case.
Sean Christopherson Feb. 4, 2021, 5:52 p.m. UTC | #5
On Thu, Feb 04, 2021, Edgecombe, Rick P wrote:
> On Thu, 2021-02-04 at 11:34 +0100, Paolo Bonzini wrote:
> > On 04/02/21 03:19, Sean Christopherson wrote:
> > > Ah, took me a few minutes, but I see what you're saying.  LAM will
> > > introduce
> > > bits that are repurposed for CR3, but not generic GPAs.  And, the
> > > behavior is
> > > based on CPU support, so it'd make sense to have a mask cached in
> > > vcpu->arch
> > > as opposed to constantly generating it on the fly.
> > > 
> > > Definitely agree that having a separate cr3_lm_rsvd_bits or
> > > whatever is the
> > > right way to go when LAM comes along.  Not sure it's worth keeping
> > > a duplicate
> > > field in the meantime, though it would avoid a small amount of
> > > thrash.
> > 
> > We don't even know if the cr3_lm_rsvd_bits would be a field in 
> > vcpu->arch, or rather computed on the fly.  So renaming the field in 
> > vcpu->arch seems like the simplest thing to do now.
> 
> Fair enough. But just to clarify, I meant that I thought the code would
> be more confusing to use illegal gpa bit checks for checking cr3. It
> seems they are only incidentally the same value.

Hmm, yeah, bits 63:52 are incidental.  Bits 52:M are not, though.  If/when we
need to special case CR3, I would like to take a similar approach to
__reset_rsvds_bits_mask(), where the high reserved bits start from
reserved_gpa_bits and mask off the bits that can't be encoded into a PxE.

> Alternatively there could be something like a is_rsvd_cr3_bits() helper that
> just uses reserved_gpa_bits for now. Probably put the comment in the wrong
> place.  It's a minor point in any case.

That thought crossed my mind, too.  Maybe kvm_vcpu_is_illegal_cr3() to match
the gpa helpers?
Paolo Bonzini Feb. 4, 2021, 5:56 p.m. UTC | #6
On 04/02/21 18:52, Sean Christopherson wrote:
>> Alternatively there could be something like a is_rsvd_cr3_bits() helper that
>> just uses reserved_gpa_bits for now. Probably put the comment in the wrong
>> place.  It's a minor point in any case.
> That thought crossed my mind, too.  Maybe kvm_vcpu_is_illegal_cr3() to match
> the gpa helpers?

Yes, that's certainly a good name but it doesn't have to be done now. 
Or at least, if you do it, somebody is guaranteed to send a patch after 
one month because the wrapper is useless. :)

Paolo
Sean Christopherson Feb. 4, 2021, 6:01 p.m. UTC | #7
On Thu, Feb 04, 2021, Paolo Bonzini wrote:
> On 04/02/21 18:52, Sean Christopherson wrote:
> > > Alternatively there could be something like a is_rsvd_cr3_bits() helper that
> > > just uses reserved_gpa_bits for now. Probably put the comment in the wrong
> > > place.  It's a minor point in any case.
> > That thought crossed my mind, too.  Maybe kvm_vcpu_is_illegal_cr3() to match
> > the gpa helpers?
> 
> Yes, that's certainly a good name but it doesn't have to be done now. Or at
> least, if you do it, somebody is guaranteed to send a patch after one month
> because the wrapper is useless. :)

LOL, I can see myself doing that...
diff mbox series

Patch

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 915f716e78e6..1653d49a66ff 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -654,7 +654,7 @@  struct kvm_vcpu_arch {
 	int cpuid_nent;
 	struct kvm_cpuid_entry2 *cpuid_entries;
 
-	unsigned long cr3_lm_rsvd_bits;
+	u64 reserved_gpa_bits;
 	int maxphyaddr;
 	int max_tdp_level;
 
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 944f518ca91b..7bd1331c1bbc 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -194,7 +194,7 @@  static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	vcpu->arch.cr4_guest_rsvd_bits =
 	    __cr4_reserved_bits(guest_cpuid_has, vcpu);
 
-	vcpu->arch.cr3_lm_rsvd_bits = rsvd_bits(cpuid_maxphyaddr(vcpu), 63);
+	vcpu->arch.reserved_gpa_bits = rsvd_bits(cpuid_maxphyaddr(vcpu), 63);
 
 	/* Invoke the vendor callback only after the above state is updated. */
 	static_call(kvm_x86_vcpu_after_set_cpuid)(vcpu);
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index a9d55ab51e3c..f673f45bdf52 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -38,7 +38,7 @@  static inline int cpuid_maxphyaddr(struct kvm_vcpu *vcpu)
 
 static inline bool kvm_vcpu_is_legal_gpa(struct kvm_vcpu *vcpu, gpa_t gpa)
 {
-	return !(gpa >> cpuid_maxphyaddr(vcpu));
+	return !(gpa & vcpu->arch.reserved_gpa_bits);
 }
 
 static inline bool kvm_vcpu_is_illegal_gpa(struct kvm_vcpu *vcpu, gpa_t gpa)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index ac662964cee5..add3cd4295e1 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -241,7 +241,7 @@  static bool nested_vmcb_check_cr3_cr4(struct vcpu_svm *svm,
 	 */
 	if ((save->efer & EFER_LME) && (save->cr0 & X86_CR0_PG)) {
 		if (!(save->cr4 & X86_CR4_PAE) || !(save->cr0 & X86_CR0_PE) ||
-		    (save->cr3 & vcpu->arch.cr3_lm_rsvd_bits))
+		    kvm_vcpu_is_illegal_gpa(vcpu, save->cr3))
 			return false;
 	}
 
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index f53e6377a933..50ad5a3bf880 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4079,7 +4079,7 @@  static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	if (sev_guest(vcpu->kvm)) {
 		best = kvm_find_cpuid_entry(vcpu, 0x8000001F, 0);
 		if (best)
-			vcpu->arch.cr3_lm_rsvd_bits &= ~(1UL << (best->ebx & 0x3f));
+			vcpu->arch.reserved_gpa_bits &= ~(1UL << (best->ebx & 0x3f));
 	}
 
 	if (!kvm_vcpu_apicv_active(vcpu))
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e6fbf2f574a6..1da7ed093650 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1082,8 +1082,7 @@  int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 		return 0;
 	}
 
-	if (is_long_mode(vcpu) &&
-	    (cr3 & vcpu->arch.cr3_lm_rsvd_bits))
+	if (is_long_mode(vcpu) && kvm_vcpu_is_illegal_gpa(vcpu, cr3))
 		return 1;
 	else if (is_pae_paging(vcpu) &&
 		 !load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3))
@@ -9712,7 +9711,7 @@  static bool kvm_is_valid_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
 		 */
 		if (!(sregs->cr4 & X86_CR4_PAE) || !(sregs->efer & EFER_LMA))
 			return false;
-		if (sregs->cr3 & vcpu->arch.cr3_lm_rsvd_bits)
+		if (kvm_vcpu_is_illegal_gpa(vcpu, sregs->cr3))
 			return false;
 	} else {
 		/*
@@ -10091,7 +10090,7 @@  int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 	fx_init(vcpu);
 
 	vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu);
-	vcpu->arch.cr3_lm_rsvd_bits = rsvd_bits(cpuid_maxphyaddr(vcpu), 63);
+	vcpu->arch.reserved_gpa_bits = rsvd_bits(cpuid_maxphyaddr(vcpu), 63);
 
 	vcpu->arch.pat = MSR_IA32_CR_PAT_DEFAULT;