diff mbox series

[04/21] KVM: x86/mmu: Allow non-zero value for non-present SPTE and removed SPTE

Message ID 20240227232100.478238-5-pbonzini@redhat.com (mailing list archive)
State New, archived
Headers show
Series TDX/SNP part 1 of n, for 6.9 | expand

Commit Message

Paolo Bonzini Feb. 27, 2024, 11:20 p.m. UTC
From: Sean Christopherson <seanjc@google.com>

For TD guest, the current way to emulate MMIO doesn't work any more, as KVM
is not able to access the private memory of TD guest and do the emulation.
Instead, TD guest expects to receive #VE when it accesses the MMIO and then
it can explicitly make hypercall to KVM to get the expected information.

To achieve this, the TDX module always enables "EPT-violation #VE" in the
VMCS control.  And accordingly, for the MMIO spte for the shared GPA,
1. KVM needs to set "suppress #VE" bit for the non-present SPTE so that EPT
violation happens on TD accessing MMIO range.  2. On EPT violation, KVM
sets the MMIO spte to clear "suppress #VE" bit so the TD guest can receive
the #VE instead of EPT misconfiguration unlike VMX case.  For the shared GPA
that is not populated yet, EPT violation need to be triggered when TD guest
accesses such shared GPA.  The non-present SPTE value for shared GPA should
set "suppress #VE" bit.

Add "suppress #VE" bit (bit 63) to SHADOW_NONPRESENT_VALUE and
REMOVED_SPTE.  Unconditionally set the "suppress #VE" bit (which is bit 63)
for both AMD and Intel as: 1) AMD hardware doesn't use this bit when
present bit is off; 2) for normal VMX guest, KVM never enables the
"EPT-violation #VE" in VMCS control and "suppress #VE" bit is ignored by
hardware.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Message-Id: <a99cb866897c7083430dce7f24c63b17d7121134.1705965635.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/spte.h | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

Comments

Xu Yilun Feb. 29, 2024, 7 a.m. UTC | #1
On Tue, Feb 27, 2024 at 06:20:43PM -0500, Paolo Bonzini wrote:
> From: Sean Christopherson <seanjc@google.com>
> 
> For TD guest, the current way to emulate MMIO doesn't work any more, as KVM
> is not able to access the private memory of TD guest and do the emulation.
> Instead, TD guest expects to receive #VE when it accesses the MMIO and then
> it can explicitly make hypercall to KVM to get the expected information.
> 
> To achieve this, the TDX module always enables "EPT-violation #VE" in the
> VMCS control.  And accordingly, for the MMIO spte for the shared GPA,
> 1. KVM needs to set "suppress #VE" bit for the non-present SPTE so that EPT
> violation happens on TD accessing MMIO range.  2. On EPT violation, KVM
> sets the MMIO spte to clear "suppress #VE" bit so the TD guest can receive
> the #VE instead of EPT misconfiguration unlike VMX case.  For the shared GPA
> that is not populated yet, EPT violation need to be triggered when TD guest
> accesses such shared GPA.  The non-present SPTE value for shared GPA should
> set "suppress #VE" bit.
> 
> Add "suppress #VE" bit (bit 63) to SHADOW_NONPRESENT_VALUE and
> REMOVED_SPTE.  Unconditionally set the "suppress #VE" bit (which is bit 63)
> for both AMD and Intel as: 1) AMD hardware doesn't use this bit when
> present bit is off; 2) for normal VMX guest, KVM never enables the
> "EPT-violation #VE" in VMCS control and "suppress #VE" bit is ignored by
> hardware.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
> Message-Id: <a99cb866897c7083430dce7f24c63b17d7121134.1705965635.git.isaku.yamahata@intel.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/mmu/spte.h | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
> index 4d1799ba2bf8..26bc95bbc962 100644
> --- a/arch/x86/kvm/mmu/spte.h
> +++ b/arch/x86/kvm/mmu/spte.h
> @@ -149,7 +149,20 @@ static_assert(MMIO_SPTE_GEN_LOW_BITS == 8 && MMIO_SPTE_GEN_HIGH_BITS == 11);
>  
>  #define MMIO_SPTE_GEN_MASK		GENMASK_ULL(MMIO_SPTE_GEN_LOW_BITS + MMIO_SPTE_GEN_HIGH_BITS - 1, 0)
>  
> +/*
> + * Non-present SPTE value for both VMX and SVM for TDP MMU.
> + * For SVM NPT, for non-present spte (bit 0 = 0), other bits are ignored.
> + * For VMX EPT, bit 63 is ignored if #VE is disabled. (EPT_VIOLATION_VE=0)
> + *              bit 63 is #VE suppress if #VE is enabled. (EPT_VIOLATION_VE=1)
> + * For TDX:
> + *   TDX module sets EPT_VIOLATION_VE for Secure-EPT and conventional EPT
> + */
> +#ifdef CONFIG_X86_64
> +#define SHADOW_NONPRESENT_VALUE	BIT_ULL(63)
> +static_assert(!(SHADOW_NONPRESENT_VALUE & SPTE_MMU_PRESENT_MASK));
> +#else
>  #define SHADOW_NONPRESENT_VALUE	0ULL
> +#endif
>  
>  extern u64 __read_mostly shadow_host_writable_mask;
>  extern u64 __read_mostly shadow_mmu_writable_mask;
> @@ -196,7 +209,7 @@ extern u64 __read_mostly shadow_nonpresent_or_rsvd_mask;

    * vulnerability.  Use only low bits to avoid 64-bit immediates.
                      ^

We may remove this comment. Others are fine.

Reviewed-by: Xu Yilun <yilun.xu@linux.intel.com>

>   *
>   * Only used by the TDP MMU.
>   */
> -#define REMOVED_SPTE	0x5a0ULL
> +#define REMOVED_SPTE	(SHADOW_NONPRESENT_VALUE | 0x5a0ULL)
>  
>  /* Removed SPTEs must not be misconstrued as shadow present PTEs. */
>  static_assert(!(REMOVED_SPTE & SPTE_MMU_PRESENT_MASK));
> -- 
> 2.39.0
> 
> 
>
Xiaoyao Li Feb. 29, 2024, 1:55 p.m. UTC | #2
On 2/28/2024 7:20 AM, Paolo Bonzini wrote:
> From: Sean Christopherson <seanjc@google.com>
> 
> For TD guest, the current way to emulate MMIO doesn't work any more, as KVM
> is not able to access the private memory of TD guest and do the emulation.
> Instead, TD guest expects to receive #VE when it accesses the MMIO and then
> it can explicitly make hypercall to KVM to get the expected information.
> 
> To achieve this, the TDX module always enables "EPT-violation #VE" in the
> VMCS control.  And accordingly, for the MMIO spte for the shared GPA,
> 1. KVM needs to set "suppress #VE" bit for the non-present SPTE so that EPT
> violation happens on TD accessing MMIO range.  2. On EPT violation, KVM
> sets the MMIO spte to clear "suppress #VE" bit so the TD guest can receive
> the #VE instead of EPT misconfiguration unlike VMX case.  For the shared GPA
> that is not populated yet, EPT violation need to be triggered when TD guest
> accesses such shared GPA.  The non-present SPTE value for shared GPA should
> set "suppress #VE" bit.
> 
> Add "suppress #VE" bit (bit 63) to SHADOW_NONPRESENT_VALUE and
> REMOVED_SPTE.  Unconditionally set the "suppress #VE" bit (which is bit 63)
> for both AMD and Intel as: 1) AMD hardware doesn't use this bit when
> present bit is off; 2) for normal VMX guest, KVM never enables the
> "EPT-violation #VE" in VMCS control and "suppress #VE" bit is ignored by
> hardware.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
> Message-Id: <a99cb866897c7083430dce7f24c63b17d7121134.1705965635.git.isaku.yamahata@intel.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

+ 1 to the nit pointed by Yilun,

after that,

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Huang, Kai March 11, 2024, 11:26 p.m. UTC | #3
>   
> +/*
> + * Non-present SPTE value for both VMX and SVM for TDP MMU.

In the previous patch, SHADOW_NONPRESENT_VALUE is also used in the 
shadow MMU code.  So here when you change SHADOW_NONPRESENT_VALUE to a 
non-zero value, the "for TDP MMU" part doesn't stand.

I am wondering whether we can just avoid using SHADOW_NONPRESENT_VALUE 
in shadow MMU code in the previous patch, and state explicitly that we 
are only going to support TDP MMU for non-zero value for non-present SPTE?

> + * For SVM NPT, for non-present spte (bit 0 = 0), other bits are ignored.
> + * For VMX EPT, bit 63 is ignored if #VE is disabled. (EPT_VIOLATION_VE=0)
> + *              bit 63 is #VE suppress if #VE is enabled. (EPT_VIOLATION_VE=1)
> + * For TDX:
> + *   TDX module sets EPT_VIOLATION_VE for Secure-EPT and conventional EPT
> + */
> +#ifdef CONFIG_X86_64
> +#define SHADOW_NONPRESENT_VALUE	BIT_ULL(63)
> +static_assert(!(SHADOW_NONPRESENT_VALUE & SPTE_MMU_PRESENT_MASK));
> +#else
>   #define SHADOW_NONPRESENT_VALUE	0ULL
> +#endif
>   
>   extern u64 __read_mostly shadow_host_writable_mask;
>   extern u64 __read_mostly shadow_mmu_writable_mask;
> @@ -196,7 +209,7 @@ extern u64 __read_mostly shadow_nonpresent_or_rsvd_mask;
>    *
>    * Only used by the TDP MMU.
>    */
> -#define REMOVED_SPTE	0x5a0ULL
> +#define REMOVED_SPTE	(SHADOW_NONPRESENT_VALUE | 0x5a0ULL)

I kinda prefer moving this chunk to the previous patch, because the 
reason to have SHADOW_NONPRESENT_VALUE is to have a non-zero value for 
non-present SPTEs, which include the REMOVED_SPTE.

But just my 2cents.
diff mbox series

Patch

diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
index 4d1799ba2bf8..26bc95bbc962 100644
--- a/arch/x86/kvm/mmu/spte.h
+++ b/arch/x86/kvm/mmu/spte.h
@@ -149,7 +149,20 @@  static_assert(MMIO_SPTE_GEN_LOW_BITS == 8 && MMIO_SPTE_GEN_HIGH_BITS == 11);
 
 #define MMIO_SPTE_GEN_MASK		GENMASK_ULL(MMIO_SPTE_GEN_LOW_BITS + MMIO_SPTE_GEN_HIGH_BITS - 1, 0)
 
+/*
+ * Non-present SPTE value for both VMX and SVM for TDP MMU.
+ * For SVM NPT, for non-present spte (bit 0 = 0), other bits are ignored.
+ * For VMX EPT, bit 63 is ignored if #VE is disabled. (EPT_VIOLATION_VE=0)
+ *              bit 63 is #VE suppress if #VE is enabled. (EPT_VIOLATION_VE=1)
+ * For TDX:
+ *   TDX module sets EPT_VIOLATION_VE for Secure-EPT and conventional EPT
+ */
+#ifdef CONFIG_X86_64
+#define SHADOW_NONPRESENT_VALUE	BIT_ULL(63)
+static_assert(!(SHADOW_NONPRESENT_VALUE & SPTE_MMU_PRESENT_MASK));
+#else
 #define SHADOW_NONPRESENT_VALUE	0ULL
+#endif
 
 extern u64 __read_mostly shadow_host_writable_mask;
 extern u64 __read_mostly shadow_mmu_writable_mask;
@@ -196,7 +209,7 @@  extern u64 __read_mostly shadow_nonpresent_or_rsvd_mask;
  *
  * Only used by the TDP MMU.
  */
-#define REMOVED_SPTE	0x5a0ULL
+#define REMOVED_SPTE	(SHADOW_NONPRESENT_VALUE | 0x5a0ULL)
 
 /* Removed SPTEs must not be misconstrued as shadow present PTEs. */
 static_assert(!(REMOVED_SPTE & SPTE_MMU_PRESENT_MASK));