diff mbox series

[RFC,v2,4/6] x86/virt/tdx: Add SEAMCALL wrappers for TDX page cache management

Message ID 20241203010317.827803-5-rick.p.edgecombe@intel.com (mailing list archive)
State New
Headers show
Series SEAMCALL Wrappers | expand

Commit Message

Edgecombe, Rick P Dec. 3, 2024, 1:03 a.m. UTC
Intel TDX protects guest VMs from malicious host and certain physical
attacks. The TDX module uses pages provided by the host for both control
structures and for TD guest pages. These pages are encrypted using the
MK-TME encryption engine, with its special requirements around cache
invalidation. For its own security, the TDX module ensures pages are
flushed properly and track which usage they are currently assigned. For
creating and tearing down TD VMs and vCPUs KVM will need to use the
TDH.PHYMEM.PAGE.RECLAIM, TDH.PHYMEM.CACHE.WB, and TDH.PHYMEM.PAGE.WBINVD
SEAMCALLs.

Add tdh_phymem_page_reclaim() to enable KVM to call
TDH.PHYMEM.PAGE.RECLAIM to reclaim the page for use by the host kernel.
This effectively resets its state in the TDX module's page tracking
(PAMT), if the page is available to be reclaimed. This will be used by KVM
to reclaim the various types of pages owned by the TDX module. It will
have a small wrapper in KVM that retries in the case of a relevant error
code. Don't implement this wrapper in arch/x86 because KVM's solution
around retrying SEAMCALLs will be better located in a single place.

Add tdh_phymem_cache_wb() to enable KVM to call TDH.PHYMEM.CACHE.WB to do
a cache write back in a way that the TDX module can verify, before it
allows a KeyID to be freed. The KVM code will use this to have a small
wrapper that handles retries. Since the TDH.PHYMEM.CACHE.WB operation is
interruptible, have tdh_phymem_cache_wb() take a resume argument to pass
this info to the TDX module for restarts. It is worth noting that this
SEAMCALL uses a SEAM specific MSR to do the write back in sections. In
this way it does export some new functionality that affects CPU state.

Add tdh_phymem_page_wbinvd_tdr() to enable KVM to call
TDH.PHYMEM.PAGE.WBINVD to do a cache write back and invalidate of a TDR,
using the global KeyID. The underlying TDH.PHYMEM.PAGE.WBINVD SEAMCALL
requires the related KeyID to be encoded into the SEAMCALL args. Since the
global KeyID is not exposed to KVM, a dedicated wrapper is needed for TDR
focused TDH.PHYMEM.PAGE.WBINVD operations.

Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
---
SEAMCALL RFC v2:
 - Use struct page (Dave)
 - Rename out args for tdh_phymem_page_reclaim() to make it clear these
   are TDX specific values, and not to be interpreted normally. Also,
   add a comment about this.

SEAMCALL RFC:
 - Use struct tdx_td
 - Use arg names with meaning for tdh_phymem_page_reclaim() for out args

uAPI breakout v2:
 - Change to use 'u64' as function parameter to prepare to move
   SEAMCALL wrappers to arch/x86. (Kai)
 - Split to separate patch
 - Move SEAMCALL wrappers from KVM to x86 core;
 - Move TDH_xx macros from KVM to x86 core;
 - Re-write log

uAPI breakout v1:
 - Make argument to C wrapper function struct kvm_tdx * or
   struct vcpu_tdx * .(Sean)
 - Drop unused helpers (Kai)
 - Fix bisectability issues in headers (Kai)
 - Updates from seamcall overhaul (Kai)

v19:
 - Update the commit message to match the patch by Yuan
 - Use seamcall() and seamcall_ret() by paolo

v18:
 - removed stub functions for __seamcall{,_ret}()
 - Added Reviewed-by Binbin
 - Make tdx_seamcall() use struct tdx_module_args instead of taking
   each inputs.
---
 arch/x86/include/asm/tdx.h  |  3 +++
 arch/x86/virt/vmx/tdx/tdx.c | 43 +++++++++++++++++++++++++++++++++++++
 arch/x86/virt/vmx/tdx/tdx.h |  3 +++
 3 files changed, 49 insertions(+)

Comments

Binbin Wu Dec. 3, 2024, 2:33 a.m. UTC | #1
On 12/3/2024 9:03 AM, Rick Edgecombe wrote:
[...]
> +
> +/*
> + * TDX ABI defines output operands as PT, OWNER and SIZE. These are TDX defined fomats.
fomats -> formats

> + * So despite the names, they must be interpted specially as described by the spec. Return
interpted -> interpreted

> + * them only for error reporting purposes.
> + */
> +u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size)
> +{
> +	struct tdx_module_args args = {
> +		.rcx = page_to_pfn(page) << PAGE_SHIFT,
> +	};
> +	u64 ret;
> +
> +	ret = seamcall_ret(TDH_PHYMEM_PAGE_RECLAIM, &args);
> +
> +	*tdx_pt = args.rcx;
> +	*tdx_owner = args.rdx;
> +	*tdx_size = args.r8;
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(tdh_phymem_page_reclaim);
> +
[...]
Edgecombe, Rick P Dec. 4, 2024, 1:58 a.m. UTC | #2
On Tue, 2024-12-03 at 10:33 +0800, Binbin Wu wrote:
> > + * TDX ABI defines output operands as PT, OWNER and SIZE. These are TDX
> > defined fomats.
> fomats -> formats
> 
> > + * So despite the names, they must be interpted specially as described by
> > the spec. Return
> interpted -> interpreted

Oof, thanks.
Yan Zhao Dec. 11, 2024, 1:23 a.m. UTC | #3
On Mon, Dec 02, 2024 at 05:03:14PM -0800, Rick Edgecombe wrote:
...
> +u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td)
> +{
> +	struct tdx_module_args args = {};
> +
> +	args.rcx = tdx_tdr_pa(td) | ((u64)tdx_global_keyid << boot_cpu_data.x86_phys_bits);
> +
> +	return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args);
> +}
> +EXPORT_SYMBOL_GPL(tdh_phymem_page_wbinvd_tdr);
The tdx_global_keyid is of type u16 in TDX spec and TDX module.
As Reinette pointed out, u64 could cause overflow.

Do we need to change all keyids to u16, including those in
tdh.mng.create() in patch 2,
the global_keyid, tdx_guest_keyid_start in arch/x86/virt/vmx/tdx/tdx.c
and kvm_tdx->hkid in arch/x86/kvm/vmx/tdx.c ?

BTW, is it a good idea to move set_hkid_to_hpa() from KVM TDX to x86 common
header?

static __always_inline hpa_t set_hkid_to_hpa(hpa_t pa, u16 hkid)
{
        return pa | ((hpa_t)hkid << boot_cpu_data.x86_phys_bits);
}
Edgecombe, Rick P Dec. 11, 2024, 1:33 a.m. UTC | #4
On Wed, 2024-12-11 at 09:23 +0800, Yan Zhao wrote:
> On Mon, Dec 02, 2024 at 05:03:14PM -0800, Rick Edgecombe wrote:
> ...
> > +u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td)
> > +{
> > +	struct tdx_module_args args = {};
> > +
> > +	args.rcx = tdx_tdr_pa(td) | ((u64)tdx_global_keyid << boot_cpu_data.x86_phys_bits);
> > +
> > +	return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args);
> > +}
> > +EXPORT_SYMBOL_GPL(tdh_phymem_page_wbinvd_tdr);
> The tdx_global_keyid is of type u16 in TDX spec and TDX module.
> As Reinette pointed out, u64 could cause overflow.
> 
> Do we need to change all keyids to u16, including those in
> tdh.mng.create() in patch 2,
> the global_keyid, tdx_guest_keyid_start in arch/x86/virt/vmx/tdx/tdx.c
> and kvm_tdx->hkid in arch/x86/kvm/vmx/tdx.c ?

It seems like a good idea.

> 
> BTW, is it a good idea to move set_hkid_to_hpa() from KVM TDX to x86 common
> header?
> 
> static __always_inline hpa_t set_hkid_to_hpa(hpa_t pa, u16 hkid)
> {
>         return pa | ((hpa_t)hkid << boot_cpu_data.x86_phys_bits);
> }

Ah, yep.
diff mbox series

Patch

diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 018bbabf8639..f6ab8a9ea46b 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -151,6 +151,9 @@  u64 tdh_mng_key_freeid(struct tdx_td *td);
 u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err);
 u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx);
 u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid);
+u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size);
+u64 tdh_phymem_cache_wb(bool resume);
+u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td);
 #else
 static inline void tdx_init(void) { }
 static inline int tdx_cpu_enable(void) { return -ENODEV; }
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index d71656868fe4..b2a1ed13d0da 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1692,3 +1692,46 @@  u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid)
 	return seamcall(TDH_VP_INIT | (1ULL << TDX_VERSION_SHIFT), &args);
 }
 EXPORT_SYMBOL_GPL(tdh_vp_init_apicid);
+
+/*
+ * TDX ABI defines output operands as PT, OWNER and SIZE. These are TDX defined fomats.
+ * So despite the names, they must be interpted specially as described by the spec. Return
+ * them only for error reporting purposes.
+ */
+u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size)
+{
+	struct tdx_module_args args = {
+		.rcx = page_to_pfn(page) << PAGE_SHIFT,
+	};
+	u64 ret;
+
+	ret = seamcall_ret(TDH_PHYMEM_PAGE_RECLAIM, &args);
+
+	*tdx_pt = args.rcx;
+	*tdx_owner = args.rdx;
+	*tdx_size = args.r8;
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_phymem_page_reclaim);
+
+u64 tdh_phymem_cache_wb(bool resume)
+{
+	struct tdx_module_args args = {
+		.rcx = resume ? 1 : 0,
+	};
+
+	return seamcall(TDH_PHYMEM_CACHE_WB, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_phymem_cache_wb);
+
+u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td)
+{
+	struct tdx_module_args args = {};
+
+	args.rcx = tdx_tdr_pa(td) | ((u64)tdx_global_keyid << boot_cpu_data.x86_phys_bits);
+
+	return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_phymem_page_wbinvd_tdr);
+
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 3663971a3669..191bdd1e571d 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -26,11 +26,14 @@ 
 #define TDH_MNG_INIT			21
 #define TDH_VP_INIT			22
 #define TDH_PHYMEM_PAGE_RDMD		24
+#define TDH_PHYMEM_PAGE_RECLAIM		28
 #define TDH_SYS_KEY_CONFIG		31
 #define TDH_SYS_INIT			33
 #define TDH_SYS_RD			34
 #define TDH_SYS_LP_INIT			35
 #define TDH_SYS_TDMR_INIT		36
+#define TDH_PHYMEM_CACHE_WB		40
+#define TDH_PHYMEM_PAGE_WBINVD		41
 #define TDH_SYS_CONFIG			45
 
 /*