diff mbox series

[v15,09/23] x86/virt/tdx: Get module global metadata for module initialization

Message ID 30906e3cf94fe48d713de21a04ffd260bd1a7268.1699527082.git.kai.huang@intel.com (mailing list archive)
State New, archived
Headers show
Series TDX host kernel support | expand

Commit Message

Huang, Kai Nov. 9, 2023, 11:55 a.m. UTC
The TDX module global metadata provides system-wide information about
the module.  The TDX module provides SEAMCALls to allow the kernel to
query one specific global metadata field (entry) or all fields.

TL;DR:

Use the TDH.SYS.RD SEAMCALL to read the essential global metadata for
module initialization, and at the same time, to only initialize TDX
module with version 1.5 and later.

Long Version:

1) Only initialize TDX module with version 1.5 and later

TDX module 1.0 has some compatibility issues with the later versions of
module, as documented in the "Intel TDX module ABI incompatibilities
between TDX1.0 and TDX1.5" spec.  Basically there's no value to use TDX
module 1.0 when TDX module 1.5 and later versions are already available.
To keep things simple, just support initializing the TDX module 1.5 and
later.

2) Get the essential global metadata for module initialization

TDX reports a list of "Convertible Memory Region" (CMR) to tell the
kernel which memory is TDX compatible.  The kernel needs to build a list
of memory regions (out of CMRs) as "TDX-usable" memory and pass them to
the TDX module.  The kernel does this by constructing a list of "TD
Memory Regions" (TDMRs) to cover all these memory regions and passing
them to the TDX module.

Each TDMR is a TDX architectural data structure containing the memory
region that the TDMR covers, plus the information to track (within this
TDMR): a) the "Physical Address Metadata Table" (PAMT) to track each TDX
memory page's status (such as which TDX guest "owns" a given page, and
b) the "reserved areas" to tell memory holes that cannot be used as TDX
memory.

The kernel needs to get below metadata from the TDX module to build the
list of TDMRs: a) the maximum number of supported TDMRs, b) the maximum
number of supported reserved areas per TDMR and, c) the PAMT entry size
for each TDX-supported page size.

Note the TDX module internally checks whether the "TDX-usable" memory
regions passed via TDMRs are truly convertible.  Just skipping reading
the CMRs and manually checking memory regions against them, but let the
TDX module do the check.

== Implementation ==

TDX module 1.0 uses TDH.SYS.INFO SEAMCALL to report the global metadata
in a fixed-size (1024-bytes) structure 'TDSYSINFO_STRUCT'.  TDX module
1.5 adds more metadata fields, and introduces the new TDH.SYS.{RD|RDALL}
SEAMCALLs for reading the metadata.  The new metadata mechanism removes
the fixed-size limitation of the structure 'TDSYSINFO_STRUCT' and allows
the TDX module to support unlimited number of metadata fields.

TDX module 1.5 and later versions still support the TDH.SYS.INFO for
compatibility to the TDX module 1.0, but it may only report part of
metadata via the 'TDSYSINFO_STRUCT'.  For any new metadata the kernel
must use TDH.SYS.{RD|RDALL} to read.

To achieve the above two goals mentioned in 1) and 2), just use the
TDH.SYS.RD to read the essential metadata fields related to the TDMRs.

TDH.SYS.RD returns *one* metadata field at a given "Metadata Field ID".
It is enough for getting these few fields for module initialization.
On the other hand, TDH.SYS.RDALL reports all metadata fields to a 4KB
buffer provided by the kernel which is a little bit overkill here.

It may be beneficial to get all metadata fields at once here so they can
also be used by KVM (some are essential for creating basic TDX guests),
but technically it's unknown how many 4K pages are needed to fill all
the metadata.  Thus it's better to read metadata when needed.

Signed-off-by: Kai Huang <kai.huang@intel.com>
---

v14 -> v15:
 - New patch to use TDH.SYS.RD to read TDX module global metadata for
   module initialization and stop initializing 1.0 module.

---
 arch/x86/include/asm/shared/tdx.h |  1 +
 arch/x86/virt/vmx/tdx/tdx.c       | 75 ++++++++++++++++++++++++++++++-
 arch/x86/virt/vmx/tdx/tdx.h       | 39 ++++++++++++++++
 3 files changed, 114 insertions(+), 1 deletion(-)

Comments

Dave Hansen Nov. 9, 2023, 11:29 p.m. UTC | #1
On 11/9/23 03:55, Kai Huang wrote:
...> +	ret = read_sys_metadata_field16(MD_FIELD_ID_MAX_TDMRS,
> +			&tdmr_sysinfo->max_tdmrs);
> +	if (ret)
> +		return ret;
> +
> +	ret = read_sys_metadata_field16(MD_FIELD_ID_MAX_RESERVED_PER_TDMR,
> +			&tdmr_sysinfo->max_reserved_per_tdmr);
> +	if (ret)
> +		return ret;
> +
> +	ret = read_sys_metadata_field16(MD_FIELD_ID_PAMT_4K_ENTRY_SIZE,
> +			&tdmr_sysinfo->pamt_entry_size[TDX_PS_4K]);
> +	if (ret)
> +		return ret;
> +
> +	ret = read_sys_metadata_field16(MD_FIELD_ID_PAMT_2M_ENTRY_SIZE,
> +			&tdmr_sysinfo->pamt_entry_size[TDX_PS_2M]);
> +	if (ret)
> +		return ret;
> +
> +	return read_sys_metadata_field16(MD_FIELD_ID_PAMT_1G_ENTRY_SIZE,
> +			&tdmr_sysinfo->pamt_entry_size[TDX_PS_1G]);
> +}

I kinda despise how this looks.  It's impossible to read.

I'd much rather do something like the attached where you just map the
field number to a structure member.  Note that this kind of structure
could also be converted to leverage the bulk metadata query in the future.

Any objections to doing something more like the attached completely
untested patch?
Huang, Kai Nov. 10, 2023, 2:23 a.m. UTC | #2
On Thu, 2023-11-09 at 15:29 -0800, Dave Hansen wrote:
> On 11/9/23 03:55, Kai Huang wrote:
> ...> +	ret = read_sys_metadata_field16(MD_FIELD_ID_MAX_TDMRS,
> > +			&tdmr_sysinfo->max_tdmrs);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = read_sys_metadata_field16(MD_FIELD_ID_MAX_RESERVED_PER_TDMR,
> > +			&tdmr_sysinfo->max_reserved_per_tdmr);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = read_sys_metadata_field16(MD_FIELD_ID_PAMT_4K_ENTRY_SIZE,
> > +			&tdmr_sysinfo->pamt_entry_size[TDX_PS_4K]);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = read_sys_metadata_field16(MD_FIELD_ID_PAMT_2M_ENTRY_SIZE,
> > +			&tdmr_sysinfo->pamt_entry_size[TDX_PS_2M]);
> > +	if (ret)
> > +		return ret;
> > +
> > +	return read_sys_metadata_field16(MD_FIELD_ID_PAMT_1G_ENTRY_SIZE,
> > +			&tdmr_sysinfo->pamt_entry_size[TDX_PS_1G]);
> > +}
> 
> I kinda despise how this looks.  It's impossible to read.
> 
> I'd much rather do something like the attached where you just map the
> field number to a structure member.  Note that this kind of structure
> could also be converted to leverage the bulk metadata query in the future.
> 
> Any objections to doing something more like the attached completely
> untested patch?

Hi Dave,

No objection and thanks!  I've just tested with your diff I can successfully
initialize the TDX module.
Isaku Yamahata Nov. 15, 2023, 7:35 p.m. UTC | #3
On Fri, Nov 10, 2023 at 12:55:46AM +1300,
Kai Huang <kai.huang@intel.com> wrote:

> The TDX module global metadata provides system-wide information about
> the module.  The TDX module provides SEAMCALls to allow the kernel to
> query one specific global metadata field (entry) or all fields.
> 
> TL;DR:
> 
> Use the TDH.SYS.RD SEAMCALL to read the essential global metadata for
> module initialization, and at the same time, to only initialize TDX
> module with version 1.5 and later.
> 
> Long Version:
> 
> 1) Only initialize TDX module with version 1.5 and later
> 
> TDX module 1.0 has some compatibility issues with the later versions of
> module, as documented in the "Intel TDX module ABI incompatibilities
> between TDX1.0 and TDX1.5" spec.  Basically there's no value to use TDX
> module 1.0 when TDX module 1.5 and later versions are already available.
> To keep things simple, just support initializing the TDX module 1.5 and
> later.
> 
> 2) Get the essential global metadata for module initialization
> 
> TDX reports a list of "Convertible Memory Region" (CMR) to tell the
> kernel which memory is TDX compatible.  The kernel needs to build a list
> of memory regions (out of CMRs) as "TDX-usable" memory and pass them to
> the TDX module.  The kernel does this by constructing a list of "TD
> Memory Regions" (TDMRs) to cover all these memory regions and passing
> them to the TDX module.
> 
> Each TDMR is a TDX architectural data structure containing the memory
> region that the TDMR covers, plus the information to track (within this
> TDMR): a) the "Physical Address Metadata Table" (PAMT) to track each TDX
> memory page's status (such as which TDX guest "owns" a given page, and
> b) the "reserved areas" to tell memory holes that cannot be used as TDX
> memory.
> 
> The kernel needs to get below metadata from the TDX module to build the
> list of TDMRs: a) the maximum number of supported TDMRs, b) the maximum
> number of supported reserved areas per TDMR and, c) the PAMT entry size
> for each TDX-supported page size.
> 
> Note the TDX module internally checks whether the "TDX-usable" memory
> regions passed via TDMRs are truly convertible.  Just skipping reading
> the CMRs and manually checking memory regions against them, but let the
> TDX module do the check.
> 
> == Implementation ==
> 
> TDX module 1.0 uses TDH.SYS.INFO SEAMCALL to report the global metadata
> in a fixed-size (1024-bytes) structure 'TDSYSINFO_STRUCT'.  TDX module
> 1.5 adds more metadata fields, and introduces the new TDH.SYS.{RD|RDALL}
> SEAMCALLs for reading the metadata.  The new metadata mechanism removes
> the fixed-size limitation of the structure 'TDSYSINFO_STRUCT' and allows
> the TDX module to support unlimited number of metadata fields.
> 
> TDX module 1.5 and later versions still support the TDH.SYS.INFO for
> compatibility to the TDX module 1.0, but it may only report part of
> metadata via the 'TDSYSINFO_STRUCT'.  For any new metadata the kernel
> must use TDH.SYS.{RD|RDALL} to read.
> 
> To achieve the above two goals mentioned in 1) and 2), just use the
> TDH.SYS.RD to read the essential metadata fields related to the TDMRs.
> 
> TDH.SYS.RD returns *one* metadata field at a given "Metadata Field ID".
> It is enough for getting these few fields for module initialization.
> On the other hand, TDH.SYS.RDALL reports all metadata fields to a 4KB
> buffer provided by the kernel which is a little bit overkill here.
> 
> It may be beneficial to get all metadata fields at once here so they can
> also be used by KVM (some are essential for creating basic TDX guests),
> but technically it's unknown how many 4K pages are needed to fill all
> the metadata.  Thus it's better to read metadata when needed.
> 
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> ---
> 
> v14 -> v15:
>  - New patch to use TDH.SYS.RD to read TDX module global metadata for
>    module initialization and stop initializing 1.0 module.
> 
> ---
>  arch/x86/include/asm/shared/tdx.h |  1 +
>  arch/x86/virt/vmx/tdx/tdx.c       | 75 ++++++++++++++++++++++++++++++-
>  arch/x86/virt/vmx/tdx/tdx.h       | 39 ++++++++++++++++
>  3 files changed, 114 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h
> index a4036149c484..fdfd41511b02 100644
> --- a/arch/x86/include/asm/shared/tdx.h
> +++ b/arch/x86/include/asm/shared/tdx.h
> @@ -59,6 +59,7 @@
>  #define TDX_PS_4K	0
>  #define TDX_PS_2M	1
>  #define TDX_PS_1G	2
> +#define TDX_PS_NR	(TDX_PS_1G + 1)
>  
>  #ifndef __ASSEMBLY__
>  
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index d1affb30f74d..d24027993983 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -235,8 +235,75 @@ static int build_tdx_memlist(struct list_head *tmb_list)
>  	return ret;
>  }
>  
> +static int read_sys_metadata_field(u64 field_id, u64 *data)
> +{
> +	struct tdx_module_args args = {};
> +	int ret;
> +
> +	/*
> +	 * TDH.SYS.RD -- reads one global metadata field
> +	 *  - RDX (in): the field to read
> +	 *  - R8 (out): the field data
> +	 */
> +	args.rdx = field_id;
> +	ret = seamcall_prerr_ret(TDH_SYS_RD, &args);
> +	if (ret)
> +		return ret;
> +
> +	*data = args.r8;
> +
> +	return 0;
> +}
> +
> +static int read_sys_metadata_field16(u64 field_id, u16 *data)
> +{
> +	u64 _data;
> +	int ret;
> +
> +	if (WARN_ON_ONCE(MD_FIELD_ID_ELE_SIZE_CODE(field_id) !=
> +			MD_FIELD_ID_ELE_SIZE_16BIT))
> +		return -EINVAL;
> +
> +	ret = read_sys_metadata_field(field_id, &_data);
> +	if (ret)
> +		return ret;
> +
> +	*data = (u16)_data;
> +
> +	return 0;
> +}
> +
> +static int get_tdx_tdmr_sysinfo(struct tdx_tdmr_sysinfo *tdmr_sysinfo)
> +{
> +	int ret;
> +
> +	ret = read_sys_metadata_field16(MD_FIELD_ID_MAX_TDMRS,
> +			&tdmr_sysinfo->max_tdmrs);
> +	if (ret)
> +		return ret;
> +
> +	ret = read_sys_metadata_field16(MD_FIELD_ID_MAX_RESERVED_PER_TDMR,
> +			&tdmr_sysinfo->max_reserved_per_tdmr);
> +	if (ret)
> +		return ret;
> +
> +	ret = read_sys_metadata_field16(MD_FIELD_ID_PAMT_4K_ENTRY_SIZE,
> +			&tdmr_sysinfo->pamt_entry_size[TDX_PS_4K]);
> +	if (ret)
> +		return ret;
> +
> +	ret = read_sys_metadata_field16(MD_FIELD_ID_PAMT_2M_ENTRY_SIZE,
> +			&tdmr_sysinfo->pamt_entry_size[TDX_PS_2M]);
> +	if (ret)
> +		return ret;
> +
> +	return read_sys_metadata_field16(MD_FIELD_ID_PAMT_1G_ENTRY_SIZE,
> +			&tdmr_sysinfo->pamt_entry_size[TDX_PS_1G]);
> +}
> +

Now we don't query the versions, build info, attributes, and etc.  Because it's
important to know its version/attributes, can we query and print them
as before? Maybe with another path.
In long term, those info would be exported via sysfs, though.
Huang, Kai Nov. 16, 2023, 3:19 a.m. UTC | #4
On Wed, 2023-11-15 at 11:35 -0800, Isaku Yamahata wrote:
> Now we don't query the versions, build info, attributes, and etc.  Because it's
> important to know its version/attributes, can we query and print them
> as before? Maybe with another path.
> In long term, those info would be exported via sysfs, though.

I am planning to do /sysfs soon (not long term) after the basic TDX
functionality is merged.  The TDX guest side also has such requirement so we can
do it together.
diff mbox series

Patch

diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h
index a4036149c484..fdfd41511b02 100644
--- a/arch/x86/include/asm/shared/tdx.h
+++ b/arch/x86/include/asm/shared/tdx.h
@@ -59,6 +59,7 @@ 
 #define TDX_PS_4K	0
 #define TDX_PS_2M	1
 #define TDX_PS_1G	2
+#define TDX_PS_NR	(TDX_PS_1G + 1)
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index d1affb30f74d..d24027993983 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -235,8 +235,75 @@  static int build_tdx_memlist(struct list_head *tmb_list)
 	return ret;
 }
 
+static int read_sys_metadata_field(u64 field_id, u64 *data)
+{
+	struct tdx_module_args args = {};
+	int ret;
+
+	/*
+	 * TDH.SYS.RD -- reads one global metadata field
+	 *  - RDX (in): the field to read
+	 *  - R8 (out): the field data
+	 */
+	args.rdx = field_id;
+	ret = seamcall_prerr_ret(TDH_SYS_RD, &args);
+	if (ret)
+		return ret;
+
+	*data = args.r8;
+
+	return 0;
+}
+
+static int read_sys_metadata_field16(u64 field_id, u16 *data)
+{
+	u64 _data;
+	int ret;
+
+	if (WARN_ON_ONCE(MD_FIELD_ID_ELE_SIZE_CODE(field_id) !=
+			MD_FIELD_ID_ELE_SIZE_16BIT))
+		return -EINVAL;
+
+	ret = read_sys_metadata_field(field_id, &_data);
+	if (ret)
+		return ret;
+
+	*data = (u16)_data;
+
+	return 0;
+}
+
+static int get_tdx_tdmr_sysinfo(struct tdx_tdmr_sysinfo *tdmr_sysinfo)
+{
+	int ret;
+
+	ret = read_sys_metadata_field16(MD_FIELD_ID_MAX_TDMRS,
+			&tdmr_sysinfo->max_tdmrs);
+	if (ret)
+		return ret;
+
+	ret = read_sys_metadata_field16(MD_FIELD_ID_MAX_RESERVED_PER_TDMR,
+			&tdmr_sysinfo->max_reserved_per_tdmr);
+	if (ret)
+		return ret;
+
+	ret = read_sys_metadata_field16(MD_FIELD_ID_PAMT_4K_ENTRY_SIZE,
+			&tdmr_sysinfo->pamt_entry_size[TDX_PS_4K]);
+	if (ret)
+		return ret;
+
+	ret = read_sys_metadata_field16(MD_FIELD_ID_PAMT_2M_ENTRY_SIZE,
+			&tdmr_sysinfo->pamt_entry_size[TDX_PS_2M]);
+	if (ret)
+		return ret;
+
+	return read_sys_metadata_field16(MD_FIELD_ID_PAMT_1G_ENTRY_SIZE,
+			&tdmr_sysinfo->pamt_entry_size[TDX_PS_1G]);
+}
+
 static int init_tdx_module(void)
 {
+	struct tdx_tdmr_sysinfo tdmr_sysinfo;
 	int ret;
 
 	/*
@@ -255,10 +322,13 @@  static int init_tdx_module(void)
 	if (ret)
 		goto out_put_tdxmem;
 
+	ret = get_tdx_tdmr_sysinfo(&tdmr_sysinfo);
+	if (ret)
+		goto out_free_tdxmem;
+
 	/*
 	 * TODO:
 	 *
-	 *  - Get TDX module "TD Memory Region" (TDMR) global metadata.
 	 *  - Construct a list of TDMRs to cover all TDX-usable memory
 	 *    regions.
 	 *  - Configure the TDMRs and the global KeyID to the TDX module.
@@ -268,6 +338,9 @@  static int init_tdx_module(void)
 	 *  Return error before all steps are done.
 	 */
 	ret = -EINVAL;
+out_free_tdxmem:
+	if (ret)
+		free_tdx_memlist(&tdx_memlist);
 out_put_tdxmem:
 	/*
 	 * @tdx_memlist is written here and read at memory hotplug time.
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index c11e0a7ca664..29cdf5ea5544 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -2,6 +2,8 @@ 
 #ifndef _X86_VIRT_TDX_H
 #define _X86_VIRT_TDX_H
 
+#include <linux/bits.h>
+
 /*
  * This file contains both macros and data structures defined by the TDX
  * architecture and Linux defined software data structures and functions.
@@ -13,8 +15,38 @@ 
  * TDX module SEAMCALL leaf functions
  */
 #define TDH_SYS_INIT		33
+#define TDH_SYS_RD		34
 #define TDH_SYS_LP_INIT		35
 
+/*
+ * Global scope metadata field ID.
+ *
+ * See Table "Global Scope Metadata", TDX module 1.5 ABI spec.
+ */
+#define MD_FIELD_ID_MAX_TDMRS			0x9100000100000008ULL
+#define MD_FIELD_ID_MAX_RESERVED_PER_TDMR	0x9100000100000009ULL
+#define MD_FIELD_ID_PAMT_4K_ENTRY_SIZE		0x9100000100000010ULL
+#define MD_FIELD_ID_PAMT_2M_ENTRY_SIZE		0x9100000100000011ULL
+#define MD_FIELD_ID_PAMT_1G_ENTRY_SIZE		0x9100000100000012ULL
+
+/*
+ * Sub-field definition of metadata field ID.
+ *
+ * See Table "MD_FIELD_ID (Metadata Field Identifier / Sequence Header)
+ * Definition", TDX module 1.5 ABI spec.
+ *
+ *  - Bit 33:32: ELEMENT_SIZE_CODE -- size of a single element of metadata
+ *
+ *	0: 8 bits
+ *	1: 16 bits
+ *	2: 32 bits
+ *	3: 64 bits
+ */
+#define MD_FIELD_ID_ELE_SIZE_CODE(_field_id)	\
+		(((_field_id) & GENMASK_ULL(33, 32)) >> 32)
+
+#define MD_FIELD_ID_ELE_SIZE_16BIT	1
+
 /*
  * Do not put any hardware-defined TDX structure representations below
  * this comment!
@@ -33,4 +65,11 @@  struct tdx_memblock {
 	unsigned long end_pfn;
 };
 
+/* "TDMR info" part of "Global Scope Metadata" for constructing TDMRs */
+struct tdx_tdmr_sysinfo {
+	u16 max_tdmrs;
+	u16 max_reserved_per_tdmr;
+	u16 pamt_entry_size[TDX_PS_NR];
+};
+
 #endif