diff mbox series

[v8,08/16] x86/virt/tdx: Add placeholder to construct TDMRs to cover all TDX memory regions

Message ID ef6fe9247007ee8e15272de01ded1e0a9152be02.1670566861.git.kai.huang@intel.com (mailing list archive)
State New
Headers show
Series TDX host kernel support | expand

Commit Message

Kai Huang Dec. 9, 2022, 6:52 a.m. UTC
After the kernel selects all TDX-usable memory regions, the kernel needs
to pass those regions to the TDX module via data structure "TD Memory
Region" (TDMR).

Add a placeholder to construct a list of TDMRs (in multiple steps) to
cover all TDX-usable memory regions.

=== Long Version ===

TDX provides increased levels of memory confidentiality and integrity.
This requires special hardware support for features like memory
encryption and storage of memory integrity checksums.  Not all memory
satisfies these requirements.

As a result, TDX introduced the concept of a "Convertible Memory Region"
(CMR).  During boot, the firmware builds a list of all of the memory
ranges which can provide the TDX security guarantees.  The list of these
ranges is available to the kernel by querying the TDX module.

The TDX architecture needs additional metadata to record things like
which TD guest "owns" a given page of memory.  This metadata essentially
serves as the 'struct page' for the TDX module.  The space for this
metadata is not reserved by the hardware up front and must be allocated
by the kernel and given to the TDX module.

Since this metadata consumes space, the VMM can choose whether or not to
allocate it for a given area of convertible memory.  If it chooses not
to, the memory cannot receive TDX protections and can not be used by TDX
guests as private memory.

For every memory region that the VMM wants to use as TDX memory, it sets
up a "TD Memory Region" (TDMR).  Each TDMR represents a physically
contiguous convertible range and must also have its own physically
contiguous metadata table, referred to as a Physical Address Metadata
Table (PAMT), to track status for each page in the TDMR range.

Unlike a CMR, each TDMR requires 1G granularity and alignment.  To
support physical RAM areas that don't meet those strict requirements,
each TDMR permits a number of internal "reserved areas" which can be
placed over memory holes.  If PAMT metadata is placed within a TDMR it
must be covered by one of these reserved areas.

Let's summarize the concepts:

 CMR - Firmware-enumerated physical ranges that support TDX.  CMRs are
       4K aligned.
TDMR - Physical address range which is chosen by the kernel to support
       TDX.  1G granularity and alignment required.  Each TDMR has
       reserved areas where TDX memory holes and overlapping PAMTs can
       be represented.
PAMT - Physically contiguous TDX metadata.  One table for each page size
       per TDMR.  Roughly 1/256th of TDMR in size.  256G TDMR = ~1G
       PAMT.

As one step of initializing the TDX module, the kernel configures
TDX-usable memory regions by passing a list of TDMRs to the TDX module.

Constructing the list of TDMRs consists below steps:

1) Fill out TDMRs to cover all memory regions that the TDX module will
   use for TD memory.
2) Allocate and set up PAMT for each TDMR.
3) Designate reserved areas for each TDMR.

Add a placeholder to construct TDMRs to do the above steps.  Always free
the space for TDMRs at the end of the module initialization (no matter
successful or not) as TDMRs are only used during the initialization.

Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---

v7 -> v8:
 - Improved changelog to tell this is one step of "TODO list" in
   init_tdx_module().
 - Other changelog improvement suggested by Dave (with "Create TDMRs" to
   "Fill out TDMRs" to align with the code).
 - Added a "TODO list" comment to lay out the steps to construct TDMRs,
   following the same idea of "TODO list" in tdx_module_init().
 - Introduced 'struct tdmr_info_list' (Dave)
 - Further added additional members (tdmr_sz/max_tdmrs/nr_tdmrs) to
   simplify getting TDMR by given index, and reduce passing arguments
   around functions.
 - Added alloc_tdmr_list()/free_tdmr_list() accordingly, which internally
   uses tdmr_size_single() (Dave).
 - tdmr_num -> nr_tdmrs (Dave).

v6 -> v7:
 - Improved commit message to explain 'int' overflow cannot happen
   in cal_tdmr_size() and alloc_tdmr_array(). -- Andy/Dave.

v5 -> v6:
 - construct_tdmrs_memblock() -> construct_tdmrs() as 'tdx_memblock' is
   used instead of memblock.
 - Added Isaku's Reviewed-by.

- v3 -> v5 (no feedback on v4):
 - Moved calculating TDMR size to this patch.
 - Changed to use alloc_pages_exact() to allocate buffer for all TDMRs
   once, instead of allocating each TDMR individually.
 - Removed "crypto protection" in the changelog.
 - -EFAULT -> -EINVAL in couple of places.

---
 arch/x86/virt/vmx/tdx/tdx.c | 104 +++++++++++++++++++++++++++++++++++-
 arch/x86/virt/vmx/tdx/tdx.h |  23 ++++++++
 2 files changed, 125 insertions(+), 2 deletions(-)

Comments

Dave Hansen Jan. 6, 2023, 7:24 p.m. UTC | #1
> +struct tdmr_info_list {
> +	struct tdmr_info *first_tdmr;

This is named badly.  This is really a pointer to an array.  While it
_does_ of course point to the first member of the array, the naming
should make it clear that there are multiple tdmr_infos here.

> +	int tdmr_sz;
> +	int max_tdmrs;
> +	int nr_tdmrs;	/* Actual number of TDMRs */
> +};

This 'tdmr_info_list's is declared in an unfortunate place.  I thought
the tdmr_size_single() function below was related to it.

Also, tdmr_sz and max_tdmrs can both be derived from 'sysinfo'.  Do they
really need to be stored here?  If so, I think I'd probably do something
like this with the structure:

struct tdmr_info_list {
	struct tdmr_info *tdmrs;
	int nr_consumed_tdmrs; // How many @tdmrs are in use

	/* Metadata for freeing this structure: */
	int tdmr_sz;   // Size of one 'tdmr_info' (has a flex array)
	int max_tdmrs; // How many @tdmrs are allocated
};

Modulo whataver folks are doing for comments these days.

> +/* Calculate the actual TDMR size */
> +static int tdmr_size_single(u16 max_reserved_per_tdmr)
> +{
> +	int tdmr_sz;
> +
> +	/*
> +	 * The actual size of TDMR depends on the maximum
> +	 * number of reserved areas.
> +	 */
> +	tdmr_sz = sizeof(struct tdmr_info);
> +	tdmr_sz += sizeof(struct tdmr_reserved_area) * max_reserved_per_tdmr;
> +
> +	return ALIGN(tdmr_sz, TDMR_INFO_ALIGNMENT);
> +}
> +
> +static int alloc_tdmr_list(struct tdmr_info_list *tdmr_list,
> +			   struct tdsysinfo_struct *sysinfo)
> +{
> +	size_t tdmr_sz, tdmr_array_sz;
> +	void *tdmr_array;
> +
> +	tdmr_sz = tdmr_size_single(sysinfo->max_reserved_per_tdmr);
> +	tdmr_array_sz = tdmr_sz * sysinfo->max_tdmrs;
> +
> +	/*
> +	 * To keep things simple, allocate all TDMRs together.
> +	 * The buffer needs to be physically contiguous to make
> +	 * sure each TDMR is physically contiguous.
> +	 */
> +	tdmr_array = alloc_pages_exact(tdmr_array_sz,
> +			GFP_KERNEL | __GFP_ZERO);
> +	if (!tdmr_array)
> +		return -ENOMEM;
> +
> +	tdmr_list->first_tdmr = tdmr_array;
> +	/*

	^ probably missing whitepsace before the comment

> +	 * Keep the size of TDMR to find the target TDMR
> +	 * at a given index in the TDMR list.
> +	 */
> +	tdmr_list->tdmr_sz = tdmr_sz;
> +	tdmr_list->max_tdmrs = sysinfo->max_tdmrs;
> +	tdmr_list->nr_tdmrs = 0;
> +
> +	return 0;
> +}
> +
> +static void free_tdmr_list(struct tdmr_info_list *tdmr_list)
> +{
> +	free_pages_exact(tdmr_list->first_tdmr,
> +			tdmr_list->max_tdmrs * tdmr_list->tdmr_sz);
> +}
> +
> +/*
> + * Construct a list of TDMRs on the preallocated space in @tdmr_list
> + * to cover all TDX memory regions in @tmb_list based on the TDX module
> + * information in @sysinfo.
> + */
> +static int construct_tdmrs(struct list_head *tmb_list,
> +			   struct tdmr_info_list *tdmr_list,
> +			   struct tdsysinfo_struct *sysinfo)
> +{
> +	/*
> +	 * TODO:
> +	 *
> +	 *  - Fill out TDMRs to cover all TDX memory regions.
> +	 *  - Allocate and set up PAMTs for each TDMR.
> +	 *  - Designate reserved areas for each TDMR.
> +	 *
> +	 * Return -EINVAL until constructing TDMRs is done
> +	 */
> +	return -EINVAL;
> +}
> +
>  static int init_tdx_module(void)
>  {
>  	/*
> @@ -358,6 +439,7 @@ static int init_tdx_module(void)
>  			TDSYSINFO_STRUCT_SIZE, TDSYSINFO_STRUCT_ALIGNMENT);
>  	struct cmr_info cmr_array[MAX_CMRS] __aligned(CMR_INFO_ARRAY_ALIGNMENT);
>  	struct tdsysinfo_struct *sysinfo = &PADDED_STRUCT(tdsysinfo);
> +	struct tdmr_info_list tdmr_list;
>  	int ret;
>  
>  	ret = tdx_get_sysinfo(sysinfo, cmr_array);
> @@ -380,11 +462,19 @@ static int init_tdx_module(void)
>  	if (ret)
>  		goto out;
>  
> +	/* Allocate enough space for constructing TDMRs */
> +	ret = alloc_tdmr_list(&tdmr_list, sysinfo);
> +	if (ret)
> +		goto out_free_tdx_mem;
> +
> +	/* Cover all TDX-usable memory regions in TDMRs */
> +	ret = construct_tdmrs(&tdx_memlist, &tdmr_list, sysinfo);
> +	if (ret)
> +		goto out_free_tdmrs;
> +
>  	/*
>  	 * TODO:
>  	 *
> -	 *  - Construct a list of TDMRs to cover all TDX-usable memory
> -	 *    regions.
>  	 *  - Pick up one TDX private KeyID as the global KeyID.
>  	 *  - Configure the TDMRs and the global KeyID to the TDX module.
>  	 *  - Configure the global KeyID on all packages.
> @@ -393,6 +483,16 @@ static int init_tdx_module(void)
>  	 *  Return error before all steps are done.
>  	 */
>  	ret = -EINVAL;
> +out_free_tdmrs:
> +	/*
> +	 * Free the space for the TDMRs no matter the initialization is
> +	 * successful or not.  They are not needed anymore after the
> +	 * module initialization.
> +	 */
> +	free_tdmr_list(&tdmr_list);
> +out_free_tdx_mem:
> +	if (ret)
> +		free_tdx_memlist(&tdx_memlist);
>  out:
>  	/*
>  	 * @tdx_memlist is written here and read at memory hotplug time.
> diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
> index 6d32f62e4182..d0c762f1a94c 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.h
> +++ b/arch/x86/virt/vmx/tdx/tdx.h
> @@ -90,6 +90,29 @@ struct tdsysinfo_struct {
>  	DECLARE_FLEX_ARRAY(struct cpuid_config, cpuid_configs);
>  } __packed;
>  
> +struct tdmr_reserved_area {
> +	u64 offset;
> +	u64 size;
> +} __packed;
> +
> +#define TDMR_INFO_ALIGNMENT	512
> +
> +struct tdmr_info {
> +	u64 base;
> +	u64 size;
> +	u64 pamt_1g_base;
> +	u64 pamt_1g_size;
> +	u64 pamt_2m_base;
> +	u64 pamt_2m_size;
> +	u64 pamt_4k_base;
> +	u64 pamt_4k_size;
> +	/*
> +	 * Actual number of reserved areas depends on
> +	 * 'struct tdsysinfo_struct'::max_reserved_per_tdmr.
> +	 */
> +	DECLARE_FLEX_ARRAY(struct tdmr_reserved_area, reserved_areas);
> +} __packed __aligned(TDMR_INFO_ALIGNMENT);
> +
>  /*
>   * Do not put any hardware-defined TDX structure representations below
>   * this comment!
Kai Huang Jan. 10, 2023, 12:40 a.m. UTC | #2
On Fri, 2023-01-06 at 11:24 -0800, Dave Hansen wrote:
> > +struct tdmr_info_list {
> > +	struct tdmr_info *first_tdmr;
> 
> This is named badly.  This is really a pointer to an array.  While it
> _does_ of course point to the first member of the array, the naming
> should make it clear that there are multiple tdmr_infos here.

Will change to 'tdmrs' as in your code.

> 
> > +	int tdmr_sz;
> > +	int max_tdmrs;
> > +	int nr_tdmrs;	/* Actual number of TDMRs */
> > +};
> 
> This 'tdmr_info_list's is declared in an unfortunate place.  I thought
> the tdmr_size_single() function below was related to it.

I think I can move it "tdx.h", which is claimed to have both TDX-arch data
structures and linux-defined structures anyway.

I think I can also move 'enum tdx_module_status_t' and 'struct tdx_memblock'
declarations to "tdx.h" too so that all declarations are in "tdx.h".

> 
> Also, tdmr_sz and max_tdmrs can both be derived from 'sysinfo'.  Do they
> really need to be stored here?  

It's not mandatory to keep them here.  I did it mainly because I want to avoid
passing 'sysinfo' as argument for almost all functions related to constructing
TDMRs.

For instance, 'tdmr_sz' is used to calculate the position of each individual
TDMR at a given index.  Instead of passing additional 'sysinfo' (or sysinfo-
>max_reserved_per_tdmr):

	struct tdmr_info *tdmr_entry(struct tdmr_info_list *tdmr_list,  int
idx,
				     struct tdsysinfo_struct *sysinfo) { ... }

I perfer:

	struct tdmr_info *tdmr_entry(struct tdmr_info_list *tdmr_list, int idx)
	{...}

tdmr_entry() is basically called in all 3 steps (fill out TDMRs, allocate PAMTs,
and designate reserved areas).  Having 'sysinfo' in it will require almost all
functions related to constructing TDMRs to have 'sysinfo' as argument, which
only makes the code more complicated and hurts the readability IMHO.

> If so, I think I'd probably do something
> like this with the structure:
> 
> struct tdmr_info_list {
> 	struct tdmr_info *tdmrs;
> 	int nr_consumed_tdmrs; // How many @tdmrs are in use
> 
> 	/* Metadata for freeing this structure: */
> 	int tdmr_sz;   // Size of one 'tdmr_info' (has a flex array)
> 	int max_tdmrs; // How many @tdmrs are allocated
> };
> 
> Modulo whataver folks are doing for comments these days.

Looks nice to me.  Will use.  A slight thing is 'tdmr_sz' is also used to get
the TDMR at a given index, but not just freeing the structure.

Btw, is C++ style comment "//" OK in kernel code?
> 
> > +/* Calculate the actual TDMR size */
> > +static int tdmr_size_single(u16 max_reserved_per_tdmr)
> > +{
> > +	int tdmr_sz;
> > +
> > +	/*
> > +	 * The actual size of TDMR depends on the maximum
> > +	 * number of reserved areas.
> > +	 */
> > +	tdmr_sz = sizeof(struct tdmr_info);
> > +	tdmr_sz += sizeof(struct tdmr_reserved_area) *
> > max_reserved_per_tdmr;
> > +
> > +	return ALIGN(tdmr_sz, TDMR_INFO_ALIGNMENT);
> > +}
> > +
> > +static int alloc_tdmr_list(struct tdmr_info_list *tdmr_list,
> > +			   struct tdsysinfo_struct *sysinfo)
> > +{
> > +	size_t tdmr_sz, tdmr_array_sz;
> > +	void *tdmr_array;
> > +
> > +	tdmr_sz = tdmr_size_single(sysinfo->max_reserved_per_tdmr);
> > +	tdmr_array_sz = tdmr_sz * sysinfo->max_tdmrs;
> > +
> > +	/*
> > +	 * To keep things simple, allocate all TDMRs together.
> > +	 * The buffer needs to be physically contiguous to make
> > +	 * sure each TDMR is physically contiguous.
> > +	 */
> > +	tdmr_array = alloc_pages_exact(tdmr_array_sz,
> > +			GFP_KERNEL | __GFP_ZERO);
> > +	if (!tdmr_array)
> > +		return -ENOMEM;
> > +
> > +	tdmr_list->first_tdmr = tdmr_array;
> > +	/*
> 
> 	^ probably missing whitepsace before the comment
> 

Will add, assuming you mean a new empty line.  Thanks for the tip.


> > +	 * Keep the size of TDMR to find the target TDMR
> > +	 * at a given index in the TDMR list.
> > +	 */
> > +	tdmr_list->tdmr_sz = tdmr_sz;
> > +	tdmr_list->max_tdmrs = sysinfo->max_tdmrs;
> > +	tdmr_list->nr_tdmrs = 0;
> > +
> > +	return 0;
> > +}
> > +

[snip]
Dave Hansen Jan. 10, 2023, 12:47 a.m. UTC | #3
On 1/9/23 16:40, Huang, Kai wrote:
> On Fri, 2023-01-06 at 11:24 -0800, Dave Hansen wrote:
...
>> Also, tdmr_sz and max_tdmrs can both be derived from 'sysinfo'.  Do they
>> really need to be stored here?
> 
> It's not mandatory to keep them here.  I did it mainly because I want to avoid
> passing 'sysinfo' as argument for almost all functions related to constructing
> TDMRs.

I don't think it hurts readability that much.  On the contrary, it makes
it more clear what data is needed for initialization.

>> If so, I think I'd probably do something
>> like this with the structure:
>>
>> struct tdmr_info_list {
>>       struct tdmr_info *tdmrs;
>>       int nr_consumed_tdmrs; // How many @tdmrs are in use
>>
>>       /* Metadata for freeing this structure: */
>>       int tdmr_sz;   // Size of one 'tdmr_info' (has a flex array)
>>       int max_tdmrs; // How many @tdmrs are allocated
>> };
>>
>> Modulo whataver folks are doing for comments these days.
> 
> Looks nice to me.  Will use.  A slight thing is 'tdmr_sz' is also used to get
> the TDMR at a given index, but not just freeing the structure.
> 
> Btw, is C++ style comment "//" OK in kernel code?

It's OK with me, but I don't think there's much consensus on it.
Probably best to stick with normal arch/x86 style for now.
Kai Huang Jan. 10, 2023, 2:23 a.m. UTC | #4
On Mon, 2023-01-09 at 16:47 -0800, Dave Hansen wrote:
> On 1/9/23 16:40, Huang, Kai wrote:
> > On Fri, 2023-01-06 at 11:24 -0800, Dave Hansen wrote:
> ...
> > > Also, tdmr_sz and max_tdmrs can both be derived from 'sysinfo'.  Do they
> > > really need to be stored here?
> > 
> > It's not mandatory to keep them here.  I did it mainly because I want to avoid
> > passing 'sysinfo' as argument for almost all functions related to constructing
> > TDMRs.
> 
> I don't think it hurts readability that much.  On the contrary, it makes
> it more clear what data is needed for initialization.

Sorry one thing I forgot to mention is if we keep 'tdmr_sz' in 'struct
tdmr_info_list', it only needs to be calculated at once when allocating the
buffer.  Otherwise, we need to calculate it based on sysinfo-
>max_reserved_per_tdmr each time we want to get a TDMR at a given index.

To me putting relevant fields (tdmrs, tdmr_sz, max_tdmrs, nr_consumed_tdmrs)
together makes how the TDMR list is organized more clear.  But please let me
know if you prefer removing 'tdmr_sz' and 'max_tdmrs'.

Btw, if we remove 'tdmr_sz' and 'max_tdmrs', even nr_consumed_tdmrs is not
absolutely necessary here.  It can be a local variable of init_tdx_module() (as
shown in v7), and the 'struct tdmr_info_list' will only have the 'tdmrs' member
(as you commented in v7):

https://lore.kernel.org/linux-mm/cc195eb6499cf021b4ce2e937200571915bfe66f.camel@intel.com/T/#mb9826e2bcf8bf6399c13cc5f95a948fe4b3a46d9

Please let me know what's your preference?

> 
> > > If so, I think I'd probably do something
> > > like this with the structure:
> > > 
> > > struct tdmr_info_list {
> > >       struct tdmr_info *tdmrs;
> > >       int nr_consumed_tdmrs; // How many @tdmrs are in use
> > > 
> > >       /* Metadata for freeing this structure: */
> > >       int tdmr_sz;   // Size of one 'tdmr_info' (has a flex array)
> > >       int max_tdmrs; // How many @tdmrs are allocated
> > > };
> > > 
> > > Modulo whataver folks are doing for comments these days.
> > 
> > Looks nice to me.  Will use.  A slight thing is 'tdmr_sz' is also used to get
> > the TDMR at a given index, but not just freeing the structure.
> > 
> > Btw, is C++ style comment "//" OK in kernel code?
> 
> It's OK with me, but I don't think there's much consensus on it.
> Probably best to stick with normal arch/x86 style for now.
> 
> 

Will use normal arch/x86 style for now.  Thanks for the info.
Dave Hansen Jan. 10, 2023, 7:12 p.m. UTC | #5
On 1/9/23 18:23, Huang, Kai wrote:
> On Mon, 2023-01-09 at 16:47 -0800, Dave Hansen wrote:
>> On 1/9/23 16:40, Huang, Kai wrote:
>>> On Fri, 2023-01-06 at 11:24 -0800, Dave Hansen wrote:
>> ...
>>>> Also, tdmr_sz and max_tdmrs can both be derived from 'sysinfo'.  Do they
>>>> really need to be stored here?
>>>
>>> It's not mandatory to keep them here.  I did it mainly because I want to avoid
>>> passing 'sysinfo' as argument for almost all functions related to constructing
>>> TDMRs.
>>
>> I don't think it hurts readability that much.  On the contrary, it makes
>> it more clear what data is needed for initialization.
> 
> Sorry one thing I forgot to mention is if we keep 'tdmr_sz' in 'struct
> tdmr_info_list', it only needs to be calculated at once when allocating the
> buffer.  Otherwise, we need to calculate it based on sysinfo-
> max_reserved_per_tdmr each time we want to get a TDMR at a given index.

What's the problem with recalculating it?  It is calculated like this:

	tdmr_sz = ALIGN(constant1 + constant2 * variable);

So, what's the problem?  You're concerned about too many multiplications?

> To me putting relevant fields (tdmrs, tdmr_sz, max_tdmrs, nr_consumed_tdmrs)
> together makes how the TDMR list is organized more clear.  But please let me
> know if you prefer removing 'tdmr_sz' and 'max_tdmrs'.
> 
> Btw, if we remove 'tdmr_sz' and 'max_tdmrs', even nr_consumed_tdmrs is not
> absolutely necessary here.  It can be a local variable of init_tdx_module() (as
> shown in v7), and the 'struct tdmr_info_list' will only have the 'tdmrs' member
> (as you commented in v7):
> 
> https://lore.kernel.org/linux-mm/cc195eb6499cf021b4ce2e937200571915bfe66f.camel@intel.com/T/#mb9826e2bcf8bf6399c13cc5f95a948fe4b3a46d9
> 
> Please let me know what's your preference?

I dunno.  My gut says that passing sysinfo around and just deriving the
sizes values from that with helpers is the best way.  'struct
tdmr_info_list' isn't a horrible idea in and of itself, but I think it's
a confusing structure because it's not clear how the pieces fit together
when half of it is *required* and the other half is just for some kind
of perceived convenience.
Kai Huang Jan. 11, 2023, 9:23 a.m. UTC | #6
On Tue, 2023-01-10 at 11:12 -0800, Hansen, Dave wrote:
> On 1/9/23 18:23, Huang, Kai wrote:
> > On Mon, 2023-01-09 at 16:47 -0800, Dave Hansen wrote:
> > > On 1/9/23 16:40, Huang, Kai wrote:
> > > > On Fri, 2023-01-06 at 11:24 -0800, Dave Hansen wrote:
> > > ...
> > > > > Also, tdmr_sz and max_tdmrs can both be derived from 'sysinfo'.  Do they
> > > > > really need to be stored here?
> > > > 
> > > > It's not mandatory to keep them here.  I did it mainly because I want to avoid
> > > > passing 'sysinfo' as argument for almost all functions related to constructing
> > > > TDMRs.
> > > 
> > > I don't think it hurts readability that much.  On the contrary, it makes
> > > it more clear what data is needed for initialization.
> > 
> > Sorry one thing I forgot to mention is if we keep 'tdmr_sz' in 'struct
> > tdmr_info_list', it only needs to be calculated at once when allocating the
> > buffer.  Otherwise, we need to calculate it based on sysinfo-
> > max_reserved_per_tdmr each time we want to get a TDMR at a given index.
> 
> What's the problem with recalculating it?  It is calculated like this:
> 
> 	tdmr_sz = ALIGN(constant1 + constant2 * variable);
> 
> So, what's the problem?  You're concerned about too many multiplications?

No problem.  I don't have concern about multiplications, but since they can be
avoided, I thought perhaps it's better to avoid.

So I am fine with either way, no problem.

> 
> > To me putting relevant fields (tdmrs, tdmr_sz, max_tdmrs, nr_consumed_tdmrs)
> > together makes how the TDMR list is organized more clear.  But please let me
> > know if you prefer removing 'tdmr_sz' and 'max_tdmrs'.
> > 
> > Btw, if we remove 'tdmr_sz' and 'max_tdmrs', even nr_consumed_tdmrs is not
> > absolutely necessary here.  It can be a local variable of init_tdx_module() (as
> > shown in v7), and the 'struct tdmr_info_list' will only have the 'tdmrs' member
> > (as you commented in v7):
> > 
> > https://lore.kernel.org/linux-mm/cc195eb6499cf021b4ce2e937200571915bfe66f.camel@intel.com/T/#mb9826e2bcf8bf6399c13cc5f95a948fe4b3a46d9
> > 
> > Please let me know what's your preference?
> 
> I dunno.  My gut says that passing sysinfo around and just deriving the
> sizes values from that with helpers is the best way.  'struct
> tdmr_info_list' isn't a horrible idea in and of itself, but I think it's
> a confusing structure because it's not clear how the pieces fit together
> when half of it is *required* and the other half is just for some kind
> of perceived convenience.
> 

Sure.  No more argument about this.

However, for the sake of not adding more review burden to you, how about keeping
the 'struct tdmr_info_list' as is this time?  Of course I am willing to remove
the 'tdmr_sz' and 'max_tdmrs' from 'struct tdmr_info_list' but only keep 'tdmrs'
and 'nr_consumed_tdmrs' if you are wiling or want to look at what will the new
code look like.

Please let me know?
diff mbox series

Patch

diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index f010402f443d..d36ac72ef299 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -20,6 +20,7 @@ 
 #include <linux/minmax.h>
 #include <linux/sizes.h>
 #include <linux/pfn.h>
+#include <linux/align.h>
 #include <asm/pgtable_types.h>
 #include <asm/msr.h>
 #include <asm/tdx.h>
@@ -347,6 +348,86 @@  static int build_tdx_memlist(struct list_head *tmb_list)
 	return ret;
 }
 
+struct tdmr_info_list {
+	struct tdmr_info *first_tdmr;
+	int tdmr_sz;
+	int max_tdmrs;
+	int nr_tdmrs;	/* Actual number of TDMRs */
+};
+
+/* Calculate the actual TDMR size */
+static int tdmr_size_single(u16 max_reserved_per_tdmr)
+{
+	int tdmr_sz;
+
+	/*
+	 * The actual size of TDMR depends on the maximum
+	 * number of reserved areas.
+	 */
+	tdmr_sz = sizeof(struct tdmr_info);
+	tdmr_sz += sizeof(struct tdmr_reserved_area) * max_reserved_per_tdmr;
+
+	return ALIGN(tdmr_sz, TDMR_INFO_ALIGNMENT);
+}
+
+static int alloc_tdmr_list(struct tdmr_info_list *tdmr_list,
+			   struct tdsysinfo_struct *sysinfo)
+{
+	size_t tdmr_sz, tdmr_array_sz;
+	void *tdmr_array;
+
+	tdmr_sz = tdmr_size_single(sysinfo->max_reserved_per_tdmr);
+	tdmr_array_sz = tdmr_sz * sysinfo->max_tdmrs;
+
+	/*
+	 * To keep things simple, allocate all TDMRs together.
+	 * The buffer needs to be physically contiguous to make
+	 * sure each TDMR is physically contiguous.
+	 */
+	tdmr_array = alloc_pages_exact(tdmr_array_sz,
+			GFP_KERNEL | __GFP_ZERO);
+	if (!tdmr_array)
+		return -ENOMEM;
+
+	tdmr_list->first_tdmr = tdmr_array;
+	/*
+	 * Keep the size of TDMR to find the target TDMR
+	 * at a given index in the TDMR list.
+	 */
+	tdmr_list->tdmr_sz = tdmr_sz;
+	tdmr_list->max_tdmrs = sysinfo->max_tdmrs;
+	tdmr_list->nr_tdmrs = 0;
+
+	return 0;
+}
+
+static void free_tdmr_list(struct tdmr_info_list *tdmr_list)
+{
+	free_pages_exact(tdmr_list->first_tdmr,
+			tdmr_list->max_tdmrs * tdmr_list->tdmr_sz);
+}
+
+/*
+ * Construct a list of TDMRs on the preallocated space in @tdmr_list
+ * to cover all TDX memory regions in @tmb_list based on the TDX module
+ * information in @sysinfo.
+ */
+static int construct_tdmrs(struct list_head *tmb_list,
+			   struct tdmr_info_list *tdmr_list,
+			   struct tdsysinfo_struct *sysinfo)
+{
+	/*
+	 * TODO:
+	 *
+	 *  - Fill out TDMRs to cover all TDX memory regions.
+	 *  - Allocate and set up PAMTs for each TDMR.
+	 *  - Designate reserved areas for each TDMR.
+	 *
+	 * Return -EINVAL until constructing TDMRs is done
+	 */
+	return -EINVAL;
+}
+
 static int init_tdx_module(void)
 {
 	/*
@@ -358,6 +439,7 @@  static int init_tdx_module(void)
 			TDSYSINFO_STRUCT_SIZE, TDSYSINFO_STRUCT_ALIGNMENT);
 	struct cmr_info cmr_array[MAX_CMRS] __aligned(CMR_INFO_ARRAY_ALIGNMENT);
 	struct tdsysinfo_struct *sysinfo = &PADDED_STRUCT(tdsysinfo);
+	struct tdmr_info_list tdmr_list;
 	int ret;
 
 	ret = tdx_get_sysinfo(sysinfo, cmr_array);
@@ -380,11 +462,19 @@  static int init_tdx_module(void)
 	if (ret)
 		goto out;
 
+	/* Allocate enough space for constructing TDMRs */
+	ret = alloc_tdmr_list(&tdmr_list, sysinfo);
+	if (ret)
+		goto out_free_tdx_mem;
+
+	/* Cover all TDX-usable memory regions in TDMRs */
+	ret = construct_tdmrs(&tdx_memlist, &tdmr_list, sysinfo);
+	if (ret)
+		goto out_free_tdmrs;
+
 	/*
 	 * TODO:
 	 *
-	 *  - Construct a list of TDMRs to cover all TDX-usable memory
-	 *    regions.
 	 *  - Pick up one TDX private KeyID as the global KeyID.
 	 *  - Configure the TDMRs and the global KeyID to the TDX module.
 	 *  - Configure the global KeyID on all packages.
@@ -393,6 +483,16 @@  static int init_tdx_module(void)
 	 *  Return error before all steps are done.
 	 */
 	ret = -EINVAL;
+out_free_tdmrs:
+	/*
+	 * Free the space for the TDMRs no matter the initialization is
+	 * successful or not.  They are not needed anymore after the
+	 * module initialization.
+	 */
+	free_tdmr_list(&tdmr_list);
+out_free_tdx_mem:
+	if (ret)
+		free_tdx_memlist(&tdx_memlist);
 out:
 	/*
 	 * @tdx_memlist is written here and read at memory hotplug time.
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 6d32f62e4182..d0c762f1a94c 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -90,6 +90,29 @@  struct tdsysinfo_struct {
 	DECLARE_FLEX_ARRAY(struct cpuid_config, cpuid_configs);
 } __packed;
 
+struct tdmr_reserved_area {
+	u64 offset;
+	u64 size;
+} __packed;
+
+#define TDMR_INFO_ALIGNMENT	512
+
+struct tdmr_info {
+	u64 base;
+	u64 size;
+	u64 pamt_1g_base;
+	u64 pamt_1g_size;
+	u64 pamt_2m_base;
+	u64 pamt_2m_size;
+	u64 pamt_4k_base;
+	u64 pamt_4k_size;
+	/*
+	 * Actual number of reserved areas depends on
+	 * 'struct tdsysinfo_struct'::max_reserved_per_tdmr.
+	 */
+	DECLARE_FLEX_ARRAY(struct tdmr_reserved_area, reserved_areas);
+} __packed __aligned(TDMR_INFO_ALIGNMENT);
+
 /*
  * Do not put any hardware-defined TDX structure representations below
  * this comment!