diff mbox series

x86/pv: limit GDT and LDT mappings areas to max number of vCPUs

Message ID 20241121111218.50984-1-roger.pau@citrix.com (mailing list archive)
State New
Headers show
Series x86/pv: limit GDT and LDT mappings areas to max number of vCPUs | expand

Commit Message

Roger Pau Monne Nov. 21, 2024, 11:12 a.m. UTC
The allocation of the paging structures in the per-domain area for mapping the
guest GDT and LDT can be limited to the maximum number of vCPUs the guest can
have.  The maximum number of vCPUs is available at domain creation since commit
4737fa52ce86.

Limiting to the actual number of vCPUs avoids wasting memory for paging
structures that will never be used.  Current logic unconditionally uses 513
pages, one page for the L3, plus 512 L1 pages.  For guests with equal or less
than 16 vCPUs only 2 pages are used (each guest vCPU GDT/LDT can only consume
32 L1 slots).

No functional change intended, all possible domain vCPUs should have the GDT
and LDT paging structures allocated and setup at domain creation, just like
before the change.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/pv/domain.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Andrew Cooper Nov. 21, 2024, 11:26 a.m. UTC | #1
On 21/11/2024 11:12 am, Roger Pau Monne wrote:
> The allocation of the paging structures in the per-domain area for mapping the
> guest GDT and LDT can be limited to the maximum number of vCPUs the guest can
> have.  The maximum number of vCPUs is available at domain creation since commit
> 4737fa52ce86.
>
> Limiting to the actual number of vCPUs avoids wasting memory for paging
> structures that will never be used.  Current logic unconditionally uses 513
> pages, one page for the L3, plus 512 L1 pages.  For guests with equal or less
> than 16 vCPUs only 2 pages are used (each guest vCPU GDT/LDT can only consume
> 32 L1 slots).
>
> No functional change intended, all possible domain vCPUs should have the GDT
> and LDT paging structures allocated and setup at domain creation, just like
> before the change.
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Ooh, that's a optimisation I'd not considered when doing the work.

But, is it really only the the LDT/GDT area which can benefit from
this?  The XLAT area seems like another good candidate.

> ---
>  xen/arch/x86/pv/domain.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c
> index d5a8564c1cbe..e861e3ce71d9 100644
> --- a/xen/arch/x86/pv/domain.c
> +++ b/xen/arch/x86/pv/domain.c
> @@ -346,7 +346,7 @@ void pv_domain_destroy(struct domain *d)
>      pv_l1tf_domain_destroy(d);
>  
>      destroy_perdomain_mapping(d, GDT_LDT_VIRT_START,
> -                              GDT_LDT_MBYTES << (20 - PAGE_SHIFT));
> +                              d->max_vcpus << GDT_LDT_VCPU_SHIFT);

Probably not for this patch, but, should we really be passing in a size
here?

Don't we just want to tear down everything in the relevant slot?

>  
>      XFREE(d->arch.pv.cpuidmasks);
>  
> @@ -377,7 +377,7 @@ int pv_domain_initialise(struct domain *d)
>          goto fail;
>  
>      rc = create_perdomain_mapping(d, GDT_LDT_VIRT_START,
> -                                  GDT_LDT_MBYTES << (20 - PAGE_SHIFT),
> +                                  d->max_vcpus << GDT_LDT_VCPU_SHIFT,
>                                    NULL, NULL);

I'd suggest putting a note here saying that the maximum bound for PV
vCPUs is governed by GDT_LDT_MBYTES.

Or alternatively, we could have create_perdomain_mapping() fail if it
tries to allocate more than one slot's worth of mappings?   It feels
like an acceptable safety net.

~Andrew
Jan Beulich Nov. 21, 2024, 11:32 a.m. UTC | #2
On 21.11.2024 12:12, Roger Pau Monne wrote:
> The allocation of the paging structures in the per-domain area for mapping the
> guest GDT and LDT can be limited to the maximum number of vCPUs the guest can
> have.  The maximum number of vCPUs is available at domain creation since commit
> 4737fa52ce86.
> 
> Limiting to the actual number of vCPUs avoids wasting memory for paging
> structures that will never be used.  Current logic unconditionally uses 513
> pages, one page for the L3, plus 512 L1 pages.  For guests with equal or less
> than 16 vCPUs only 2 pages are used (each guest vCPU GDT/LDT can only consume
> 32 L1 slots).
> 
> No functional change intended, all possible domain vCPUs should have the GDT
> and LDT paging structures allocated and setup at domain creation, just like
> before the change.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
perhaps with the small adjustment Andrew has asked for.

Jan
Roger Pau Monne Nov. 21, 2024, 11:39 a.m. UTC | #3
On Thu, Nov 21, 2024 at 11:26:19AM +0000, Andrew Cooper wrote:
> On 21/11/2024 11:12 am, Roger Pau Monne wrote:
> > The allocation of the paging structures in the per-domain area for mapping the
> > guest GDT and LDT can be limited to the maximum number of vCPUs the guest can
> > have.  The maximum number of vCPUs is available at domain creation since commit
> > 4737fa52ce86.
> >
> > Limiting to the actual number of vCPUs avoids wasting memory for paging
> > structures that will never be used.  Current logic unconditionally uses 513
> > pages, one page for the L3, plus 512 L1 pages.  For guests with equal or less
> > than 16 vCPUs only 2 pages are used (each guest vCPU GDT/LDT can only consume
> > 32 L1 slots).
> >
> > No functional change intended, all possible domain vCPUs should have the GDT
> > and LDT paging structures allocated and setup at domain creation, just like
> > before the change.
> >
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Ooh, that's a optimisation I'd not considered when doing the work.
> 
> But, is it really only the the LDT/GDT area which can benefit from
> this?  The XLAT area seems like another good candidate.

I don't see XLAT being pre-allocated like the GDT/LDT area is, it's
only populated on a per-vCPU basis in setup_compat_arg_xlat() which is
already bounded to the number of initialized vCPUs.

> > ---
> >  xen/arch/x86/pv/domain.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c
> > index d5a8564c1cbe..e861e3ce71d9 100644
> > --- a/xen/arch/x86/pv/domain.c
> > +++ b/xen/arch/x86/pv/domain.c
> > @@ -346,7 +346,7 @@ void pv_domain_destroy(struct domain *d)
> >      pv_l1tf_domain_destroy(d);
> >  
> >      destroy_perdomain_mapping(d, GDT_LDT_VIRT_START,
> > -                              GDT_LDT_MBYTES << (20 - PAGE_SHIFT));
> > +                              d->max_vcpus << GDT_LDT_VCPU_SHIFT);
> 
> Probably not for this patch, but, should we really be passing in a size
> here?
> 
> Don't we just want to tear down everything in the relevant slot?

Hm, I've considered leaving that alone and keep passing GDT_LDT_MBYTES
to destroy the whole slot.  The performance advantage of not iterating
over the known empty slots is negligible IMO.  No strong opinion, I
can leave as-is if it's considered better.

> >  
> >      XFREE(d->arch.pv.cpuidmasks);
> >  
> > @@ -377,7 +377,7 @@ int pv_domain_initialise(struct domain *d)
> >          goto fail;
> >  
> >      rc = create_perdomain_mapping(d, GDT_LDT_VIRT_START,
> > -                                  GDT_LDT_MBYTES << (20 - PAGE_SHIFT),
> > +                                  d->max_vcpus << GDT_LDT_VCPU_SHIFT,
> >                                    NULL, NULL);
> 
> I'd suggest putting a note here saying that the maximum bound for PV
> vCPUs is governed by GDT_LDT_MBYTES.

Yeah, MAX_VIRT_CPUS is already calculated based on GDT_LDT_MBYTES.

> Or alternatively, we could have create_perdomain_mapping() fail if it
> tries to allocate more than one slot's worth of mappings?   It feels
> like an acceptable safety net.

I would rather use something like:

if ( (d->max_vcpus << GDT_LDT_VCPU_SHIFT) >
     (PERDOMAIN_SLOT_MBYTES << (20 - PAGE_SHIFT)) )
    ASSERT_UNREACHABLE();
    return -EINVAL;
}

As it should be impossible to call pv_domain_initialise() with a
number of vCPUs past what fits in the GDT/LDT slot.
arch_sanitise_domain_config() should have filtered such attempt before
it gets into pv_domain_initialise().

Thanks, Roger.
Roger Pau Monne Nov. 21, 2024, 3:56 p.m. UTC | #4
On Thu, Nov 21, 2024 at 12:12:18PM +0100, Roger Pau Monne wrote:
> The allocation of the paging structures in the per-domain area for mapping the
> guest GDT and LDT can be limited to the maximum number of vCPUs the guest can
> have.  The maximum number of vCPUs is available at domain creation since commit
> 4737fa52ce86.
> 
> Limiting to the actual number of vCPUs avoids wasting memory for paging
> structures that will never be used.  Current logic unconditionally uses 513
> pages, one page for the L3, plus 512 L1 pages.

This is not true, I was confused with the logic in
create_perdomain_mapping().  When create_perdomain_mapping() is called
with pl1tab == NULL and ppg == NULL it just allocates the L2, but not
the L1 tables.

So the purpose of the create_perdomain_mapping(d, GDT_LDT_VIRT_START,
...) in pv_domain_initialise() is even more dubious now - as it just
allocates a page to use as L2.  I'm tempted to just remove it if you
agree, since I don't consider this useful.  The allocation will
already be done at vCPU initialization.

Thanks, Roger.
Jan Beulich Nov. 21, 2024, 4:03 p.m. UTC | #5
On 21.11.2024 16:56, Roger Pau Monné wrote:
> On Thu, Nov 21, 2024 at 12:12:18PM +0100, Roger Pau Monne wrote:
>> The allocation of the paging structures in the per-domain area for mapping the
>> guest GDT and LDT can be limited to the maximum number of vCPUs the guest can
>> have.  The maximum number of vCPUs is available at domain creation since commit
>> 4737fa52ce86.
>>
>> Limiting to the actual number of vCPUs avoids wasting memory for paging
>> structures that will never be used.  Current logic unconditionally uses 513
>> pages, one page for the L3, plus 512 L1 pages.
> 
> This is not true, I was confused with the logic in
> create_perdomain_mapping().  When create_perdomain_mapping() is called
> with pl1tab == NULL and ppg == NULL it just allocates the L2, but not
> the L1 tables.
> 
> So the purpose of the create_perdomain_mapping(d, GDT_LDT_VIRT_START,
> ...) in pv_domain_initialise() is even more dubious now - as it just
> allocates a page to use as L2.  I'm tempted to just remove it if you
> agree, since I don't consider this useful.  The allocation will
> already be done at vCPU initialization.

If it's done implicitly there, removing is likely fine. It feels like it may
have been necessary to do this explicitly earlier on.

Jan
Roger Pau Monne Nov. 21, 2024, 5:11 p.m. UTC | #6
On Thu, Nov 21, 2024 at 05:03:21PM +0100, Jan Beulich wrote:
> On 21.11.2024 16:56, Roger Pau Monné wrote:
> > On Thu, Nov 21, 2024 at 12:12:18PM +0100, Roger Pau Monne wrote:
> >> The allocation of the paging structures in the per-domain area for mapping the
> >> guest GDT and LDT can be limited to the maximum number of vCPUs the guest can
> >> have.  The maximum number of vCPUs is available at domain creation since commit
> >> 4737fa52ce86.
> >>
> >> Limiting to the actual number of vCPUs avoids wasting memory for paging
> >> structures that will never be used.  Current logic unconditionally uses 513
> >> pages, one page for the L3, plus 512 L1 pages.
> > 
> > This is not true, I was confused with the logic in
> > create_perdomain_mapping().  When create_perdomain_mapping() is called
> > with pl1tab == NULL and ppg == NULL it just allocates the L2, but not
> > the L1 tables.
> > 
> > So the purpose of the create_perdomain_mapping(d, GDT_LDT_VIRT_START,
> > ...) in pv_domain_initialise() is even more dubious now - as it just
> > allocates a page to use as L2.  I'm tempted to just remove it if you
> > agree, since I don't consider this useful.  The allocation will
> > already be done at vCPU initialization.
> 
> If it's done implicitly there, removing is likely fine. It feels like it may
> have been necessary to do this explicitly earlier on.

Possibly before you introduced the create_perdomain_mapping() the
initialization and allocation of the L2 was needed in
pv_domain_initialise().

I have a patch to remove it, and as expected nothing seems to explode,
so I will send it.

Thanks, Roger.
diff mbox series

Patch

diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c
index d5a8564c1cbe..e861e3ce71d9 100644
--- a/xen/arch/x86/pv/domain.c
+++ b/xen/arch/x86/pv/domain.c
@@ -346,7 +346,7 @@  void pv_domain_destroy(struct domain *d)
     pv_l1tf_domain_destroy(d);
 
     destroy_perdomain_mapping(d, GDT_LDT_VIRT_START,
-                              GDT_LDT_MBYTES << (20 - PAGE_SHIFT));
+                              d->max_vcpus << GDT_LDT_VCPU_SHIFT);
 
     XFREE(d->arch.pv.cpuidmasks);
 
@@ -377,7 +377,7 @@  int pv_domain_initialise(struct domain *d)
         goto fail;
 
     rc = create_perdomain_mapping(d, GDT_LDT_VIRT_START,
-                                  GDT_LDT_MBYTES << (20 - PAGE_SHIFT),
+                                  d->max_vcpus << GDT_LDT_VCPU_SHIFT,
                                   NULL, NULL);
     if ( rc )
         goto fail;