diff mbox series

[v4] xen/arm: p2m: Populate pages for GICv2 mapping in arch_domain_create()

Message ID 20221017165133.17066-1-Henry.Wang@arm.com (mailing list archive)
State New, archived
Headers show
Series [v4] xen/arm: p2m: Populate pages for GICv2 mapping in arch_domain_create() | expand

Commit Message

Henry Wang Oct. 17, 2022, 4:51 p.m. UTC
Hardware using GICv2 needs to create a P2M mapping of 8KB GICv2 area
when the domain is created. Considering the worst case of page tables
which requires 6 P2M pages as the two pages will be consecutive but not
necessarily in the same L3 page table and keep a buffer, populate 16
pages as the default value to the P2M pages pool in arch_domain_create()
at the domain creation stage to satisfy the GICv2 requirement. For
GICv3, the above-mentioned P2M mapping is not necessary, but since the
allocated 16 pages here would not be lost, hence populate these pages
unconditionally.

With the default 16 P2M pages populated, there would be a case that
failures would happen in the domain creation with P2M pages already in
use. To properly free the P2M for this case, firstly support the
optionally preemption of p2m_teardown(), then call p2m_teardown() and
p2m_set_allocation(d, 0, NULL) non-preemptively in p2m_final_teardown().
As non-preemptive p2m_teardown() should only return 0, use a
BUG_ON to confirm that.

Since p2m_final_teardown() is called either after
domain_relinquish_resources() where relinquish_p2m_mapping() has been
called, or from failure path of domain_create()/arch_domain_create()
where mappings that require p2m_put_l3_page() should never be created,
relinquish_p2m_mapping() is not added in p2m_final_teardown(), add
in-code comments to refer this.

Fixes: cbea5a1149ca ("xen/arm: Allocate and free P2M pages from the P2M pool")
Suggested-by: Julien Grall <jgrall@amazon.com>
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
---
This should also be backported to 4.13, 4.14, 4.15 and 4.16.
v4 changes:
- Move the initial population of 16 default pages to the end of
  p2m_init(), add if(rc) return rc; after p2m_alloc_table()
v3 changes:
- Move the population of default pages to p2m_init().
- Use a loop over p2m_teardown_allocation() to implement the
  non-preemptive p2m_teardown_allocation() and avoid open-coding.
- Reorder assertions in p2m_final_teardown().
- Add p2m_teardown() will always return 0 if called non-preemptively in
  doc, move the page_list_empty(&p2m->pages) check to p2m_teardown()
  and use a BUG_ON to confirm p2m_teardown() will return 0 in
  p2m_final_teardown().
- Add a comment in p2m_final_teardown() to mention relinquish_p2m_mapping()
  does not need to be called, also update commit message.
v2 changes:
- Move the p2m_set_allocation(d, 0, NULL); to p2m_final_teardown().
- Support optionally preemption of p2m_teardown(), and make the calling of
  p2m_teardown() preemptively when relinquish the resources, non-preemptively
  in p2m_final_teardown().
- Refactor the error handling to make the code use less spin_unlock.
- Explain the worst case of page tables and the unconditional population
  of pages in commit message.
- Mention the unconditional population of pages in in-code comment.
---
 xen/arch/arm/domain.c          |  2 +-
 xen/arch/arm/include/asm/p2m.h | 14 ++++++++++----
 xen/arch/arm/p2m.c             | 34 ++++++++++++++++++++++++++++++++--
 3 files changed, 43 insertions(+), 7 deletions(-)

Comments

Andrew Cooper Oct. 17, 2022, 7:07 p.m. UTC | #1
On 17/10/2022 17:51, Henry Wang wrote:
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index f17500ddf3..0b4e24218e 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1784,6 +1800,8 @@ int p2m_init(struct domain *d)
>          !iommu_has_feature(d, IOMMU_FEAT_COHERENT_WALK);
>  
>      rc = p2m_alloc_table(d);
> +    if ( rc != 0 )
> +        return rc;

This error path now leaks the VMID.

And to preempt the obvious change, if you simply move this lower, you'll
introduce a path where p2m_teardown() falls over a NULL pointer.

~Andrew
Julien Grall Oct. 18, 2022, 12:43 p.m. UTC | #2
Hi Henry,

On 17/10/2022 17:51, Henry Wang wrote:
> @@ -1736,7 +1739,20 @@ void p2m_final_teardown(struct domain *d)
>       if ( !p2m->domain )
>           return;
>   
> +    /*
> +     * No need to call relinquish_p2m_mapping() here because
> +     * p2m_final_teardown() is called either after domain_relinquish_resources()
> +     * where relinquish_p2m_mapping() has been called, or from failure path of
> +     * domain_create()/arch_domain_create() where mappings that require
> +     * p2m_put_l3_page() should never be created. For the latter case, also see
> +     * comment on top of the p2m_set_entry() for more info.
> +     */
> +
> +    BUG_ON(p2m_teardown(d, false));
>       ASSERT(page_list_empty(&p2m->pages));
> +
> +    while ( p2m_teardown_allocation(d) == -ERESTART )
> +        continue; /* No preemption support here */
>       ASSERT(page_list_empty(&d->arch.paging.p2m_freelist));
>   
>       if ( p2m->root )
> @@ -1784,6 +1800,8 @@ int p2m_init(struct domain *d)

As Andrew pointed out the change in p2m_init() will end up leaking 
either the VMID or the root table.

Andrew's patch #1 [1] should help to solve the problem. So I would 
suggest to rebase on top of it.

Other than that, the logic looks good to me. This is even knowning that 
Andrew said the code is buggy. I spent some time starring at the code 
and can't figure out where the issue lies because p2m_teardown() will do 
nothing when the list is empty. If it is not empty, then it is 
guaranteed that the VMID and root table is allocated. So the code looks 
functional but just not efficient.

We already discussed the latter point earlier in the review and agreed 
to look at improve it post 4.17.

For the former, I am happy to be proven wrong. But this is going to 
require more substantial explanations.

Cheers,
Henry Wang Oct. 18, 2022, 12:54 p.m. UTC | #3
Hi Julien,

> -----Original Message-----
> From: Julien Grall <julien@xen.org>
> Subject: Re: [PATCH v4] xen/arm: p2m: Populate pages for GICv2 mapping in
> arch_domain_create()
> >       if ( p2m->root )
> > @@ -1784,6 +1800,8 @@ int p2m_init(struct domain *d)
> 
> As Andrew pointed out the change in p2m_init() will end up leaking
> either the VMID or the root table.
> 
> Andrew's patch #1 [1] should help to solve the problem. So I would
> suggest to rebase on top of it.

Of course. I appreciate Andrew very much for finding the issue. To
properly solve the problem and reduce the amount of work from
maintainers, I will try to respin the patch series (with #1 from Andrew
and #2 from my rebased patch with commit message properly adapted).
So that either we decided to pick Andrew's whole series or Andrew's #1
plus my #2, we can fetch the series directly from ML.

[...]

> 
> Other than that, the logic looks good to me. This is even knowning that
> Andrew said the code is buggy. I spent some time starring at the code

Thanks for your time :)

> and can't figure out where the issue lies because p2m_teardown() will do
> nothing when the list is empty. If it is not empty, then it is
> guaranteed that the VMID and root table is allocated. So the code looks
> functional but just not efficient.
> 
> We already discussed the latter point earlier in the review and agreed
> to look at improve it post 4.17.
> 
> For the former, I am happy to be proven wrong. But this is going to
> require more substantial explanations.

...having said that, definitely no problem for me to wait for a bit for
the discussion continues. We can pick the most suitable series when we
reach to the conclusion.

Kind regards,
Henry

> 
> Cheers,
> 
> --
> Julien Grall
Bertrand Marquis Oct. 18, 2022, 3:07 p.m. UTC | #4
Hi Andrew,

> On 17 Oct 2022, at 20:07, Andrew Cooper <Andrew.Cooper3@citrix.com> wrote:
> 
> On 17/10/2022 17:51, Henry Wang wrote:
>> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
>> index f17500ddf3..0b4e24218e 100644
>> --- a/xen/arch/arm/p2m.c
>> +++ b/xen/arch/arm/p2m.c
>> @@ -1784,6 +1800,8 @@ int p2m_init(struct domain *d)
>>         !iommu_has_feature(d, IOMMU_FEAT_COHERENT_WALK);
>> 
>>     rc = p2m_alloc_table(d);
>> +    if ( rc != 0 )
>> +        return rc;
> 
> This error path now leaks the VMID.
> 
> And to preempt the obvious change, if you simply move this lower, you'll
> introduce a path where p2m_teardown() falls over a NULL pointer.

Could you explain me the in which case we would end up with a NULL pointer ?
If the list is empty we bail out so I do not see how this could happen.

Cheers
Bertrand

> 
> ~Andrew
diff mbox series

Patch

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 2c84e6dbbb..38e22f12af 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -1064,7 +1064,7 @@  int domain_relinquish_resources(struct domain *d)
             return ret;
 
     PROGRESS(p2m):
-        ret = p2m_teardown(d);
+        ret = p2m_teardown(d, true);
         if ( ret )
             return ret;
 
diff --git a/xen/arch/arm/include/asm/p2m.h b/xen/arch/arm/include/asm/p2m.h
index 42bfd548c4..c8f14d13c2 100644
--- a/xen/arch/arm/include/asm/p2m.h
+++ b/xen/arch/arm/include/asm/p2m.h
@@ -194,14 +194,18 @@  int p2m_init(struct domain *d);
 
 /*
  * The P2M resources are freed in two parts:
- *  - p2m_teardown() will be called when relinquish the resources. It
- *    will free large resources (e.g. intermediate page-tables) that
- *    requires preemption.
+ *  - p2m_teardown() will be called preemptively when relinquish the
+ *    resources, in which case it will free large resources (e.g. intermediate
+ *    page-tables) that requires preemption.
  *  - p2m_final_teardown() will be called when domain struct is been
  *    freed. This *cannot* be preempted and therefore one small
  *    resources should be freed here.
+ *  Note that p2m_final_teardown() will also call p2m_teardown(), to properly
+ *  free the P2M when failures happen in the domain creation with P2M pages
+ *  already in use. In this case p2m_teardown() is called non-preemptively and
+ *  p2m_teardown() will always return 0.
  */
-int p2m_teardown(struct domain *d);
+int p2m_teardown(struct domain *d, bool allow_preemption);
 void p2m_final_teardown(struct domain *d);
 
 /*
@@ -266,6 +270,8 @@  mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
 /*
  * Direct set a p2m entry: only for use by the P2M code.
  * The P2M write lock should be taken.
+ * TODO: Add a check in __p2m_set_entry() to avoid creating a mapping in
+ * arch_domain_create() that requires p2m_put_l3_page() to be called.
  */
 int p2m_set_entry(struct p2m_domain *p2m,
                   gfn_t sgfn,
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index f17500ddf3..0b4e24218e 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1685,7 +1685,7 @@  static void p2m_free_vmid(struct domain *d)
     spin_unlock(&vmid_alloc_lock);
 }
 
-int p2m_teardown(struct domain *d)
+int p2m_teardown(struct domain *d, bool allow_preemption)
 {
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
     unsigned long count = 0;
@@ -1693,6 +1693,9 @@  int p2m_teardown(struct domain *d)
     unsigned int i;
     int rc = 0;
 
+    if ( page_list_empty(&p2m->pages) )
+        return 0;
+
     p2m_write_lock(p2m);
 
     /*
@@ -1716,7 +1719,7 @@  int p2m_teardown(struct domain *d)
         p2m_free_page(p2m->domain, pg);
         count++;
         /* Arbitrarily preempt every 512 iterations */
-        if ( !(count % 512) && hypercall_preempt_check() )
+        if ( allow_preemption && !(count % 512) && hypercall_preempt_check() )
         {
             rc = -ERESTART;
             break;
@@ -1736,7 +1739,20 @@  void p2m_final_teardown(struct domain *d)
     if ( !p2m->domain )
         return;
 
+    /*
+     * No need to call relinquish_p2m_mapping() here because
+     * p2m_final_teardown() is called either after domain_relinquish_resources()
+     * where relinquish_p2m_mapping() has been called, or from failure path of
+     * domain_create()/arch_domain_create() where mappings that require
+     * p2m_put_l3_page() should never be created. For the latter case, also see
+     * comment on top of the p2m_set_entry() for more info.
+     */
+
+    BUG_ON(p2m_teardown(d, false));
     ASSERT(page_list_empty(&p2m->pages));
+
+    while ( p2m_teardown_allocation(d) == -ERESTART )
+        continue; /* No preemption support here */
     ASSERT(page_list_empty(&d->arch.paging.p2m_freelist));
 
     if ( p2m->root )
@@ -1784,6 +1800,8 @@  int p2m_init(struct domain *d)
         !iommu_has_feature(d, IOMMU_FEAT_COHERENT_WALK);
 
     rc = p2m_alloc_table(d);
+    if ( rc != 0 )
+        return rc;
 
     /*
      * Make sure that the type chosen to is able to store the an vCPU ID
@@ -1803,6 +1821,18 @@  int p2m_init(struct domain *d)
      */
     p2m->domain = d;
 
+    /*
+     * Hardware using GICv2 needs to create a P2M mapping of 8KB GICv2 area
+     * when the domain is created. Considering the worst case for page
+     * tables and keep a buffer, populate 16 pages to the P2M pages pool here.
+     * For GICv3, the above-mentioned P2M mapping is not necessary, but since
+     * the allocated 16 pages here would not be lost, hence populate these
+     * pages unconditionally.
+     */
+    spin_lock(&d->arch.paging.lock);
+    rc = p2m_set_allocation(d, 16, NULL);
+    spin_unlock(&d->arch.paging.lock);
+
     return rc;
 }