diff mbox series

[v6,3/4] mm: make MEMF_no_refcount pages safe to assign

Message ID 20200129143831.1369-4-pdurrant@amazon.com (mailing list archive)
State Superseded
Headers show
Series purge free_shared_domheap_page() | expand

Commit Message

Paul Durrant Jan. 29, 2020, 2:38 p.m. UTC
Currently it is unsafe to assign a domheap page allocated with
MEMF_no_refcount to a domain because the domain't 'tot_pages' will not
be incremented, but will be decrement when the page is freed (since
free_domheap_pages() has no way of telling that the increment was skipped).

This patch allocates a new 'count_info' bit for a PGC_no_refcount flag
which is then used to mark domheap pages allocated with MEMF_no_refcount.
This then allows free_domheap_pages() to skip decrementing tot_pages when
appropriate and hence makes the pages safe to assign.

NOTE: The patch sets MEMF_no_refcount directly in alloc_domheap_pages()
      rather than in assign_pages() because the latter is called with
      MEMF_no_refcount by memory_exchange().

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Wei Liu <wl@xen.org>
Cc: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>

v6:
 - Add an extra ASSERT into assign_pages() that PGC_no_refcount is not
   set if MEMF_no_refcount is clear
 - ASSERT that count_info is 0 in alloc_domheap_pages() and set to
   PGC_no_refcount rather than ORing

v5:
 - Make sure PGC_no_refcount is set before assign_pages() is called
 - Don't bother to clear PGC_no_refcount in free_domheap_pages() and
   drop ASSERT in free_heap_pages()
 - Don't latch count_info in free_heap_pages()

v4:
 - New in v4
---
 xen/common/page_alloc.c  | 40 ++++++++++++++++++++++++++++------------
 xen/include/asm-arm/mm.h |  5 ++++-
 xen/include/asm-x86/mm.h |  7 +++++--
 3 files changed, 37 insertions(+), 15 deletions(-)

Comments

Jan Beulich Jan. 29, 2020, 3:15 p.m. UTC | #1
On 29.01.2020 15:38, Paul Durrant wrote:
> @@ -2371,6 +2383,8 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
>  
>          if ( likely(d) && likely(d != dom_cow) )
>          {
> +            long pages = 0;
> +
>              /* NB. May recursively lock from relinquish_memory(). */
>              spin_lock_recursive(&d->page_alloc_lock);
>  
> @@ -2386,9 +2400,11 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
>                      BUG();
>                  }
>                  arch_free_heap_page(d, &pg[i]);
> +                if ( !(pg[i].count_info & PGC_no_refcount) )
> +                    pages--;
>              }
>  
> -            drop_dom_ref = !domain_adjust_tot_pages(d, -(1 << order));
> +            drop_dom_ref = !domain_adjust_tot_pages(d, pages);

Following from what I've just said on the previous patch, this needs
further changing then as well. There'll need to be a per-domain
"non-refcounted-pages" count, which - when transitioning from zero
to non-zero is accompanied by obtaining a domain ref, and when
transitioning back to zero causes this domain ref to be dropped.
Otherwise, once the last ref-counted page was freed, the domain
may become ready for final destruction, no matter how many non-
refcounted pages there still are on its page lists. (An alternative
model might be to include all pages in ->tot_pages, keep using just
that for the domain ref acquire/release, and subtract the new
count when e.g. comparing against ->max_pages.)

Jan
Durrant, Paul Jan. 29, 2020, 3:21 p.m. UTC | #2
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 29 January 2020 15:15
> To: Durrant, Paul <pdurrant@amazon.co.uk>
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> <andrew.cooper3@citrix.com>; George Dunlap <George.Dunlap@eu.citrix.com>;
> Ian Jackson <ian.jackson@eu.citrix.com>; Julien Grall <julien@xen.org>;
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Stefano Stabellini
> <sstabellini@kernel.org>; Wei Liu <wl@xen.org>; Volodymyr Babchuk
> <Volodymyr_Babchuk@epam.com>; Roger Pau Monné <roger.pau@citrix.com>
> Subject: Re: [PATCH v6 3/4] mm: make MEMF_no_refcount pages safe to assign
> 
> On 29.01.2020 15:38, Paul Durrant wrote:
> > @@ -2371,6 +2383,8 @@ void free_domheap_pages(struct page_info *pg,
> unsigned int order)
> >
> >          if ( likely(d) && likely(d != dom_cow) )
> >          {
> > +            long pages = 0;
> > +
> >              /* NB. May recursively lock from relinquish_memory(). */
> >              spin_lock_recursive(&d->page_alloc_lock);
> >
> > @@ -2386,9 +2400,11 @@ void free_domheap_pages(struct page_info *pg,
> unsigned int order)
> >                      BUG();
> >                  }
> >                  arch_free_heap_page(d, &pg[i]);
> > +                if ( !(pg[i].count_info & PGC_no_refcount) )
> > +                    pages--;
> >              }
> >
> > -            drop_dom_ref = !domain_adjust_tot_pages(d, -(1 << order));
> > +            drop_dom_ref = !domain_adjust_tot_pages(d, pages);
> 
> Following from what I've just said on the previous patch, this needs
> further changing then as well. There'll need to be a per-domain
> "non-refcounted-pages" count, which - when transitioning from zero
> to non-zero is accompanied by obtaining a domain ref, and when
> transitioning back to zero causes this domain ref to be dropped.
> Otherwise, once the last ref-counted page was freed, the domain
> may become ready for final destruction, no matter how many non-
> refcounted pages there still are on its page lists. (An alternative
> model might be to include all pages in ->tot_pages, keep using just
> that for the domain ref acquire/release, and subtract the new
> count when e.g. comparing against ->max_pages.)

Yes, I think I'll adjust totpages unconditionally and then subtract the secondary count for comparison as it means I can leave the ref counting alone.

  Paul

> 
> Jan
diff mbox series

Patch

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 135e15bae0..12b2c5a3d6 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -2287,11 +2287,16 @@  int assign_pages(
 
     for ( i = 0; i < (1 << order); i++ )
     {
+        unsigned long count_info = pg[i].count_info;
+
         ASSERT(page_get_owner(&pg[i]) == NULL);
-        ASSERT(!pg[i].count_info);
+        ASSERT(!(count_info & ~PGC_no_refcount));
+        ASSERT((memflags & MEMF_no_refcount) ||
+               !(count_info & PGC_no_refcount));
         page_set_owner(&pg[i], d);
         smp_wmb(); /* Domain pointer must be visible before updating refcnt. */
-        pg[i].count_info = PGC_allocated | 1;
+        count_info &= PGC_no_refcount;
+        pg[i].count_info = count_info | PGC_allocated | 1;
         page_list_add_tail(&pg[i], &d->page_list);
     }
 
@@ -2317,11 +2322,6 @@  struct page_info *alloc_domheap_pages(
 
     if ( memflags & MEMF_no_owner )
         memflags |= MEMF_no_refcount;
-    else if ( (memflags & MEMF_no_refcount) && d )
-    {
-        ASSERT(!(memflags & MEMF_no_refcount));
-        return NULL;
-    }
 
     if ( !dma_bitsize )
         memflags &= ~MEMF_no_dma;
@@ -2334,11 +2334,23 @@  struct page_info *alloc_domheap_pages(
                                   memflags, d)) == NULL)) )
          return NULL;
 
-    if ( d && !(memflags & MEMF_no_owner) &&
-         assign_pages(d, pg, order, memflags) )
+    if ( d && !(memflags & MEMF_no_owner) )
     {
-        free_heap_pages(pg, order, memflags & MEMF_no_scrub);
-        return NULL;
+        if ( memflags & MEMF_no_refcount )
+        {
+            unsigned long i;
+
+            for ( i = 0; i < (1ul << order); i++ )
+            {
+                ASSERT(!pg[i].count_info);
+                pg[i].count_info = PGC_no_refcount;
+            }
+        }
+        if ( assign_pages(d, pg, order, memflags) )
+        {
+            free_heap_pages(pg, order, memflags & MEMF_no_scrub);
+            return NULL;
+        }
     }
 
     return pg;
@@ -2371,6 +2383,8 @@  void free_domheap_pages(struct page_info *pg, unsigned int order)
 
         if ( likely(d) && likely(d != dom_cow) )
         {
+            long pages = 0;
+
             /* NB. May recursively lock from relinquish_memory(). */
             spin_lock_recursive(&d->page_alloc_lock);
 
@@ -2386,9 +2400,11 @@  void free_domheap_pages(struct page_info *pg, unsigned int order)
                     BUG();
                 }
                 arch_free_heap_page(d, &pg[i]);
+                if ( !(pg[i].count_info & PGC_no_refcount) )
+                    pages--;
             }
 
-            drop_dom_ref = !domain_adjust_tot_pages(d, -(1 << order));
+            drop_dom_ref = !domain_adjust_tot_pages(d, pages);
 
             spin_unlock_recursive(&d->page_alloc_lock);
 
diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index 333efd3a60..1076cc9713 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -119,9 +119,12 @@  struct page_info
 #define PGC_state_offlined PG_mask(2, 9)
 #define PGC_state_free    PG_mask(3, 9)
 #define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st)
+/* Page is not reference counted */
+#define _PGC_no_refcount  PG_shift(10)
+#define PGC_no_refcount   PG_mask(1, 10)
 
 /* Count of references to this frame. */
-#define PGC_count_width   PG_shift(9)
+#define PGC_count_width   PG_shift(10)
 #define PGC_count_mask    ((1UL<<PGC_count_width)-1)
 
 /*
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 2ca8882ad0..e75feea15e 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -77,9 +77,12 @@ 
 #define PGC_state_offlined PG_mask(2, 9)
 #define PGC_state_free    PG_mask(3, 9)
 #define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st)
+/* Page is not reference counted */
+#define _PGC_no_refcount  PG_shift(10)
+#define PGC_no_refcount   PG_mask(1, 10)
 
- /* Count of references to this frame. */
-#define PGC_count_width   PG_shift(9)
+/* Count of references to this frame. */
+#define PGC_count_width   PG_shift(10)
 #define PGC_count_mask    ((1UL<<PGC_count_width)-1)
 
 /*