diff mbox series

[v4,1/4] x86/mem_sharing: reorder when pages are unlocked and released

Message ID 20190502221345.18459-1-tamas@tklengyel.com (mailing list archive)
State Superseded
Headers show
Series [v4,1/4] x86/mem_sharing: reorder when pages are unlocked and released | expand

Commit Message

Tamas K Lengyel May 2, 2019, 10:13 p.m. UTC
Calling _put_page_type while also holding the page_lock
for that page can cause a deadlock.

The comment being dropped is incorrect since it's now out-of-date.

Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Roger Pau Monne <roger.pau@citrix.com>
---
This series is based on Andrew Cooper's x86-next branch

v4: drop grabbing extra references
---
 xen/arch/x86/mm/mem_sharing.c | 41 ++++++++++-------------------------
 1 file changed, 11 insertions(+), 30 deletions(-)

Comments

Jan Beulich May 3, 2019, 8:12 a.m. UTC | #1
>>> On 03.05.19 at 00:13, <tamas@tklengyel.com> wrote:
> @@ -1002,7 +989,10 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
>      /* Free the client page */
>      if(test_and_clear_bit(_PGC_allocated, &cpage->count_info))
>          put_page(cpage);
> -    put_page(cpage);
> +
> +    BUG_ON(!put_count);
> +    while ( put_count-- )
> +        put_page_and_type(cpage);

Strictly speaking I think the BUG_ON() should be moved ahead of the
if() in context, so that a problematic put_page() would not get
executed in the first place (even if the system is to die soon after).

Jan
Tamas K Lengyel May 3, 2019, 1:48 p.m. UTC | #2
On Fri, May 3, 2019 at 2:12 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 03.05.19 at 00:13, <tamas@tklengyel.com> wrote:
> > @@ -1002,7 +989,10 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
> >      /* Free the client page */
> >      if(test_and_clear_bit(_PGC_allocated, &cpage->count_info))
> >          put_page(cpage);
> > -    put_page(cpage);
> > +
> > +    BUG_ON(!put_count);
> > +    while ( put_count-- )
> > +        put_page_and_type(cpage);
>
> Strictly speaking I think the BUG_ON() should be moved ahead of the
> if() in context, so that a problematic put_page() would not get
> executed in the first place (even if the system is to die soon after).

I don't follow - where is the problematic put_page()? And why is it problematic?

Tamas
Jan Beulich May 3, 2019, 1:55 p.m. UTC | #3
>>> On 03.05.19 at 15:48, <tamas@tklengyel.com> wrote:
> On Fri, May 3, 2019 at 2:12 AM Jan Beulich <JBeulich@suse.com> wrote:
>>
>> >>> On 03.05.19 at 00:13, <tamas@tklengyel.com> wrote:
>> > @@ -1002,7 +989,10 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
>> >      /* Free the client page */
>> >      if(test_and_clear_bit(_PGC_allocated, &cpage->count_info))
>> >          put_page(cpage);

This should be after ...

>> > -    put_page(cpage);
>> > +
>> > +    BUG_ON(!put_count);

... this, because ...

>> > +    while ( put_count-- )
>> > +        put_page_and_type(cpage);
>>
>> Strictly speaking I think the BUG_ON() should be moved ahead of the
>> if() in context, so that a problematic put_page() would not get
>> executed in the first place (even if the system is to die soon after).
> 
> I don't follow - where is the problematic put_page()? And why is it 
> problematic?

... if indeed the BUG_ON() triggers, then it should do so before
potentially putting the last ref of a page which shouldn't be put
that way.

Jan
Tamas K Lengyel May 3, 2019, 2:14 p.m. UTC | #4
On Fri, May 3, 2019 at 7:56 AM Jan Beulich <JBeulich@suse.com> wrote:
>
> >>> On 03.05.19 at 15:48, <tamas@tklengyel.com> wrote:
> > On Fri, May 3, 2019 at 2:12 AM Jan Beulich <JBeulich@suse.com> wrote:
> >>
> >> >>> On 03.05.19 at 00:13, <tamas@tklengyel.com> wrote:
> >> > @@ -1002,7 +989,10 @@ static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
> >> >      /* Free the client page */
> >> >      if(test_and_clear_bit(_PGC_allocated, &cpage->count_info))
> >> >          put_page(cpage);
>
> This should be after ...
>
> >> > -    put_page(cpage);
> >> > +
> >> > +    BUG_ON(!put_count);
>
> ... this, because ...
>
> >> > +    while ( put_count-- )
> >> > +        put_page_and_type(cpage);
> >>
> >> Strictly speaking I think the BUG_ON() should be moved ahead of the
> >> if() in context, so that a problematic put_page() would not get
> >> executed in the first place (even if the system is to die soon after).
> >
> > I don't follow - where is the problematic put_page()? And why is it
> > problematic?
>
> ... if indeed the BUG_ON() triggers, then it should do so before
> potentially putting the last ref of a page which shouldn't be put
> that way.

I see, thanks.

Tamas
diff mbox series

Patch

diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index dfc279d371..4b3a094481 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -648,10 +648,6 @@  static int page_make_private(struct domain *d, struct page_info *page)
         return -EBUSY;
     }
 
-    /* We can only change the type if count is one */
-    /* Because we are locking pages individually, we need to drop
-     * the lock here, while the page is typed. We cannot risk the 
-     * race of page_unlock and then put_page_type. */
     expected_type = (PGT_shared_page | PGT_validated | PGT_locked | 2);
     if ( page->u.inuse.type_info != expected_type )
     {
@@ -660,12 +656,11 @@  static int page_make_private(struct domain *d, struct page_info *page)
         return -EEXIST;
     }
 
+    mem_sharing_page_unlock(page);
+
     /* Drop the final typecount */
     put_page_and_type(page);
 
-    /* Now that we've dropped the type, we can unlock */
-    mem_sharing_page_unlock(page);
-
     /* Change the owner */
     ASSERT(page_get_owner(page) == dom_cow);
     page_set_owner(page, d);
@@ -900,6 +895,7 @@  static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
     p2m_type_t smfn_type, cmfn_type;
     struct two_gfns tg;
     struct rmap_iterator ri;
+    unsigned long put_count = 0;
 
     get_two_gfns(sd, sgfn, &smfn_type, NULL, &smfn,
                  cd, cgfn, &cmfn_type, NULL, &cmfn, 0, &tg);
@@ -964,15 +960,6 @@  static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
         goto err_out;
     }
 
-    /* Acquire an extra reference, for the freeing below to be safe. */
-    if ( !get_page(cpage, dom_cow) )
-    {
-        ret = -EOVERFLOW;
-        mem_sharing_page_unlock(secondpg);
-        mem_sharing_page_unlock(firstpg);
-        goto err_out;
-    }
-
     /* Merge the lists together */
     rmap_seed_iterator(cpage, &ri);
     while ( (gfn = rmap_iterate(cpage, &ri)) != NULL)
@@ -984,7 +971,7 @@  static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
          * Don't change the type of rmap for the client page. */
         rmap_del(gfn, cpage, 0);
         rmap_add(gfn, spage);
-        put_page_and_type(cpage);
+        put_count++;
         d = get_domain_by_id(gfn->domain);
         BUG_ON(!d);
         BUG_ON(set_shared_p2m_entry(d, gfn->gfn, smfn));
@@ -1002,7 +989,10 @@  static int share_pages(struct domain *sd, gfn_t sgfn, shr_handle_t sh,
     /* Free the client page */
     if(test_and_clear_bit(_PGC_allocated, &cpage->count_info))
         put_page(cpage);
-    put_page(cpage);
+
+    BUG_ON(!put_count);
+    while ( put_count-- )
+        put_page_and_type(cpage);
 
     /* We managed to free a domain page. */
     atomic_dec(&nr_shared_mfns);
@@ -1167,20 +1157,11 @@  int __mem_sharing_unshare_page(struct domain *d,
     {
         if ( !last_gfn )
             mem_sharing_gfn_destroy(page, d, gfn_info);
-        put_page_and_type(page);
         mem_sharing_page_unlock(page);
-        if ( last_gfn )
-        {
-            if ( !get_page(page, dom_cow) )
-            {
-                put_gfn(d, gfn);
-                domain_crash(d);
-                return -EOVERFLOW;
-            }
-            if ( test_and_clear_bit(_PGC_allocated, &page->count_info) )
-                put_page(page);
+        if ( last_gfn &&
+            test_and_clear_bit(_PGC_allocated, &page->count_info) )
             put_page(page);
-        }
+        put_page_and_type(page);
         put_gfn(d, gfn);
 
         return 0;