diff mbox series

xen/grant-table: Properly acquire the vCPU maptrack freelist lock

Message ID 20230428102633.86473-1-hakor@amazon.com (mailing list archive)
State New, archived
Headers show
Series xen/grant-table: Properly acquire the vCPU maptrack freelist lock | expand

Commit Message

Ruben Hakobyan April 28, 2023, 10:26 a.m. UTC
Introduced as part of XSA-228, the maptrack_freelist_lock is meant to
protect all accesses to entries in the vCPU freelist as well as the
head and tail pointers.

However, this principle is violated twice in get_maptrack_handle(),
where the tail pointer is directly accessed without taking the lock.
The first occurrence is when stealing an extra entry for the tail
pointer, and the second occurrence is when directly setting the tail of
an empty freelist after allocating its first page.

Make sure to correctly acquire the freelist lock before accessing and
modifying the tail pointer to fully comply with XSA-228.

It should be noted that with the current setup, it is not possible for
these accesses to race with anything. However, it is still important
to correctly take the lock here to avoid any future possible races. For
example, a race could be possible with put_maptrack_handle() if the
maptrack code is modified to allow vCPU freelists to temporarily
include handles not directly assigned to them in the maptrack.

Note that the tail and head pointers can still be accessed without
taking the lock when initialising the freelist in grant_table_init_vcpu()
as concurrent access will not be possible here.

Signed-off-by: Ruben Hakobyan <hakor@amazon.com>
---
 xen/common/grant_table.c | 30 ++++++++++++++++++------------
 1 file changed, 18 insertions(+), 12 deletions(-)

Comments

Jan Beulich May 2, 2023, 8:38 a.m. UTC | #1
On 28.04.2023 12:26, Ruben Hakobyan wrote:
> Introduced as part of XSA-228, the maptrack_freelist_lock is meant to
> protect all accesses to entries in the vCPU freelist as well as the
> head and tail pointers.
> 
> However, this principle is violated twice in get_maptrack_handle(),
> where the tail pointer is directly accessed without taking the lock.
> The first occurrence is when stealing an extra entry for the tail
> pointer, and the second occurrence is when directly setting the tail of
> an empty freelist after allocating its first page.

I don't read out of the doc we have that this is a violation (the lock
still fully protects the list, just - as you say below - that in two
cases taking the lock isn't necessary to achieve that goal), and iirc
the relaxation was done quite deliberately. Did you, as an alternative,
consider making the doc more explicit?

> Make sure to correctly acquire the freelist lock before accessing and
> modifying the tail pointer to fully comply with XSA-228.
> 
> It should be noted that with the current setup, it is not possible for
> these accesses to race with anything. However, it is still important
> to correctly take the lock here to avoid any future possible races. For
> example, a race could be possible with put_maptrack_handle() if the
> maptrack code is modified to allow vCPU freelists to temporarily
> include handles not directly assigned to them in the maptrack.
> 
> Note that the tail and head pointers can still be accessed without
> taking the lock when initialising the freelist in grant_table_init_vcpu()
> as concurrent access will not be possible here.

This "no concurrent accesses" is the justification also for at least the
"set tail directly ..." case, aiui.

> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -660,23 +660,27 @@ get_maptrack_handle(
>      if ( !new_mt )
>      {
>          spin_unlock(&lgt->maptrack_lock);
> +        handle = steal_maptrack_handle(lgt, curr);
> +        if ( handle == INVALID_MAPTRACK_HANDLE )
> +            return handle;
> +
> +        spin_lock(&curr->maptrack_freelist_lock);
> +        if ( curr->maptrack_tail != MAPTRACK_TAIL )
> +        {
> +            spin_unlock(&curr->maptrack_freelist_lock);
> +            return handle;
> +        }
>  
>          /*
>           * Uninitialized free list? Steal an extra entry for the tail
>           * sentinel.
>           */
> -        if ( curr->maptrack_tail == MAPTRACK_TAIL )
> -        {
> -            handle = steal_maptrack_handle(lgt, curr);
> -            if ( handle == INVALID_MAPTRACK_HANDLE )
> -                return handle;
> -            spin_lock(&curr->maptrack_freelist_lock);
> -            maptrack_entry(lgt, handle).ref = MAPTRACK_TAIL;
> -            curr->maptrack_tail = handle;
> -            if ( curr->maptrack_head == MAPTRACK_TAIL )
> -                curr->maptrack_head = handle;
> -            spin_unlock(&curr->maptrack_freelist_lock);
> -        }
> +        maptrack_entry(lgt, handle).ref = MAPTRACK_TAIL;
> +        curr->maptrack_tail = handle;
> +        if ( curr->maptrack_head == MAPTRACK_TAIL )
> +            curr->maptrack_head = handle;
> +        spin_unlock(&curr->maptrack_freelist_lock);
> +
>          return steal_maptrack_handle(lgt, curr);
>      }

While this transformation looks to provide the intended guarantees (but
the comment would then need re-wording some), ...

> @@ -696,8 +700,10 @@ get_maptrack_handle(
>      }
>  
>      /* Set tail directly if this is the first page for the local vCPU. */
> +    spin_lock(&curr->maptrack_freelist_lock);
>      if ( curr->maptrack_tail == MAPTRACK_TAIL )
>          curr->maptrack_tail = handle + MAPTRACK_PER_PAGE - 1;
> +    spin_unlock(&curr->maptrack_freelist_lock);
>  
>      lgt->maptrack[nr_maptrack_frames(lgt)] = new_mt;
>      smp_wmb();

... I don't think this change does: It now leaves the free list in a
transiently corrupt state until the freelist-lock is taken again a few
lines down and the list is further manipulated. Ftaod it's not the state
itself that I'm worried about - that remains as before - but the
supposed implication of the list now always being in consistent/correct
state when having successfully acquired the lock. (Iirc one of the goals
with omitting the lock here was to avoid lock nesting, even if this
nesting arrangement is permitted. Plus of course said observation that
acquiring the lock once here and then once again later in the function
doesn't really help. I don't recall though whether there was a reason
why setting the tail needs to be done right here. Yet if it was to be
moved into the locked region further down, it would need to be
clarified that such movement is correct/legitimate.)

Jan
diff mbox series

Patch

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index d87e58a53d..67e346ca64 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -660,23 +660,27 @@  get_maptrack_handle(
     if ( !new_mt )
     {
         spin_unlock(&lgt->maptrack_lock);
+        handle = steal_maptrack_handle(lgt, curr);
+        if ( handle == INVALID_MAPTRACK_HANDLE )
+            return handle;
+
+        spin_lock(&curr->maptrack_freelist_lock);
+        if ( curr->maptrack_tail != MAPTRACK_TAIL )
+        {
+            spin_unlock(&curr->maptrack_freelist_lock);
+            return handle;
+        }
 
         /*
          * Uninitialized free list? Steal an extra entry for the tail
          * sentinel.
          */
-        if ( curr->maptrack_tail == MAPTRACK_TAIL )
-        {
-            handle = steal_maptrack_handle(lgt, curr);
-            if ( handle == INVALID_MAPTRACK_HANDLE )
-                return handle;
-            spin_lock(&curr->maptrack_freelist_lock);
-            maptrack_entry(lgt, handle).ref = MAPTRACK_TAIL;
-            curr->maptrack_tail = handle;
-            if ( curr->maptrack_head == MAPTRACK_TAIL )
-                curr->maptrack_head = handle;
-            spin_unlock(&curr->maptrack_freelist_lock);
-        }
+        maptrack_entry(lgt, handle).ref = MAPTRACK_TAIL;
+        curr->maptrack_tail = handle;
+        if ( curr->maptrack_head == MAPTRACK_TAIL )
+            curr->maptrack_head = handle;
+        spin_unlock(&curr->maptrack_freelist_lock);
+
         return steal_maptrack_handle(lgt, curr);
     }
 
@@ -696,8 +700,10 @@  get_maptrack_handle(
     }
 
     /* Set tail directly if this is the first page for the local vCPU. */
+    spin_lock(&curr->maptrack_freelist_lock);
     if ( curr->maptrack_tail == MAPTRACK_TAIL )
         curr->maptrack_tail = handle + MAPTRACK_PER_PAGE - 1;
+    spin_unlock(&curr->maptrack_freelist_lock);
 
     lgt->maptrack[nr_maptrack_frames(lgt)] = new_mt;
     smp_wmb();