diff mbox series

x86/mm: Add an early PGT_validated exit in _get_page_type()

Message ID 20220617104739.7861-1-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show
Series x86/mm: Add an early PGT_validated exit in _get_page_type() | expand

Commit Message

Andrew Cooper June 17, 2022, 10:47 a.m. UTC
This is a continuation of the cleanup and commenting in:
  9186e96b199e ("x86/pv: Clean up _get_page_type()")
  8cc5036bc385 ("x86/pv: Fix ABAC cmpxchg() race in _get_page_type()")

With the re-arranged and newly commented logic, it's far more clear that the
second half of _get_page_type() only has work to do for page validation.

Introduce an early exit for PGT_validated.  This makes the fastpath marginally
faster, and simplifies the subsequent logic as it no longer needs to exclude
the fully validated case.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: George Dunlap <george.dunlap@eu.citrix.com>

Not that it's relevant, but bloat-o-meter says:

  add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-300 (-300)
  Function                                     old     new   delta
  _get_page_type                              6618    6318    -300

which is more impressive than I was expecting.
---
 xen/arch/x86/mm.c | 70 +++++++++++++++++++++++++++++++------------------------
 1 file changed, 39 insertions(+), 31 deletions(-)

Comments

Jan Beulich June 22, 2022, 9:58 a.m. UTC | #1
On 17.06.2022 12:47, Andrew Cooper wrote:
> This is a continuation of the cleanup and commenting in:
>   9186e96b199e ("x86/pv: Clean up _get_page_type()")
>   8cc5036bc385 ("x86/pv: Fix ABAC cmpxchg() race in _get_page_type()")
> 
> With the re-arranged and newly commented logic, it's far more clear that the
> second half of _get_page_type() only has work to do for page validation.

To be honest "far more clear" reads misleading to me: Part of the re-
arrangement was to move later the early setting of PGT_validated for
PGT_writable pages, without which the stated fact wasn't entirely true
prior to the re-arrangement. How about s/far more/now/ ?

> Introduce an early exit for PGT_validated.  This makes the fastpath marginally
> faster, and simplifies the subsequent logic as it no longer needs to exclude
> the fully validated case.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Preferably with the wording above adjusted:
Reviewed-by: Jan Beulich <jbeulich@suse.com>

> Not that it's relevant, but bloat-o-meter says:
> 
>   add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-300 (-300)
>   Function                                     old     new   delta
>   _get_page_type                              6618    6318    -300
> 
> which is more impressive than I was expecting.

And I have to admit I'm having trouble seeing why that would be.

Jan
Andrew Cooper June 22, 2022, 10:13 a.m. UTC | #2
On 22/06/2022 10:58, Jan Beulich wrote:
> On 17.06.2022 12:47, Andrew Cooper wrote:
>> This is a continuation of the cleanup and commenting in:
>>   9186e96b199e ("x86/pv: Clean up _get_page_type()")
>>   8cc5036bc385 ("x86/pv: Fix ABAC cmpxchg() race in _get_page_type()")
>>
>> With the re-arranged and newly commented logic, it's far more clear that the
>> second half of _get_page_type() only has work to do for page validation.
> To be honest "far more clear" reads misleading to me: Part of the re-
> arrangement was to move later the early setting of PGT_validated for
> PGT_writable pages, without which the stated fact wasn't entirely true
> prior to the re-arrangement. How about s/far more/now/ ?
>
>> Introduce an early exit for PGT_validated.  This makes the fastpath marginally
>> faster, and simplifies the subsequent logic as it no longer needs to exclude
>> the fully validated case.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Preferably with the wording above adjusted:
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Ok.

>> Not that it's relevant, but bloat-o-meter says:
>>
>>   add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-300 (-300)
>>   Function                                     old     new   delta
>>   _get_page_type                              6618    6318    -300
>>
>> which is more impressive than I was expecting.
> And I have to admit I'm having trouble seeing why that would be.

I was surprised too, but it's deterministic with GCC 11.  I initially
disbelieved it and did full clean rebuilds.

validate_page() is fully inlined, and there were a reasonable number of
jmp.d32 -> jmp.d8 changes, but I very much doubt there are 75 of them. 
I can only assume there's a reasonable chunk of logic which succumbs to DCE.

~Andrew
diff mbox series

Patch

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index ac74ae389c99..57751d2ed70f 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -3002,6 +3002,17 @@  static int _get_page_type(struct page_info *page, unsigned long type,
      * fully validated (PGT_[type] | PGT_validated | >0).
      */
 
+    /* If the page is fully validated, we're done. */
+    if ( likely(nx & PGT_validated) )
+        return 0;
+
+    /*
+     * The page is in the "validate locked" state.  We have exclusive access,
+     * and any concurrent callers are waiting in the cmpxchg() loop above.
+     *
+     * Exclusive access ends when PGT_validated or PGT_partial get set.
+     */
+
     if ( unlikely((x & PGT_count_mask) == 0) )
     {
         struct domain *d = page_get_owner(page);
@@ -3071,43 +3082,40 @@  static int _get_page_type(struct page_info *page, unsigned long type,
         }
     }
 
-    if ( unlikely(!(nx & PGT_validated)) )
+    /*
+     * Flush the cache if there were previously non-coherent mappings of
+     * this page, and we're trying to use it as anything other than a
+     * writeable page.  This forces the page to be coherent before we
+     * validate its contents for safety.
+     */
+    if ( (nx & PGT_non_coherent) && type != PGT_writable_page )
     {
-        /*
-         * Flush the cache if there were previously non-coherent mappings of
-         * this page, and we're trying to use it as anything other than a
-         * writeable page.  This forces the page to be coherent before we
-         * validate its contents for safety.
-         */
-        if ( (nx & PGT_non_coherent) && type != PGT_writable_page )
-        {
-            void *addr = __map_domain_page(page);
+        void *addr = __map_domain_page(page);
 
-            cache_flush(addr, PAGE_SIZE);
-            unmap_domain_page(addr);
+        cache_flush(addr, PAGE_SIZE);
+        unmap_domain_page(addr);
 
-            page->u.inuse.type_info &= ~PGT_non_coherent;
-        }
+        page->u.inuse.type_info &= ~PGT_non_coherent;
+    }
 
-        /*
-         * No special validation needed for writable or shared pages.  Page
-         * tables and GDT/LDT need to have their contents audited.
-         *
-         * per validate_page(), non-atomic updates are fine here.
-         */
-        if ( type == PGT_writable_page || type == PGT_shared_page )
-            page->u.inuse.type_info |= PGT_validated;
-        else
+    /*
+     * No special validation needed for writable or shared pages.  Page
+     * tables and GDT/LDT need to have their contents audited.
+     *
+     * per validate_page(), non-atomic updates are fine here.
+     */
+    if ( type == PGT_writable_page || type == PGT_shared_page )
+        page->u.inuse.type_info |= PGT_validated;
+    else
+    {
+        if ( !(x & PGT_partial) )
         {
-            if ( !(x & PGT_partial) )
-            {
-                page->nr_validated_ptes = 0;
-                page->partial_flags = 0;
-                page->linear_pt_count = 0;
-            }
-
-            rc = validate_page(page, type, preemptible);
+            page->nr_validated_ptes = 0;
+            page->partial_flags = 0;
+            page->linear_pt_count = 0;
         }
+
+        rc = validate_page(page, type, preemptible);
     }
 
  out: