Message ID | 20220617104739.7861-1-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/mm: Add an early PGT_validated exit in _get_page_type() | expand |
On 17.06.2022 12:47, Andrew Cooper wrote: > This is a continuation of the cleanup and commenting in: > 9186e96b199e ("x86/pv: Clean up _get_page_type()") > 8cc5036bc385 ("x86/pv: Fix ABAC cmpxchg() race in _get_page_type()") > > With the re-arranged and newly commented logic, it's far more clear that the > second half of _get_page_type() only has work to do for page validation. To be honest "far more clear" reads misleading to me: Part of the re- arrangement was to move later the early setting of PGT_validated for PGT_writable pages, without which the stated fact wasn't entirely true prior to the re-arrangement. How about s/far more/now/ ? > Introduce an early exit for PGT_validated. This makes the fastpath marginally > faster, and simplifies the subsequent logic as it no longer needs to exclude > the fully validated case. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Preferably with the wording above adjusted: Reviewed-by: Jan Beulich <jbeulich@suse.com> > Not that it's relevant, but bloat-o-meter says: > > add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-300 (-300) > Function old new delta > _get_page_type 6618 6318 -300 > > which is more impressive than I was expecting. And I have to admit I'm having trouble seeing why that would be. Jan
On 22/06/2022 10:58, Jan Beulich wrote: > On 17.06.2022 12:47, Andrew Cooper wrote: >> This is a continuation of the cleanup and commenting in: >> 9186e96b199e ("x86/pv: Clean up _get_page_type()") >> 8cc5036bc385 ("x86/pv: Fix ABAC cmpxchg() race in _get_page_type()") >> >> With the re-arranged and newly commented logic, it's far more clear that the >> second half of _get_page_type() only has work to do for page validation. > To be honest "far more clear" reads misleading to me: Part of the re- > arrangement was to move later the early setting of PGT_validated for > PGT_writable pages, without which the stated fact wasn't entirely true > prior to the re-arrangement. How about s/far more/now/ ? > >> Introduce an early exit for PGT_validated. This makes the fastpath marginally >> faster, and simplifies the subsequent logic as it no longer needs to exclude >> the fully validated case. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > Preferably with the wording above adjusted: > Reviewed-by: Jan Beulich <jbeulich@suse.com> Ok. >> Not that it's relevant, but bloat-o-meter says: >> >> add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-300 (-300) >> Function old new delta >> _get_page_type 6618 6318 -300 >> >> which is more impressive than I was expecting. > And I have to admit I'm having trouble seeing why that would be. I was surprised too, but it's deterministic with GCC 11. I initially disbelieved it and did full clean rebuilds. validate_page() is fully inlined, and there were a reasonable number of jmp.d32 -> jmp.d8 changes, but I very much doubt there are 75 of them. I can only assume there's a reasonable chunk of logic which succumbs to DCE. ~Andrew
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index ac74ae389c99..57751d2ed70f 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -3002,6 +3002,17 @@ static int _get_page_type(struct page_info *page, unsigned long type, * fully validated (PGT_[type] | PGT_validated | >0). */ + /* If the page is fully validated, we're done. */ + if ( likely(nx & PGT_validated) ) + return 0; + + /* + * The page is in the "validate locked" state. We have exclusive access, + * and any concurrent callers are waiting in the cmpxchg() loop above. + * + * Exclusive access ends when PGT_validated or PGT_partial get set. + */ + if ( unlikely((x & PGT_count_mask) == 0) ) { struct domain *d = page_get_owner(page); @@ -3071,43 +3082,40 @@ static int _get_page_type(struct page_info *page, unsigned long type, } } - if ( unlikely(!(nx & PGT_validated)) ) + /* + * Flush the cache if there were previously non-coherent mappings of + * this page, and we're trying to use it as anything other than a + * writeable page. This forces the page to be coherent before we + * validate its contents for safety. + */ + if ( (nx & PGT_non_coherent) && type != PGT_writable_page ) { - /* - * Flush the cache if there were previously non-coherent mappings of - * this page, and we're trying to use it as anything other than a - * writeable page. This forces the page to be coherent before we - * validate its contents for safety. - */ - if ( (nx & PGT_non_coherent) && type != PGT_writable_page ) - { - void *addr = __map_domain_page(page); + void *addr = __map_domain_page(page); - cache_flush(addr, PAGE_SIZE); - unmap_domain_page(addr); + cache_flush(addr, PAGE_SIZE); + unmap_domain_page(addr); - page->u.inuse.type_info &= ~PGT_non_coherent; - } + page->u.inuse.type_info &= ~PGT_non_coherent; + } - /* - * No special validation needed for writable or shared pages. Page - * tables and GDT/LDT need to have their contents audited. - * - * per validate_page(), non-atomic updates are fine here. - */ - if ( type == PGT_writable_page || type == PGT_shared_page ) - page->u.inuse.type_info |= PGT_validated; - else + /* + * No special validation needed for writable or shared pages. Page + * tables and GDT/LDT need to have their contents audited. + * + * per validate_page(), non-atomic updates are fine here. + */ + if ( type == PGT_writable_page || type == PGT_shared_page ) + page->u.inuse.type_info |= PGT_validated; + else + { + if ( !(x & PGT_partial) ) { - if ( !(x & PGT_partial) ) - { - page->nr_validated_ptes = 0; - page->partial_flags = 0; - page->linear_pt_count = 0; - } - - rc = validate_page(page, type, preemptible); + page->nr_validated_ptes = 0; + page->partial_flags = 0; + page->linear_pt_count = 0; } + + rc = validate_page(page, type, preemptible); } out:
This is a continuation of the cleanup and commenting in: 9186e96b199e ("x86/pv: Clean up _get_page_type()") 8cc5036bc385 ("x86/pv: Fix ABAC cmpxchg() race in _get_page_type()") With the re-arranged and newly commented logic, it's far more clear that the second half of _get_page_type() only has work to do for page validation. Introduce an early exit for PGT_validated. This makes the fastpath marginally faster, and simplifies the subsequent logic as it no longer needs to exclude the fully validated case. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Wei Liu <wl@xen.org> CC: George Dunlap <george.dunlap@eu.citrix.com> Not that it's relevant, but bloat-o-meter says: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-300 (-300) Function old new delta _get_page_type 6618 6318 -300 which is more impressive than I was expecting. --- xen/arch/x86/mm.c | 70 +++++++++++++++++++++++++++++++------------------------ 1 file changed, 39 insertions(+), 31 deletions(-)