diff mbox series

x86/mm: account for PGT_pae_xen_l2 in recently added assertion

Message ID ee7e7e1f-8d3c-0d8f-24ef-c281b09faa25@suse.com (mailing list archive)
State New, archived
Headers show
Series x86/mm: account for PGT_pae_xen_l2 in recently added assertion | expand

Commit Message

Jan Beulich June 10, 2022, 7:26 a.m. UTC
While PGT_pae_xen_l2 will be zapped once the type refcount of an L2 page
reaches zero, it'll be retained as long as the type refcount is non-
zero. Hence any checking against the requested type needs to either zap
the bit from the type or include it in the used mask.

Fixes: 9186e96b199e ("x86/pv: Clean up _get_page_type()")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
The check around the TLB flush which was moved for XSA-401 also looks to
needlessly trigger a flush when "type" has the bit set (while "x"
wouldn't). That's no different from original behavior, but still looks
inefficient.

Comments

Andrew Cooper June 10, 2022, 8:12 a.m. UTC | #1
On 10/06/2022 08:26, Jan Beulich wrote:
> While PGT_pae_xen_l2 will be zapped once the type refcount of an L2 page
> reaches zero, it'll be retained as long as the type refcount is non-
> zero. Hence any checking against the requested type needs to either zap
> the bit from the type or include it in the used mask.
>
> Fixes: 9186e96b199e ("x86/pv: Clean up _get_page_type()")
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

pae_xen_l2 being outside of the typemask is deeply confusing to work
with.  It also renders all of the comments trying to explain the
structure of this logic wrong.

I'm a little concerned with type usage in the non-coherent path too. 
It's safe, but is (along side the IOMMU path) a misleading example to
surrounding code.

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

I can't think of anything better to do in the short term.

> ---
> The check around the TLB flush which was moved for XSA-401 also looks to
> needlessly trigger a flush when "type" has the bit set (while "x"
> wouldn't). That's no different from original behavior, but still looks
> inefficient.

It's not the only inefficiency here.  Still plenty of improvements to be
had in _get_page_type().

~Andrew
Jan Beulich June 10, 2022, 8:17 a.m. UTC | #2
On 10.06.2022 10:12, Andrew Cooper wrote:
> On 10/06/2022 08:26, Jan Beulich wrote:
>> While PGT_pae_xen_l2 will be zapped once the type refcount of an L2 page
>> reaches zero, it'll be retained as long as the type refcount is non-
>> zero. Hence any checking against the requested type needs to either zap
>> the bit from the type or include it in the used mask.
>>
>> Fixes: 9186e96b199e ("x86/pv: Clean up _get_page_type()")
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> pae_xen_l2 being outside of the typemask is deeply confusing to work
> with.  It also renders all of the comments trying to explain the
> structure of this logic wrong.
> 
> I'm a little concerned with type usage in the non-coherent path too. 
> It's safe, but is (along side the IOMMU path) a misleading example to
> surrounding code.
> 
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

Thanks.

> I can't think of anything better to do in the short term.
> 
>> ---
>> The check around the TLB flush which was moved for XSA-401 also looks to
>> needlessly trigger a flush when "type" has the bit set (while "x"
>> wouldn't). That's no different from original behavior, but still looks
>> inefficient.
> 
> It's not the only inefficiency here.  Still plenty of improvements to be
> had in _get_page_type().

You did say you have some follow-up changes pending. It wasn't clear to
me whether this particular aspect was among them. If not, I can make
a(nother) patch ...

Jan
Andrew Cooper June 10, 2022, 8:36 a.m. UTC | #3
On 10/06/2022 09:17, Jan Beulich wrote:
> On 10.06.2022 10:12, Andrew Cooper wrote:
>> On 10/06/2022 08:26, Jan Beulich wrote:
>>> ---
>>> The check around the TLB flush which was moved for XSA-401 also looks to
>>> needlessly trigger a flush when "type" has the bit set (while "x"
>>> wouldn't). That's no different from original behavior, but still looks
>>> inefficient.
>> It's not the only inefficiency here.  Still plenty of improvements to be
>> had in _get_page_type().
> You did say you have some follow-up changes pending. It wasn't clear to
> me whether this particular aspect was among them. If not, I can make
> a(nother) patch ...

At this point, it's probably more accurate to say that I've got a pile
of plans about making improvements, rather than a pile of patches.

The major improvement is the early exit for PGT_validated, making the
second half of the function exclusively for the validate-locked state.

Other improvements (off the top of my head) are shuffling the TLB flush
setup logic to not even do the tlb filter calculations if we're going to
skip the flush call anyway, deduping the page_get_owner()
calls/variables, sorting out PGC_page_table naming/semantics, reducing
the number of redundant ways we've got of expressing the same logic
(there are a lot of invariants between x, nx and type), better
explanation of the iommu behaviour, and better explanation of the tlb
flushing safety requirements.

If you think some of that's easy enough to do, then feel free.

~Andrew
diff mbox series

Patch

--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2956,7 +2956,8 @@  static int _get_page_type(struct page_in
              * The page is in one of two states (depending on PGT_partial),
              * and should have exactly one reference.
              */
-            ASSERT((x & (PGT_type_mask | PGT_count_mask)) == (type | 1));
+            ASSERT((x & (PGT_type_mask | PGT_pae_xen_l2 | PGT_count_mask)) ==
+                   (type | 1));
 
             if ( !(x & PGT_partial) )
             {