diff mbox series

[2/4] x86/mm: Implement common put_data_pages for put_page_from_l[23]e

Message ID 20191212173203.1692762-3-george.dunlap@citrix.com (mailing list archive)
State Superseded
Headers show
Series Post-299 cleanups | expand

Commit Message

George Dunlap Dec. 12, 2019, 5:32 p.m. UTC
Both put_page_from_l2e and put_page_from_l3e handle having superpage
entries by looping over each page and "put"-ing each one individually.
As with putting page table entries, this code is functionally
identical, but for some reason different.  Moreover, there is already
a common function, put_data_page(), to handle automatically swapping
between put_page() (for read-only pages) or put_page_and_type() (for
read-write pages).

Replace this with put_data_pages() (plural), which does the entire
loop, as well as the put_page / put_page_and_type switch.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
NB that I've used the "simple for loop" version to make it easy to see
what's going on, rather than the "do { } while()" version which uses &
and compare to zero rather than comparing to the max.

CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/mm.c | 52 ++++++++++++++++++-----------------------------
 1 file changed, 20 insertions(+), 32 deletions(-)

Comments

Andrew Cooper Dec. 12, 2019, 7:57 p.m. UTC | #1
On 12/12/2019 17:32, George Dunlap wrote:
> Both put_page_from_l2e and put_page_from_l3e handle having superpage
> entries by looping over each page and "put"-ing each one individually.
> As with putting page table entries, this code is functionally
> identical, but for some reason different.  Moreover, there is already
> a common function, put_data_page(), to handle automatically swapping
> between put_page() (for read-only pages) or put_page_and_type() (for
> read-write pages).
>
> Replace this with put_data_pages() (plural), which does the entire
> loop, as well as the put_page / put_page_and_type switch.
>
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
> ---
> NB that I've used the "simple for loop" version to make it easy to see
> what's going on, rather than the "do { } while()" version which uses &
> and compare to zero rather than comparing to the max.

So while I think the change is an improvement, and are fine as
presented, I'm -1 towards it overall.

I am going to once again express my firm opinion that the remaining use
of PV superpages do far more harm than good, and should be removed
completely.

We construct dom0 with some superpages for its p2m and/or initrd.

This turned out to be the issue behind pv-l1tf breaking for dom0 (c/s
96f6ee15ad7c), and why we had to ship XSA-273 in an insecure
configuration (and I'd like to point out that Xen is still in an
insecure configuration by default.)

There is a still-outstanding issue with grant map by linear address over
a superpage where things malfunction, and the only reason this doesn't
have an XSA is that it is believed to be restricted to dom0 only.

Finally, an L3_SHIFT loop is a long running operation which we can't put
a continuation into the middle of, and I bet there are fun things which
can be done there in the general case.

Seeing as PV guests decompress and free the initrd, and repositions the
p2m, none of these superpages tend to survive post boot, so I am
currently unconvinced that a perf improvement would be anything but
marginal.

I certainly don't think it is worth the complexity and corner cases it
causes is Xen.

~Andrew
Jan Beulich Dec. 13, 2019, 10:44 a.m. UTC | #2
On 12.12.2019 18:32, George Dunlap wrote:
> Both put_page_from_l2e and put_page_from_l3e handle having superpage
> entries by looping over each page and "put"-ing each one individually.
> As with putting page table entries, this code is functionally
> identical, but for some reason different.  Moreover, there is already
> a common function, put_data_page(), to handle automatically swapping
> between put_page() (for read-only pages) or put_page_and_type() (for
> read-write pages).
> 
> Replace this with put_data_pages() (plural), which does the entire
> loop, as well as the put_page / put_page_and_type switch.
> 
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
> ---
> NB that I've used the "simple for loop" version to make it easy to see
> what's going on, rather than the "do { } while()" version which uses &
> and compare to zero rather than comparing to the max.
> 
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Jan Beulich <jbeulich@suse.com>
> ---
>  xen/arch/x86/mm.c | 52 ++++++++++++++++++-----------------------------
>  1 file changed, 20 insertions(+), 32 deletions(-)
> 
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index d8a0eb2aa5..c05039ab21 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -1289,14 +1289,6 @@ void put_page_from_l1e(l1_pgentry_t l1e, struct domain *l1e_owner)
>  }
>  
>  #ifdef CONFIG_PV
> -static void put_data_page(struct page_info *page, bool writeable)
> -{
> -    if ( writeable )
> -        put_page_and_type(page);
> -    else
> -        put_page(page);
> -}
> -
>  static int put_pt_page(struct page_info *pg, struct page_info *ptpg,
>                         unsigned int flags)
>  {
> @@ -1319,6 +1311,20 @@ static int put_pt_page(struct page_info *pg, struct page_info *ptpg,
>      return rc;
>  }
>  
> +static int put_data_pages(struct page_info *page, bool writeable, int pt_shift)
> +{
> +    int i, count = 1 << (pt_shift - PAGE_SHIFT);

With both "int" here changed to "unsigned int"
Reviewed-by: Jan Beulich <jbeulich@suse.com>
But of course Andrew's objection needs addressing one way or another.

Jan
George Dunlap Dec. 13, 2019, 1:58 p.m. UTC | #3
> On Dec 12, 2019, at 7:57 PM, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> 
> On 12/12/2019 17:32, George Dunlap wrote:
>> Both put_page_from_l2e and put_page_from_l3e handle having superpage
>> entries by looping over each page and "put"-ing each one individually.
>> As with putting page table entries, this code is functionally
>> identical, but for some reason different.  Moreover, there is already
>> a common function, put_data_page(), to handle automatically swapping
>> between put_page() (for read-only pages) or put_page_and_type() (for
>> read-write pages).
>> 
>> Replace this with put_data_pages() (plural), which does the entire
>> loop, as well as the put_page / put_page_and_type switch.
>> 
>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
>> ---
>> NB that I've used the "simple for loop" version to make it easy to see
>> what's going on, rather than the "do { } while()" version which uses &
>> and compare to zero rather than comparing to the max.
> 
> So while I think the change is an improvement, and are fine as
> presented, I'm -1 towards it overall.
> 
> I am going to once again express my firm opinion that the remaining use
> of PV superpages do far more harm than good, and should be removed
> completely.
> 
> We construct dom0 with some superpages for its p2m and/or initrd.
> 
> This turned out to be the issue behind pv-l1tf breaking for dom0 (c/s
> 96f6ee15ad7c), and why we had to ship XSA-273 in an insecure
> configuration (and I'd like to point out that Xen is still in an
> insecure configuration by default.)
> 
> There is a still-outstanding issue with grant map by linear address over
> a superpage where things malfunction, and the only reason this doesn't
> have an XSA is that it is believed to be restricted to dom0 only.
> 
> Finally, an L3_SHIFT loop is a long running operation which we can't put
> a continuation into the middle of, and I bet there are fun things which
> can be done there in the general case.
> 
> Seeing as PV guests decompress and free the initrd, and repositions the
> p2m, none of these superpages tend to survive post boot, so I am
> currently unconvinced that a perf improvement would be anything but
> marginal.
> 
> I certainly don't think it is worth the complexity and corner cases it
> causes is Xen.

That all sounds reasonable, but I don’t really know anything about PV superpages in Xen at the moment (in fact I sort of wondered what this code was about).

I’d recommend taking this patch as-is, and “someone” can put it on their list to get rid of the PV superpages later.  The alternate is I drop this patch from the series and “someone" puts the PV superpage removal on their list to do later.

(My mind is also involuntarily churning through options to regularize superpage promotion and demotion.)

 -George
Andrew Cooper Dec. 13, 2019, 3:04 p.m. UTC | #4
On 13/12/2019 13:58, George Dunlap wrote:
>
>> On Dec 12, 2019, at 7:57 PM, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>
>> On 12/12/2019 17:32, George Dunlap wrote:
>>> Both put_page_from_l2e and put_page_from_l3e handle having superpage
>>> entries by looping over each page and "put"-ing each one individually.
>>> As with putting page table entries, this code is functionally
>>> identical, but for some reason different.  Moreover, there is already
>>> a common function, put_data_page(), to handle automatically swapping
>>> between put_page() (for read-only pages) or put_page_and_type() (for
>>> read-write pages).
>>>
>>> Replace this with put_data_pages() (plural), which does the entire
>>> loop, as well as the put_page / put_page_and_type switch.
>>>
>>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
>>> ---
>>> NB that I've used the "simple for loop" version to make it easy to see
>>> what's going on, rather than the "do { } while()" version which uses &
>>> and compare to zero rather than comparing to the max.
>> So while I think the change is an improvement, and are fine as
>> presented, I'm -1 towards it overall.
>>
>> I am going to once again express my firm opinion that the remaining use
>> of PV superpages do far more harm than good, and should be removed
>> completely.
>>
>> We construct dom0 with some superpages for its p2m and/or initrd.
>>
>> This turned out to be the issue behind pv-l1tf breaking for dom0 (c/s
>> 96f6ee15ad7c), and why we had to ship XSA-273 in an insecure
>> configuration (and I'd like to point out that Xen is still in an
>> insecure configuration by default.)
>>
>> There is a still-outstanding issue with grant map by linear address over
>> a superpage where things malfunction, and the only reason this doesn't
>> have an XSA is that it is believed to be restricted to dom0 only.
>>
>> Finally, an L3_SHIFT loop is a long running operation which we can't put
>> a continuation into the middle of, and I bet there are fun things which
>> can be done there in the general case.
>>
>> Seeing as PV guests decompress and free the initrd, and repositions the
>> p2m, none of these superpages tend to survive post boot, so I am
>> currently unconvinced that a perf improvement would be anything but
>> marginal.
>>
>> I certainly don't think it is worth the complexity and corner cases it
>> causes is Xen.
> That all sounds reasonable, but I don’t really know anything about PV superpages in Xen at the moment (in fact I sort of wondered what this code was about).
>
> I’d recommend taking this patch as-is, and “someone” can put it on their list to get rid of the PV superpages later.  The alternate is I drop this patch from the series and “someone" puts the PV superpage removal on their list to do later.

As I said, I think the patch is fine cleanup (modulo the 3 int's which
Jan commented on).

If you don't want to tackle the bigger problem now, then guess it is
another thing which can sit on my todo list (and possibly never get done...)

~Andrew
diff mbox series

Patch

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index d8a0eb2aa5..c05039ab21 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1289,14 +1289,6 @@  void put_page_from_l1e(l1_pgentry_t l1e, struct domain *l1e_owner)
 }
 
 #ifdef CONFIG_PV
-static void put_data_page(struct page_info *page, bool writeable)
-{
-    if ( writeable )
-        put_page_and_type(page);
-    else
-        put_page(page);
-}
-
 static int put_pt_page(struct page_info *pg, struct page_info *ptpg,
                        unsigned int flags)
 {
@@ -1319,6 +1311,20 @@  static int put_pt_page(struct page_info *pg, struct page_info *ptpg,
     return rc;
 }
 
+static int put_data_pages(struct page_info *page, bool writeable, int pt_shift)
+{
+    int i, count = 1 << (pt_shift - PAGE_SHIFT);
+
+    ASSERT(!(mfn_x(page_to_mfn(page)) & (count - 1)));
+    for ( i = 0; i < count ; i++, page++ )
+        if ( writeable )
+            put_page_and_type(page);
+        else
+            put_page(page);
+
+    return 0;
+}
+
 /*
  * NB. Virtual address 'l2e' maps to a machine address within frame 'pfn'.
  * Note also that this automatically deals correctly with linear p.t.'s.
@@ -1330,18 +1336,9 @@  static int put_page_from_l2e(l2_pgentry_t l2e, unsigned long pfn,
         return 1;
 
     if ( l2e_get_flags(l2e) & _PAGE_PSE )
-    {
-        struct page_info *page = l2e_get_page(l2e);
-        bool writeable = l2e_get_flags(l2e) & _PAGE_RW;
-        unsigned int i;
-
-        ASSERT(!(mfn_x(page_to_mfn(page)) &
-                 ((1UL << (L2_PAGETABLE_SHIFT - PAGE_SHIFT)) - 1)));
-        for ( i = 0; i < (1u << PAGETABLE_ORDER); i++, page++ )
-            put_data_page(page, writeable);
-
-        return 0;
-    }
+        return put_data_pages(l2e_get_page(l2e),
+                              l2e_get_flags(l2e) & _PAGE_RW,
+                              L2_PAGETABLE_SHIFT);
 
     return put_pt_page(l2e_get_page(l2e), mfn_to_page(_mfn(pfn)), flags);
 }
@@ -1353,18 +1350,9 @@  static int put_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn,
         return 1;
 
     if ( unlikely(l3e_get_flags(l3e) & _PAGE_PSE) )
-    {
-        unsigned long mfn = l3e_get_pfn(l3e);
-        bool writeable = l3e_get_flags(l3e) & _PAGE_RW;
-
-        ASSERT(!(flags & PTF_partial_set));
-        ASSERT(!(mfn & ((1UL << (L3_PAGETABLE_SHIFT - PAGE_SHIFT)) - 1)));
-        do {
-            put_data_page(mfn_to_page(_mfn(mfn)), writeable);
-        } while ( ++mfn & ((1UL << (L3_PAGETABLE_SHIFT - PAGE_SHIFT)) - 1) );
-
-        return 0;
-    }
+        return put_data_pages(l3e_get_page(l3e),
+                              l3e_get_flags(l3e) & _PAGE_RW,
+                              L3_PAGETABLE_SHIFT);
 
     return put_pt_page(l3e_get_page(l3e), mfn_to_page(_mfn(pfn)), flags);
 }