diff mbox

[RFC] x86/ioreq server: Optimize p2m cleaning up code in p2m_finish_type_change().

Message ID 1491382790-30066-1-git-send-email-yu.c.zhang@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Yu Zhang April 5, 2017, 8:59 a.m. UTC
Previously, p2m_finish_type_change() is triggered to iterate and
clean up the p2m table when an ioreq server unmaps from memory type
HVMMEM_ioreq_server. And the current iteration number is set to 256
And after these iterations, hypercall pre-emption is checked.

But it is likely that no p2m change is performed for the just finished
iterations, which means p2m_finish_type_change() will return quite
soon. So in such scenario, we can allow the p2m iteration to continue,
without checking the hypercall pre-emption.

Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
---
Note: this patch shall only be accepted after previous x86/ioreq server
patches be merged.

Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
---
 xen/arch/x86/hvm/dm.c     | 5 ++++-
 xen/arch/x86/mm/p2m.c     | 9 ++++++++-
 xen/include/asm-x86/p2m.h | 2 +-
 3 files changed, 13 insertions(+), 3 deletions(-)

Comments

George Dunlap April 5, 2017, 3:10 p.m. UTC | #1
On 05/04/17 09:59, Yu Zhang wrote:
> Previously, p2m_finish_type_change() is triggered to iterate and
> clean up the p2m table when an ioreq server unmaps from memory type
> HVMMEM_ioreq_server. And the current iteration number is set to 256
> And after these iterations, hypercall pre-emption is checked.
> 
> But it is likely that no p2m change is performed for the just finished
> iterations, which means p2m_finish_type_change() will return quite
> soon. So in such scenario, we can allow the p2m iteration to continue,
> without checking the hypercall pre-emption.

Suppose you have a guest with 128TiB of RAM, and the ioreq_server p2m
entries are at the very end of RAM.  Won't this run for several minutes
before even allowing preemption?

 -George
George Dunlap April 5, 2017, 3:11 p.m. UTC | #2
On 05/04/17 16:10, George Dunlap wrote:
> On 05/04/17 09:59, Yu Zhang wrote:
>> Previously, p2m_finish_type_change() is triggered to iterate and
>> clean up the p2m table when an ioreq server unmaps from memory type
>> HVMMEM_ioreq_server. And the current iteration number is set to 256
>> And after these iterations, hypercall pre-emption is checked.
>>
>> But it is likely that no p2m change is performed for the just finished
>> iterations, which means p2m_finish_type_change() will return quite
>> soon. So in such scenario, we can allow the p2m iteration to continue,
>> without checking the hypercall pre-emption.
> 
> Suppose you have a guest with 128TiB of RAM, and the ioreq_server p2m
> entries are at the very end of RAM.  Won't this run for several minutes
> before even allowing preemption?

Sorry, this should be GiB.  But I think you get my point. :-)

 -George
Yu Zhang April 5, 2017, 4:28 p.m. UTC | #3
On 4/5/2017 11:11 PM, George Dunlap wrote:
> On 05/04/17 16:10, George Dunlap wrote:
>> On 05/04/17 09:59, Yu Zhang wrote:
>>> Previously, p2m_finish_type_change() is triggered to iterate and
>>> clean up the p2m table when an ioreq server unmaps from memory type
>>> HVMMEM_ioreq_server. And the current iteration number is set to 256
>>> And after these iterations, hypercall pre-emption is checked.
>>>
>>> But it is likely that no p2m change is performed for the just finished
>>> iterations, which means p2m_finish_type_change() will return quite
>>> soon. So in such scenario, we can allow the p2m iteration to continue,
>>> without checking the hypercall pre-emption.
>> Suppose you have a guest with 128TiB of RAM, and the ioreq_server p2m
>> entries are at the very end of RAM.  Won't this run for several minutes
>> before even allowing preemption?
> Sorry, this should be GiB.  But I think you get my point. :-)

Yep. I got it.
I'd better reconsider it - my head is quite dizzy now. Maybe together 
with your generic p2m change solution in 4.10. :-)

Thanks
Yu
>
>   -George
>
>
Jan Beulich May 9, 2017, 4:29 p.m. UTC | #4
>>> On 05.04.17 at 10:59, <yu.c.zhang@linux.intel.com> wrote:
> --- a/xen/arch/x86/hvm/dm.c
> +++ b/xen/arch/x86/hvm/dm.c
> @@ -411,14 +411,17 @@ static int dm_op(domid_t domid,
>              while ( read_atomic(&p2m->ioreq.entry_count) &&
>                      first_gfn <= p2m->max_mapped_pfn )
>              {
> +                bool changed = false;
> +
>                  /* Iterate p2m table for 256 gfns each time. */
>                  p2m_finish_type_change(d, _gfn(first_gfn), 256,
> -                                       p2m_ioreq_server, p2m_ram_rw);
> +                                       p2m_ioreq_server, p2m_ram_rw, &changed);
>  
>                  first_gfn += 256;
>  
>                  /* Check for continuation if it's not the last iteration. */
>                  if ( first_gfn <= p2m->max_mapped_pfn &&
> +                     changed &&
>                       hypercall_preempt_check() )
>                  {
>                      rc = -ERESTART;

I appreciate and support the intention, but you're opening up a
long lasting loop here in case very little or no changes need to
be done. You need to check for preemption every so many
iterations even if you've never seen "changed" come back set.

Jan
Jan Beulich May 9, 2017, 4:30 p.m. UTC | #5
>>> On 05.04.17 at 10:59, <yu.c.zhang@linux.intel.com> wrote:
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -1034,12 +1034,13 @@ void p2m_change_type_range(struct domain *d,
>  /* Synchronously modify the p2m type for a range of gfns from ot to nt. */
>  void p2m_finish_type_change(struct domain *d,
>                              gfn_t first_gfn, unsigned long max_nr,
> -                            p2m_type_t ot, p2m_type_t nt)
> +                            p2m_type_t ot, p2m_type_t nt, bool *changed)
>  {
>      struct p2m_domain *p2m = p2m_get_hostp2m(d);
>      p2m_type_t t;
>      unsigned long gfn = gfn_x(first_gfn);
>      unsigned long last_gfn = gfn + max_nr - 1;
> +    bool is_changed = false;
>  
>      ASSERT(ot != nt);
>      ASSERT(p2m_is_changeable(ot) && p2m_is_changeable(nt));
> @@ -1052,12 +1053,18 @@ void p2m_finish_type_change(struct domain *d,
>          get_gfn_query_unlocked(d, gfn, &t);
>  
>          if ( t == ot )
> +        {
>              p2m_change_type_one(d, gfn, t, nt);
> +            is_changed = true;
> +        }
>  
>          gfn++;
>      }
>  
>      p2m_unlock(p2m);
> +
> +    if ( changed )
> +        *changed = is_changed;
>  }

Also, wouldn't it be better to return a count here? If there was just
a single change in the current 256-GFN batch, surely we could take
on another?

Jan
Yu Zhang May 10, 2017, 5:27 a.m. UTC | #6
On 5/10/2017 12:29 AM, Jan Beulich wrote:
>>>> On 05.04.17 at 10:59, <yu.c.zhang@linux.intel.com> wrote:
>> --- a/xen/arch/x86/hvm/dm.c
>> +++ b/xen/arch/x86/hvm/dm.c
>> @@ -411,14 +411,17 @@ static int dm_op(domid_t domid,
>>               while ( read_atomic(&p2m->ioreq.entry_count) &&
>>                       first_gfn <= p2m->max_mapped_pfn )
>>               {
>> +                bool changed = false;
>> +
>>                   /* Iterate p2m table for 256 gfns each time. */
>>                   p2m_finish_type_change(d, _gfn(first_gfn), 256,
>> -                                       p2m_ioreq_server, p2m_ram_rw);
>> +                                       p2m_ioreq_server, p2m_ram_rw, &changed);
>>   
>>                   first_gfn += 256;
>>   
>>                   /* Check for continuation if it's not the last iteration. */
>>                   if ( first_gfn <= p2m->max_mapped_pfn &&
>> +                     changed &&
>>                        hypercall_preempt_check() )
>>                   {
>>                       rc = -ERESTART;
> I appreciate and support the intention, but you're opening up a
> long lasting loop here in case very little or no changes need to
> be done. You need to check for preemption every so many
> iterations even if you've never seen "changed" come back set.

Thanks for your comments, Jan.
Indeed, this patch is problematic. Another thought is - since current 
p2m sweeping
implementation disables live migration when there's ioreq server entries 
left, and George
had proposed a generic p2m change solution previously. I'd like to leave 
the optimization
together with the generic solution in future xen release. :-)


Yu

> Jan
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
diff mbox

Patch

diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
index d72b7bd..3aa1286 100644
--- a/xen/arch/x86/hvm/dm.c
+++ b/xen/arch/x86/hvm/dm.c
@@ -411,14 +411,17 @@  static int dm_op(domid_t domid,
             while ( read_atomic(&p2m->ioreq.entry_count) &&
                     first_gfn <= p2m->max_mapped_pfn )
             {
+                bool changed = false;
+
                 /* Iterate p2m table for 256 gfns each time. */
                 p2m_finish_type_change(d, _gfn(first_gfn), 256,
-                                       p2m_ioreq_server, p2m_ram_rw);
+                                       p2m_ioreq_server, p2m_ram_rw, &changed);
 
                 first_gfn += 256;
 
                 /* Check for continuation if it's not the last iteration. */
                 if ( first_gfn <= p2m->max_mapped_pfn &&
+                     changed &&
                      hypercall_preempt_check() )
                 {
                     rc = -ERESTART;
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 0daaa86..b171a4b 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1034,12 +1034,13 @@  void p2m_change_type_range(struct domain *d,
 /* Synchronously modify the p2m type for a range of gfns from ot to nt. */
 void p2m_finish_type_change(struct domain *d,
                             gfn_t first_gfn, unsigned long max_nr,
-                            p2m_type_t ot, p2m_type_t nt)
+                            p2m_type_t ot, p2m_type_t nt, bool *changed)
 {
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
     p2m_type_t t;
     unsigned long gfn = gfn_x(first_gfn);
     unsigned long last_gfn = gfn + max_nr - 1;
+    bool is_changed = false;
 
     ASSERT(ot != nt);
     ASSERT(p2m_is_changeable(ot) && p2m_is_changeable(nt));
@@ -1052,12 +1053,18 @@  void p2m_finish_type_change(struct domain *d,
         get_gfn_query_unlocked(d, gfn, &t);
 
         if ( t == ot )
+        {
             p2m_change_type_one(d, gfn, t, nt);
+            is_changed = true;
+        }
 
         gfn++;
     }
 
     p2m_unlock(p2m);
+
+    if ( changed )
+        *changed = is_changed;
 }
 
 /*
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 0e670af..2e02538 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -615,7 +615,7 @@  int p2m_change_type_one(struct domain *d, unsigned long gfn,
 void p2m_finish_type_change(struct domain *d,
                             gfn_t first_gfn,
                             unsigned long max_nr,
-                            p2m_type_t ot, p2m_type_t nt);
+                            p2m_type_t ot, p2m_type_t nt, bool *changed);
 
 /* Report a change affecting memory types. */
 void p2m_memory_type_changed(struct domain *d);