diff mbox series

[v3,5/7] xen/memory: Improve compat XENMEM_acquire_resource handling

Message ID 20210112194841.1537-6-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show
Series Multiple fixes to XENMEM_acquire_resource | expand

Commit Message

Andrew Cooper Jan. 12, 2021, 7:48 p.m. UTC
The frame_list is an input, or an output, depending on whether the calling
domain is translated or not.  The array does not need marshalling in both
directions.

Furthermore, the copy-in loop was very inefficient, copying 4 bytes at at
time.  Rewrite it to copy in all nr_frames at once, and then expand
compat_pfn_t to xen_pfn_t in place.

Re-position the copy-in loop to simplify continuation support in a future
patch, and reduce the scope of certain variables.

No change in guest observed behaviour.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
---
CC: George Dunlap <George.Dunlap@eu.citrix.com>
CC: Ian Jackson <iwj@xenproject.org>
CC: Jan Beulich <JBeulich@suse.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Wei Liu <wl@xen.org>
CC: Julien Grall <julien@xen.org>
CC: Paul Durrant <paul@xen.org>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>
---
 xen/common/compat/memory.c | 65 ++++++++++++++++++++++++++++------------------
 1 file changed, 40 insertions(+), 25 deletions(-)

Comments

Jan Beulich Jan. 15, 2021, 3:37 p.m. UTC | #1
On 12.01.2021 20:48, Andrew Cooper wrote:
> @@ -446,6 +430,31 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>  
>  #undef XLAT_mem_acquire_resource_HNDL_frame_list
>  
> +            if ( xen_frame_list && cmp.mar.nr_frames )
> +            {
> +                /*
> +                 * frame_list is an input for translated guests, and an output
> +                 * for untranslated guests.  Only copy in for translated guests.
> +                 */
> +                if ( paging_mode_translate(currd) )
> +                {
> +                    compat_pfn_t *compat_frame_list = (void *)xen_frame_list;
> +
> +                    if ( !compat_handle_okay(cmp.mar.frame_list,
> +                                             cmp.mar.nr_frames) ||
> +                         __copy_from_compat_offset(
> +                             compat_frame_list, cmp.mar.frame_list,
> +                             0, cmp.mar.nr_frames) )
> +                        return -EFAULT;
> +
> +                    /*
> +                     * Iterate backwards over compat_frame_list[] expanding
> +                     * compat_pfn_t to xen_pfn_t in place.
> +                     */
> +                    for ( int x = cmp.mar.nr_frames - 1; x >= 0; --x )
> +                        xen_frame_list[x] = compat_frame_list[x];

Just as a nit, without requiring you to adjust (but with the
request to consider adjusting) - x getting used as array index
would generally suggest it wants to be an unsigned type (despite
me guessing the compiler ought to be able to avoid an explicit
sign-extension for the actual memory accesses):

                    for ( unsigned int x = cmp.mar.nr_frames; x--; )
                        xen_frame_list[x] = compat_frame_list[x];

Jan
Andrew Cooper Jan. 28, 2021, 11:32 p.m. UTC | #2
On 15/01/2021 15:37, Jan Beulich wrote:
> On 12.01.2021 20:48, Andrew Cooper wrote:
>> @@ -446,6 +430,31 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>>  
>>  #undef XLAT_mem_acquire_resource_HNDL_frame_list
>>  
>> +            if ( xen_frame_list && cmp.mar.nr_frames )
>> +            {
>> +                /*
>> +                 * frame_list is an input for translated guests, and an output
>> +                 * for untranslated guests.  Only copy in for translated guests.
>> +                 */
>> +                if ( paging_mode_translate(currd) )
>> +                {
>> +                    compat_pfn_t *compat_frame_list = (void *)xen_frame_list;
>> +
>> +                    if ( !compat_handle_okay(cmp.mar.frame_list,
>> +                                             cmp.mar.nr_frames) ||
>> +                         __copy_from_compat_offset(
>> +                             compat_frame_list, cmp.mar.frame_list,
>> +                             0, cmp.mar.nr_frames) )
>> +                        return -EFAULT;
>> +
>> +                    /*
>> +                     * Iterate backwards over compat_frame_list[] expanding
>> +                     * compat_pfn_t to xen_pfn_t in place.
>> +                     */
>> +                    for ( int x = cmp.mar.nr_frames - 1; x >= 0; --x )
>> +                        xen_frame_list[x] = compat_frame_list[x];
> Just as a nit, without requiring you to adjust (but with the
> request to consider adjusting) - x getting used as array index
> would generally suggest it wants to be an unsigned type (despite
> me guessing the compiler ought to be able to avoid an explicit
> sign-extension for the actual memory accesses):
>
>                     for ( unsigned int x = cmp.mar.nr_frames; x--; )
>                         xen_frame_list[x] = compat_frame_list[x];

Signed numbers are not inherently evil.  The range of x is between 0 and
1020 so there is no issue with failing to enter the loop.

It is the compilers job to make this optimisation.  It is a very poor
use of a developers time to write logic which takes extra effort to
figure out whether it is correct or not.

You know what my attitude will be towards a compiler which is incapable
of making the optimisation, and you've got to go back a decade to find a
processor old enough to not have identical performance between the
unoptimised signed and unsigned forms.

Both signs of numbers have their uses, and a rigid policy of using
unsigned numbers does more harm than good (in this case, concerning the
simplicity of the code).

~Andrew
Jan Beulich Feb. 1, 2021, 2:12 p.m. UTC | #3
On 29.01.2021 00:32, Andrew Cooper wrote:
> On 15/01/2021 15:37, Jan Beulich wrote:
>> On 12.01.2021 20:48, Andrew Cooper wrote:
>>> @@ -446,6 +430,31 @@ int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
>>>  
>>>  #undef XLAT_mem_acquire_resource_HNDL_frame_list
>>>  
>>> +            if ( xen_frame_list && cmp.mar.nr_frames )
>>> +            {
>>> +                /*
>>> +                 * frame_list is an input for translated guests, and an output
>>> +                 * for untranslated guests.  Only copy in for translated guests.
>>> +                 */
>>> +                if ( paging_mode_translate(currd) )
>>> +                {
>>> +                    compat_pfn_t *compat_frame_list = (void *)xen_frame_list;
>>> +
>>> +                    if ( !compat_handle_okay(cmp.mar.frame_list,
>>> +                                             cmp.mar.nr_frames) ||
>>> +                         __copy_from_compat_offset(
>>> +                             compat_frame_list, cmp.mar.frame_list,
>>> +                             0, cmp.mar.nr_frames) )
>>> +                        return -EFAULT;
>>> +
>>> +                    /*
>>> +                     * Iterate backwards over compat_frame_list[] expanding
>>> +                     * compat_pfn_t to xen_pfn_t in place.
>>> +                     */
>>> +                    for ( int x = cmp.mar.nr_frames - 1; x >= 0; --x )
>>> +                        xen_frame_list[x] = compat_frame_list[x];
>> Just as a nit, without requiring you to adjust (but with the
>> request to consider adjusting) - x getting used as array index
>> would generally suggest it wants to be an unsigned type (despite
>> me guessing the compiler ought to be able to avoid an explicit
>> sign-extension for the actual memory accesses):
>>
>>                     for ( unsigned int x = cmp.mar.nr_frames; x--; )
>>                         xen_frame_list[x] = compat_frame_list[x];
> 
> Signed numbers are not inherently evil.  The range of x is between 0 and
> 1020 so there is no issue with failing to enter the loop.
> 
> It is the compilers job to make this optimisation.  It is a very poor
> use of a developers time to write logic which takes extra effort to
> figure out whether it is correct or not.

I don't see why my suggested alternative is any more difficult to
understand. It's one less expression, so perhaps even less cognitive
load. But yes, this is easily getting subjective.

> You know what my attitude will be towards a compiler which is incapable
> of making the optimisation, and you've got to go back a decade to find a
> processor old enough to not have identical performance between the
> unoptimised signed and unsigned forms.

I'm not sure I see how the compiler could transform this to using
unsigned int. By observation, gcc10 doesn't, despite -O2 (release
build). It still emits an otherwise unnecessary MOVSXD, and the
loop body is one insn shorter with an unsigned induction variable
(albeit that's likely just a side effect in this specific example).

> Both signs of numbers have their uses, and a rigid policy of using
> unsigned numbers does more harm than good (in this case, concerning the
> simplicity of the code).

Of course. But array accesses are where we'd better limit ourselves
to unsigned indexing variables, imo.

Jan
diff mbox series

Patch

diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
index ed92e05b08..834c5e19d1 100644
--- a/xen/common/compat/memory.c
+++ b/xen/common/compat/memory.c
@@ -55,6 +55,8 @@  static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr,
 
 int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
 {
+    struct vcpu *curr = current;
+    struct domain *currd = curr->domain;
     int split, op = cmd & MEMOP_CMD_MASK;
     long rc;
     unsigned int start_extent = cmd >> MEMOP_EXTENT_SHIFT;
@@ -399,7 +401,7 @@  int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
 
         case XENMEM_acquire_resource:
         {
-            xen_pfn_t *xen_frame_list;
+            xen_pfn_t *xen_frame_list = NULL;
             unsigned int max_nr_frames;
 
             if ( copy_from_guest(&cmp.mar, compat, 1) )
@@ -417,28 +419,10 @@  int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
             if ( cmp.mar.nr_frames > max_nr_frames )
                 return -E2BIG;
 
-            if ( compat_handle_is_null(cmp.mar.frame_list) )
-                xen_frame_list = NULL;
-            else
-            {
+            /* Marshal the frame list in the remainder of the xlat space. */
+            if ( !compat_handle_is_null(cmp.mar.frame_list) )
                 xen_frame_list = (xen_pfn_t *)(nat.mar + 1);
 
-                if ( !compat_handle_okay(cmp.mar.frame_list,
-                                         cmp.mar.nr_frames) )
-                    return -EFAULT;
-
-                for ( i = 0; i < cmp.mar.nr_frames; i++ )
-                {
-                    compat_pfn_t frame;
-
-                    if ( __copy_from_compat_offset(
-                             &frame, cmp.mar.frame_list, i, 1) )
-                        return -EFAULT;
-
-                    xen_frame_list[i] = frame;
-                }
-            }
-
 #define XLAT_mem_acquire_resource_HNDL_frame_list(_d_, _s_) \
             set_xen_guest_handle((_d_)->frame_list, xen_frame_list)
 
@@ -446,6 +430,31 @@  int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
 
 #undef XLAT_mem_acquire_resource_HNDL_frame_list
 
+            if ( xen_frame_list && cmp.mar.nr_frames )
+            {
+                /*
+                 * frame_list is an input for translated guests, and an output
+                 * for untranslated guests.  Only copy in for translated guests.
+                 */
+                if ( paging_mode_translate(currd) )
+                {
+                    compat_pfn_t *compat_frame_list = (void *)xen_frame_list;
+
+                    if ( !compat_handle_okay(cmp.mar.frame_list,
+                                             cmp.mar.nr_frames) ||
+                         __copy_from_compat_offset(
+                             compat_frame_list, cmp.mar.frame_list,
+                             0, cmp.mar.nr_frames) )
+                        return -EFAULT;
+
+                    /*
+                     * Iterate backwards over compat_frame_list[] expanding
+                     * compat_pfn_t to xen_pfn_t in place.
+                     */
+                    for ( int x = cmp.mar.nr_frames - 1; x >= 0; --x )
+                        xen_frame_list[x] = compat_frame_list[x];
+                }
+            }
             break;
         }
         default:
@@ -590,8 +599,6 @@  int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
 
         case XENMEM_acquire_resource:
         {
-            const xen_pfn_t *xen_frame_list = (xen_pfn_t *)(nat.mar + 1);
-            compat_pfn_t *compat_frame_list = (compat_pfn_t *)(nat.mar + 1);
             DEFINE_XEN_GUEST_HANDLE(compat_mem_acquire_resource_t);
 
             if ( compat_handle_is_null(cmp.mar.frame_list) )
@@ -601,9 +608,18 @@  int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
                                            compat_mem_acquire_resource_t),
                          nat.mar, nr_frames) )
                     return -EFAULT;
+                break;
             }
-            else
+
+            /*
+             * frame_list is an input for translated guests, and an output for
+             * untranslated guests.  Only copy out for untranslated guests.
+             */
+            if ( !paging_mode_translate(currd) )
             {
+                const xen_pfn_t *xen_frame_list = (xen_pfn_t *)(nat.mar + 1);
+                compat_pfn_t *compat_frame_list = (compat_pfn_t *)(nat.mar + 1);
+
                 /*
                  * NOTE: the smaller compat array overwrites the native
                  *       array.
@@ -625,7 +641,6 @@  int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
                                              cmp.mar.nr_frames) )
                     return -EFAULT;
             }
-
             break;
         }