diff mbox series

[v3,2/8] domain: update GADDR based runstate guest area

Message ID 65893a9e-e2ae-6853-7c4d-54f2bf19b17b@suse.com (mailing list archive)
State New, archived
Headers show
Series runstate/time area registration by (guest) physical address | expand

Commit Message

Jan Beulich May 3, 2023, 3:55 p.m. UTC
Before adding a new vCPU operation to register the runstate area by
guest-physical address, add code to actually keep such areas up-to-date.

Note that updating of the area will be done exclusively following the
model enabled by VMASST_TYPE_runstate_update_flag for virtual-address
based registered areas.

Note further that pages aren't marked dirty when written to (matching
the handling of space mapped by map_vcpu_info()), on the basis that the
registrations are lost anyway across migration (or would need re-
populating at the target for transparent migration). Plus the contents
of the areas in question have to be deemed volatile in the first place
(so saving a "most recent" value is pretty meaningless even for e.g.
snapshotting).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
RFC: HVM guests (on x86) can change bitness and hence layout (and size!
     and alignment) of the runstate area. I don't think it is an option
     to require 32-bit code to pass a range such that even the 64-bit
     layout wouldn't cross a page boundary (and be suitably aligned). I
     also don't see any other good solution, so for now a crude approach
     with an extra boolean is used (using has_32bit_shinfo() isn't race
     free and could hence lead to overrunning the mapped space).
---
v3: Use assignment instead of memcpy().
v2: Drop VM-assist conditionals.

Comments

Roger Pau Monné Sept. 27, 2023, 9:44 a.m. UTC | #1
On Wed, May 03, 2023 at 05:55:11PM +0200, Jan Beulich wrote:
> Before adding a new vCPU operation to register the runstate area by
> guest-physical address, add code to actually keep such areas up-to-date.
> 
> Note that updating of the area will be done exclusively following the
> model enabled by VMASST_TYPE_runstate_update_flag for virtual-address
> based registered areas.
> 
> Note further that pages aren't marked dirty when written to (matching
> the handling of space mapped by map_vcpu_info()), on the basis that the
> registrations are lost anyway across migration (or would need re-
> populating at the target for transparent migration). Plus the contents
> of the areas in question have to be deemed volatile in the first place
> (so saving a "most recent" value is pretty meaningless even for e.g.
> snapshotting).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> ---
> RFC: HVM guests (on x86) can change bitness and hence layout (and size!
>      and alignment) of the runstate area. I don't think it is an option
>      to require 32-bit code to pass a range such that even the 64-bit
>      layout wouldn't cross a page boundary (and be suitably aligned). I
>      also don't see any other good solution, so for now a crude approach
>      with an extra boolean is used (using has_32bit_shinfo() isn't race
>      free and could hence lead to overrunning the mapped space).

Shouldn't a well behaved guest attempt to unmap the runstate area
before changing bitness?  I would expect this to happen for example
when OVMF relinquishes control to the OS.

Thanks, Roger.
Jan Beulich Sept. 27, 2023, 10:19 a.m. UTC | #2
On 27.09.2023 11:44, Roger Pau Monné wrote:
> On Wed, May 03, 2023 at 05:55:11PM +0200, Jan Beulich wrote:
>> Before adding a new vCPU operation to register the runstate area by
>> guest-physical address, add code to actually keep such areas up-to-date.
>>
>> Note that updating of the area will be done exclusively following the
>> model enabled by VMASST_TYPE_runstate_update_flag for virtual-address
>> based registered areas.
>>
>> Note further that pages aren't marked dirty when written to (matching
>> the handling of space mapped by map_vcpu_info()), on the basis that the
>> registrations are lost anyway across migration (or would need re-
>> populating at the target for transparent migration). Plus the contents
>> of the areas in question have to be deemed volatile in the first place
>> (so saving a "most recent" value is pretty meaningless even for e.g.
>> snapshotting).
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

>> ---
>> RFC: HVM guests (on x86) can change bitness and hence layout (and size!
>>      and alignment) of the runstate area. I don't think it is an option
>>      to require 32-bit code to pass a range such that even the 64-bit
>>      layout wouldn't cross a page boundary (and be suitably aligned). I
>>      also don't see any other good solution, so for now a crude approach
>>      with an extra boolean is used (using has_32bit_shinfo() isn't race
>>      free and could hence lead to overrunning the mapped space).
> 
> Shouldn't a well behaved guest attempt to unmap the runstate area
> before changing bitness?  I would expect this to happen for example
> when OVMF relinquishes control to the OS.

Well, the OVMF example falls in a different class: Before transferring
ownership of the system, it wants to unmap regardless of whether bitness
is going to change, or else memory corruption can occur to the following
entity.

Jan
diff mbox series

Patch

--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1615,15 +1615,52 @@  bool update_runstate_area(struct vcpu *v
     bool rc;
     struct guest_memory_policy policy = { };
     void __user *guest_handle = NULL;
-    struct vcpu_runstate_info runstate;
+    struct vcpu_runstate_info runstate = v->runstate;
+    struct vcpu_runstate_info *map = v->runstate_guest_area.map;
+
+    if ( map )
+    {
+        uint64_t *pset;
+#ifdef CONFIG_COMPAT
+        struct compat_vcpu_runstate_info *cmap = NULL;
+
+        if ( v->runstate_guest_area_compat )
+            cmap = (void *)map;
+#endif
+
+        /*
+         * NB: No VM_ASSIST(v->domain, runstate_update_flag) check here.
+         *     Always using that updating model.
+         */
+#ifdef CONFIG_COMPAT
+        if ( cmap )
+            pset = &cmap->state_entry_time;
+        else
+#endif
+            pset = &map->state_entry_time;
+        runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
+        write_atomic(pset, runstate.state_entry_time);
+        smp_wmb();
+
+#ifdef CONFIG_COMPAT
+        if ( cmap )
+            XLAT_vcpu_runstate_info(cmap, &runstate);
+        else
+#endif
+            *map = runstate;
+
+        smp_wmb();
+        runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
+        write_atomic(pset, runstate.state_entry_time);
+
+        return true;
+    }
 
     if ( guest_handle_is_null(runstate_guest(v)) )
         return true;
 
     update_guest_memory_policy(v, &policy);
 
-    memcpy(&runstate, &v->runstate, sizeof(runstate));
-
     if ( VM_ASSIST(v->domain, runstate_update_flag) )
     {
 #ifdef CONFIG_COMPAT
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -231,6 +231,8 @@  struct vcpu
 #ifdef CONFIG_COMPAT
     /* A hypercall is using the compat ABI? */
     bool             hcall_compat;
+    /* Physical runstate area registered via compat ABI? */
+    bool             runstate_guest_area_compat;
 #endif
 
 #ifdef CONFIG_IOREQ_SERVER