diff mbox series

[v3,14/26] memory: Add Error** argument to .log_global*() handlers

Message ID 20240304122844.1888308-15-clg@redhat.com (mailing list archive)
State New, archived
Headers show
Series migration: Improve error reporting | expand

Commit Message

Cédric Le Goater March 4, 2024, 12:28 p.m. UTC
Modify all log_global*() handlers to take an Error** parameter and
return a bool. A new MEMORY_LISTENER_CALL_LOG_GLOBAL macro looping on
the listeners is introduced to handle a possible error, which will
would interrupt the loop if necessary.

To be noted the change in memory_global_dirty_log_start() behavior as
it will return as soon as an error is detected.

Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Paul Durrant <paul@xen.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---

 Changes in v3: 

 - Fixed return value of vfio_listener_log_global_start() and
   vfio_listener_log_global_stop(). Went unnoticed because not tested.
   
 include/exec/memory.h | 15 ++++++--
 hw/i386/xen/xen-hvm.c |  6 ++--
 hw/vfio/common.c      |  8 +++--
 hw/virtio/vhost.c     |  6 ++--
 system/memory.c       | 83 +++++++++++++++++++++++++++++++++++++------
 system/physmem.c      |  5 +--
 6 files changed, 101 insertions(+), 22 deletions(-)

Comments

Peter Xu March 5, 2024, 7:57 a.m. UTC | #1
On Mon, Mar 04, 2024 at 01:28:32PM +0100, Cédric Le Goater wrote:
> @@ -2936,15 +2975,22 @@ void memory_global_dirty_log_start(unsigned int flags)
>      trace_global_dirty_changed(global_dirty_tracking);
>  
>      if (!old_flags) {
> -        MEMORY_LISTENER_CALL_GLOBAL(log_global_start, Forward);
> +        MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_start, Forward,
> +                                        &local_err);
> +        if (local_err) {
> +            error_report_err(local_err);
> +            return;

Returns here means global_dirty_tracking will keep the new value even if
it's not truly commited globally (in memory_region_transaction_commit()
later below).  I think it'll cause inconsistency: global_dirty_tracking
should reflect the global status of dirty tracking, and that should match
with the MR status cached in FlatViews (which is used in memory core to
reflect address space translations).

For some details on how that flag applied to each MR, feel free to have a
quick look in address_space_update_topology_pass() of the "else if (frold
&& frnew && flatrange_equal(frold, frnew))".

Here IIUC if to fully support a graceful failure (IIUC that is the goal for
VFIO.. and this op should be easily triggerable by the user), then we need
to do proper unwind on both:

  - Call proper log_global_stop() on those who has already been started
    successfully before the current failed log_global_start(), then,

  - Reset global_dirty_tracking to old_flags before return

We may want to make sure trace_global_dirty_changed() is only called when
all things succeeded.

I don't have a strong opinion on whether do we need similar error report
interfaces for _stop() and _log_sync().  I'd still suggest the same that we
drop them to make the patch simpler, but only add such error reports for
log_global_start().  If they never get triggered they're dead code anyway,
so I don't think "having errp for all APIs" is a must-to-have at least to me.

Thanks,

> +        }
>          memory_region_transaction_begin();
>          memory_region_update_pending = true;
>          memory_region_transaction_commit();
>      }
>  }
Cédric Le Goater March 5, 2024, 2:25 p.m. UTC | #2
On 3/5/24 08:57, Peter Xu wrote:
> On Mon, Mar 04, 2024 at 01:28:32PM +0100, Cédric Le Goater wrote:
>> @@ -2936,15 +2975,22 @@ void memory_global_dirty_log_start(unsigned int flags)
>>       trace_global_dirty_changed(global_dirty_tracking);
>>   
>>       if (!old_flags) {
>> -        MEMORY_LISTENER_CALL_GLOBAL(log_global_start, Forward);
>> +        MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_start, Forward,
>> +                                        &local_err);
>> +        if (local_err) {
>> +            error_report_err(local_err);
>> +            return;
> 
> Returns here means global_dirty_tracking will keep the new value even if
> it's not truly commited globally (in memory_region_transaction_commit()
> later below).  I think it'll cause inconsistency: global_dirty_tracking
> should reflect the global status of dirty tracking, and that should match
> with the MR status cached in FlatViews (which is used in memory core to
> reflect address space translations).

You are right. FlatRange::dirty_log_mask could be out sync with
global_dirty_tracking.

> For some details on how that flag applied to each MR, feel free to have a
> quick look in address_space_update_topology_pass() of the "else if (frold
> && frnew && flatrange_equal(frold, frnew))".>
> Here IIUC if to fully support a graceful failure (IIUC that is the goal for
> VFIO.. and this op should be easily triggerable by the user), then we need
> to do proper unwind on both:
> 
>    - Call proper log_global_stop() on those who has already been started
>      successfully before the current failed log_global_start(), then,

Yes. This needs more work to restore the initial state. The current
proposal is relying on save_cleanup() to restore the previous state.
This is not enough.

>    - Reset global_dirty_tracking to old_flags before return
> 
> We may want to make sure trace_global_dirty_changed() is only called when
> all things succeeded.

That should be done after the loop on listeners even today.

> I don't have a strong opinion on whether do we need similar error report
> interfaces for _stop() and _log_sync(). 

Yes. Let's focus on log_global_start(). The other changes are not necessary
for VFIO.

> I'd still suggest the same that we
> drop them to make the patch simpler, but only add such error reports for
> log_global_start().  If they never get triggered they're dead code anyway,
> so I don't think "having errp for all APIs" is a must-to-have at least to me.

I am fine with that.

Thanks,

C.
diff mbox series

Patch

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 8626a355b310ed7b1a1db7978ba4b394032c2f15..4bc146c5ebdd377cd14a4e462f32cc945db5a0a8 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -998,8 +998,11 @@  struct MemoryListener {
      * active at that time.
      *
      * @listener: The #MemoryListener.
+     * @errp: pointer to Error*, to store an error if it happens.
+     *
+     * Return: true on success, else false setting @errp with error.
      */
-    void (*log_global_start)(MemoryListener *listener);
+    bool (*log_global_start)(MemoryListener *listener, Error **errp);
 
     /**
      * @log_global_stop:
@@ -1009,8 +1012,11 @@  struct MemoryListener {
      * the address space.
      *
      * @listener: The #MemoryListener.
+     * @errp: pointer to Error*, to store an error if it happens.
+     *
+     * Return: true on success, else false setting @errp with error.
      */
-    void (*log_global_stop)(MemoryListener *listener);
+    bool (*log_global_stop)(MemoryListener *listener, Error **errp);
 
     /**
      * @log_global_after_sync:
@@ -1019,8 +1025,11 @@  struct MemoryListener {
      * for any #MemoryRegionSection.
      *
      * @listener: The #MemoryListener.
+     * @errp: pointer to Error*, to store an error if it happens.
+     *
+     * Return: true on success, else false setting @errp with error.
      */
-    void (*log_global_after_sync)(MemoryListener *listener);
+    bool (*log_global_after_sync)(MemoryListener *listener, Error **errp);
 
     /**
      * @eventfd_add:
diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
index f42621e6742552035122ea58092c91c3458338ff..925a207b494b4eed52d5f360b554f18ac8a9806d 100644
--- a/hw/i386/xen/xen-hvm.c
+++ b/hw/i386/xen/xen-hvm.c
@@ -446,16 +446,18 @@  static void xen_log_sync(MemoryListener *listener, MemoryRegionSection *section)
                           int128_get64(section->size));
 }
 
-static void xen_log_global_start(MemoryListener *listener)
+static bool xen_log_global_start(MemoryListener *listener, Error **errp)
 {
     if (xen_enabled()) {
         xen_in_migration = true;
     }
+    return true;
 }
 
-static void xen_log_global_stop(MemoryListener *listener)
+static bool xen_log_global_stop(MemoryListener *listener, Error **errp)
 {
     xen_in_migration = false;
+    return true;
 }
 
 static const MemoryListener xen_memory_listener = {
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index ff88c3f31ca660b3c0a790601100fdc6116192a0..71352ad6532cad02f3a2ab53bdd531d769ea55b2 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1075,7 +1075,8 @@  out:
     return ret;
 }
 
-static void vfio_listener_log_global_start(MemoryListener *listener)
+static bool vfio_listener_log_global_start(MemoryListener *listener,
+                                           Error **errp)
 {
     VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase,
                                                  listener);
@@ -1092,9 +1093,11 @@  static void vfio_listener_log_global_start(MemoryListener *listener)
                      ret, strerror(-ret));
         vfio_set_migration_error(ret);
     }
+    return !ret;
 }
 
-static void vfio_listener_log_global_stop(MemoryListener *listener)
+static bool vfio_listener_log_global_stop(MemoryListener *listener,
+                                          Error **errp)
 {
     VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase,
                                                  listener);
@@ -1111,6 +1114,7 @@  static void vfio_listener_log_global_stop(MemoryListener *listener)
                      ret, strerror(-ret));
         vfio_set_migration_error(ret);
     }
+    return !ret;
 }
 
 static int vfio_device_dma_logging_report(VFIODevice *vbasedev, hwaddr iova,
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 2c9ac794680ea9b65eba6cc22e70cf141e90aa73..7a555f941934991a72a2817e5505fe0ce6d6fc64 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1044,7 +1044,7 @@  check_dev_state:
     return r;
 }
 
-static void vhost_log_global_start(MemoryListener *listener)
+static bool vhost_log_global_start(MemoryListener *listener, Error **errp)
 {
     int r;
 
@@ -1052,9 +1052,10 @@  static void vhost_log_global_start(MemoryListener *listener)
     if (r < 0) {
         abort();
     }
+    return true;
 }
 
-static void vhost_log_global_stop(MemoryListener *listener)
+static bool vhost_log_global_stop(MemoryListener *listener, Error **errp)
 {
     int r;
 
@@ -1062,6 +1063,7 @@  static void vhost_log_global_stop(MemoryListener *listener)
     if (r < 0) {
         abort();
     }
+    return true;
 }
 
 static void vhost_log_start(MemoryListener *listener,
diff --git a/system/memory.c b/system/memory.c
index a229a79988fce2aa3cb77e3a130db4c694e8cd49..af06157ead5b1272548e87f79ab9fb3036055328 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -127,6 +127,35 @@  enum ListenerDirection { Forward, Reverse };
         }                                                               \
     } while (0)
 
+#define MEMORY_LISTENER_CALL_LOG_GLOBAL(_callback, _direction, _errp,   \
+                                        _args...)                       \
+    do {                                                                \
+        MemoryListener *_listener;                                      \
+                                                                        \
+        switch (_direction) {                                           \
+        case Forward:                                                   \
+            QTAILQ_FOREACH(_listener, &memory_listeners, link) {        \
+                if (_listener->_callback) {                             \
+                    if (!_listener->_callback(_listener, _errp, ##_args)) { \
+                        break;                                          \
+                    }                                                   \
+                }                                                       \
+            }                                                           \
+            break;                                                      \
+        case Reverse:                                                   \
+            QTAILQ_FOREACH_REVERSE(_listener, &memory_listeners, link) { \
+                if (_listener->_callback) {                             \
+                    if (!_listener->_callback(_listener, _errp, ##_args)) { \
+                        break;                                          \
+                    }                                                   \
+                }                                                       \
+            }                                                           \
+            break;                                                      \
+        default:                                                        \
+            abort();                                                    \
+        };                                                              \
+    } while (0)
+
 #define MEMORY_LISTENER_CALL(_as, _callback, _direction, _section, _args...) \
     do {                                                                \
         MemoryListener *_listener;                                      \
@@ -2903,7 +2932,13 @@  void memory_global_dirty_log_sync(bool last_stage)
 
 void memory_global_after_dirty_log_sync(void)
 {
-    MEMORY_LISTENER_CALL_GLOBAL(log_global_after_sync, Forward);
+    Error *local_err = NULL;
+
+    MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_after_sync, Forward,
+                                    &local_err);
+    if (local_err) {
+        error_report_err(local_err);
+    }
 }
 
 /*
@@ -2912,18 +2947,22 @@  void memory_global_after_dirty_log_sync(void)
  */
 static unsigned int postponed_stop_flags;
 static VMChangeStateEntry *vmstate_change;
-static void memory_global_dirty_log_stop_postponed_run(void);
+static bool memory_global_dirty_log_stop_postponed_run(Error **errp);
 
 void memory_global_dirty_log_start(unsigned int flags)
 {
     unsigned int old_flags;
+    Error *local_err = NULL;
 
     assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
 
     if (vmstate_change) {
         /* If there is postponed stop(), operate on it first */
         postponed_stop_flags &= ~flags;
-        memory_global_dirty_log_stop_postponed_run();
+        if (!memory_global_dirty_log_stop_postponed_run(&local_err)) {
+            error_report_err(local_err);
+            return;
+        }
     }
 
     flags &= ~global_dirty_tracking;
@@ -2936,15 +2975,22 @@  void memory_global_dirty_log_start(unsigned int flags)
     trace_global_dirty_changed(global_dirty_tracking);
 
     if (!old_flags) {
-        MEMORY_LISTENER_CALL_GLOBAL(log_global_start, Forward);
+        MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_start, Forward,
+                                        &local_err);
+        if (local_err) {
+            error_report_err(local_err);
+            return;
+        }
         memory_region_transaction_begin();
         memory_region_update_pending = true;
         memory_region_transaction_commit();
     }
 }
 
-static void memory_global_dirty_log_do_stop(unsigned int flags)
+static bool memory_global_dirty_log_do_stop(unsigned int flags, Error **errp)
 {
+    ERRP_GUARD();
+
     assert(flags && !(flags & (~GLOBAL_DIRTY_MASK)));
     assert((global_dirty_tracking & flags) == flags);
     global_dirty_tracking &= ~flags;
@@ -2955,39 +3001,49 @@  static void memory_global_dirty_log_do_stop(unsigned int flags)
         memory_region_transaction_begin();
         memory_region_update_pending = true;
         memory_region_transaction_commit();
-        MEMORY_LISTENER_CALL_GLOBAL(log_global_stop, Reverse);
+        MEMORY_LISTENER_CALL_LOG_GLOBAL(log_global_stop, Reverse, errp);
     }
+    return !*errp;
 }
 
 /*
  * Execute the postponed dirty log stop operations if there is, then reset
  * everything (including the flags and the vmstate change hook).
  */
-static void memory_global_dirty_log_stop_postponed_run(void)
+static bool memory_global_dirty_log_stop_postponed_run(Error **errp)
 {
+    bool ret = true;
+
     /* This must be called with the vmstate handler registered */
     assert(vmstate_change);
 
     /* Note: postponed_stop_flags can be cleared in log start routine */
     if (postponed_stop_flags) {
-        memory_global_dirty_log_do_stop(postponed_stop_flags);
+        ret = memory_global_dirty_log_do_stop(postponed_stop_flags, errp);
         postponed_stop_flags = 0;
     }
 
     qemu_del_vm_change_state_handler(vmstate_change);
     vmstate_change = NULL;
+    return ret;
 }
 
 static void memory_vm_change_state_handler(void *opaque, bool running,
                                            RunState state)
 {
+    Error *local_err = NULL;
+
     if (running) {
-        memory_global_dirty_log_stop_postponed_run();
+        if (!memory_global_dirty_log_stop_postponed_run(&local_err)) {
+            error_report_err(local_err);
+        }
     }
 }
 
 void memory_global_dirty_log_stop(unsigned int flags)
 {
+    Error *local_err = NULL;
+
     if (!runstate_is_running()) {
         /* Postpone the dirty log stop, e.g., to when VM starts again */
         if (vmstate_change) {
@@ -3001,7 +3057,9 @@  void memory_global_dirty_log_stop(unsigned int flags)
         return;
     }
 
-    memory_global_dirty_log_do_stop(flags);
+    if (!memory_global_dirty_log_do_stop(flags, &local_err)) {
+        error_report_err(local_err);
+    }
 }
 
 static void listener_add_address_space(MemoryListener *listener,
@@ -3009,13 +3067,16 @@  static void listener_add_address_space(MemoryListener *listener,
 {
     FlatView *view;
     FlatRange *fr;
+    Error *local_err = NULL;
 
     if (listener->begin) {
         listener->begin(listener);
     }
     if (global_dirty_tracking) {
         if (listener->log_global_start) {
-            listener->log_global_start(listener);
+            if (!listener->log_global_start(listener, &local_err)) {
+                error_report_err(local_err);
+            }
         }
     }
 
diff --git a/system/physmem.c b/system/physmem.c
index e3ebc19eefd8050a1dee16e3d1449f0c144f751f..9adbf9aea847cd80bdac6dca466fb476844ac048 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -148,7 +148,7 @@  typedef struct subpage_t {
 
 static void io_mem_init(void);
 static void memory_map_init(void);
-static void tcg_log_global_after_sync(MemoryListener *listener);
+static bool tcg_log_global_after_sync(MemoryListener *listener, Error **errp);
 static void tcg_commit(MemoryListener *listener);
 
 /**
@@ -2475,7 +2475,7 @@  static void do_nothing(CPUState *cpu, run_on_cpu_data d)
 {
 }
 
-static void tcg_log_global_after_sync(MemoryListener *listener)
+static bool tcg_log_global_after_sync(MemoryListener *listener, Error **errp)
 {
     CPUAddressSpace *cpuas;
 
@@ -2507,6 +2507,7 @@  static void tcg_log_global_after_sync(MemoryListener *listener)
         cpuas = container_of(listener, CPUAddressSpace, tcg_as_listener);
         run_on_cpu(cpuas->cpu, do_nothing, RUN_ON_CPU_NULL);
     }
+    return true;
 }
 
 static void tcg_commit_cpu(CPUState *cpu, run_on_cpu_data data)