diff mbox

[05/11] migration: move vm_old_running into global state

Message ID 20180103054043.25719-6-peterx@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Peter Xu Jan. 3, 2018, 5:40 a.m. UTC
Firstly, it was passed around.  Let's just move it into MigrationState
just like many other variables as state of migration.

One thing to mention is that for postcopy, we actually don't need this
knowledge at all since postcopy can't resume a VM even if it fails (we
can see that from the old code too: when we try to resume we also check
against "entered_postcopy" variable).  So further we do this:

- in postcopy_start(), we don't update vm_old_running since useless
- in migration_thread(), we don't need to check entered_postcopy when
  resume, since it's only used for precopy.

Comment this out too for that variable definition.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.c | 17 +++++++----------
 migration/migration.h |  6 ++++++
 2 files changed, 13 insertions(+), 10 deletions(-)

Comments

Juan Quintela Jan. 3, 2018, 9:05 a.m. UTC | #1
Peter Xu <peterx@redhat.com> wrote:
> Firstly, it was passed around.  Let's just move it into MigrationState
> just like many other variables as state of migration.
>
> One thing to mention is that for postcopy, we actually don't need this
> knowledge at all since postcopy can't resume a VM even if it fails (we
> can see that from the old code too: when we try to resume we also check
> against "entered_postcopy" variable).  So further we do this:
>
> - in postcopy_start(), we don't update vm_old_running since useless
> - in migration_thread(), we don't need to check entered_postcopy when
>   resume, since it's only used for precopy.
>
> Comment this out too for that variable definition.

Reviewed-by: Juan Quintela <quintela@redhat.com>

But I wonder if we can came with a better name.  Best one that I can
think is
   vm_was_running

Any other name that I came is bad for precopy or colo.

i.e. restart_vm_on_cancel_error

is meaningful for precopy, but not for colo.

> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  migration/migration.c | 17 +++++++----------
>  migration/migration.h |  6 ++++++
>  2 files changed, 13 insertions(+), 10 deletions(-)
>
> diff --git a/migration/migration.h b/migration/migration.h
> index ac74a12713..0f5df2367c 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -111,6 +111,12 @@ struct MigrationState
>      int64_t expected_downtime;
>      bool enabled_capabilities[MIGRATION_CAPABILITY__MAX];
>      int64_t setup_time;
> +    /*
> +     * Whether the old VM is running for the last migration.  This is
> +     * used to resume the VM when precopy failed or cancelled somehow.
> +     * It's never used for postcopy.
> +     */
> +    bool old_vm_running;

I think this comment is right for precopy, but not for colo.  BTW, I
think that I would put the postcopy comment on its use, not here.

/me tries to improve the comment

  Guest was running when we enter the completion stage.  If migration don't
  sucess, we need to continue running guest on source.

What do you think?

Later, Juan.


>  
>      /* Flag set once the migration has been asked to enter postcopy */
>      bool start_postcopy;
Peter Xu Jan. 3, 2018, 9:20 a.m. UTC | #2
On Wed, Jan 03, 2018 at 10:05:07AM +0100, Juan Quintela wrote:
> Peter Xu <peterx@redhat.com> wrote:
> > Firstly, it was passed around.  Let's just move it into MigrationState
> > just like many other variables as state of migration.
> >
> > One thing to mention is that for postcopy, we actually don't need this
> > knowledge at all since postcopy can't resume a VM even if it fails (we
> > can see that from the old code too: when we try to resume we also check
> > against "entered_postcopy" variable).  So further we do this:
> >
> > - in postcopy_start(), we don't update vm_old_running since useless
> > - in migration_thread(), we don't need to check entered_postcopy when
> >   resume, since it's only used for precopy.
> >
> > Comment this out too for that variable definition.
> 
> Reviewed-by: Juan Quintela <quintela@redhat.com>

Taken.

> 
> But I wonder if we can came with a better name.  Best one that I can
> think is
>    vm_was_running
> 
> Any other name that I came is bad for precopy or colo.
> 
> i.e. restart_vm_on_cancel_error
> 
> is meaningful for precopy, but not for colo.

Yeah actually spent >10 seconds thinking about a better naming but
failed, so the old one is kept.  I'll use vm_was_running.

> 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  migration/migration.c | 17 +++++++----------
> >  migration/migration.h |  6 ++++++
> >  2 files changed, 13 insertions(+), 10 deletions(-)
> >
> > diff --git a/migration/migration.h b/migration/migration.h
> > index ac74a12713..0f5df2367c 100644
> > --- a/migration/migration.h
> > +++ b/migration/migration.h
> > @@ -111,6 +111,12 @@ struct MigrationState
> >      int64_t expected_downtime;
> >      bool enabled_capabilities[MIGRATION_CAPABILITY__MAX];
> >      int64_t setup_time;
> > +    /*
> > +     * Whether the old VM is running for the last migration.  This is
> > +     * used to resume the VM when precopy failed or cancelled somehow.
> > +     * It's never used for postcopy.
> > +     */
> > +    bool old_vm_running;
> 
> I think this comment is right for precopy, but not for colo.  BTW, I
> think that I would put the postcopy comment on its use, not here.

Or, how about I just don't mention postcopy at all?

> 
> /me tries to improve the comment
> 
>   Guest was running when we enter the completion stage.  If migration don't
>   sucess, we need to continue running guest on source.
> 
> What do you think?

I think it's generally good.  Maybe a tiny fix like:

  s/Guest was/Whether guest was/
  s/If migration don't sucess/If migration failed/

?  Thanks,
Juan Quintela Jan. 3, 2018, 10:26 a.m. UTC | #3
Peter Xu <peterx@redhat.com> wrote:
> On Wed, Jan 03, 2018 at 10:05:07AM +0100, Juan Quintela wrote:
>> Peter Xu <peterx@redhat.com> wrote:
>> > Firstly, it was passed around.  Let's just move it into MigrationState
>> > just like many other variables as state of migration.
>> >
>> > One thing to mention is that for postcopy, we actually don't need this
>> > knowledge at all since postcopy can't resume a VM even if it fails (we
>> > can see that from the old code too: when we try to resume we also check
>> > against "entered_postcopy" variable).  So further we do this:
>> >
>> > - in postcopy_start(), we don't update vm_old_running since useless
>> > - in migration_thread(), we don't need to check entered_postcopy when
>> >   resume, since it's only used for precopy.
>> >
>> > Comment this out too for that variable definition.

>> I think this comment is right for precopy, but not for colo.  BTW, I
>> think that I would put the postcopy comment on its use, not here.
>
> Or, how about I just don't mention postcopy at all?

Fully agree.
>> 
>> /me tries to improve the comment
>> 
>>   Guest was running when we enter the completion stage.  If migration don't
>>   sucess, we need to continue running guest on source.
>> 
>> What do you think?
>
> I think it's generally good.  Maybe a tiny fix like:
>
>   s/Guest was/Whether guest was/

ok.

>   s/If migration don't sucess/If migration failed/

We also use it in case of migration_cancel.  Cancel is not one error,
that is why I wrote it that way.  What about:

Whether guest was running when we enter the completion stage.  If
migration is interrupted by any reason, we need to continue running the
guest on source.

What do you think?

Later, Juan.


> ?  Thanks,
Peter Xu Jan. 3, 2018, 10:40 a.m. UTC | #4
On Wed, Jan 03, 2018 at 11:26:12AM +0100, Juan Quintela wrote:
> Peter Xu <peterx@redhat.com> wrote:
> > On Wed, Jan 03, 2018 at 10:05:07AM +0100, Juan Quintela wrote:
> >> Peter Xu <peterx@redhat.com> wrote:
> >> > Firstly, it was passed around.  Let's just move it into MigrationState
> >> > just like many other variables as state of migration.
> >> >
> >> > One thing to mention is that for postcopy, we actually don't need this
> >> > knowledge at all since postcopy can't resume a VM even if it fails (we
> >> > can see that from the old code too: when we try to resume we also check
> >> > against "entered_postcopy" variable).  So further we do this:
> >> >
> >> > - in postcopy_start(), we don't update vm_old_running since useless
> >> > - in migration_thread(), we don't need to check entered_postcopy when
> >> >   resume, since it's only used for precopy.
> >> >
> >> > Comment this out too for that variable definition.
> 
> >> I think this comment is right for precopy, but not for colo.  BTW, I
> >> think that I would put the postcopy comment on its use, not here.
> >
> > Or, how about I just don't mention postcopy at all?
> 
> Fully agree.
> >> 
> >> /me tries to improve the comment
> >> 
> >>   Guest was running when we enter the completion stage.  If migration don't
> >>   sucess, we need to continue running guest on source.
> >> 
> >> What do you think?
> >
> > I think it's generally good.  Maybe a tiny fix like:
> >
> >   s/Guest was/Whether guest was/
> 
> ok.
> 
> >   s/If migration don't sucess/If migration failed/
> 
> We also use it in case of migration_cancel.  Cancel is not one error,
> that is why I wrote it that way.  What about:
> 
> Whether guest was running when we enter the completion stage.  If
> migration is interrupted by any reason, we need to continue running the
> guest on source.
> 
> What do you think?

Sure.  I was mostly trying to fix the typo and grammar issue.  Will
take your advise.
diff mbox

Patch

diff --git a/migration/migration.c b/migration/migration.c
index b684c2005d..62b3766852 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1272,6 +1272,7 @@  MigrationState *migrate_init(void)
 
     s->mig_start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
     s->mig_total_time = 0;
+    s->old_vm_running = false;
     return s;
 }
 
@@ -1846,7 +1847,7 @@  static int await_return_path_close_on_source(MigrationState *ms)
  * Switch from normal iteration to postcopy
  * Returns non-0 on error
  */
-static int postcopy_start(MigrationState *ms, bool *old_vm_running)
+static int postcopy_start(MigrationState *ms)
 {
     int ret;
     QIOChannelBuffer *bioc;
@@ -1864,7 +1865,6 @@  static int postcopy_start(MigrationState *ms, bool *old_vm_running)
     trace_postcopy_start_set_run();
 
     qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER);
-    *old_vm_running = runstate_is_running();
     global_state_store();
     ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
     if (ret < 0) {
@@ -2055,11 +2055,9 @@  static int migration_maybe_pause(MigrationState *s,
  *
  * @s: Current migration state
  * @current_active_state: The migration state we expect to be in
- * @*old_vm_running: Pointer to old_vm_running flag
  * @*start_time: Pointer to time to update
  */
 static void migration_completion(MigrationState *s, int current_active_state,
-                                 bool *old_vm_running,
                                  int64_t *start_time)
 {
     int ret;
@@ -2068,7 +2066,7 @@  static void migration_completion(MigrationState *s, int current_active_state,
         qemu_mutex_lock_iothread();
         *start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
         qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER);
-        *old_vm_running = runstate_is_running();
+        s->old_vm_running = runstate_is_running();
         ret = global_state_store();
 
         if (!ret) {
@@ -2174,7 +2172,6 @@  static void *migration_thread(void *opaque)
     int64_t threshold_size = 0;
     int64_t start_time = initial_time;
     int64_t end_time;
-    bool old_vm_running = false;
     bool entered_postcopy = false;
     /* The active state we expect to be in; ACTIVE or POSTCOPY_ACTIVE */
     enum MigrationStatus current_active_state = MIGRATION_STATUS_ACTIVE;
@@ -2233,7 +2230,7 @@  static void *migration_thread(void *opaque)
                     pend_nonpost <= threshold_size &&
                     atomic_read(&s->start_postcopy)) {
 
-                    if (!postcopy_start(s, &old_vm_running)) {
+                    if (!postcopy_start(s)) {
                         current_active_state = MIGRATION_STATUS_POSTCOPY_ACTIVE;
                         entered_postcopy = true;
                     }
@@ -2245,7 +2242,7 @@  static void *migration_thread(void *opaque)
             } else {
                 trace_migration_thread_low_pending(pending_size);
                 migration_completion(s, current_active_state,
-                                     &old_vm_running, &start_time);
+                                     &start_time);
                 break;
             }
         }
@@ -2311,9 +2308,9 @@  static void *migration_thread(void *opaque)
             * Fixme: we will run VM in COLO no matter its old running state.
             * After exited COLO, we will keep running.
             */
-            old_vm_running = true;
+            s->old_vm_running = true;
         }
-        if (old_vm_running && !entered_postcopy) {
+        if (s->old_vm_running) {
             vm_start();
         } else {
             if (runstate_check(RUN_STATE_FINISH_MIGRATE)) {
diff --git a/migration/migration.h b/migration/migration.h
index ac74a12713..0f5df2367c 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -111,6 +111,12 @@  struct MigrationState
     int64_t expected_downtime;
     bool enabled_capabilities[MIGRATION_CAPABILITY__MAX];
     int64_t setup_time;
+    /*
+     * Whether the old VM is running for the last migration.  This is
+     * used to resume the VM when precopy failed or cancelled somehow.
+     * It's never used for postcopy.
+     */
+    bool old_vm_running;
 
     /* Flag set once the migration has been asked to enter postcopy */
     bool start_postcopy;