diff mbox series

[v3,3/8] migration/savevm: Allow immutable device state to be migrated early (i.e., before RAM)

Message ID 20230112164403.105085-4-david@redhat.com (mailing list archive)
State New, archived
Headers show
Series virtio-mem: Handle preallocation with migration | expand

Commit Message

David Hildenbrand Jan. 12, 2023, 4:43 p.m. UTC
For virtio-mem, we want to have the plugged/unplugged state of memory
blocks available before migrating any actual RAM content, and perform
sanity checks before touching anything on the destination. This
information is immutable on the migration source while migration is active,

We want to use this information for proper preallocation support with
migration: currently, we don't preallocate memory on the migration target,
and especially with hugetlb, we can easily run out of hugetlb pages during
RAM migration and will crash (SIGBUS) instead of catching this gracefully
via preallocation.

Migrating device state via a vmsd before we start iterating is currently
impossible: the only approach that would be possible is avoiding a vmsd
and migrating state manually during save_setup(), to be restored during
load_state().

Let's allow for migrating device state via a vmsd early, during the
setup phase in qemu_savevm_state_setup(). To keep it simple, we
indicate applicable vmds's using an "immutable" flag.

Note that only very selected devices (i.e., ones seriously messing with
RAM setup) are supposed to make use of such early state migration.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/migration/vmstate.h |  5 +++++
 migration/savevm.c          | 14 ++++++++++++++
 2 files changed, 19 insertions(+)

Comments

Dr. David Alan Gilbert Jan. 12, 2023, 5:56 p.m. UTC | #1
* David Hildenbrand (david@redhat.com) wrote:
> For virtio-mem, we want to have the plugged/unplugged state of memory
> blocks available before migrating any actual RAM content, and perform
> sanity checks before touching anything on the destination. This
> information is immutable on the migration source while migration is active,
> 
> We want to use this information for proper preallocation support with
> migration: currently, we don't preallocate memory on the migration target,
> and especially with hugetlb, we can easily run out of hugetlb pages during
> RAM migration and will crash (SIGBUS) instead of catching this gracefully
> via preallocation.
> 
> Migrating device state via a vmsd before we start iterating is currently
> impossible: the only approach that would be possible is avoiding a vmsd
> and migrating state manually during save_setup(), to be restored during
> load_state().
> 
> Let's allow for migrating device state via a vmsd early, during the
> setup phase in qemu_savevm_state_setup(). To keep it simple, we
> indicate applicable vmds's using an "immutable" flag.
> 
> Note that only very selected devices (i.e., ones seriously messing with
> RAM setup) are supposed to make use of such early state migration.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  include/migration/vmstate.h |  5 +++++
>  migration/savevm.c          | 14 ++++++++++++++
>  2 files changed, 19 insertions(+)
> 
> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> index ad24aa1934..dd06c3abad 100644
> --- a/include/migration/vmstate.h
> +++ b/include/migration/vmstate.h
> @@ -179,6 +179,11 @@ struct VMStateField {
>  struct VMStateDescription {
>      const char *name;
>      int unmigratable;
> +    /*
> +     * The state is immutable while migration is active and is saved
> +     * during the setup phase, to be restored early on the destination.
> +     */
> +    int immutable;

A bool would be nicer (as it would for unmigratable above)

>      int version_id;
>      int minimum_version_id;
>      MigrationPriority priority;
> diff --git a/migration/savevm.c b/migration/savevm.c
> index ff2b8d0064..536d6f662b 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1200,6 +1200,15 @@ void qemu_savevm_state_setup(QEMUFile *f)
>  
>      trace_savevm_state_setup();
>      QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> +        if (se->vmsd && se->vmsd->immutable) {
> +            ret = vmstate_save(f, se, ms->vmdesc);
> +            if (ret) {
> +                qemu_file_set_error(f, ret);
> +                break;
> +            }
> +            continue;
> +        }
> +

Does this give you the ordering you want? i.e. there's no guarantee here
that immutables come first?

Dave


>          if (!se->ops || !se->ops->save_setup) {
>              continue;
>          }
> @@ -1402,6 +1411,11 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
>      int ret;
>  
>      QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> +        if (se->vmsd && se->vmsd->immutable) {
> +            /* Already saved during qemu_savevm_state_setup(). */
> +            continue;
> +        }
> +
>          ret = vmstate_save(f, se, vmdesc);
>          if (ret) {
>              qemu_file_set_error(f, ret);
> -- 
> 2.39.0
>
David Hildenbrand Jan. 12, 2023, 6:21 p.m. UTC | #2
On 12.01.23 18:56, Dr. David Alan Gilbert wrote:
> * David Hildenbrand (david@redhat.com) wrote:
>> For virtio-mem, we want to have the plugged/unplugged state of memory
>> blocks available before migrating any actual RAM content, and perform
>> sanity checks before touching anything on the destination. This
>> information is immutable on the migration source while migration is active,
>>
>> We want to use this information for proper preallocation support with
>> migration: currently, we don't preallocate memory on the migration target,
>> and especially with hugetlb, we can easily run out of hugetlb pages during
>> RAM migration and will crash (SIGBUS) instead of catching this gracefully
>> via preallocation.
>>
>> Migrating device state via a vmsd before we start iterating is currently
>> impossible: the only approach that would be possible is avoiding a vmsd
>> and migrating state manually during save_setup(), to be restored during
>> load_state().
>>
>> Let's allow for migrating device state via a vmsd early, during the
>> setup phase in qemu_savevm_state_setup(). To keep it simple, we
>> indicate applicable vmds's using an "immutable" flag.
>>
>> Note that only very selected devices (i.e., ones seriously messing with
>> RAM setup) are supposed to make use of such early state migration.
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>   include/migration/vmstate.h |  5 +++++
>>   migration/savevm.c          | 14 ++++++++++++++
>>   2 files changed, 19 insertions(+)
>>
>> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
>> index ad24aa1934..dd06c3abad 100644
>> --- a/include/migration/vmstate.h
>> +++ b/include/migration/vmstate.h
>> @@ -179,6 +179,11 @@ struct VMStateField {
>>   struct VMStateDescription {
>>       const char *name;
>>       int unmigratable;
>> +    /*
>> +     * The state is immutable while migration is active and is saved
>> +     * during the setup phase, to be restored early on the destination.
>> +     */
>> +    int immutable;
> 
> A bool would be nicer (as it would for unmigratable above)

Yes, I chose an int for consistency with "unmigratable". I can turn that 
into a bool.

I'd even include a cleanup patch for unmigratable if it wouldn't be ...

$ git grep "unmigratable \=" | wc -l
29

> 
>>       int version_id;
>>       int minimum_version_id;
>>       MigrationPriority priority;
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index ff2b8d0064..536d6f662b 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -1200,6 +1200,15 @@ void qemu_savevm_state_setup(QEMUFile *f)
>>   
>>       trace_savevm_state_setup();
>>       QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
>> +        if (se->vmsd && se->vmsd->immutable) {
>> +            ret = vmstate_save(f, se, ms->vmdesc);
>> +            if (ret) {
>> +                qemu_file_set_error(f, ret);
>> +                break;
>> +            }
>> +            continue;
>> +        }
>> +
> 
> Does this give you the ordering you want? i.e. there's no guarantee here
> that immutables come first?

Yes, for virtio-mem at least this is fine. There are no real ordering 
requirements in regard to save_setup().

I guess one could use vmstate priorities to affect the ordering, if 
required.

So for my use case this is good enough, any suggestions? Thanks.
Dr. David Alan Gilbert Jan. 12, 2023, 7:52 p.m. UTC | #3
* David Hildenbrand (david@redhat.com) wrote:
> On 12.01.23 18:56, Dr. David Alan Gilbert wrote:
> > * David Hildenbrand (david@redhat.com) wrote:
> > > For virtio-mem, we want to have the plugged/unplugged state of memory
> > > blocks available before migrating any actual RAM content, and perform
> > > sanity checks before touching anything on the destination. This
> > > information is immutable on the migration source while migration is active,
> > > 
> > > We want to use this information for proper preallocation support with
> > > migration: currently, we don't preallocate memory on the migration target,
> > > and especially with hugetlb, we can easily run out of hugetlb pages during
> > > RAM migration and will crash (SIGBUS) instead of catching this gracefully
> > > via preallocation.
> > > 
> > > Migrating device state via a vmsd before we start iterating is currently
> > > impossible: the only approach that would be possible is avoiding a vmsd
> > > and migrating state manually during save_setup(), to be restored during
> > > load_state().
> > > 
> > > Let's allow for migrating device state via a vmsd early, during the
> > > setup phase in qemu_savevm_state_setup(). To keep it simple, we
> > > indicate applicable vmds's using an "immutable" flag.
> > > 
> > > Note that only very selected devices (i.e., ones seriously messing with
> > > RAM setup) are supposed to make use of such early state migration.
> > > 
> > > Signed-off-by: David Hildenbrand <david@redhat.com>
> > > ---
> > >   include/migration/vmstate.h |  5 +++++
> > >   migration/savevm.c          | 14 ++++++++++++++
> > >   2 files changed, 19 insertions(+)
> > > 
> > > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> > > index ad24aa1934..dd06c3abad 100644
> > > --- a/include/migration/vmstate.h
> > > +++ b/include/migration/vmstate.h
> > > @@ -179,6 +179,11 @@ struct VMStateField {
> > >   struct VMStateDescription {
> > >       const char *name;
> > >       int unmigratable;
> > > +    /*
> > > +     * The state is immutable while migration is active and is saved
> > > +     * during the setup phase, to be restored early on the destination.
> > > +     */
> > > +    int immutable;
> > 
> > A bool would be nicer (as it would for unmigratable above)
> 
> Yes, I chose an int for consistency with "unmigratable". I can turn that
> into a bool.
> 
> I'd even include a cleanup patch for unmigratable if it wouldn't be ...
> 
> $ git grep "unmigratable \=" | wc -l
> 29

It might be OK if you just change the declaration; I mean '1' is pretty
close to true? (I think...)
Anyway, at least make the new one a bool.

> > >       int version_id;
> > >       int minimum_version_id;
> > >       MigrationPriority priority;
> > > diff --git a/migration/savevm.c b/migration/savevm.c
> > > index ff2b8d0064..536d6f662b 100644
> > > --- a/migration/savevm.c
> > > +++ b/migration/savevm.c
> > > @@ -1200,6 +1200,15 @@ void qemu_savevm_state_setup(QEMUFile *f)
> > >       trace_savevm_state_setup();
> > >       QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> > > +        if (se->vmsd && se->vmsd->immutable) {
> > > +            ret = vmstate_save(f, se, ms->vmdesc);
> > > +            if (ret) {
> > > +                qemu_file_set_error(f, ret);
> > > +                break;
> > > +            }
> > > +            continue;
> > > +        }
> > > +
> > 
> > Does this give you the ordering you want? i.e. there's no guarantee here
> > that immutables come first?
> 
> Yes, for virtio-mem at least this is fine. There are no real ordering
> requirements in regard to save_setup().
> 
> I guess one could use vmstate priorities to affect the ordering, if
> required.
> 
> So for my use case this is good enough, any suggestions? Thanks.

OK, but consider whether it might be better just to have a separate
QTAILQ_FOREACH look in savevm_state_setup that first does all the
immutables, and then all the setups.

Dave

> -- 
> Thanks,
> 
> David / dhildenb
>
Peter Xu Jan. 12, 2023, 10:14 p.m. UTC | #4
On Thu, Jan 12, 2023 at 07:52:41PM +0000, Dr. David Alan Gilbert wrote:
> * David Hildenbrand (david@redhat.com) wrote:
> > On 12.01.23 18:56, Dr. David Alan Gilbert wrote:
> > > * David Hildenbrand (david@redhat.com) wrote:
> > > > For virtio-mem, we want to have the plugged/unplugged state of memory
> > > > blocks available before migrating any actual RAM content, and perform
> > > > sanity checks before touching anything on the destination. This
> > > > information is immutable on the migration source while migration is active,
> > > > 
> > > > We want to use this information for proper preallocation support with
> > > > migration: currently, we don't preallocate memory on the migration target,
> > > > and especially with hugetlb, we can easily run out of hugetlb pages during
> > > > RAM migration and will crash (SIGBUS) instead of catching this gracefully
> > > > via preallocation.
> > > > 
> > > > Migrating device state via a vmsd before we start iterating is currently
> > > > impossible: the only approach that would be possible is avoiding a vmsd
> > > > and migrating state manually during save_setup(), to be restored during
> > > > load_state().
> > > > 
> > > > Let's allow for migrating device state via a vmsd early, during the
> > > > setup phase in qemu_savevm_state_setup(). To keep it simple, we
> > > > indicate applicable vmds's using an "immutable" flag.
> > > > 
> > > > Note that only very selected devices (i.e., ones seriously messing with
> > > > RAM setup) are supposed to make use of such early state migration.
> > > > 
> > > > Signed-off-by: David Hildenbrand <david@redhat.com>
> > > > ---
> > > >   include/migration/vmstate.h |  5 +++++
> > > >   migration/savevm.c          | 14 ++++++++++++++
> > > >   2 files changed, 19 insertions(+)
> > > > 
> > > > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> > > > index ad24aa1934..dd06c3abad 100644
> > > > --- a/include/migration/vmstate.h
> > > > +++ b/include/migration/vmstate.h
> > > > @@ -179,6 +179,11 @@ struct VMStateField {
> > > >   struct VMStateDescription {
> > > >       const char *name;
> > > >       int unmigratable;
> > > > +    /*
> > > > +     * The state is immutable while migration is active and is saved
> > > > +     * during the setup phase, to be restored early on the destination.
> > > > +     */
> > > > +    int immutable;
> > > 
> > > A bool would be nicer (as it would for unmigratable above)
> > 
> > Yes, I chose an int for consistency with "unmigratable". I can turn that
> > into a bool.
> > 
> > I'd even include a cleanup patch for unmigratable if it wouldn't be ...
> > 
> > $ git grep "unmigratable \=" | wc -l
> > 29
> 
> It might be OK if you just change the declaration; I mean '1' is pretty
> close to true? (I think...)
> Anyway, at least make the new one a bool.

Agreed bool is better.  Can we rename it to something like "early_setup"?
"immutable" isn't clear on its most important attribute (on when it'll be
migrated).  Meanwhile I'd hope we can comment that explicitly.  I'd go with:

  /*
   * This VMSD describes something that should be sent during setup phase
   * of migration.  It plays similar role as save_setup() for explicitly
   * registered vmstate entries, the only difference is the vmsd will be
   * sent right at the start of migration.
   */
  bool early_setup;

> 
> > > >       int version_id;
> > > >       int minimum_version_id;
> > > >       MigrationPriority priority;
> > > > diff --git a/migration/savevm.c b/migration/savevm.c
> > > > index ff2b8d0064..536d6f662b 100644
> > > > --- a/migration/savevm.c
> > > > +++ b/migration/savevm.c
> > > > @@ -1200,6 +1200,15 @@ void qemu_savevm_state_setup(QEMUFile *f)
> > > >       trace_savevm_state_setup();
> > > >       QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> > > > +        if (se->vmsd && se->vmsd->immutable) {
> > > > +            ret = vmstate_save(f, se, ms->vmdesc);
> > > > +            if (ret) {
> > > > +                qemu_file_set_error(f, ret);
> > > > +                break;
> > > > +            }
> > > > +            continue;
> > > > +        }
> > > > +
> > > 
> > > Does this give you the ordering you want? i.e. there's no guarantee here
> > > that immutables come first?
> > 
> > Yes, for virtio-mem at least this is fine. There are no real ordering
> > requirements in regard to save_setup().
> > 
> > I guess one could use vmstate priorities to affect the ordering, if
> > required.
> > 
> > So for my use case this is good enough, any suggestions? Thanks.
> 
> OK, but consider whether it might be better just to have a separate
> QTAILQ_FOREACH look in savevm_state_setup that first does all the
> immutables, and then all the setups.

After patch 1 the order may not matter iiuc, because each call to the
immutable vmsds calls the new vmstate_save() which will always send
QEMU_VM_SECTION_FULL and footers along the vmsd.

Thanks,
Peter Xu Jan. 12, 2023, 10:28 p.m. UTC | #5
On Thu, Jan 12, 2023 at 05:14:57PM -0500, Peter Xu wrote:
> On Thu, Jan 12, 2023 at 07:52:41PM +0000, Dr. David Alan Gilbert wrote:
> > * David Hildenbrand (david@redhat.com) wrote:
> > > On 12.01.23 18:56, Dr. David Alan Gilbert wrote:
> > > > * David Hildenbrand (david@redhat.com) wrote:
> > > > > For virtio-mem, we want to have the plugged/unplugged state of memory
> > > > > blocks available before migrating any actual RAM content, and perform
> > > > > sanity checks before touching anything on the destination. This
> > > > > information is immutable on the migration source while migration is active,
> > > > > 
> > > > > We want to use this information for proper preallocation support with
> > > > > migration: currently, we don't preallocate memory on the migration target,
> > > > > and especially with hugetlb, we can easily run out of hugetlb pages during
> > > > > RAM migration and will crash (SIGBUS) instead of catching this gracefully
> > > > > via preallocation.
> > > > > 
> > > > > Migrating device state via a vmsd before we start iterating is currently
> > > > > impossible: the only approach that would be possible is avoiding a vmsd
> > > > > and migrating state manually during save_setup(), to be restored during
> > > > > load_state().
> > > > > 
> > > > > Let's allow for migrating device state via a vmsd early, during the
> > > > > setup phase in qemu_savevm_state_setup(). To keep it simple, we
> > > > > indicate applicable vmds's using an "immutable" flag.
> > > > > 
> > > > > Note that only very selected devices (i.e., ones seriously messing with
> > > > > RAM setup) are supposed to make use of such early state migration.
> > > > > 
> > > > > Signed-off-by: David Hildenbrand <david@redhat.com>
> > > > > ---
> > > > >   include/migration/vmstate.h |  5 +++++
> > > > >   migration/savevm.c          | 14 ++++++++++++++
> > > > >   2 files changed, 19 insertions(+)
> > > > > 
> > > > > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> > > > > index ad24aa1934..dd06c3abad 100644
> > > > > --- a/include/migration/vmstate.h
> > > > > +++ b/include/migration/vmstate.h
> > > > > @@ -179,6 +179,11 @@ struct VMStateField {
> > > > >   struct VMStateDescription {
> > > > >       const char *name;
> > > > >       int unmigratable;
> > > > > +    /*
> > > > > +     * The state is immutable while migration is active and is saved
> > > > > +     * during the setup phase, to be restored early on the destination.
> > > > > +     */
> > > > > +    int immutable;
> > > > 
> > > > A bool would be nicer (as it would for unmigratable above)
> > > 
> > > Yes, I chose an int for consistency with "unmigratable". I can turn that
> > > into a bool.
> > > 
> > > I'd even include a cleanup patch for unmigratable if it wouldn't be ...
> > > 
> > > $ git grep "unmigratable \=" | wc -l
> > > 29
> > 
> > It might be OK if you just change the declaration; I mean '1' is pretty
> > close to true? (I think...)
> > Anyway, at least make the new one a bool.
> 
> Agreed bool is better.  Can we rename it to something like "early_setup"?
> "immutable" isn't clear on its most important attribute (on when it'll be
> migrated).  Meanwhile I'd hope we can comment that explicitly.  I'd go with:
> 
>   /*
>    * This VMSD describes something that should be sent during setup phase
>    * of migration.  It plays similar role as save_setup() for explicitly
>    * registered vmstate entries, the only difference is the vmsd will be
>    * sent right at the start of migration.
>    */
>   bool early_setup;

Let me try some even better wording..

    /*
     * This VMSD describes something that should be sent during setup phase
     * of migration.  It plays similar role as save_setup() for explicitly
     * registered vmstate entries, so it can be seen as a way to describe
     * save_setup() in vmsd structures.
     *
     * One SaveStateEntry should either have the save_setup() specified or
     * the vmsd with early_setup set to true.  It should never have both
     * things set.
     */
    bool early_setup;

There's one tricky thing that we'll send QEMU_VM_SECTION_START for
save_setup() entries but QEMU_VM_SECTION_FULL for vmsd early_setup
entries.

David, do you think we can slightly modify your new version of
vmstate_save() so as to pass in the section_type?  I think it'll be even
cleaner to send QEMU_VM_SECTION_START for the early vmsds too.  I assume
this shouldn't affect your goal and anything else.

> 
> > 
> > > > >       int version_id;
> > > > >       int minimum_version_id;
> > > > >       MigrationPriority priority;
> > > > > diff --git a/migration/savevm.c b/migration/savevm.c
> > > > > index ff2b8d0064..536d6f662b 100644
> > > > > --- a/migration/savevm.c
> > > > > +++ b/migration/savevm.c
> > > > > @@ -1200,6 +1200,15 @@ void qemu_savevm_state_setup(QEMUFile *f)
> > > > >       trace_savevm_state_setup();
> > > > >       QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> > > > > +        if (se->vmsd && se->vmsd->immutable) {
> > > > > +            ret = vmstate_save(f, se, ms->vmdesc);
> > > > > +            if (ret) {
> > > > > +                qemu_file_set_error(f, ret);
> > > > > +                break;
> > > > > +            }
> > > > > +            continue;
> > > > > +        }
> > > > > +
> > > > 
> > > > Does this give you the ordering you want? i.e. there's no guarantee here
> > > > that immutables come first?
> > > 
> > > Yes, for virtio-mem at least this is fine. There are no real ordering
> > > requirements in regard to save_setup().
> > > 
> > > I guess one could use vmstate priorities to affect the ordering, if
> > > required.
> > > 
> > > So for my use case this is good enough, any suggestions? Thanks.
> > 
> > OK, but consider whether it might be better just to have a separate
> > QTAILQ_FOREACH look in savevm_state_setup that first does all the
> > immutables, and then all the setups.
> 
> After patch 1 the order may not matter iiuc, because each call to the
> immutable vmsds calls the new vmstate_save() which will always send
> QEMU_VM_SECTION_FULL and footers along the vmsd.
> 
> Thanks,
> 
> -- 
> Peter Xu
David Hildenbrand Jan. 13, 2023, 1:47 p.m. UTC | #6
[...]

>>> It might be OK if you just change the declaration; I mean '1' is pretty
>>> close to true? (I think...)
>>> Anyway, at least make the new one a bool.
>>
>> Agreed bool is better.  Can we rename it to something like "early_setup"?
>> "immutable" isn't clear on its most important attribute (on when it'll be
>> migrated).  Meanwhile I'd hope we can comment that explicitly.  I'd go with:
>>
>>    /*
>>     * This VMSD describes something that should be sent during setup phase
>>     * of migration.  It plays similar role as save_setup() for explicitly
>>     * registered vmstate entries, the only difference is the vmsd will be
>>     * sent right at the start of migration.
>>     */
>>    bool early_setup;
> 
> Let me try some even better wording..
> 
>      /*
>       * This VMSD describes something that should be sent during setup phase
>       * of migration.  It plays similar role as save_setup() for explicitly
>       * registered vmstate entries, so it can be seen as a way to describe
>       * save_setup() in vmsd structures.
>       *
>       * One SaveStateEntry should either have the save_setup() specified or
>       * the vmsd with early_setup set to true.  It should never have both
>       * things set.
>       */
>      bool early_setup;
> 

Thanks, I'll use that.

> There's one tricky thing that we'll send QEMU_VM_SECTION_START for
> save_setup() entries but QEMU_VM_SECTION_FULL for vmsd early_setup
> entries.

I think that makes sense for now, though: we only transmit a VMSD and 
VMSDs are transmitted once and are not iterable.

In comparison, for iterable things we expect a

QEMU_VM_SECTION_START
0..X QEMU_VM_SECTION_PART
QEMU_VM_SECTION_END


I assume you're thinking about "mixing" save_state() with an early vmsd 
in a SaveStateEntry. I don't think something like that would currently 
work (I'm pretty sure the core would have a hard time figuring out if to 
restore a vmsd or whether to send the input to load_state()?), neither 
can it be configured: we wither have se->ops or se->vmsd.

> 
> David, do you think we can slightly modify your new version of
> vmstate_save() so as to pass in the section_type?  I think it'll be even
> cleaner to send QEMU_VM_SECTION_START for the early vmsds too.  I assume
> this shouldn't affect your goal and anything else.

I'd prefer to not go down that path for now. QEMU_VM_SECTION_START 
without QEMU_VM_SECTION_PART and QEMU_VM_SECTION_END feels pretty 
incomplete and wrong to me.

If we want to do that in the future, we should conditionally send 
QEMU_VM_SECTION_START only if we have se->ops I assume?

> 
>>
>>>
>>>>>>        int version_id;
>>>>>>        int minimum_version_id;
>>>>>>        MigrationPriority priority;
>>>>>> diff --git a/migration/savevm.c b/migration/savevm.c
>>>>>> index ff2b8d0064..536d6f662b 100644
>>>>>> --- a/migration/savevm.c
>>>>>> +++ b/migration/savevm.c
>>>>>> @@ -1200,6 +1200,15 @@ void qemu_savevm_state_setup(QEMUFile *f)
>>>>>>        trace_savevm_state_setup();
>>>>>>        QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
>>>>>> +        if (se->vmsd && se->vmsd->immutable) {
>>>>>> +            ret = vmstate_save(f, se, ms->vmdesc);
>>>>>> +            if (ret) {
>>>>>> +                qemu_file_set_error(f, ret);
>>>>>> +                break;
>>>>>> +            }
>>>>>> +            continue;
>>>>>> +        }
>>>>>> +
>>>>>
>>>>> Does this give you the ordering you want? i.e. there's no guarantee here
>>>>> that immutables come first?
>>>>
>>>> Yes, for virtio-mem at least this is fine. There are no real ordering
>>>> requirements in regard to save_setup().
>>>>
>>>> I guess one could use vmstate priorities to affect the ordering, if
>>>> required.
>>>>
>>>> So for my use case this is good enough, any suggestions? Thanks.
>>>
>>> OK, but consider whether it might be better just to have a separate
>>> QTAILQ_FOREACH look in savevm_state_setup that first does all the
>>> immutables, and then all the setups.
>>
>> After patch 1 the order may not matter iiuc, because each call to the
>> immutable vmsds calls the new vmstate_save() which will always send
>> QEMU_VM_SECTION_FULL and footers along the vmsd.

Agreed. I'll leave it like that for now.

Thanks!
Peter Xu Jan. 13, 2023, 3:20 p.m. UTC | #7
On Fri, Jan 13, 2023 at 02:47:24PM +0100, David Hildenbrand wrote:
> I'd prefer to not go down that path for now. QEMU_VM_SECTION_START without
> QEMU_VM_SECTION_PART and QEMU_VM_SECTION_END feels pretty incomplete and
> wrong to me.

That's fine.

> 
> If we want to do that in the future, we should conditionally send
> QEMU_VM_SECTION_START only if we have se->ops I assume?

Yes.  START/FULL frames are mostly replaceable afaiu in the stream ABI, so
we always have space to change no matter what.  Let's leave that as-is.

Thanks,
Peter Xu Jan. 13, 2023, 3:27 p.m. UTC | #8
On Fri, Jan 13, 2023 at 10:20:31AM -0500, Peter Xu wrote:
> On Fri, Jan 13, 2023 at 02:47:24PM +0100, David Hildenbrand wrote:
> > I'd prefer to not go down that path for now. QEMU_VM_SECTION_START without
> > QEMU_VM_SECTION_PART and QEMU_VM_SECTION_END feels pretty incomplete and
> > wrong to me.
> 
> That's fine.
> 
> > 
> > If we want to do that in the future, we should conditionally send
> > QEMU_VM_SECTION_START only if we have se->ops I assume?
> 
> Yes.  START/FULL frames are mostly replaceable afaiu in the stream ABI, so
> we always have space to change no matter what.  Let's leave that as-is.

If so, please consider adding one more paragraph describing the difference
in vmsd early_setup comments (on using FULL for early vmsd and START for
save_setup), hopefully it'll make things clearer.

Thanks,
David Hildenbrand Jan. 13, 2023, 3:28 p.m. UTC | #9
On 13.01.23 16:20, Peter Xu wrote:
> On Fri, Jan 13, 2023 at 02:47:24PM +0100, David Hildenbrand wrote:
>> I'd prefer to not go down that path for now. QEMU_VM_SECTION_START without
>> QEMU_VM_SECTION_PART and QEMU_VM_SECTION_END feels pretty incomplete and
>> wrong to me.
> 
> That's fine.
> 
>>
>> If we want to do that in the future, we should conditionally send
>> QEMU_VM_SECTION_START only if we have se->ops I assume?
> 
> Yes.  START/FULL frames are mostly replaceable afaiu in the stream ABI, so
> we always have space to change no matter what.  Let's leave that as-is.

Thanks Peter! I'll send a new version early next week.
David Hildenbrand Jan. 16, 2023, 10:35 a.m. UTC | #10
On 13.01.23 16:27, Peter Xu wrote:
> On Fri, Jan 13, 2023 at 10:20:31AM -0500, Peter Xu wrote:
>> On Fri, Jan 13, 2023 at 02:47:24PM +0100, David Hildenbrand wrote:
>>> I'd prefer to not go down that path for now. QEMU_VM_SECTION_START without
>>> QEMU_VM_SECTION_PART and QEMU_VM_SECTION_END feels pretty incomplete and
>>> wrong to me.
>>
>> That's fine.
>>
>>>
>>> If we want to do that in the future, we should conditionally send
>>> QEMU_VM_SECTION_START only if we have se->ops I assume?
>>
>> Yes.  START/FULL frames are mostly replaceable afaiu in the stream ABI, so
>> we always have space to change no matter what.  Let's leave that as-is.
> 
> If so, please consider adding one more paragraph describing the difference
> in vmsd early_setup comments (on using FULL for early vmsd and START for
> save_setup), hopefully it'll make things clearer.

What about the following:

diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 7bc0cd9de9..cc910cab0f 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -188,6 +188,11 @@ struct VMStateDescription {
       * One SaveStateEntry should either have the save_setup() specified or
       * the vmsd with early_setup set to true. It should never have both
       * things set.
+     *
+     * Note that for now, a SaveStateEntry cannot have a VMSD and
+     * operations (e.g., save_setup()) set at the same time. For this reason,
+     * also early_setup VMSDs are migrated in a QEMU_VM_SECTION_FULL section,
+     * while save_setup() data is migrated in a QEMU_VM_SECTION_START section.
       */
      bool early_setup;
      int version_id;

Thanks!
Peter Xu Jan. 16, 2023, 2:56 p.m. UTC | #11
On Mon, Jan 16, 2023 at 11:35:22AM +0100, David Hildenbrand wrote:
> What about the following:
> 
> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> index 7bc0cd9de9..cc910cab0f 100644
> --- a/include/migration/vmstate.h
> +++ b/include/migration/vmstate.h
> @@ -188,6 +188,11 @@ struct VMStateDescription {
>       * One SaveStateEntry should either have the save_setup() specified or
>       * the vmsd with early_setup set to true. It should never have both
>       * things set.
> +     *
> +     * Note that for now, a SaveStateEntry cannot have a VMSD and
> +     * operations (e.g., save_setup()) set at the same time. For this reason,

This slightly duplicates with above?

> +     * also early_setup VMSDs are migrated in a QEMU_VM_SECTION_FULL section,
> +     * while save_setup() data is migrated in a QEMU_VM_SECTION_START section.
>       */

This looks good.

Thanks,
David Hildenbrand Jan. 16, 2023, 2:57 p.m. UTC | #12
On 16.01.23 15:56, Peter Xu wrote:
> On Mon, Jan 16, 2023 at 11:35:22AM +0100, David Hildenbrand wrote:
>> What about the following:
>>
>> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
>> index 7bc0cd9de9..cc910cab0f 100644
>> --- a/include/migration/vmstate.h
>> +++ b/include/migration/vmstate.h
>> @@ -188,6 +188,11 @@ struct VMStateDescription {
>>        * One SaveStateEntry should either have the save_setup() specified or
>>        * the vmsd with early_setup set to true. It should never have both
>>        * things set.
>> +     *
>> +     * Note that for now, a SaveStateEntry cannot have a VMSD and
>> +     * operations (e.g., save_setup()) set at the same time. For this reason,
> 
> This slightly duplicates with above?

Right, will merge both sections and simplify.

Thanks!
diff mbox series

Patch

diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index ad24aa1934..dd06c3abad 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -179,6 +179,11 @@  struct VMStateField {
 struct VMStateDescription {
     const char *name;
     int unmigratable;
+    /*
+     * The state is immutable while migration is active and is saved
+     * during the setup phase, to be restored early on the destination.
+     */
+    int immutable;
     int version_id;
     int minimum_version_id;
     MigrationPriority priority;
diff --git a/migration/savevm.c b/migration/savevm.c
index ff2b8d0064..536d6f662b 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1200,6 +1200,15 @@  void qemu_savevm_state_setup(QEMUFile *f)
 
     trace_savevm_state_setup();
     QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
+        if (se->vmsd && se->vmsd->immutable) {
+            ret = vmstate_save(f, se, ms->vmdesc);
+            if (ret) {
+                qemu_file_set_error(f, ret);
+                break;
+            }
+            continue;
+        }
+
         if (!se->ops || !se->ops->save_setup) {
             continue;
         }
@@ -1402,6 +1411,11 @@  int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
     int ret;
 
     QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
+        if (se->vmsd && se->vmsd->immutable) {
+            /* Already saved during qemu_savevm_state_setup(). */
+            continue;
+        }
+
         ret = vmstate_save(f, se, vmdesc);
         if (ret) {
             qemu_file_set_error(f, ret);