Message ID | 20250327143934.7935-2-farosas@suse.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | migration: savevm testing | expand |
On Thu, Mar 27, 2025 at 11:39:31AM -0300, Fabiano Rosas wrote: > It has always been possible to enable arbitrary migration capabilities > and attempt to take a snapshot of the VM with the savevm/loadvm > commands as well as their QMP counterparts > snapshot-save/snapshot-load. > > Most migration capabilities are not meant to be used with snapshots > and there's a risk of crashing QEMU or producing incorrect > behavior. Ideally, every migration capability would either be > implemented for savevm or explicitly rejected. IMHO, this a prime example of why migration config shouldn't be held as global state, and instead passed as parameters to the commands that need them. The snapshot-save/load commands would then only be able to accept what few settings are actually relevant, instead of inheriting any/all global migration state. > Add a compatibility check routine and reject the snapshot command if > an incompatible capability is enabled. For now only act on the the two > that actually cause a crash: multifd and mapped-ram. > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2818 Issue is 2881 not 2818 ^^^^^^^ > Signed-off-by: Fabiano Rosas <farosas@suse.de> > --- > migration/options.c | 26 ++++++++++++++++++++++++++ > migration/options.h | 1 + > migration/savevm.c | 8 ++++++++ > 3 files changed, 35 insertions(+) With regards, Daniel
Daniel P. Berrangé <berrange@redhat.com> writes: > On Thu, Mar 27, 2025 at 11:39:31AM -0300, Fabiano Rosas wrote: >> It has always been possible to enable arbitrary migration capabilities >> and attempt to take a snapshot of the VM with the savevm/loadvm >> commands as well as their QMP counterparts >> snapshot-save/snapshot-load. >> >> Most migration capabilities are not meant to be used with snapshots >> and there's a risk of crashing QEMU or producing incorrect >> behavior. Ideally, every migration capability would either be >> implemented for savevm or explicitly rejected. > > IMHO, this a prime example of why migration config shouldn't be held > as global state, and instead passed as parameters to the commands > that need them. The snapshot-save/load commands would then only > be able to accept what few settings are actually relevant, instead > of inheriting any/all global migration state. > Right, I remember we got caught around the fact that some migration options are needed during runtime as well... but I don't remember the details, let try to find that thread. >> Add a compatibility check routine and reject the snapshot command if >> an incompatible capability is enabled. For now only act on the the two >> that actually cause a crash: multifd and mapped-ram. >> >> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2818 > > Issue is 2881 not 2818 ^^^^^^^ > It seem that was your own C-t =) >> Signed-off-by: Fabiano Rosas <farosas@suse.de> >> --- >> migration/options.c | 26 ++++++++++++++++++++++++++ >> migration/options.h | 1 + >> migration/savevm.c | 8 ++++++++ >> 3 files changed, 35 insertions(+) > > With regards, > Daniel
Hello Fabiano, First of all thanks a lot for the quick follow up to my issue! I just want to point out that with only mapped-ram enabled (without multifd) savevm/loadvm do not lead to a crash but just to an error according to my (few) experiments (on upstream). Ciao Marco On Thursday, March 27, 2025 15:39 CET, Fabiano Rosas <farosas@suse.de> wrote: > It has always been possible to enable arbitrary migration capabilities > and attempt to take a snapshot of the VM with the savevm/loadvm > commands as well as their QMP counterparts > snapshot-save/snapshot-load. > > Most migration capabilities are not meant to be used with snapshots > and there's a risk of crashing QEMU or producing incorrect > behavior. Ideally, every migration capability would either be > implemented for savevm or explicitly rejected. > > Add a compatibility check routine and reject the snapshot command if > an incompatible capability is enabled. For now only act on the the two > that actually cause a crash: multifd and mapped-ram. > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2881 > Signed-off-by: Fabiano Rosas <farosas@suse.de> > --- > migration/options.c | 26 ++++++++++++++++++++++++++ > migration/options.h | 1 + > migration/savevm.c | 8 ++++++++ > 3 files changed, 35 insertions(+) > > diff --git a/migration/options.c b/migration/options.c > index b0ac2ea408..8772b98dca 100644 > --- a/migration/options.c > +++ b/migration/options.c > @@ -443,11 +443,37 @@ INITIALIZE_MIGRATE_CAPS_SET(check_caps_background_snapshot, > MIGRATION_CAPABILITY_VALIDATE_UUID, > MIGRATION_CAPABILITY_ZERO_COPY_SEND); > > +/* Snapshot compatibility check list */ > +static const > +INITIALIZE_MIGRATE_CAPS_SET(check_caps_savevm, > + MIGRATION_CAPABILITY_MULTIFD, > + MIGRATION_CAPABILITY_MAPPED_RAM, > +); > + > static bool migrate_incoming_started(void) > { > return !!migration_incoming_get_current()->transport_data; > } > > +bool migrate_can_snapshot(Error **errp) > +{ > + MigrationState *s = migrate_get_current(); > + int i; > + > + for (i = 0; i < check_caps_savevm.size; i++) { > + int incomp_cap = check_caps_savevm.caps[i]; > + > + if (s->capabilities[incomp_cap]) { > + error_setg(errp, > + "Snapshots are not compatible with %s", > + MigrationCapability_str(incomp_cap)); > + return false; > + } > + } > + > + return true; > +} > + > /** > * @migration_caps_check - check capability compatibility > * > diff --git a/migration/options.h b/migration/options.h > index 762be4e641..20b71b6f2a 100644 > --- a/migration/options.h > +++ b/migration/options.h > @@ -58,6 +58,7 @@ bool migrate_tls(void); > /* capabilities helpers */ > > bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp); > +bool migrate_can_snapshot(Error **errp); > > /* parameters */ > > diff --git a/migration/savevm.c b/migration/savevm.c > index ce158c3512..3be13bcfe8 100644 > --- a/migration/savevm.c > +++ b/migration/savevm.c > @@ -3229,6 +3229,10 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate, > > GLOBAL_STATE_CODE(); > > + if (!migrate_can_snapshot(errp)) { > + return false; > + } > + > if (migration_is_blocked(errp)) { > return false; > } > @@ -3413,6 +3417,10 @@ bool load_snapshot(const char *name, const char *vmstate, > int ret; > MigrationIncomingState *mis = migration_incoming_get_current(); > > + if (!migrate_can_snapshot(errp)) { > + return false; > + } > + > if (!bdrv_all_can_snapshot(has_devices, devices, errp)) { > return false; > } > -- > 2.35.3 >
"Marco Cavenati" <Marco.Cavenati@eurecom.fr> writes: > Hello Fabiano, > > First of all thanks a lot for the quick follow up to my issue! > > I just want to point out that with only mapped-ram enabled (without > multifd) savevm/loadvm do not lead to a crash but just to an error > according to my (few) experiments (on upstream). > Yes, absolutely. I used imprecise language. Thanks for the correction, I'll explain it better in the following versions.
diff --git a/migration/options.c b/migration/options.c index b0ac2ea408..8772b98dca 100644 --- a/migration/options.c +++ b/migration/options.c @@ -443,11 +443,37 @@ INITIALIZE_MIGRATE_CAPS_SET(check_caps_background_snapshot, MIGRATION_CAPABILITY_VALIDATE_UUID, MIGRATION_CAPABILITY_ZERO_COPY_SEND); +/* Snapshot compatibility check list */ +static const +INITIALIZE_MIGRATE_CAPS_SET(check_caps_savevm, + MIGRATION_CAPABILITY_MULTIFD, + MIGRATION_CAPABILITY_MAPPED_RAM, +); + static bool migrate_incoming_started(void) { return !!migration_incoming_get_current()->transport_data; } +bool migrate_can_snapshot(Error **errp) +{ + MigrationState *s = migrate_get_current(); + int i; + + for (i = 0; i < check_caps_savevm.size; i++) { + int incomp_cap = check_caps_savevm.caps[i]; + + if (s->capabilities[incomp_cap]) { + error_setg(errp, + "Snapshots are not compatible with %s", + MigrationCapability_str(incomp_cap)); + return false; + } + } + + return true; +} + /** * @migration_caps_check - check capability compatibility * diff --git a/migration/options.h b/migration/options.h index 762be4e641..20b71b6f2a 100644 --- a/migration/options.h +++ b/migration/options.h @@ -58,6 +58,7 @@ bool migrate_tls(void); /* capabilities helpers */ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp); +bool migrate_can_snapshot(Error **errp); /* parameters */ diff --git a/migration/savevm.c b/migration/savevm.c index ce158c3512..3be13bcfe8 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -3229,6 +3229,10 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate, GLOBAL_STATE_CODE(); + if (!migrate_can_snapshot(errp)) { + return false; + } + if (migration_is_blocked(errp)) { return false; } @@ -3413,6 +3417,10 @@ bool load_snapshot(const char *name, const char *vmstate, int ret; MigrationIncomingState *mis = migration_incoming_get_current(); + if (!migrate_can_snapshot(errp)) { + return false; + } + if (!bdrv_all_can_snapshot(has_devices, devices, errp)) { return false; }
It has always been possible to enable arbitrary migration capabilities and attempt to take a snapshot of the VM with the savevm/loadvm commands as well as their QMP counterparts snapshot-save/snapshot-load. Most migration capabilities are not meant to be used with snapshots and there's a risk of crashing QEMU or producing incorrect behavior. Ideally, every migration capability would either be implemented for savevm or explicitly rejected. Add a compatibility check routine and reject the snapshot command if an incompatible capability is enabled. For now only act on the the two that actually cause a crash: multifd and mapped-ram. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2881 Signed-off-by: Fabiano Rosas <farosas@suse.de> --- migration/options.c | 26 ++++++++++++++++++++++++++ migration/options.h | 1 + migration/savevm.c | 8 ++++++++ 3 files changed, 35 insertions(+)