Message ID | 20250130171240.286878-4-kwolf@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | block: Managing inactive nodes (QSD migration) | expand |
On Thu, Jan 30, 2025 at 06:12:34PM +0100, Kevin Wolf wrote: > Block devices have an individual active state, a single global flag > can't cover this correctly. This becomes more important as we allow > users to manually manage which nodes are active or inactive. > > Now that it's allowed to call bdrv_inactivate_all() even when some > nodes are already inactive, we can remove the flag and just Is this commit out of order with 5/15 that removes the assertion failure for inactivating an already-inactive device? But in the long run, the sentiment is correct, even if the wording is inaccurate for a window of a couple of patches, so I'm not sure it is worth a slight rewording to s/it's allows/it will soon be allowed/. > unconditionally call bdrv_inactivate_all() and, more importantly, > bdrv_activate_all() before we make use of the nodes. > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > migration/migration.h | 3 --- > migration/block-active.c | 46 ---------------------------------------- > migration/migration.c | 8 ------- > 3 files changed, 57 deletions(-) > Reviewed-by: Eric Blake <eblake@redhat.com>
On Thu, Jan 30, 2025 at 06:12:34PM +0100, Kevin Wolf wrote: > Block devices have an individual active state, a single global flag > can't cover this correctly. This becomes more important as we allow > users to manually manage which nodes are active or inactive. > > Now that it's allowed to call bdrv_inactivate_all() even when some > nodes are already inactive, we can remove the flag and just > unconditionally call bdrv_inactivate_all() and, more importantly, > bdrv_activate_all() before we make use of the nodes. > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > migration/migration.h | 3 --- > migration/block-active.c | 46 ---------------------------------------- > migration/migration.c | 8 ------- > 3 files changed, 57 deletions(-) Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Am 30.01.2025 um 20:50 hat Eric Blake geschrieben: > On Thu, Jan 30, 2025 at 06:12:34PM +0100, Kevin Wolf wrote: > > Block devices have an individual active state, a single global flag > > can't cover this correctly. This becomes more important as we allow > > users to manually manage which nodes are active or inactive. > > > > Now that it's allowed to call bdrv_inactivate_all() even when some > > nodes are already inactive, we can remove the flag and just > > Is this commit out of order with 5/15 that removes the assertion > failure for inactivating an already-inactive device? It is. Looks like I moved things around a bit too much in this series. 5/15 doesn't seem to depend on anything else,m so I'll move it before this one to fix the ordering. Kevin > But in the long run, the sentiment is correct, even if the wording is > inaccurate for a window of a couple of patches, so I'm not sure it is > worth a slight rewording to s/it's allows/it will soon be allowed/. > > > unconditionally call bdrv_inactivate_all() and, more importantly, > > bdrv_activate_all() before we make use of the nodes. > > > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > > --- > > migration/migration.h | 3 --- > > migration/block-active.c | 46 ---------------------------------------- > > migration/migration.c | 8 ------- > > 3 files changed, 57 deletions(-) > > > > Reviewed-by: Eric Blake <eblake@redhat.com> > > -- > Eric Blake, Principal Software Engineer > Red Hat, Inc. > Virtualization: qemu.org | libguestfs.org >
diff --git a/migration/migration.h b/migration/migration.h index 0df2a187af..9540e8f04e 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -553,7 +553,4 @@ void migration_bitmap_sync_precopy(bool last_stage); /* migration/block-dirty-bitmap.c */ void dirty_bitmap_mig_init(void); -/* migration/block-active.c */ -void migration_block_active_setup(bool active); - #endif diff --git a/migration/block-active.c b/migration/block-active.c index d477cf8182..40e986aade 100644 --- a/migration/block-active.c +++ b/migration/block-active.c @@ -12,51 +12,12 @@ #include "qemu/error-report.h" #include "trace.h" -/* - * Migration-only cache to remember the block layer activation status. - * Protected by BQL. - * - * We need this because.. - * - * - Migration can fail after block devices are invalidated (during - * switchover phase). When that happens, we need to be able to recover - * the block drive status by re-activating them. - * - * - Currently bdrv_inactivate_all() is not safe to be invoked on top of - * invalidated drives (even if bdrv_activate_all() is actually safe to be - * called any time!). It means remembering this could help migration to - * make sure it won't invalidate twice in a row, crashing QEMU. It can - * happen when we migrate a PAUSED VM from host1 to host2, then migrate - * again to host3 without starting it. TODO: a cleaner solution is to - * allow safe invoke of bdrv_inactivate_all() at anytime, like - * bdrv_activate_all(). - * - * For freshly started QEMU, the flag is initialized to TRUE reflecting the - * scenario where QEMU owns block device ownerships. - * - * For incoming QEMU taking a migration stream, the flag is initialized to - * FALSE reflecting that the incoming side doesn't own the block devices, - * not until switchover happens. - */ -static bool migration_block_active; - -/* Setup the disk activation status */ -void migration_block_active_setup(bool active) -{ - migration_block_active = active; -} - bool migration_block_activate(Error **errp) { ERRP_GUARD(); assert(bql_locked()); - if (migration_block_active) { - trace_migration_block_activation("active-skipped"); - return true; - } - trace_migration_block_activation("active"); bdrv_activate_all(errp); @@ -65,7 +26,6 @@ bool migration_block_activate(Error **errp) return false; } - migration_block_active = true; return true; } @@ -75,11 +35,6 @@ bool migration_block_inactivate(void) assert(bql_locked()); - if (!migration_block_active) { - trace_migration_block_activation("inactive-skipped"); - return true; - } - trace_migration_block_activation("inactive"); ret = bdrv_inactivate_all(); @@ -89,6 +44,5 @@ bool migration_block_inactivate(void) return false; } - migration_block_active = false; return true; } diff --git a/migration/migration.c b/migration/migration.c index 2d1da917c7..ae252f24e6 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1838,12 +1838,6 @@ void qmp_migrate_incoming(const char *uri, bool has_channels, return; } - /* - * Newly setup incoming QEMU. Mark the block active state to reflect - * that the src currently owns the disks. - */ - migration_block_active_setup(false); - once = false; } @@ -3808,8 +3802,6 @@ static void migration_instance_init(Object *obj) ms->state = MIGRATION_STATUS_NONE; ms->mbps = -1; ms->pages_per_second = -1; - /* Freshly started QEMU owns all the block devices */ - migration_block_active_setup(true); qemu_sem_init(&ms->pause_sem, 0); qemu_mutex_init(&ms->error_mutex);
Block devices have an individual active state, a single global flag can't cover this correctly. This becomes more important as we allow users to manually manage which nodes are active or inactive. Now that it's allowed to call bdrv_inactivate_all() even when some nodes are already inactive, we can remove the flag and just unconditionally call bdrv_inactivate_all() and, more importantly, bdrv_activate_all() before we make use of the nodes. Signed-off-by: Kevin Wolf <kwolf@redhat.com> --- migration/migration.h | 3 --- migration/block-active.c | 46 ---------------------------------------- migration/migration.c | 8 ------- 3 files changed, 57 deletions(-)