Message ID | 20250114230746.3268797-1-peterx@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | migration: Switchover phase refactoring | expand |
On Tue, Jan 14, 2025 at 18:07:30 -0500, Peter Xu wrote: > CI: https://gitlab.com/peterx/qemu/-/pipelines/1625266692 > (note: warning is present on rust stuff, but shouldn't be relevant) > > This series refactors the migration switchover path quite a bit. I started > this work initially to measure the JSON writer overhead, but then I decided > to cleanup the switchover path in general when I am at it altogether, as I > wanted to do this for a long time. > > A few major things I tried to do: > ... > - DEVICE migration state > > QEMU has a very special DEVICE migration state, that only happens with > precopy, and only when pause-before-switchover capability is enabled. > Due to that specialty we can't merge precopy and postcopy code on > switchover starts, because the state machine will be different. > > However after I checked the history and also with libvirt developers, > this seems unnecessary. So I had one patch making DEVICE state to be > the "switchover" phase for precopy/postcopy unconditionally. That will > make the state machine much easier for both modes, meanwhile nothing is > expected to break with it (but please still shoot if anyone knows / > suspect something will, or could, break..). No problem from libvirt side... Tested-by: Jiri Denemark <jdenemar@redhat.com>
On Wed, Jan 15, 2025 at 10:12:45AM +0100, Jiri Denemark wrote: > On Tue, Jan 14, 2025 at 18:07:30 -0500, Peter Xu wrote: > > CI: https://gitlab.com/peterx/qemu/-/pipelines/1625266692 > > (note: warning is present on rust stuff, but shouldn't be relevant) > > > > This series refactors the migration switchover path quite a bit. I started > > this work initially to measure the JSON writer overhead, but then I decided > > to cleanup the switchover path in general when I am at it altogether, as I > > wanted to do this for a long time. > > > > A few major things I tried to do: > > > ... > > - DEVICE migration state > > > > QEMU has a very special DEVICE migration state, that only happens with > > precopy, and only when pause-before-switchover capability is enabled. > > Due to that specialty we can't merge precopy and postcopy code on > > switchover starts, because the state machine will be different. > > > > However after I checked the history and also with libvirt developers, > > this seems unnecessary. So I had one patch making DEVICE state to be > > the "switchover" phase for precopy/postcopy unconditionally. That will > > make the state machine much easier for both modes, meanwhile nothing is > > expected to break with it (but please still shoot if anyone knows / > > suspect something will, or could, break..). > > No problem from libvirt side... > > Tested-by: Jiri Denemark <jdenemar@redhat.com> This is definitely reassuring.. thanks a lot, Jiri!
On 2025-01-14 18:07, Peter Xu wrote: > CI: https://gitlab.com/peterx/qemu/-/pipelines/1625266692 > (note: warning is present on rust stuff, but shouldn't be relevant) > > This series refactors the migration switchover path quite a bit. I started > this work initially to measure the JSON writer overhead, but then I decided > to cleanup the switchover path in general when I am at it altogether, as I > wanted to do this for a long time. > > A few major things I tried to do: > > - About the JSON writer > > Currently, precopy migration always dumps a chunk of data called VM > description (QEMU_VM_VMDESCRIPTION) for debugging purpose. That is a > JSON blob explaining all the vmstates dumped in the migration stream. > QEMU has a machine property suppress-vmdesc deciding whether migration > will have that JSON chunk included. > > Postcopy does not have such JSON dump because postcopy is live session > and it can't normally be debugged from stream level (e.g. as a streamed > file). > > A tiny problem is we don't yet have a clue on how much cpu cycles we > need to construct and dump these JSONs even if they're only for > debugging, and even if suppress-vmdesc=on QEMU will still try to > construct these JSONs (e.g. also for postcopy). > > This series has a few patches just to make sure the JSON blob won't be > constructed if not needed (either postcopy, or suppress-vmdesc=on). I > tried to measure the downtime diff with/without these changes, the time > QEMU takes to construct / dump the JSON blob is still not measurable. > So I suppose unconditionally having this is ok. Said that, let's still > have these changes around so we avoid JSON operations if not needed. > > - DEVICE migration state > > QEMU has a very special DEVICE migration state, that only happens with > precopy, and only when pause-before-switchover capability is enabled. > Due to that specialty we can't merge precopy and postcopy code on > switchover starts, because the state machine will be different. > > However after I checked the history and also with libvirt developers, > this seems unnecessary. So I had one patch making DEVICE state to be > the "switchover" phase for precopy/postcopy unconditionally. That will > make the state machine much easier for both modes, meanwhile nothing is > expected to break with it (but please still shoot if anyone knows / > suspect something will, or could, break..). > > - General cleanups and fixes > > Most of the rest changes are random cleanups and fixes in the > switchover path. > > E.g., postcopy_start() has some code that isn't easy to read due to > some special flags here and there, mostly around the two calls of > qemu_savevm_state_complete_precopy(). This series will remove most of > those special treatments here and there. > > We could have done something twice in the past in postcopy switchover > (e.g. I believe we sync CPU twice.. but only happens with postcopy), > now they should all be sorted out. > > And quite some other things hopefully can be separately discussed and > justified in each patch. After these cleanups, we will be able to have > an unified entrance for precopy/postcopy on switchover. > > Initially I thought this could optimize the downtime slightly, but after > some tests, it turns out there's no measureable difference, at least in my > current setup... So let's take this as a cleanup series at least for now, > and I hope they would still make some sense. Comments welcomed. > > Thanks, > > Peter Xu (16): > migration: Remove postcopy implications in should_send_vmdesc() > migration: Do not construct JSON description if suppressed > migration: Optimize postcopy on downtime by avoiding JSON writer > migration: Avoid two src-downtime-end tracepoints for postcopy > migration: Drop inactivate_disk param in qemu_savevm_state_complete* > migration: Synchronize all CPU states only for non-iterable dump > migration: Adjust postcopy bandwidth during switchover > migration: Adjust locking in migration_maybe_pause() > migration: Drop cached migration state in migration_maybe_pause() > migration: Take BQL slightly longer in postcopy_start() > migration: Notify COMPLETE once for postcopy > migration: Unwrap qemu_savevm_state_complete_precopy() in postcopy > migration: Cleanup qemu_savevm_state_complete_precopy() > migration: Always set DEVICE state > migration: Merge precopy/postcopy on switchover start > migration: Trivial cleanup on JSON writer of vmstate_save() > > qapi/migration.json | 7 +- > migration/migration.h | 1 + > migration/savevm.h | 6 +- > migration/migration.c | 209 +++++++++++++++++++++++------------- > migration/savevm.c | 116 ++++++++------------ > migration/vmstate.c | 6 +- > tests/qtest/libqos/libqos.c | 3 +- > migration/trace-events | 2 +- > tests/qemu-iotests/194.out | 1 + > tests/qemu-iotests/203.out | 1 + > tests/qemu-iotests/234.out | 2 + > tests/qemu-iotests/262.out | 1 + > tests/qemu-iotests/280.out | 1 + > 13 files changed, 200 insertions(+), 156 deletions(-) > > -- > 2.47.0 > All patches look good to me, nice refactor! Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Peter Xu <peterx@redhat.com> writes: > CI: https://gitlab.com/peterx/qemu/-/pipelines/1625266692 > (note: warning is present on rust stuff, but shouldn't be relevant) > > This series refactors the migration switchover path quite a bit. I started > this work initially to measure the JSON writer overhead, but then I decided > to cleanup the switchover path in general when I am at it altogether, as I > wanted to do this for a long time. > > A few major things I tried to do: > > - About the JSON writer > > Currently, precopy migration always dumps a chunk of data called VM > description (QEMU_VM_VMDESCRIPTION) for debugging purpose. That is a > JSON blob explaining all the vmstates dumped in the migration stream. > QEMU has a machine property suppress-vmdesc deciding whether migration > will have that JSON chunk included. > > Postcopy does not have such JSON dump because postcopy is live session > and it can't normally be debugged from stream level (e.g. as a streamed > file). > > A tiny problem is we don't yet have a clue on how much cpu cycles we > need to construct and dump these JSONs even if they're only for > debugging, and even if suppress-vmdesc=on QEMU will still try to > construct these JSONs (e.g. also for postcopy). > > This series has a few patches just to make sure the JSON blob won't be > constructed if not needed (either postcopy, or suppress-vmdesc=on). I > tried to measure the downtime diff with/without these changes, the time > QEMU takes to construct / dump the JSON blob is still not measurable. > So I suppose unconditionally having this is ok. Said that, let's still > have these changes around so we avoid JSON operations if not needed. > > - DEVICE migration state > > QEMU has a very special DEVICE migration state, that only happens with > precopy, and only when pause-before-switchover capability is enabled. > Due to that specialty we can't merge precopy and postcopy code on > switchover starts, because the state machine will be different. > > However after I checked the history and also with libvirt developers, > this seems unnecessary. So I had one patch making DEVICE state to be > the "switchover" phase for precopy/postcopy unconditionally. That will > make the state machine much easier for both modes, meanwhile nothing is > expected to break with it (but please still shoot if anyone knows / > suspect something will, or could, break..). > > - General cleanups and fixes > > Most of the rest changes are random cleanups and fixes in the > switchover path. > > E.g., postcopy_start() has some code that isn't easy to read due to > some special flags here and there, mostly around the two calls of > qemu_savevm_state_complete_precopy(). This series will remove most of > those special treatments here and there. > > We could have done something twice in the past in postcopy switchover > (e.g. I believe we sync CPU twice.. but only happens with postcopy), > now they should all be sorted out. > > And quite some other things hopefully can be separately discussed and > justified in each patch. After these cleanups, we will be able to have > an unified entrance for precopy/postcopy on switchover. > > Initially I thought this could optimize the downtime slightly, but after > some tests, it turns out there's no measureable difference, at least in my > current setup... So let's take this as a cleanup series at least for now, > and I hope they would still make some sense. Comments welcomed. > > Thanks, > > Peter Xu (16): > migration: Remove postcopy implications in should_send_vmdesc() > migration: Do not construct JSON description if suppressed > migration: Optimize postcopy on downtime by avoiding JSON writer > migration: Avoid two src-downtime-end tracepoints for postcopy > migration: Drop inactivate_disk param in qemu_savevm_state_complete* > migration: Synchronize all CPU states only for non-iterable dump > migration: Adjust postcopy bandwidth during switchover > migration: Adjust locking in migration_maybe_pause() > migration: Drop cached migration state in migration_maybe_pause() > migration: Take BQL slightly longer in postcopy_start() > migration: Notify COMPLETE once for postcopy > migration: Unwrap qemu_savevm_state_complete_precopy() in postcopy > migration: Cleanup qemu_savevm_state_complete_precopy() > migration: Always set DEVICE state > migration: Merge precopy/postcopy on switchover start > migration: Trivial cleanup on JSON writer of vmstate_save() > > qapi/migration.json | 7 +- > migration/migration.h | 1 + > migration/savevm.h | 6 +- > migration/migration.c | 209 +++++++++++++++++++++++------------- > migration/savevm.c | 116 ++++++++------------ > migration/vmstate.c | 6 +- > tests/qtest/libqos/libqos.c | 3 +- > migration/trace-events | 2 +- > tests/qemu-iotests/194.out | 1 + > tests/qemu-iotests/203.out | 1 + > tests/qemu-iotests/234.out | 2 + > tests/qemu-iotests/262.out | 1 + > tests/qemu-iotests/280.out | 1 + > 13 files changed, 200 insertions(+), 156 deletions(-) Queued, thanks!