Message ID | 20241206005834.1050905-3-peterx@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | migration/multifd: Some VFIO / postcopy preparations on flush | expand |
Peter Xu <peterx@redhat.com> writes: > Teach multifd_send_sync_main() to sync with threads only. > > We already have such requests, which is when mapped-ram is enabled with > multifd. In that case, no SYNC messages will be pushed to the stream when > multifd syncs the sender threads because there's no destination threads > waiting for that. The whole point of the sync is to make sure all threads > flushed their jobs. s/flushed/finished/ otherwise we risk confusing people. > > So fundamentally we have a request to do the sync in different ways: > > - Either to sync the threads only, > - Or to sync the threads but also with the destination side. > > Mapped-ram did it already because of the use_packet check in the sync > handler of the sender thread. It works. > > However it may stop working when e.g. VFIO may start to reuse multifd > channels to push device states. In that case VFIO has similar request on > "thread-only sync" however we can't check a flag because such sync request > can still come from RAM which needs the on-wire notifications. > > Paving way for that by allowing the multifd_send_sync_main() to specify > what kind of sync the caller needs. We can use it for mapped-ram already. > > No functional change intended. > > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > migration/multifd.h | 19 ++++++++++++++++--- > migration/multifd-nocomp.c | 7 ++++++- > migration/multifd.c | 15 +++++++++------ > 3 files changed, 31 insertions(+), 10 deletions(-) > > diff --git a/migration/multifd.h b/migration/multifd.h > index 50d58c0c9c..bd337631ec 100644 > --- a/migration/multifd.h > +++ b/migration/multifd.h > @@ -19,6 +19,18 @@ > typedef struct MultiFDRecvData MultiFDRecvData; > typedef struct MultiFDSendData MultiFDSendData; > > +typedef enum { > + /* No sync request */ > + MULTIFD_SYNC_NONE = 0, > + /* Sync locally on the sender threads without pushing messages */ > + MULTIFD_SYNC_LOCAL, > + /* > + * Sync not only on the sender threads, but also push "SYNC" message to > + * the wire (which is for a remote sync). s/SYNC/MULTIFD_FLAG_SYNC/ Do we need to also mention that this needs to be paired with a multifd_recv_sync_main() via the emission of the RAM_SAVE_FLAG_MULTIFD_FLUSH flag on the stream? > + */ > + MULTIFD_SYNC_ALL, > +} MultiFDSyncReq; > + > bool multifd_send_setup(void); > void multifd_send_shutdown(void); > void multifd_send_channel_created(void); > @@ -28,7 +40,7 @@ void multifd_recv_shutdown(void); > bool multifd_recv_all_channels_created(void); > void multifd_recv_new_channel(QIOChannel *ioc, Error **errp); > void multifd_recv_sync_main(void); > -int multifd_send_sync_main(void); > +int multifd_send_sync_main(MultiFDSyncReq req); > bool multifd_queue_page(RAMBlock *block, ram_addr_t offset); > bool multifd_recv(void); > MultiFDRecvData *multifd_get_recv_data(void); > @@ -143,7 +155,7 @@ typedef struct { > /* multifd flags for each packet */ > uint32_t flags; > /* > - * The sender thread has work to do if either of below boolean is set. > + * The sender thread has work to do if either of below field is set. > * > * @pending_job: a job is pending > * @pending_sync: a sync request is pending > @@ -152,7 +164,8 @@ typedef struct { > * cleared by the multifd sender threads. > */ > bool pending_job; > - bool pending_sync; > + MultiFDSyncReq pending_sync; > + > MultiFDSendData *data; > > /* thread local variables. No locking required */ > diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c > index 55191152f9..219f9e58ef 100644 > --- a/migration/multifd-nocomp.c > +++ b/migration/multifd-nocomp.c > @@ -345,6 +345,8 @@ retry: > > int multifd_ram_flush_and_sync(void) > { > + MultiFDSyncReq req; > + > if (!migrate_multifd()) { > return 0; > } > @@ -356,7 +358,10 @@ int multifd_ram_flush_and_sync(void) > } > } > > - return multifd_send_sync_main(); > + /* File migrations only need to sync with threads */ > + req = migrate_mapped_ram() ? MULTIFD_SYNC_LOCAL : MULTIFD_SYNC_ALL; > + > + return multifd_send_sync_main(req); > } > > bool multifd_send_prepare_common(MultiFDSendParams *p) > diff --git a/migration/multifd.c b/migration/multifd.c > index 498e71fd10..2248bd2d46 100644 > --- a/migration/multifd.c > +++ b/migration/multifd.c > @@ -523,7 +523,7 @@ static int multifd_zero_copy_flush(QIOChannel *c) > return ret; > } > > -int multifd_send_sync_main(void) > +int multifd_send_sync_main(MultiFDSyncReq req) > { > int i; > bool flush_zero_copy; assert(req != MULTIFD_SYNC_NONE) ? > @@ -543,8 +543,8 @@ int multifd_send_sync_main(void) > * We should be the only user so far, so not possible to be set by > * others concurrently. > */ > - assert(qatomic_read(&p->pending_sync) == false); > - qatomic_set(&p->pending_sync, true); > + assert(qatomic_read(&p->pending_sync) == MULTIFD_SYNC_NONE); > + qatomic_set(&p->pending_sync, req); > qemu_sem_post(&p->sem); > } > for (i = 0; i < migrate_multifd_channels(); i++) { > @@ -635,14 +635,17 @@ static void *multifd_send_thread(void *opaque) > */ > qatomic_store_release(&p->pending_job, false); > } else { > + MultiFDSyncReq req = qatomic_read(&p->pending_sync); > + > /* > * If not a normal job, must be a sync request. Note that > * pending_sync is a standalone flag (unlike pending_job), so > * it doesn't require explicit memory barriers. > */ > - assert(qatomic_read(&p->pending_sync)); > + assert(req != MULTIFD_SYNC_NONE); > > - if (use_packets) { > + /* Only push the SYNC message if it involves a remote sync */ > + if (req == MULTIFD_SYNC_ALL) { > p->flags = MULTIFD_FLAG_SYNC; > multifd_send_fill_packet(p); > ret = qio_channel_write_all(p->c, (void *)p->packet, > @@ -654,7 +657,7 @@ static void *multifd_send_thread(void *opaque) > stat64_add(&mig_stats.multifd_bytes, p->packet_len); > } > > - qatomic_set(&p->pending_sync, false); > + qatomic_set(&p->pending_sync, MULTIFD_SYNC_NONE); > qemu_sem_post(&p->sem_sync); > } > }
On Fri, Dec 06, 2024 at 10:26:06AM -0300, Fabiano Rosas wrote: > Peter Xu <peterx@redhat.com> writes: > > > Teach multifd_send_sync_main() to sync with threads only. > > > > We already have such requests, which is when mapped-ram is enabled with > > multifd. In that case, no SYNC messages will be pushed to the stream when > > multifd syncs the sender threads because there's no destination threads > > waiting for that. The whole point of the sync is to make sure all threads > > flushed their jobs. > > s/flushed/finished/ otherwise we risk confusing people. done. > > > > > So fundamentally we have a request to do the sync in different ways: > > > > - Either to sync the threads only, > > - Or to sync the threads but also with the destination side. > > > > Mapped-ram did it already because of the use_packet check in the sync > > handler of the sender thread. It works. > > > > However it may stop working when e.g. VFIO may start to reuse multifd > > channels to push device states. In that case VFIO has similar request on > > "thread-only sync" however we can't check a flag because such sync request > > can still come from RAM which needs the on-wire notifications. > > > > Paving way for that by allowing the multifd_send_sync_main() to specify > > what kind of sync the caller needs. We can use it for mapped-ram already. > > > > No functional change intended. > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > --- > > migration/multifd.h | 19 ++++++++++++++++--- > > migration/multifd-nocomp.c | 7 ++++++- > > migration/multifd.c | 15 +++++++++------ > > 3 files changed, 31 insertions(+), 10 deletions(-) > > > > diff --git a/migration/multifd.h b/migration/multifd.h > > index 50d58c0c9c..bd337631ec 100644 > > --- a/migration/multifd.h > > +++ b/migration/multifd.h > > @@ -19,6 +19,18 @@ > > typedef struct MultiFDRecvData MultiFDRecvData; > > typedef struct MultiFDSendData MultiFDSendData; > > > > +typedef enum { > > + /* No sync request */ > > + MULTIFD_SYNC_NONE = 0, > > + /* Sync locally on the sender threads without pushing messages */ > > + MULTIFD_SYNC_LOCAL, > > + /* > > + * Sync not only on the sender threads, but also push "SYNC" message to > > + * the wire (which is for a remote sync). > > s/SYNC/MULTIFD_FLAG_SYNC/ > > Do we need to also mention that this needs to be paired with a > multifd_recv_sync_main() via the emission of the > RAM_SAVE_FLAG_MULTIFD_FLUSH flag on the stream? If we want to mention something, IMO it would be better about what happens on the src, not dest. It can be too hard to follow if we connect that directly to the dest behavior. Does this look good to you? /* * Sync not only on the sender threads, but also push MULTIFD_FLAG_SYNC * message to the wire for each iochannel (which is for a remote sync). * * When remote sync is used, need to be paired with a follow up * RAM_SAVE_FLAG_EOS / RAM_SAVE_FLAG_MULTIFD_FLUSH message on the main * channel. */ > > > + */ > > + MULTIFD_SYNC_ALL, > > +} MultiFDSyncReq; > > + > > bool multifd_send_setup(void); > > void multifd_send_shutdown(void); > > void multifd_send_channel_created(void); > > @@ -28,7 +40,7 @@ void multifd_recv_shutdown(void); > > bool multifd_recv_all_channels_created(void); > > void multifd_recv_new_channel(QIOChannel *ioc, Error **errp); > > void multifd_recv_sync_main(void); > > -int multifd_send_sync_main(void); > > +int multifd_send_sync_main(MultiFDSyncReq req); > > bool multifd_queue_page(RAMBlock *block, ram_addr_t offset); > > bool multifd_recv(void); > > MultiFDRecvData *multifd_get_recv_data(void); > > @@ -143,7 +155,7 @@ typedef struct { > > /* multifd flags for each packet */ > > uint32_t flags; > > /* > > - * The sender thread has work to do if either of below boolean is set. > > + * The sender thread has work to do if either of below field is set. > > * > > * @pending_job: a job is pending > > * @pending_sync: a sync request is pending > > @@ -152,7 +164,8 @@ typedef struct { > > * cleared by the multifd sender threads. > > */ > > bool pending_job; > > - bool pending_sync; > > + MultiFDSyncReq pending_sync; > > + > > MultiFDSendData *data; > > > > /* thread local variables. No locking required */ > > diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c > > index 55191152f9..219f9e58ef 100644 > > --- a/migration/multifd-nocomp.c > > +++ b/migration/multifd-nocomp.c > > @@ -345,6 +345,8 @@ retry: > > > > int multifd_ram_flush_and_sync(void) > > { > > + MultiFDSyncReq req; > > + > > if (!migrate_multifd()) { > > return 0; > > } > > @@ -356,7 +358,10 @@ int multifd_ram_flush_and_sync(void) > > } > > } > > > > - return multifd_send_sync_main(); > > + /* File migrations only need to sync with threads */ > > + req = migrate_mapped_ram() ? MULTIFD_SYNC_LOCAL : MULTIFD_SYNC_ALL; > > + > > + return multifd_send_sync_main(req); > > } > > > > bool multifd_send_prepare_common(MultiFDSendParams *p) > > diff --git a/migration/multifd.c b/migration/multifd.c > > index 498e71fd10..2248bd2d46 100644 > > --- a/migration/multifd.c > > +++ b/migration/multifd.c > > @@ -523,7 +523,7 @@ static int multifd_zero_copy_flush(QIOChannel *c) > > return ret; > > } > > > > -int multifd_send_sync_main(void) > > +int multifd_send_sync_main(MultiFDSyncReq req) > > { > > int i; > > bool flush_zero_copy; > > assert(req != MULTIFD_SYNC_NONE) ? Sure. > > > @@ -543,8 +543,8 @@ int multifd_send_sync_main(void) > > * We should be the only user so far, so not possible to be set by > > * others concurrently. > > */ > > - assert(qatomic_read(&p->pending_sync) == false); > > - qatomic_set(&p->pending_sync, true); > > + assert(qatomic_read(&p->pending_sync) == MULTIFD_SYNC_NONE); > > + qatomic_set(&p->pending_sync, req); > > qemu_sem_post(&p->sem); > > } > > for (i = 0; i < migrate_multifd_channels(); i++) { > > @@ -635,14 +635,17 @@ static void *multifd_send_thread(void *opaque) > > */ > > qatomic_store_release(&p->pending_job, false); > > } else { > > + MultiFDSyncReq req = qatomic_read(&p->pending_sync); > > + > > /* > > * If not a normal job, must be a sync request. Note that > > * pending_sync is a standalone flag (unlike pending_job), so > > * it doesn't require explicit memory barriers. > > */ > > - assert(qatomic_read(&p->pending_sync)); > > + assert(req != MULTIFD_SYNC_NONE); > > > > - if (use_packets) { > > + /* Only push the SYNC message if it involves a remote sync */ > > + if (req == MULTIFD_SYNC_ALL) { > > p->flags = MULTIFD_FLAG_SYNC; > > multifd_send_fill_packet(p); > > ret = qio_channel_write_all(p->c, (void *)p->packet, > > @@ -654,7 +657,7 @@ static void *multifd_send_thread(void *opaque) > > stat64_add(&mig_stats.multifd_bytes, p->packet_len); > > } > > > > - qatomic_set(&p->pending_sync, false); > > + qatomic_set(&p->pending_sync, MULTIFD_SYNC_NONE); > > qemu_sem_post(&p->sem_sync); > > } > > } >
Peter Xu <peterx@redhat.com> writes: > On Fri, Dec 06, 2024 at 10:26:06AM -0300, Fabiano Rosas wrote: >> Peter Xu <peterx@redhat.com> writes: >> >> > Teach multifd_send_sync_main() to sync with threads only. >> > >> > We already have such requests, which is when mapped-ram is enabled with >> > multifd. In that case, no SYNC messages will be pushed to the stream when >> > multifd syncs the sender threads because there's no destination threads >> > waiting for that. The whole point of the sync is to make sure all threads >> > flushed their jobs. >> >> s/flushed/finished/ otherwise we risk confusing people. > > done. > >> >> > >> > So fundamentally we have a request to do the sync in different ways: >> > >> > - Either to sync the threads only, >> > - Or to sync the threads but also with the destination side. >> > >> > Mapped-ram did it already because of the use_packet check in the sync >> > handler of the sender thread. It works. >> > >> > However it may stop working when e.g. VFIO may start to reuse multifd >> > channels to push device states. In that case VFIO has similar request on >> > "thread-only sync" however we can't check a flag because such sync request >> > can still come from RAM which needs the on-wire notifications. >> > >> > Paving way for that by allowing the multifd_send_sync_main() to specify >> > what kind of sync the caller needs. We can use it for mapped-ram already. >> > >> > No functional change intended. >> > >> > Signed-off-by: Peter Xu <peterx@redhat.com> >> > --- >> > migration/multifd.h | 19 ++++++++++++++++--- >> > migration/multifd-nocomp.c | 7 ++++++- >> > migration/multifd.c | 15 +++++++++------ >> > 3 files changed, 31 insertions(+), 10 deletions(-) >> > >> > diff --git a/migration/multifd.h b/migration/multifd.h >> > index 50d58c0c9c..bd337631ec 100644 >> > --- a/migration/multifd.h >> > +++ b/migration/multifd.h >> > @@ -19,6 +19,18 @@ >> > typedef struct MultiFDRecvData MultiFDRecvData; >> > typedef struct MultiFDSendData MultiFDSendData; >> > >> > +typedef enum { >> > + /* No sync request */ >> > + MULTIFD_SYNC_NONE = 0, >> > + /* Sync locally on the sender threads without pushing messages */ >> > + MULTIFD_SYNC_LOCAL, >> > + /* >> > + * Sync not only on the sender threads, but also push "SYNC" message to >> > + * the wire (which is for a remote sync). >> >> s/SYNC/MULTIFD_FLAG_SYNC/ >> >> Do we need to also mention that this needs to be paired with a >> multifd_recv_sync_main() via the emission of the >> RAM_SAVE_FLAG_MULTIFD_FLUSH flag on the stream? > > If we want to mention something, IMO it would be better about what happens > on the src, not dest. It can be too hard to follow if we connect that > directly to the dest behavior. > > Does this look good to you? > > /* > * Sync not only on the sender threads, but also push MULTIFD_FLAG_SYNC > * message to the wire for each iochannel (which is for a remote sync). > * > * When remote sync is used, need to be paired with a follow up > * RAM_SAVE_FLAG_EOS / RAM_SAVE_FLAG_MULTIFD_FLUSH message on the main > * channel. > */ Yes, thanks.
diff --git a/migration/multifd.h b/migration/multifd.h index 50d58c0c9c..bd337631ec 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -19,6 +19,18 @@ typedef struct MultiFDRecvData MultiFDRecvData; typedef struct MultiFDSendData MultiFDSendData; +typedef enum { + /* No sync request */ + MULTIFD_SYNC_NONE = 0, + /* Sync locally on the sender threads without pushing messages */ + MULTIFD_SYNC_LOCAL, + /* + * Sync not only on the sender threads, but also push "SYNC" message to + * the wire (which is for a remote sync). + */ + MULTIFD_SYNC_ALL, +} MultiFDSyncReq; + bool multifd_send_setup(void); void multifd_send_shutdown(void); void multifd_send_channel_created(void); @@ -28,7 +40,7 @@ void multifd_recv_shutdown(void); bool multifd_recv_all_channels_created(void); void multifd_recv_new_channel(QIOChannel *ioc, Error **errp); void multifd_recv_sync_main(void); -int multifd_send_sync_main(void); +int multifd_send_sync_main(MultiFDSyncReq req); bool multifd_queue_page(RAMBlock *block, ram_addr_t offset); bool multifd_recv(void); MultiFDRecvData *multifd_get_recv_data(void); @@ -143,7 +155,7 @@ typedef struct { /* multifd flags for each packet */ uint32_t flags; /* - * The sender thread has work to do if either of below boolean is set. + * The sender thread has work to do if either of below field is set. * * @pending_job: a job is pending * @pending_sync: a sync request is pending @@ -152,7 +164,8 @@ typedef struct { * cleared by the multifd sender threads. */ bool pending_job; - bool pending_sync; + MultiFDSyncReq pending_sync; + MultiFDSendData *data; /* thread local variables. No locking required */ diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c index 55191152f9..219f9e58ef 100644 --- a/migration/multifd-nocomp.c +++ b/migration/multifd-nocomp.c @@ -345,6 +345,8 @@ retry: int multifd_ram_flush_and_sync(void) { + MultiFDSyncReq req; + if (!migrate_multifd()) { return 0; } @@ -356,7 +358,10 @@ int multifd_ram_flush_and_sync(void) } } - return multifd_send_sync_main(); + /* File migrations only need to sync with threads */ + req = migrate_mapped_ram() ? MULTIFD_SYNC_LOCAL : MULTIFD_SYNC_ALL; + + return multifd_send_sync_main(req); } bool multifd_send_prepare_common(MultiFDSendParams *p) diff --git a/migration/multifd.c b/migration/multifd.c index 498e71fd10..2248bd2d46 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -523,7 +523,7 @@ static int multifd_zero_copy_flush(QIOChannel *c) return ret; } -int multifd_send_sync_main(void) +int multifd_send_sync_main(MultiFDSyncReq req) { int i; bool flush_zero_copy; @@ -543,8 +543,8 @@ int multifd_send_sync_main(void) * We should be the only user so far, so not possible to be set by * others concurrently. */ - assert(qatomic_read(&p->pending_sync) == false); - qatomic_set(&p->pending_sync, true); + assert(qatomic_read(&p->pending_sync) == MULTIFD_SYNC_NONE); + qatomic_set(&p->pending_sync, req); qemu_sem_post(&p->sem); } for (i = 0; i < migrate_multifd_channels(); i++) { @@ -635,14 +635,17 @@ static void *multifd_send_thread(void *opaque) */ qatomic_store_release(&p->pending_job, false); } else { + MultiFDSyncReq req = qatomic_read(&p->pending_sync); + /* * If not a normal job, must be a sync request. Note that * pending_sync is a standalone flag (unlike pending_job), so * it doesn't require explicit memory barriers. */ - assert(qatomic_read(&p->pending_sync)); + assert(req != MULTIFD_SYNC_NONE); - if (use_packets) { + /* Only push the SYNC message if it involves a remote sync */ + if (req == MULTIFD_SYNC_ALL) { p->flags = MULTIFD_FLAG_SYNC; multifd_send_fill_packet(p); ret = qio_channel_write_all(p->c, (void *)p->packet, @@ -654,7 +657,7 @@ static void *multifd_send_thread(void *opaque) stat64_add(&mig_stats.multifd_bytes, p->packet_len); } - qatomic_set(&p->pending_sync, false); + qatomic_set(&p->pending_sync, MULTIFD_SYNC_NONE); qemu_sem_post(&p->sem_sync); } }
Teach multifd_send_sync_main() to sync with threads only. We already have such requests, which is when mapped-ram is enabled with multifd. In that case, no SYNC messages will be pushed to the stream when multifd syncs the sender threads because there's no destination threads waiting for that. The whole point of the sync is to make sure all threads flushed their jobs. So fundamentally we have a request to do the sync in different ways: - Either to sync the threads only, - Or to sync the threads but also with the destination side. Mapped-ram did it already because of the use_packet check in the sync handler of the sender thread. It works. However it may stop working when e.g. VFIO may start to reuse multifd channels to push device states. In that case VFIO has similar request on "thread-only sync" however we can't check a flag because such sync request can still come from RAM which needs the on-wire notifications. Paving way for that by allowing the multifd_send_sync_main() to specify what kind of sync the caller needs. We can use it for mapped-ram already. No functional change intended. Signed-off-by: Peter Xu <peterx@redhat.com> --- migration/multifd.h | 19 ++++++++++++++++--- migration/multifd-nocomp.c | 7 ++++++- migration/multifd.c | 15 +++++++++------ 3 files changed, 31 insertions(+), 10 deletions(-)