Message ID | 20230501140141.11743-3-avihaih@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | migration: Add precopy initial data capability and VFIO precopy support | expand |
(I've left high level comment in cover letter, but still some quick comments I noticed when reading) On Mon, May 01, 2023 at 05:01:35PM +0300, Avihai Horon wrote: > Add precopy initial data handshake between source and destination upon > migration setup. The purpose of the handshake is to notify the > destination that precopy initial data is used and which migration users > (i.e., SaveStateEntry) are going to use it. > > The handshake is done in two levels. First, a general enable command is > sent to notify the destination migration code that precopy initial data > is used. > Then, for each migration user in the source that supports precopy > initial data, an enable command is sent to its counterpart in the > destination: > If both support it, precopy initial data will be used for them. > If source doesn't support it, precopy initial data will not be used for > them. > If source supports it and destination doesn't, migration will be failed. > > To implement it, a new migration command MIG_CMD_INITIAL_DATA_ENABLE is > added, as well as a new SaveVMHandlers handler initial_data_advise. > Calling the handler advises the migration user that precopy initial data > is used and its return value indicates whether precopy initial data is > supported by it. > > Signed-off-by: Avihai Horon <avihaih@nvidia.com> > --- > include/migration/register.h | 6 +++ > migration/migration.h | 3 ++ > migration/savevm.h | 1 + > migration/migration.c | 4 ++ > migration/savevm.c | 102 +++++++++++++++++++++++++++++++++++ > migration/trace-events | 2 + > 6 files changed, 118 insertions(+) > > diff --git a/include/migration/register.h b/include/migration/register.h > index a8dfd8fefd..0a73f3883e 100644 > --- a/include/migration/register.h > +++ b/include/migration/register.h > @@ -71,6 +71,12 @@ typedef struct SaveVMHandlers { > int (*load_cleanup)(void *opaque); > /* Called when postcopy migration wants to resume from failure */ > int (*resume_prepare)(MigrationState *s, void *opaque); > + > + /* > + * Advises that precopy initial data was requested to be enabled. Returns > + * true if it's supported or false otherwise. Called both in src and dest. > + */ > + bool (*initial_data_advise)(void *opaque); > } SaveVMHandlers; > > int register_savevm_live(const char *idstr, > diff --git a/migration/migration.h b/migration/migration.h > index 3a918514e7..4f615e9dbc 100644 > --- a/migration/migration.h > +++ b/migration/migration.h > @@ -204,6 +204,9 @@ struct MigrationIncomingState { > * contains valid information. > */ > QemuMutex page_request_mutex; > + > + /* Indicates whether precopy initial data was enabled and should be used */ > + bool initial_data_enabled; > }; > > MigrationIncomingState *migration_incoming_get_current(void); > diff --git a/migration/savevm.h b/migration/savevm.h > index fb636735f0..d47ab4ad18 100644 > --- a/migration/savevm.h > +++ b/migration/savevm.h > @@ -58,6 +58,7 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name, > uint64_t *start_list, > uint64_t *length_list); > void qemu_savevm_send_colo_enable(QEMUFile *f); > +void qemu_savevm_send_initial_data_enable(MigrationState *ms, QEMUFile *f); > void qemu_savevm_live_state(QEMUFile *f); > int qemu_save_device_state(QEMUFile *f); > > diff --git a/migration/migration.c b/migration/migration.c > index abcadbb619..68cdf5b184 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -2964,6 +2964,10 @@ static void *migration_thread(void *opaque) > qemu_savevm_send_colo_enable(s->to_dst_file); > } > > + if (migrate_precopy_initial_data()) { > + qemu_savevm_send_initial_data_enable(s, s->to_dst_file); > + } > + > qemu_savevm_state_setup(s->to_dst_file); > > qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP, > diff --git a/migration/savevm.c b/migration/savevm.c > index a9181b444b..2740defdf0 100644 > --- a/migration/savevm.c > +++ b/migration/savevm.c > @@ -71,6 +71,13 @@ > > const unsigned int postcopy_ram_discard_version; > > +typedef struct { > + uint8_t general_enable; > + uint8_t reserved[7]; > + uint8_t idstr[256]; > + uint32_t instance_id; > +} InitialDataInfo; > + > /* Subcommands for QEMU_VM_COMMAND */ > enum qemu_vm_cmd { > MIG_CMD_INVALID = 0, /* Must be 0 */ > @@ -90,6 +97,8 @@ enum qemu_vm_cmd { > MIG_CMD_ENABLE_COLO, /* Enable COLO */ > MIG_CMD_POSTCOPY_RESUME, /* resume postcopy on dest */ > MIG_CMD_RECV_BITMAP, /* Request for recved bitmap on dst */ > + > + MIG_CMD_INITIAL_DATA_ENABLE, /* Enable precopy initial data in dest */ > MIG_CMD_MAX > }; > > @@ -109,6 +118,8 @@ static struct mig_cmd_args { > [MIG_CMD_POSTCOPY_RESUME] = { .len = 0, .name = "POSTCOPY_RESUME" }, > [MIG_CMD_PACKAGED] = { .len = 4, .name = "PACKAGED" }, > [MIG_CMD_RECV_BITMAP] = { .len = -1, .name = "RECV_BITMAP" }, > + [MIG_CMD_INITIAL_DATA_ENABLE] = { .len = sizeof(InitialDataInfo), > + .name = "INITIAL_DATA_ENABLE" }, > [MIG_CMD_MAX] = { .len = -1, .name = "MAX" }, > }; > > @@ -1036,6 +1047,40 @@ static void qemu_savevm_command_send(QEMUFile *f, > qemu_fflush(f); > } > > +void qemu_savevm_send_initial_data_enable(MigrationState *ms, QEMUFile *f) > +{ > + SaveStateEntry *se; > + InitialDataInfo buf; > + > + /* Enable precopy initial data generally in the migration */ > + memset(&buf, 0, sizeof(buf)); > + buf.general_enable = 1; > + qemu_savevm_command_send(f, MIG_CMD_INITIAL_DATA_ENABLE, sizeof(buf), > + (uint8_t *)&buf); > + trace_savevm_send_initial_data_enable(buf.general_enable, (char *)buf.idstr, > + buf.instance_id); Maybe a generalized helper would be nice here to send one MIG_CMD_INITIAL_DATA_ENABLE packet, then it can be used below too. Here instance_id is multi-bytes, we may want to consider endianess. it seems the common way is use big endian over the wire for qemu migration msgs. > + > + /* Enable precopy initial data for each migration user that supports it */ > + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { > + if (!se->ops || !se->ops->initial_data_advise) { > + continue; > + } > + > + if (!se->ops->initial_data_advise(se->opaque)) { > + continue; > + } > + > + memset(&buf, 0, sizeof(buf)); > + memcpy(buf.idstr, se->idstr, sizeof(buf.idstr)); > + buf.instance_id = se->instance_id; > + > + qemu_savevm_command_send(f, MIG_CMD_INITIAL_DATA_ENABLE, sizeof(buf), > + (uint8_t *)&buf); > + trace_savevm_send_initial_data_enable( > + buf.general_enable, (char *)buf.idstr, buf.instance_id); > + } > +} > + > void qemu_savevm_send_colo_enable(QEMUFile *f) > { > trace_savevm_send_colo_enable(); > @@ -2314,6 +2359,60 @@ static int loadvm_process_enable_colo(MigrationIncomingState *mis) > return ret; > } > > +static int loadvm_handle_initial_data_enable(MigrationIncomingState *mis) > +{ > + InitialDataInfo buf; > + SaveStateEntry *se; > + ssize_t read_size; > + > + read_size = qemu_get_buffer(mis->from_src_file, (void *)&buf, sizeof(buf)); > + if (read_size != sizeof(buf)) { > + error_report("%s: Could not get data buffer, read_size %ld, len %lu", > + __func__, read_size, sizeof(buf)); > + > + return -EIO; > + } > + > + /* Enable precopy initial data generally in the migration */ > + if (buf.general_enable) { > + mis->initial_data_enabled = true; > + trace_loadvm_handle_initial_data_enable( > + buf.general_enable, (char *)buf.idstr, buf.instance_id); > + > + return 0; > + } > + > + /* Enable precopy initial data for a specific migration user */ > + se = find_se((char *)buf.idstr, buf.instance_id); > + if (!se) { > + error_report("%s: Could not find SaveStateEntry, idstr '%s', " > + "instance_id %" PRIu32, > + __func__, buf.idstr, buf.instance_id); > + > + return -ENOENT; > + } > + > + if (!se->ops || !se->ops->initial_data_advise) { > + error_report("%s: '%s' doesn't have required " > + "initial_data_advise op", > + __func__, buf.idstr); > + > + return -EOPNOTSUPP; > + } > + > + if (!se->ops->initial_data_advise(se->opaque)) { > + error_report("%s: '%s' doesn't support precopy initial data", __func__, > + buf.idstr); > + > + return -EOPNOTSUPP; > + } > + > + trace_loadvm_handle_initial_data_enable(buf.general_enable, > + (char *)buf.idstr, buf.instance_id); > + > + return 0; > +} > + > /* > * Process an incoming 'QEMU_VM_COMMAND' > * 0 just a normal return > @@ -2397,6 +2496,9 @@ static int loadvm_process_command(QEMUFile *f) > > case MIG_CMD_ENABLE_COLO: > return loadvm_process_enable_colo(mis); > + > + case MIG_CMD_INITIAL_DATA_ENABLE: > + return loadvm_handle_initial_data_enable(mis); > } > > return 0; > diff --git a/migration/trace-events b/migration/trace-events > index 92161eeac5..21ae471126 100644 > --- a/migration/trace-events > +++ b/migration/trace-events > @@ -23,6 +23,7 @@ loadvm_postcopy_ram_handle_discard_end(void) "" > loadvm_postcopy_ram_handle_discard_header(const char *ramid, uint16_t len) "%s: %ud" > loadvm_process_command(const char *s, uint16_t len) "com=%s len=%d" > loadvm_process_command_ping(uint32_t val) "0x%x" > +loadvm_handle_initial_data_enable(uint8_t general_enable, const char *idstr, int instance_id) "general_enable=%u, idstr=%s, instance_id=%u" > postcopy_ram_listen_thread_exit(void) "" > postcopy_ram_listen_thread_start(void) "" > qemu_savevm_send_postcopy_advise(void) "" > @@ -38,6 +39,7 @@ savevm_send_postcopy_run(void) "" > savevm_send_postcopy_resume(void) "" > savevm_send_colo_enable(void) "" > savevm_send_recv_bitmap(char *name) "%s" > +savevm_send_initial_data_enable(uint8_t general_enable, const char *idstr, int instance_id) "general_enable=%u, idstr=%s, instance_id=%u" > savevm_state_setup(void) "" > savevm_state_resume_prepare(void) "" > savevm_state_header(void) "" > -- > 2.26.3 >
On 03/05/2023 1:54, Peter Xu wrote: > External email: Use caution opening links or attachments > > > (I've left high level comment in cover letter, but still some quick > comments I noticed when reading) > > On Mon, May 01, 2023 at 05:01:35PM +0300, Avihai Horon wrote: >> Add precopy initial data handshake between source and destination upon >> migration setup. The purpose of the handshake is to notify the >> destination that precopy initial data is used and which migration users >> (i.e., SaveStateEntry) are going to use it. >> >> The handshake is done in two levels. First, a general enable command is >> sent to notify the destination migration code that precopy initial data >> is used. >> Then, for each migration user in the source that supports precopy >> initial data, an enable command is sent to its counterpart in the >> destination: >> If both support it, precopy initial data will be used for them. >> If source doesn't support it, precopy initial data will not be used for >> them. >> If source supports it and destination doesn't, migration will be failed. >> >> To implement it, a new migration command MIG_CMD_INITIAL_DATA_ENABLE is >> added, as well as a new SaveVMHandlers handler initial_data_advise. >> Calling the handler advises the migration user that precopy initial data >> is used and its return value indicates whether precopy initial data is >> supported by it. >> >> Signed-off-by: Avihai Horon <avihaih@nvidia.com> >> --- >> include/migration/register.h | 6 +++ >> migration/migration.h | 3 ++ >> migration/savevm.h | 1 + >> migration/migration.c | 4 ++ >> migration/savevm.c | 102 +++++++++++++++++++++++++++++++++++ >> migration/trace-events | 2 + >> 6 files changed, 118 insertions(+) >> >> diff --git a/include/migration/register.h b/include/migration/register.h >> index a8dfd8fefd..0a73f3883e 100644 >> --- a/include/migration/register.h >> +++ b/include/migration/register.h >> @@ -71,6 +71,12 @@ typedef struct SaveVMHandlers { >> int (*load_cleanup)(void *opaque); >> /* Called when postcopy migration wants to resume from failure */ >> int (*resume_prepare)(MigrationState *s, void *opaque); >> + >> + /* >> + * Advises that precopy initial data was requested to be enabled. Returns >> + * true if it's supported or false otherwise. Called both in src and dest. >> + */ >> + bool (*initial_data_advise)(void *opaque); >> } SaveVMHandlers; >> >> int register_savevm_live(const char *idstr, >> diff --git a/migration/migration.h b/migration/migration.h >> index 3a918514e7..4f615e9dbc 100644 >> --- a/migration/migration.h >> +++ b/migration/migration.h >> @@ -204,6 +204,9 @@ struct MigrationIncomingState { >> * contains valid information. >> */ >> QemuMutex page_request_mutex; >> + >> + /* Indicates whether precopy initial data was enabled and should be used */ >> + bool initial_data_enabled; >> }; >> >> MigrationIncomingState *migration_incoming_get_current(void); >> diff --git a/migration/savevm.h b/migration/savevm.h >> index fb636735f0..d47ab4ad18 100644 >> --- a/migration/savevm.h >> +++ b/migration/savevm.h >> @@ -58,6 +58,7 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name, >> uint64_t *start_list, >> uint64_t *length_list); >> void qemu_savevm_send_colo_enable(QEMUFile *f); >> +void qemu_savevm_send_initial_data_enable(MigrationState *ms, QEMUFile *f); >> void qemu_savevm_live_state(QEMUFile *f); >> int qemu_save_device_state(QEMUFile *f); >> >> diff --git a/migration/migration.c b/migration/migration.c >> index abcadbb619..68cdf5b184 100644 >> --- a/migration/migration.c >> +++ b/migration/migration.c >> @@ -2964,6 +2964,10 @@ static void *migration_thread(void *opaque) >> qemu_savevm_send_colo_enable(s->to_dst_file); >> } >> >> + if (migrate_precopy_initial_data()) { >> + qemu_savevm_send_initial_data_enable(s, s->to_dst_file); >> + } >> + >> qemu_savevm_state_setup(s->to_dst_file); >> >> qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP, >> diff --git a/migration/savevm.c b/migration/savevm.c >> index a9181b444b..2740defdf0 100644 >> --- a/migration/savevm.c >> +++ b/migration/savevm.c >> @@ -71,6 +71,13 @@ >> >> const unsigned int postcopy_ram_discard_version; >> >> +typedef struct { >> + uint8_t general_enable; >> + uint8_t reserved[7]; >> + uint8_t idstr[256]; >> + uint32_t instance_id; >> +} InitialDataInfo; >> + >> /* Subcommands for QEMU_VM_COMMAND */ >> enum qemu_vm_cmd { >> MIG_CMD_INVALID = 0, /* Must be 0 */ >> @@ -90,6 +97,8 @@ enum qemu_vm_cmd { >> MIG_CMD_ENABLE_COLO, /* Enable COLO */ >> MIG_CMD_POSTCOPY_RESUME, /* resume postcopy on dest */ >> MIG_CMD_RECV_BITMAP, /* Request for recved bitmap on dst */ >> + >> + MIG_CMD_INITIAL_DATA_ENABLE, /* Enable precopy initial data in dest */ >> MIG_CMD_MAX >> }; >> >> @@ -109,6 +118,8 @@ static struct mig_cmd_args { >> [MIG_CMD_POSTCOPY_RESUME] = { .len = 0, .name = "POSTCOPY_RESUME" }, >> [MIG_CMD_PACKAGED] = { .len = 4, .name = "PACKAGED" }, >> [MIG_CMD_RECV_BITMAP] = { .len = -1, .name = "RECV_BITMAP" }, >> + [MIG_CMD_INITIAL_DATA_ENABLE] = { .len = sizeof(InitialDataInfo), >> + .name = "INITIAL_DATA_ENABLE" }, >> [MIG_CMD_MAX] = { .len = -1, .name = "MAX" }, >> }; >> >> @@ -1036,6 +1047,40 @@ static void qemu_savevm_command_send(QEMUFile *f, >> qemu_fflush(f); >> } >> >> +void qemu_savevm_send_initial_data_enable(MigrationState *ms, QEMUFile *f) >> +{ >> + SaveStateEntry *se; >> + InitialDataInfo buf; >> + >> + /* Enable precopy initial data generally in the migration */ >> + memset(&buf, 0, sizeof(buf)); >> + buf.general_enable = 1; >> + qemu_savevm_command_send(f, MIG_CMD_INITIAL_DATA_ENABLE, sizeof(buf), >> + (uint8_t *)&buf); >> + trace_savevm_send_initial_data_enable(buf.general_enable, (char *)buf.idstr, >> + buf.instance_id); > Maybe a generalized helper would be nice here to send one > MIG_CMD_INITIAL_DATA_ENABLE packet, then it can be used below too. Sure, I will do that. > > Here instance_id is multi-bytes, we may want to consider endianess. it > seems the common way is use big endian over the wire for qemu migration > msgs. Oh, right. I will fix it. Thanks. >> + >> + /* Enable precopy initial data for each migration user that supports it */ >> + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { >> + if (!se->ops || !se->ops->initial_data_advise) { >> + continue; >> + } >> + >> + if (!se->ops->initial_data_advise(se->opaque)) { >> + continue; >> + } >> + >> + memset(&buf, 0, sizeof(buf)); >> + memcpy(buf.idstr, se->idstr, sizeof(buf.idstr)); >> + buf.instance_id = se->instance_id; >> + >> + qemu_savevm_command_send(f, MIG_CMD_INITIAL_DATA_ENABLE, sizeof(buf), >> + (uint8_t *)&buf); >> + trace_savevm_send_initial_data_enable( >> + buf.general_enable, (char *)buf.idstr, buf.instance_id); >> + } >> +} >> + >> void qemu_savevm_send_colo_enable(QEMUFile *f) >> { >> trace_savevm_send_colo_enable(); >> @@ -2314,6 +2359,60 @@ static int loadvm_process_enable_colo(MigrationIncomingState *mis) >> return ret; >> } >> >> +static int loadvm_handle_initial_data_enable(MigrationIncomingState *mis) >> +{ >> + InitialDataInfo buf; >> + SaveStateEntry *se; >> + ssize_t read_size; >> + >> + read_size = qemu_get_buffer(mis->from_src_file, (void *)&buf, sizeof(buf)); >> + if (read_size != sizeof(buf)) { >> + error_report("%s: Could not get data buffer, read_size %ld, len %lu", >> + __func__, read_size, sizeof(buf)); >> + >> + return -EIO; >> + } >> + >> + /* Enable precopy initial data generally in the migration */ >> + if (buf.general_enable) { >> + mis->initial_data_enabled = true; >> + trace_loadvm_handle_initial_data_enable( >> + buf.general_enable, (char *)buf.idstr, buf.instance_id); >> + >> + return 0; >> + } >> + >> + /* Enable precopy initial data for a specific migration user */ >> + se = find_se((char *)buf.idstr, buf.instance_id); >> + if (!se) { >> + error_report("%s: Could not find SaveStateEntry, idstr '%s', " >> + "instance_id %" PRIu32, >> + __func__, buf.idstr, buf.instance_id); >> + >> + return -ENOENT; >> + } >> + >> + if (!se->ops || !se->ops->initial_data_advise) { >> + error_report("%s: '%s' doesn't have required " >> + "initial_data_advise op", >> + __func__, buf.idstr); >> + >> + return -EOPNOTSUPP; >> + } >> + >> + if (!se->ops->initial_data_advise(se->opaque)) { >> + error_report("%s: '%s' doesn't support precopy initial data", __func__, >> + buf.idstr); >> + >> + return -EOPNOTSUPP; >> + } >> + >> + trace_loadvm_handle_initial_data_enable(buf.general_enable, >> + (char *)buf.idstr, buf.instance_id); >> + >> + return 0; >> +} >> + >> /* >> * Process an incoming 'QEMU_VM_COMMAND' >> * 0 just a normal return >> @@ -2397,6 +2496,9 @@ static int loadvm_process_command(QEMUFile *f) >> >> case MIG_CMD_ENABLE_COLO: >> return loadvm_process_enable_colo(mis); >> + >> + case MIG_CMD_INITIAL_DATA_ENABLE: >> + return loadvm_handle_initial_data_enable(mis); >> } >> >> return 0; >> diff --git a/migration/trace-events b/migration/trace-events >> index 92161eeac5..21ae471126 100644 >> --- a/migration/trace-events >> +++ b/migration/trace-events >> @@ -23,6 +23,7 @@ loadvm_postcopy_ram_handle_discard_end(void) "" >> loadvm_postcopy_ram_handle_discard_header(const char *ramid, uint16_t len) "%s: %ud" >> loadvm_process_command(const char *s, uint16_t len) "com=%s len=%d" >> loadvm_process_command_ping(uint32_t val) "0x%x" >> +loadvm_handle_initial_data_enable(uint8_t general_enable, const char *idstr, int instance_id) "general_enable=%u, idstr=%s, instance_id=%u" >> postcopy_ram_listen_thread_exit(void) "" >> postcopy_ram_listen_thread_start(void) "" >> qemu_savevm_send_postcopy_advise(void) "" >> @@ -38,6 +39,7 @@ savevm_send_postcopy_run(void) "" >> savevm_send_postcopy_resume(void) "" >> savevm_send_colo_enable(void) "" >> savevm_send_recv_bitmap(char *name) "%s" >> +savevm_send_initial_data_enable(uint8_t general_enable, const char *idstr, int instance_id) "general_enable=%u, idstr=%s, instance_id=%u" >> savevm_state_setup(void) "" >> savevm_state_resume_prepare(void) "" >> savevm_state_header(void) "" >> -- >> 2.26.3 >> > -- > Peter Xu >
Avihai Horon <avihaih@nvidia.com> wrote: > Add precopy initial data handshake between source and destination upon > migration setup. The purpose of the handshake is to notify the > destination that precopy initial data is used and which migration users > (i.e., SaveStateEntry) are going to use it. > > The handshake is done in two levels. First, a general enable command is > sent to notify the destination migration code that precopy initial data > is used. > Then, for each migration user in the source that supports precopy > initial data, an enable command is sent to its counterpart in the > destination: > If both support it, precopy initial data will be used for them. > If source doesn't support it, precopy initial data will not be used for > them. > If source supports it and destination doesn't, migration will be failed. > > To implement it, a new migration command MIG_CMD_INITIAL_DATA_ENABLE is > added, as well as a new SaveVMHandlers handler initial_data_advise. > Calling the handler advises the migration user that precopy initial data > is used and its return value indicates whether precopy initial data is > supported by it. > > Signed-off-by: Avihai Horon <avihaih@nvidia.com> > diff --git a/migration/savevm.c b/migration/savevm.c > index a9181b444b..2740defdf0 100644 > --- a/migration/savevm.c > +++ b/migration/savevm.c > @@ -71,6 +71,13 @@ > > const unsigned int postcopy_ram_discard_version; > > +typedef struct { > + uint8_t general_enable; I miss a comment for this field. I think that you only use the values 0 and 1 And that 1 means something like: we require this feature and do nothing And that 0 means that this is a device that uses the feature and <something, something> > + uint8_t reserved[7]; > + uint8_t idstr[256]; > + uint32_t instance_id; > +} InitialDataInfo; We have several "reserved" space here. Do we want a Version field? It don't seem that we need a size field, as the command is fixed length. > @@ -90,6 +97,8 @@ enum qemu_vm_cmd { > MIG_CMD_ENABLE_COLO, /* Enable COLO */ > MIG_CMD_POSTCOPY_RESUME, /* resume postcopy on dest */ > MIG_CMD_RECV_BITMAP, /* Request for recved bitmap on dst */ > + Spurious blank line > + MIG_CMD_INITIAL_DATA_ENABLE, /* Enable precopy initial data in dest */ > MIG_CMD_MAX > +void qemu_savevm_send_initial_data_enable(MigrationState *ms, QEMUFile *f) > +{ > + SaveStateEntry *se; > + InitialDataInfo buf; Small nit. The new way in the block to declare that something needs to be initialized to zero is: InitialDataInfo buf = {}; And no, I have no clue if this makes the compiler generate any better code. > + /* Enable precopy initial data generally in the migration */ > + memset(&buf, 0, sizeof(buf)); > + buf.general_enable = 1; > + qemu_savevm_command_send(f, MIG_CMD_INITIAL_DATA_ENABLE, sizeof(buf), > + (uint8_t *)&buf); > + trace_savevm_send_initial_data_enable(buf.general_enable, (char *)buf.idstr, > + buf.instance_id); No buf.idstr here. Why do we need a command before the loop and seeing if we are having any device that requires this. > + /* Enable precopy initial data for each migration user that supports it */ > + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { > + if (!se->ops || !se->ops->initial_data_advise) { > + continue; > + } > + > + if (!se->ops->initial_data_advise(se->opaque)) { > + continue; > + } Is this callback going to send anything? > + > + memset(&buf, 0, sizeof(buf)); > + memcpy(buf.idstr, se->idstr, sizeof(buf.idstr)); > + buf.instance_id = se->instance_id; > + > + qemu_savevm_command_send(f, MIG_CMD_INITIAL_DATA_ENABLE, sizeof(buf), > + (uint8_t *)&buf); > + trace_savevm_send_initial_data_enable( > + buf.general_enable, (char *)buf.idstr, buf.instance_id); It is me or general_enable is always zero here? > + } > +} > + > void qemu_savevm_send_colo_enable(QEMUFile *f) > { > trace_savevm_send_colo_enable(); > @@ -2314,6 +2359,60 @@ static int loadvm_process_enable_colo(MigrationIncomingState *mis) > return ret; > } > > +static int loadvm_handle_initial_data_enable(MigrationIncomingState *mis) > +{ > + InitialDataInfo buf; > + SaveStateEntry *se; > + ssize_t read_size; > + > + read_size = qemu_get_buffer(mis->from_src_file, (void *)&buf, sizeof(buf)); > + if (read_size != sizeof(buf)) { > + error_report("%s: Could not get data buffer, read_size %ld, len %lu", > + __func__, read_size, sizeof(buf)); > + > + return -EIO; > + } > + > + /* Enable precopy initial data generally in the migration */ > + if (buf.general_enable) { > + mis->initial_data_enabled = true; > + trace_loadvm_handle_initial_data_enable( > + buf.general_enable, (char *)buf.idstr, buf.instance_id); > + > + return 0; > + } > + > + /* Enable precopy initial data for a specific migration user */ > + se = find_se((char *)buf.idstr, buf.instance_id); > + if (!se) { > + error_report("%s: Could not find SaveStateEntry, idstr '%s', " > + "instance_id %" PRIu32, > + __func__, buf.idstr, buf.instance_id); > + > + return -ENOENT; > + } > + > + if (!se->ops || !se->ops->initial_data_advise) { > + error_report("%s: '%s' doesn't have required " > + "initial_data_advise op", > + __func__, buf.idstr); > + > + return -EOPNOTSUPP; > + } > + > + if (!se->ops->initial_data_advise(se->opaque)) { > + error_report("%s: '%s' doesn't support precopy initial data", __func__, > + buf.idstr); This is not your fault. Just venting. And here we are, again, with a place where we can't return errors. Sniff. > + > + return -EOPNOTSUPP; > + } I have to wait until I see the usage later in the series, but it is a good idea to have a single handle for source and destination, and not passing at least a parameter telling where are we? Really nice patch, very good done and very good integrated with the surrounded style. A pleasure. Later, Juan.
On 10/05/2023 11:40, Juan Quintela wrote: > External email: Use caution opening links or attachments > > > Avihai Horon <avihaih@nvidia.com> wrote: >> Add precopy initial data handshake between source and destination upon >> migration setup. The purpose of the handshake is to notify the >> destination that precopy initial data is used and which migration users >> (i.e., SaveStateEntry) are going to use it. >> >> The handshake is done in two levels. First, a general enable command is >> sent to notify the destination migration code that precopy initial data >> is used. >> Then, for each migration user in the source that supports precopy >> initial data, an enable command is sent to its counterpart in the >> destination: >> If both support it, precopy initial data will be used for them. >> If source doesn't support it, precopy initial data will not be used for >> them. >> If source supports it and destination doesn't, migration will be failed. >> >> To implement it, a new migration command MIG_CMD_INITIAL_DATA_ENABLE is >> added, as well as a new SaveVMHandlers handler initial_data_advise. >> Calling the handler advises the migration user that precopy initial data >> is used and its return value indicates whether precopy initial data is >> supported by it. >> >> Signed-off-by: Avihai Horon <avihaih@nvidia.com> >> diff --git a/migration/savevm.c b/migration/savevm.c >> index a9181b444b..2740defdf0 100644 >> --- a/migration/savevm.c >> +++ b/migration/savevm.c >> @@ -71,6 +71,13 @@ >> >> const unsigned int postcopy_ram_discard_version; >> >> +typedef struct { >> + uint8_t general_enable; > I miss a comment for this field. > > I think that you only use the values 0 and 1 > And that 1 means something like: we require this feature and do nothing > And that 0 means that this is a device that uses the feature and > <something, something> Correct. But if we assume both source and destination will enable the capability then general_enable can just be dropped. >> + uint8_t reserved[7]; >> + uint8_t idstr[256]; >> + uint32_t instance_id; >> +} InitialDataInfo; > We have several "reserved" space here. Do we want a Version field? You mean SaveStateEntry->version_id field? If so, then I don't think it's necessary, it's checked later on when device data is sent. > It don't seem that we need a size field, as the command is fixed length. > >> @@ -90,6 +97,8 @@ enum qemu_vm_cmd { >> MIG_CMD_ENABLE_COLO, /* Enable COLO */ >> MIG_CMD_POSTCOPY_RESUME, /* resume postcopy on dest */ >> MIG_CMD_RECV_BITMAP, /* Request for recved bitmap on dst */ >> + > Spurious blank line Will remove. > >> + MIG_CMD_INITIAL_DATA_ENABLE, /* Enable precopy initial data in dest */ >> MIG_CMD_MAX > > >> +void qemu_savevm_send_initial_data_enable(MigrationState *ms, QEMUFile *f) >> +{ >> + SaveStateEntry *se; >> + InitialDataInfo buf; > Small nit. > > The new way in the block to declare that something needs to be > initialized to zero is: > > InitialDataInfo buf = {}; > > And no, I have no clue if this makes the compiler generate any better code. Will change. > >> + /* Enable precopy initial data generally in the migration */ >> + memset(&buf, 0, sizeof(buf)); >> + buf.general_enable = 1; >> + qemu_savevm_command_send(f, MIG_CMD_INITIAL_DATA_ENABLE, sizeof(buf), >> + (uint8_t *)&buf); >> + trace_savevm_send_initial_data_enable(buf.general_enable, (char *)buf.idstr, >> + buf.instance_id); > No buf.idstr here. > > Why do we need a command before the loop and seeing if we are having any > device that requires this. If we assume both source and destination will enable the capability then this can be dropped as well. > >> + /* Enable precopy initial data for each migration user that supports it */ >> + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { >> + if (!se->ops || !se->ops->initial_data_advise) { >> + continue; >> + } >> + >> + if (!se->ops->initial_data_advise(se->opaque)) { >> + continue; >> + } > Is this callback going to send anything? Nope. > >> + >> + memset(&buf, 0, sizeof(buf)); >> + memcpy(buf.idstr, se->idstr, sizeof(buf.idstr)); >> + buf.instance_id = se->instance_id; >> + >> + qemu_savevm_command_send(f, MIG_CMD_INITIAL_DATA_ENABLE, sizeof(buf), >> + (uint8_t *)&buf); >> + trace_savevm_send_initial_data_enable( >> + buf.general_enable, (char *)buf.idstr, buf.instance_id); > It is me or general_enable is always zero here? Yes, it's always 0. But as I wrote above, if we assume both source and destination will enable the capability then general_enable can be dropped. > >> + } >> +} >> + >> void qemu_savevm_send_colo_enable(QEMUFile *f) >> { >> trace_savevm_send_colo_enable(); >> @@ -2314,6 +2359,60 @@ static int loadvm_process_enable_colo(MigrationIncomingState *mis) >> return ret; >> } >> >> +static int loadvm_handle_initial_data_enable(MigrationIncomingState *mis) >> +{ >> + InitialDataInfo buf; >> + SaveStateEntry *se; >> + ssize_t read_size; >> + >> + read_size = qemu_get_buffer(mis->from_src_file, (void *)&buf, sizeof(buf)); >> + if (read_size != sizeof(buf)) { >> + error_report("%s: Could not get data buffer, read_size %ld, len %lu", >> + __func__, read_size, sizeof(buf)); >> + >> + return -EIO; >> + } >> + >> + /* Enable precopy initial data generally in the migration */ >> + if (buf.general_enable) { >> + mis->initial_data_enabled = true; >> + trace_loadvm_handle_initial_data_enable( >> + buf.general_enable, (char *)buf.idstr, buf.instance_id); >> + >> + return 0; >> + } >> + >> + /* Enable precopy initial data for a specific migration user */ >> + se = find_se((char *)buf.idstr, buf.instance_id); >> + if (!se) { >> + error_report("%s: Could not find SaveStateEntry, idstr '%s', " >> + "instance_id %" PRIu32, >> + __func__, buf.idstr, buf.instance_id); >> + >> + return -ENOENT; >> + } >> + >> + if (!se->ops || !se->ops->initial_data_advise) { >> + error_report("%s: '%s' doesn't have required " >> + "initial_data_advise op", >> + __func__, buf.idstr); >> + >> + return -EOPNOTSUPP; >> + } >> + >> + if (!se->ops->initial_data_advise(se->opaque)) { >> + error_report("%s: '%s' doesn't support precopy initial data", __func__, >> + buf.idstr); > This is not your fault. Just venting. > > And here we are, again, with a place where we can't return errors. Sniff. > >> + >> + return -EOPNOTSUPP; >> + } > I have to wait until I see the usage later in the series, but it is a > good idea to have a single handle for source and destination, and not > passing at least a parameter telling where are we? Are you referring here to the initial_data_advise() handler? If so, then I didn't need to distinguish between source and destination. But I can add a parameter if you think it could be useful/good practice. > > Really nice patch, very good done and very good integrated with the > surrounded style. A pleasure. Thanks!
Hello Avihai, > +static int loadvm_handle_initial_data_enable(MigrationIncomingState *mis) > +{ > + InitialDataInfo buf; > + SaveStateEntry *se; > + ssize_t read_size; > + > + read_size = qemu_get_buffer(mis->from_src_file, (void *)&buf, sizeof(buf)); > + if (read_size != sizeof(buf)) { > + error_report("%s: Could not get data buffer, read_size %ld, len %lu", please use %zd and %zu for ssize_t. Thanks, C.
On 14/05/2023 19:42, Cédric Le Goater wrote: > External email: Use caution opening links or attachments > > > Hello Avihai, > >> +static int loadvm_handle_initial_data_enable(MigrationIncomingState >> *mis) >> +{ >> + InitialDataInfo buf; >> + SaveStateEntry *se; >> + ssize_t read_size; >> + >> + read_size = qemu_get_buffer(mis->from_src_file, (void *)&buf, >> sizeof(buf)); >> + if (read_size != sizeof(buf)) { >> + error_report("%s: Could not get data buffer, read_size %ld, >> len %lu", > > please use %zd and %zu for ssize_t. > Ah, right. Sure will change. Thanks.
diff --git a/include/migration/register.h b/include/migration/register.h index a8dfd8fefd..0a73f3883e 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -71,6 +71,12 @@ typedef struct SaveVMHandlers { int (*load_cleanup)(void *opaque); /* Called when postcopy migration wants to resume from failure */ int (*resume_prepare)(MigrationState *s, void *opaque); + + /* + * Advises that precopy initial data was requested to be enabled. Returns + * true if it's supported or false otherwise. Called both in src and dest. + */ + bool (*initial_data_advise)(void *opaque); } SaveVMHandlers; int register_savevm_live(const char *idstr, diff --git a/migration/migration.h b/migration/migration.h index 3a918514e7..4f615e9dbc 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -204,6 +204,9 @@ struct MigrationIncomingState { * contains valid information. */ QemuMutex page_request_mutex; + + /* Indicates whether precopy initial data was enabled and should be used */ + bool initial_data_enabled; }; MigrationIncomingState *migration_incoming_get_current(void); diff --git a/migration/savevm.h b/migration/savevm.h index fb636735f0..d47ab4ad18 100644 --- a/migration/savevm.h +++ b/migration/savevm.h @@ -58,6 +58,7 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name, uint64_t *start_list, uint64_t *length_list); void qemu_savevm_send_colo_enable(QEMUFile *f); +void qemu_savevm_send_initial_data_enable(MigrationState *ms, QEMUFile *f); void qemu_savevm_live_state(QEMUFile *f); int qemu_save_device_state(QEMUFile *f); diff --git a/migration/migration.c b/migration/migration.c index abcadbb619..68cdf5b184 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2964,6 +2964,10 @@ static void *migration_thread(void *opaque) qemu_savevm_send_colo_enable(s->to_dst_file); } + if (migrate_precopy_initial_data()) { + qemu_savevm_send_initial_data_enable(s, s->to_dst_file); + } + qemu_savevm_state_setup(s->to_dst_file); qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP, diff --git a/migration/savevm.c b/migration/savevm.c index a9181b444b..2740defdf0 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -71,6 +71,13 @@ const unsigned int postcopy_ram_discard_version; +typedef struct { + uint8_t general_enable; + uint8_t reserved[7]; + uint8_t idstr[256]; + uint32_t instance_id; +} InitialDataInfo; + /* Subcommands for QEMU_VM_COMMAND */ enum qemu_vm_cmd { MIG_CMD_INVALID = 0, /* Must be 0 */ @@ -90,6 +97,8 @@ enum qemu_vm_cmd { MIG_CMD_ENABLE_COLO, /* Enable COLO */ MIG_CMD_POSTCOPY_RESUME, /* resume postcopy on dest */ MIG_CMD_RECV_BITMAP, /* Request for recved bitmap on dst */ + + MIG_CMD_INITIAL_DATA_ENABLE, /* Enable precopy initial data in dest */ MIG_CMD_MAX }; @@ -109,6 +118,8 @@ static struct mig_cmd_args { [MIG_CMD_POSTCOPY_RESUME] = { .len = 0, .name = "POSTCOPY_RESUME" }, [MIG_CMD_PACKAGED] = { .len = 4, .name = "PACKAGED" }, [MIG_CMD_RECV_BITMAP] = { .len = -1, .name = "RECV_BITMAP" }, + [MIG_CMD_INITIAL_DATA_ENABLE] = { .len = sizeof(InitialDataInfo), + .name = "INITIAL_DATA_ENABLE" }, [MIG_CMD_MAX] = { .len = -1, .name = "MAX" }, }; @@ -1036,6 +1047,40 @@ static void qemu_savevm_command_send(QEMUFile *f, qemu_fflush(f); } +void qemu_savevm_send_initial_data_enable(MigrationState *ms, QEMUFile *f) +{ + SaveStateEntry *se; + InitialDataInfo buf; + + /* Enable precopy initial data generally in the migration */ + memset(&buf, 0, sizeof(buf)); + buf.general_enable = 1; + qemu_savevm_command_send(f, MIG_CMD_INITIAL_DATA_ENABLE, sizeof(buf), + (uint8_t *)&buf); + trace_savevm_send_initial_data_enable(buf.general_enable, (char *)buf.idstr, + buf.instance_id); + + /* Enable precopy initial data for each migration user that supports it */ + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { + if (!se->ops || !se->ops->initial_data_advise) { + continue; + } + + if (!se->ops->initial_data_advise(se->opaque)) { + continue; + } + + memset(&buf, 0, sizeof(buf)); + memcpy(buf.idstr, se->idstr, sizeof(buf.idstr)); + buf.instance_id = se->instance_id; + + qemu_savevm_command_send(f, MIG_CMD_INITIAL_DATA_ENABLE, sizeof(buf), + (uint8_t *)&buf); + trace_savevm_send_initial_data_enable( + buf.general_enable, (char *)buf.idstr, buf.instance_id); + } +} + void qemu_savevm_send_colo_enable(QEMUFile *f) { trace_savevm_send_colo_enable(); @@ -2314,6 +2359,60 @@ static int loadvm_process_enable_colo(MigrationIncomingState *mis) return ret; } +static int loadvm_handle_initial_data_enable(MigrationIncomingState *mis) +{ + InitialDataInfo buf; + SaveStateEntry *se; + ssize_t read_size; + + read_size = qemu_get_buffer(mis->from_src_file, (void *)&buf, sizeof(buf)); + if (read_size != sizeof(buf)) { + error_report("%s: Could not get data buffer, read_size %ld, len %lu", + __func__, read_size, sizeof(buf)); + + return -EIO; + } + + /* Enable precopy initial data generally in the migration */ + if (buf.general_enable) { + mis->initial_data_enabled = true; + trace_loadvm_handle_initial_data_enable( + buf.general_enable, (char *)buf.idstr, buf.instance_id); + + return 0; + } + + /* Enable precopy initial data for a specific migration user */ + se = find_se((char *)buf.idstr, buf.instance_id); + if (!se) { + error_report("%s: Could not find SaveStateEntry, idstr '%s', " + "instance_id %" PRIu32, + __func__, buf.idstr, buf.instance_id); + + return -ENOENT; + } + + if (!se->ops || !se->ops->initial_data_advise) { + error_report("%s: '%s' doesn't have required " + "initial_data_advise op", + __func__, buf.idstr); + + return -EOPNOTSUPP; + } + + if (!se->ops->initial_data_advise(se->opaque)) { + error_report("%s: '%s' doesn't support precopy initial data", __func__, + buf.idstr); + + return -EOPNOTSUPP; + } + + trace_loadvm_handle_initial_data_enable(buf.general_enable, + (char *)buf.idstr, buf.instance_id); + + return 0; +} + /* * Process an incoming 'QEMU_VM_COMMAND' * 0 just a normal return @@ -2397,6 +2496,9 @@ static int loadvm_process_command(QEMUFile *f) case MIG_CMD_ENABLE_COLO: return loadvm_process_enable_colo(mis); + + case MIG_CMD_INITIAL_DATA_ENABLE: + return loadvm_handle_initial_data_enable(mis); } return 0; diff --git a/migration/trace-events b/migration/trace-events index 92161eeac5..21ae471126 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -23,6 +23,7 @@ loadvm_postcopy_ram_handle_discard_end(void) "" loadvm_postcopy_ram_handle_discard_header(const char *ramid, uint16_t len) "%s: %ud" loadvm_process_command(const char *s, uint16_t len) "com=%s len=%d" loadvm_process_command_ping(uint32_t val) "0x%x" +loadvm_handle_initial_data_enable(uint8_t general_enable, const char *idstr, int instance_id) "general_enable=%u, idstr=%s, instance_id=%u" postcopy_ram_listen_thread_exit(void) "" postcopy_ram_listen_thread_start(void) "" qemu_savevm_send_postcopy_advise(void) "" @@ -38,6 +39,7 @@ savevm_send_postcopy_run(void) "" savevm_send_postcopy_resume(void) "" savevm_send_colo_enable(void) "" savevm_send_recv_bitmap(char *name) "%s" +savevm_send_initial_data_enable(uint8_t general_enable, const char *idstr, int instance_id) "general_enable=%u, idstr=%s, instance_id=%u" savevm_state_setup(void) "" savevm_state_resume_prepare(void) "" savevm_state_header(void) ""
Add precopy initial data handshake between source and destination upon migration setup. The purpose of the handshake is to notify the destination that precopy initial data is used and which migration users (i.e., SaveStateEntry) are going to use it. The handshake is done in two levels. First, a general enable command is sent to notify the destination migration code that precopy initial data is used. Then, for each migration user in the source that supports precopy initial data, an enable command is sent to its counterpart in the destination: If both support it, precopy initial data will be used for them. If source doesn't support it, precopy initial data will not be used for them. If source supports it and destination doesn't, migration will be failed. To implement it, a new migration command MIG_CMD_INITIAL_DATA_ENABLE is added, as well as a new SaveVMHandlers handler initial_data_advise. Calling the handler advises the migration user that precopy initial data is used and its return value indicates whether precopy initial data is supported by it. Signed-off-by: Avihai Horon <avihaih@nvidia.com> --- include/migration/register.h | 6 +++ migration/migration.h | 3 ++ migration/savevm.h | 1 + migration/migration.c | 4 ++ migration/savevm.c | 102 +++++++++++++++++++++++++++++++++++ migration/trace-events | 2 + 6 files changed, 118 insertions(+)