Message ID | 20230518160026.57414-1-wei.w.wang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v1] migration: fail the cap check if it requires the use of deferred incoming | expand |
On Fri, May 19, 2023 at 12:00:26AM +0800, Wei Wang wrote: > qemu_start_incoming_migration needs to check the number of multifd > channels or postcopy ram channels to configure the backlog parameter (i.e. > the maximum length to which the queue of pending connections for sockfd > may grow) of listen(). So multifd and postcopy-preempt caps require the > use of deferred incoming, that is, calling qemu_start_incoming_migration > should be deferred via qmp or hmp commands after the cap of multifd and > postcopy-preempt are configured. > > Check if deferred incoming is used when enabling multifd or > postcopy-preempt, and fail the check with error messages if not. > > Signed-off-by: Wei Wang <wei.w.wang@intel.com> IIUC this will unfortunately break things like: -global migration.x-postcopy-preempt=on where the cap is actually applied before incoming starts even with !defer so it should still work. Can we just make socket_start_incoming_migration_internal() listen on a static but larger value?
On Friday, May 19, 2023 3:20 AM, Peter Xu wrote: > On Fri, May 19, 2023 at 12:00:26AM +0800, Wei Wang wrote: > > qemu_start_incoming_migration needs to check the number of multifd > > channels or postcopy ram channels to configure the backlog parameter (i.e. > > the maximum length to which the queue of pending connections for > > sockfd may grow) of listen(). So multifd and postcopy-preempt caps > > require the use of deferred incoming, that is, calling > > qemu_start_incoming_migration should be deferred via qmp or hmp > > commands after the cap of multifd and postcopy-preempt are configured. > > > > Check if deferred incoming is used when enabling multifd or > > postcopy-preempt, and fail the check with error messages if not. > > > > Signed-off-by: Wei Wang <wei.w.wang@intel.com> > > IIUC this will unfortunately break things like: > > -global migration.x-postcopy-preempt=on > > where the cap is actually applied before incoming starts even with !defer so > it should still work. Actually the patch doesn’t check "!defer". It just checks if incoming has been started or not. It allows the 2 caps to be set only before incoming starts. So I think the above should work. > > Can we just make socket_start_incoming_migration_internal() listen on a > static but larger value? Yes, agree for this and that's out initial change. This needs listen() to create a longer queue for pending connections (seems OK to me). Need to see Daniel and Juan's opinion about this.
On Thu, May 18, 2023 at 03:20:02PM -0400, Peter Xu wrote: > On Fri, May 19, 2023 at 12:00:26AM +0800, Wei Wang wrote: > > qemu_start_incoming_migration needs to check the number of multifd > > channels or postcopy ram channels to configure the backlog parameter (i.e. > > the maximum length to which the queue of pending connections for sockfd > > may grow) of listen(). So multifd and postcopy-preempt caps require the > > use of deferred incoming, that is, calling qemu_start_incoming_migration > > should be deferred via qmp or hmp commands after the cap of multifd and > > postcopy-preempt are configured. > > > > Check if deferred incoming is used when enabling multifd or > > postcopy-preempt, and fail the check with error messages if not. > > > > Signed-off-by: Wei Wang <wei.w.wang@intel.com> > > IIUC this will unfortunately break things like: > > -global migration.x-postcopy-preempt=on > > where the cap is actually applied before incoming starts even with !defer > so it should still work. > > Can we just make socket_start_incoming_migration_internal() listen on a > static but larger value? Why do we need todo that ? I thought we just determined the problem was a configuration error, not a code error. With regards, Daniel
On Fri, May 19, 2023 at 02:34:57AM +0000, Wang, Wei W wrote: > On Friday, May 19, 2023 3:20 AM, Peter Xu wrote: > > On Fri, May 19, 2023 at 12:00:26AM +0800, Wei Wang wrote: > > > qemu_start_incoming_migration needs to check the number of multifd > > > channels or postcopy ram channels to configure the backlog parameter (i.e. > > > the maximum length to which the queue of pending connections for > > > sockfd may grow) of listen(). So multifd and postcopy-preempt caps > > > require the use of deferred incoming, that is, calling > > > qemu_start_incoming_migration should be deferred via qmp or hmp > > > commands after the cap of multifd and postcopy-preempt are configured. > > > > > > Check if deferred incoming is used when enabling multifd or > > > postcopy-preempt, and fail the check with error messages if not. > > > > > > Signed-off-by: Wei Wang <wei.w.wang@intel.com> > > > > IIUC this will unfortunately break things like: > > > > -global migration.x-postcopy-preempt=on > > > > where the cap is actually applied before incoming starts even with !defer so > > it should still work. > > Actually the patch doesn’t check "!defer". It just checks if incoming has been started > or not. It allows the 2 caps to be set only before incoming starts. So I think the above > should work. Ah yes indeed it keeps working, because we apply -global bits before setup sockets. Then it's fine by me since that's the only thing I would still like to keep it working. :) If so, can we reword the error message a bit? Obviously as you said we're not really checking against -defer, but established channels. The problem is if something is established without knowing multifd being there it may not work for multifd or preempt, not strictly about defer. How about: "Multifd/Preempt-Mode cannot be modified if incoming channel has setup" ?
On Fri, May 19, 2023 at 09:26:23AM +0100, Daniel P. Berrangé wrote: > On Thu, May 18, 2023 at 03:20:02PM -0400, Peter Xu wrote: > > On Fri, May 19, 2023 at 12:00:26AM +0800, Wei Wang wrote: > > > qemu_start_incoming_migration needs to check the number of multifd > > > channels or postcopy ram channels to configure the backlog parameter (i.e. > > > the maximum length to which the queue of pending connections for sockfd > > > may grow) of listen(). So multifd and postcopy-preempt caps require the > > > use of deferred incoming, that is, calling qemu_start_incoming_migration > > > should be deferred via qmp or hmp commands after the cap of multifd and > > > postcopy-preempt are configured. > > > > > > Check if deferred incoming is used when enabling multifd or > > > postcopy-preempt, and fail the check with error messages if not. > > > > > > Signed-off-by: Wei Wang <wei.w.wang@intel.com> > > > > IIUC this will unfortunately break things like: > > > > -global migration.x-postcopy-preempt=on > > > > where the cap is actually applied before incoming starts even with !defer > > so it should still work. > > > > Can we just make socket_start_incoming_migration_internal() listen on a > > static but larger value? > > Why do we need todo that ? I thought we just determined the problem was > a configuration error, not a code error. Yes, sorry I just read the other thread for more context. So it seems my concern wasn't really a concern anyway, now I'm fine with the current approach. Thanks,
On Fri, May 19, 2023 at 11:30:31AM -0400, Peter Xu wrote: > On Fri, May 19, 2023 at 02:34:57AM +0000, Wang, Wei W wrote: > > On Friday, May 19, 2023 3:20 AM, Peter Xu wrote: > > > On Fri, May 19, 2023 at 12:00:26AM +0800, Wei Wang wrote: > > > > qemu_start_incoming_migration needs to check the number of multifd > > > > channels or postcopy ram channels to configure the backlog parameter (i.e. > > > > the maximum length to which the queue of pending connections for > > > > sockfd may grow) of listen(). So multifd and postcopy-preempt caps > > > > require the use of deferred incoming, that is, calling > > > > qemu_start_incoming_migration should be deferred via qmp or hmp > > > > commands after the cap of multifd and postcopy-preempt are configured. > > > > > > > > Check if deferred incoming is used when enabling multifd or > > > > postcopy-preempt, and fail the check with error messages if not. > > > > > > > > Signed-off-by: Wei Wang <wei.w.wang@intel.com> > > > > > > IIUC this will unfortunately break things like: > > > > > > -global migration.x-postcopy-preempt=on > > > > > > where the cap is actually applied before incoming starts even with !defer so > > > it should still work. > > > > Actually the patch doesn’t check "!defer". It just checks if incoming has been started > > or not. It allows the 2 caps to be set only before incoming starts. So I think the above > > should work. > > Ah yes indeed it keeps working, because we apply -global bits before setup > sockets. Then it's fine by me since that's the only thing I would still > like to keep it working. :) > > If so, can we reword the error message a bit? Obviously as you said we're > not really checking against -defer, but established channels. The problem > is if something is established without knowing multifd being there it may > not work for multifd or preempt, not strictly about defer. > > How about: > > "Multifd/Preempt-Mode cannot be modified if incoming channel has setup" > > ? We may also want to trap the channel setups on num: migrate_params_test_apply(): if (params->has_multifd_channels) { dest->multifd_channels = params->multifd_channels; }
On Friday, May 19, 2023 11:34 PM, Peter Xu wrote: > > Ah yes indeed it keeps working, because we apply -global bits before > > setup sockets. Then it's fine by me since that's the only thing I > > would still like to keep it working. :) > > > > If so, can we reword the error message a bit? Obviously as you said > > we're not really checking against -defer, but established channels. > > The problem is if something is established without knowing multifd > > being there it may not work for multifd or preempt, not strictly about defer. > > > > How about: > > > > "Multifd/Preempt-Mode cannot be modified if incoming channel has > setup" > > > > ? Yes, I'll reword it a bit. > > We may also want to trap the channel setups on num: > > migrate_params_test_apply(): > > if (params->has_multifd_channels) { > dest->multifd_channels = params->multifd_channels; > } Didn’t get this one. What do you want to add to above?
On Sat, May 20, 2023 at 01:42:06AM +0000, Wang, Wei W wrote: > On Friday, May 19, 2023 11:34 PM, Peter Xu wrote: > > > Ah yes indeed it keeps working, because we apply -global bits before > > > setup sockets. Then it's fine by me since that's the only thing I > > > would still like to keep it working. :) > > > > > > If so, can we reword the error message a bit? Obviously as you said > > > we're not really checking against -defer, but established channels. > > > The problem is if something is established without knowing multifd > > > being there it may not work for multifd or preempt, not strictly about defer. > > > > > > How about: > > > > > > "Multifd/Preempt-Mode cannot be modified if incoming channel has > > setup" > > > > > > ? > > Yes, I'll reword it a bit. > > > > > We may also want to trap the channel setups on num: > > > > migrate_params_test_apply(): > > > > if (params->has_multifd_channels) { > > dest->multifd_channels = params->multifd_channels; > > } > > Didn’t get this one. What do you want to add to above? I meant after listen() is called with an explicit number in this case, should we disallow changing of multifd number of channels?
On Tuesday, May 23, 2023 7:36 AM, Peter Xu wrote: > > > We may also want to trap the channel setups on num: > > > > > > migrate_params_test_apply(): > > > > > > if (params->has_multifd_channels) { > > > dest->multifd_channels = params->multifd_channels; > > > } > > > > Didn’t get this one. What do you want to add to above? > > I meant after listen() is called with an explicit number in this case, should we > disallow changing of multifd number of channels? Got you, thanks. That seems unnecessary to me, as the cap setting is required for the use of multifd and patching there already achieves below what we want: - users get the error message when deferred -incoming isn’t used; - fail the cap setting for multifd, meaning that multifd won't be used (i.e. no place that will care about multifd_channels).
On Tue, May 23, 2023 at 01:44:03AM +0000, Wang, Wei W wrote: > On Tuesday, May 23, 2023 7:36 AM, Peter Xu wrote: > > > > We may also want to trap the channel setups on num: > > > > > > > > migrate_params_test_apply(): > > > > > > > > if (params->has_multifd_channels) { > > > > dest->multifd_channels = params->multifd_channels; > > > > } > > > > > > Didn’t get this one. What do you want to add to above? > > > > I meant after listen() is called with an explicit number in this case, should we > > disallow changing of multifd number of channels? > > Got you, thanks. That seems unnecessary to me, as the cap setting is required > for the use of multifd and patching there already achieves below what we want: > - users get the error message when deferred -incoming isn’t used; > - fail the cap setting for multifd, meaning that multifd won't be used (i.e. > no place that will care about multifd_channels). It's about whether we want to protect e.g. below steps: 1. start dest qemu with -incoming defer 2. "migrate-set-capabilities" to enable multifd 3. "migrate-incoming xxx" to setup the sockets 4. "migrate-set-parameters" to setup the num of multifd <--- will be invalid here Would that still be a problem that falls into the same category of what this patch wants to protect qemu from? Thanks,
On Tuesday, May 23, 2023 9:41 PM, Peter Xu wrote: > On Tue, May 23, 2023 at 01:44:03AM +0000, Wang, Wei W wrote: > > On Tuesday, May 23, 2023 7:36 AM, Peter Xu wrote: > > > > > We may also want to trap the channel setups on num: > > > > > > > > > > migrate_params_test_apply(): > > > > > > > > > > if (params->has_multifd_channels) { > > > > > dest->multifd_channels = params->multifd_channels; > > > > > } > > > > > > > > Didn’t get this one. What do you want to add to above? > > > > > > I meant after listen() is called with an explicit number in this > > > case, should we disallow changing of multifd number of channels? > > > > Got you, thanks. That seems unnecessary to me, as the cap setting is > > required for the use of multifd and patching there already achieves below > what we want: > > - users get the error message when deferred -incoming isn’t used; > > - fail the cap setting for multifd, meaning that multifd won't be used (i.e. > > no place that will care about multifd_channels). > > It's about whether we want to protect e.g. below steps: > > 1. start dest qemu with -incoming defer > 2. "migrate-set-capabilities" to enable multifd 3. "migrate-incoming xxx" to > setup the sockets > 4. "migrate-set-parameters" to setup the num of multifd <--- will be invalid > here Yes, step 4 is invalid, but I think nobody cares about that (i.e. no place uses the invalid value) as step2 already fails the cap setting (with error messages). > > Would that still be a problem that falls into the same category of what this > patch wants to protect qemu from? My intension was to notice the user explicitly that the deferred -incoming must be used for multifd (if not the case, stop the use of multifd). I think the patch already achieves this. Adding such check to migrate-set-parameters doesn’t cause problems, but seems a bit redundant to me. Not sure if others would also have strong preferences to do so for any reason. Thanks, Wei
On Tue, May 23, 2023 at 02:30:25PM +0000, Wang, Wei W wrote: > > It's about whether we want to protect e.g. below steps: > > > > 1. start dest qemu with -incoming defer > > 2. "migrate-set-capabilities" to enable multifd > > 3. "migrate-incoming xxx" to setup the sockets > > 4. "migrate-set-parameters" to setup the num of multifd <--- will be invalid here > > Yes, step 4 is invalid, but I think nobody cares about that (i.e. no place uses the > invalid value) as step2 already fails the cap setting (with error messages). Since only until step 3 it setups the transport_data, so step 2 should be fine and not fail? That's the whole point of my example or I missd something here..
On Tuesday, May 23, 2023 10:50 PM, Peter Xu wrote: > On Tue, May 23, 2023 at 02:30:25PM +0000, Wang, Wei W wrote: > > > It's about whether we want to protect e.g. below steps: > > > > > > 1. start dest qemu with -incoming defer 2. > > > "migrate-set-capabilities" to enable multifd 3. "migrate-incoming > > > xxx" to setup the sockets > > > 4. "migrate-set-parameters" to setup the num of multifd <--- will be > invalid here > > > > Yes, step 4 is invalid, but I think nobody cares about that (i.e. no > > place uses the invalid value) as step2 already fails the cap setting (with > error messages). > > Since only until step 3 it setups the transport_data, so step 2 should be fine > and not fail? That's the whole point of my example or I missd something > here.. OK, I misread something before. Good point, thanks.
diff --git a/migration/options.c b/migration/options.c index c2a278ee2d..25b333b3f4 100644 --- a/migration/options.c +++ b/migration/options.c @@ -537,6 +537,11 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp) error_setg(errp, "Postcopy preempt not compatible with compress"); return false; } + + if (mis->transport_data) { + error_setg(errp, "Postcopy preempt should use deferred incoming"); + return false; + } } if (new_caps[MIGRATION_CAPABILITY_MULTIFD]) { @@ -544,6 +549,10 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp) error_setg(errp, "Multifd is not compatible with compress"); return false; } + if (mis->transport_data) { + error_setg(errp, "Multifd should use deferred incoming"); + return false; + } } return true;
qemu_start_incoming_migration needs to check the number of multifd channels or postcopy ram channels to configure the backlog parameter (i.e. the maximum length to which the queue of pending connections for sockfd may grow) of listen(). So multifd and postcopy-preempt caps require the use of deferred incoming, that is, calling qemu_start_incoming_migration should be deferred via qmp or hmp commands after the cap of multifd and postcopy-preempt are configured. Check if deferred incoming is used when enabling multifd or postcopy-preempt, and fail the check with error messages if not. Signed-off-by: Wei Wang <wei.w.wang@intel.com> --- migration/options.c | 9 +++++++++ 1 file changed, 9 insertions(+)