Message ID | 24a7412cc151f8b48d74cd170a3bdc1ce8e6c879.1741124640.git.maciej.szmigiero@oracle.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Multifd | expand |
On Tue, Mar 04, 2025 at 11:03:34PM +0100, Maciej S. Szmigiero wrote: > From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> > > All callers to migration_incoming_state_destroy() other than > postcopy_ram_listen_thread() do this call with BQL held. > > Since migration_incoming_state_destroy() ultimately calls "load_cleanup" > SaveVMHandlers and it will soon call BQL-sensitive code it makes sense > to always call that function under BQL rather than to have it deal with > both cases (with BQL and without BQL). > Add the necessary bql_lock() and bql_unlock() to > postcopy_ram_listen_thread(). > > qemu_loadvm_state_main() in postcopy_ram_listen_thread() could call > "load_state" SaveVMHandlers that are expecting BQL to be held. > > In principle, the only devices that should be arriving on migration > channel serviced by postcopy_ram_listen_thread() are those that are > postcopiable and whose load handlers are safe to be called without BQL > being held. > > But nothing currently prevents the source from sending data for "unsafe" > devices which would cause trouble there. > Add a TODO comment there so it's clear that it would be good to improve > handling of such (erroneous) case in the future. > > Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> > --- > migration/migration.c | 16 ++++++++++++++++ > migration/savevm.c | 4 ++++ > 2 files changed, 20 insertions(+) > > diff --git a/migration/migration.c b/migration/migration.c > index 9e9db26667f1..6b2a8af4231d 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -402,10 +402,26 @@ void migration_incoming_state_destroy(void) > struct MigrationIncomingState *mis = migration_incoming_get_current(); > > multifd_recv_cleanup(); > + > /* > * RAM state cleanup needs to happen after multifd cleanup, because > * multifd threads can use some of its states (receivedmap). > + * > + * This call also needs BQL held since it calls all registered > + * load_cleanup SaveVMHandlers and at least the VFIO implementation is > + * BQL-sensitive. > + * > + * In addition to the above, it also performs cleanup of load threads > + * thread pool. > + * This cleanup operation is BQL-sensitive as it requires unlocking BQL > + * so a thread possibly waiting for it could get unblocked and finally > + * exit. > + * The reason why a load thread may need to hold BQL in the first place > + * is because address space modification operations require it. Hold on... This almost says exactly why load_cleanup() should _not_ take BQL... rather than should.. So I had a closer look at the latest code, it's about this: static void vfio_load_cleanup_load_bufs_thread(VFIOMultifd *multifd) { /* The lock order is load_bufs_mutex -> BQL so unlock BQL here first */ bql_unlock(); WITH_QEMU_LOCK_GUARD(&multifd->load_bufs_mutex) { while (multifd->load_bufs_thread_running) { multifd->load_bufs_thread_want_exit = true; qemu_cond_signal(&multifd->load_bufs_buffer_ready_cond); qemu_cond_signal(&multifd->load_bufs_iter_done_cond); qemu_cond_wait(&multifd->load_bufs_thread_finished_cond, &multifd->load_bufs_mutex); } } bql_lock(); } It doesn't make much sense to me to take it only because we want to drop it unconditionally. Can we guarantee the function not taking BQL instead? I had a quick look on pmem's pmem_persist() (from libpmem, qemu_ram_msync <- qemu_ram_block_writeback <- ram_load_cleanup), it looks ok. So the question is, is it safe to unlock BQL in whatever context (in coroutines, or in a bottom half)? If the answer is yes, we could make migration_incoming_state_destroy() always not taking BQL (and assert(!bql_locked()) instead). If the answer is no, then vfio_load_cleanup_load_bufs_thread()'s current version may not work either.. > + * > + * Check proper BQL state here rather than risk possible deadlock later. > */ > + assert(bql_locked()); > qemu_loadvm_state_cleanup(); > > if (mis->to_src_file) { > diff --git a/migration/savevm.c b/migration/savevm.c > index 7c1aa8ad7b9d..3e86b572cfa8 100644 > --- a/migration/savevm.c > +++ b/migration/savevm.c > @@ -1986,6 +1986,8 @@ static void *postcopy_ram_listen_thread(void *opaque) > * in qemu_file, and thus we must be blocking now. > */ > qemu_file_set_blocking(f, true); > + > + /* TODO: sanity check that only postcopiable data will be loaded here */ > load_res = qemu_loadvm_state_main(f, mis); > > /* > @@ -2046,7 +2048,9 @@ static void *postcopy_ram_listen_thread(void *opaque) > * (If something broke then qemu will have to exit anyway since it's > * got a bad migration state). > */ > + bql_lock(); > migration_incoming_state_destroy(); > + bql_unlock(); > > rcu_unregister_thread(); > mis->have_listen_thread = false; >
On 5.03.2025 13:34, Peter Xu wrote: > On Tue, Mar 04, 2025 at 11:03:34PM +0100, Maciej S. Szmigiero wrote: >> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> >> >> All callers to migration_incoming_state_destroy() other than >> postcopy_ram_listen_thread() do this call with BQL held. >> >> Since migration_incoming_state_destroy() ultimately calls "load_cleanup" >> SaveVMHandlers and it will soon call BQL-sensitive code it makes sense >> to always call that function under BQL rather than to have it deal with >> both cases (with BQL and without BQL). >> Add the necessary bql_lock() and bql_unlock() to >> postcopy_ram_listen_thread(). >> >> qemu_loadvm_state_main() in postcopy_ram_listen_thread() could call >> "load_state" SaveVMHandlers that are expecting BQL to be held. >> >> In principle, the only devices that should be arriving on migration >> channel serviced by postcopy_ram_listen_thread() are those that are >> postcopiable and whose load handlers are safe to be called without BQL >> being held. >> >> But nothing currently prevents the source from sending data for "unsafe" >> devices which would cause trouble there. >> Add a TODO comment there so it's clear that it would be good to improve >> handling of such (erroneous) case in the future. >> >> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> >> --- >> migration/migration.c | 16 ++++++++++++++++ >> migration/savevm.c | 4 ++++ >> 2 files changed, 20 insertions(+) >> >> diff --git a/migration/migration.c b/migration/migration.c >> index 9e9db26667f1..6b2a8af4231d 100644 >> --- a/migration/migration.c >> +++ b/migration/migration.c >> @@ -402,10 +402,26 @@ void migration_incoming_state_destroy(void) >> struct MigrationIncomingState *mis = migration_incoming_get_current(); >> >> multifd_recv_cleanup(); >> + >> /* >> * RAM state cleanup needs to happen after multifd cleanup, because >> * multifd threads can use some of its states (receivedmap). >> + * >> + * This call also needs BQL held since it calls all registered >> + * load_cleanup SaveVMHandlers and at least the VFIO implementation is >> + * BQL-sensitive. >> + * >> + * In addition to the above, it also performs cleanup of load threads >> + * thread pool. >> + * This cleanup operation is BQL-sensitive as it requires unlocking BQL >> + * so a thread possibly waiting for it could get unblocked and finally >> + * exit. >> + * The reason why a load thread may need to hold BQL in the first place >> + * is because address space modification operations require it. > > Hold on... > > This almost says exactly why load_cleanup() should _not_ take BQL... rather > than should.. > > So I had a closer look at the latest code, it's about this: > > static void vfio_load_cleanup_load_bufs_thread(VFIOMultifd *multifd) > { > /* The lock order is load_bufs_mutex -> BQL so unlock BQL here first */ > bql_unlock(); > WITH_QEMU_LOCK_GUARD(&multifd->load_bufs_mutex) { > while (multifd->load_bufs_thread_running) { > multifd->load_bufs_thread_want_exit = true; > > qemu_cond_signal(&multifd->load_bufs_buffer_ready_cond); > qemu_cond_signal(&multifd->load_bufs_iter_done_cond); > qemu_cond_wait(&multifd->load_bufs_thread_finished_cond, > &multifd->load_bufs_mutex); > } > } > bql_lock(); > } > > It doesn't make much sense to me to take it only because we want to drop it > unconditionally. Can we guarantee the function not taking BQL instead? I > had a quick look on pmem's pmem_persist() (from libpmem, qemu_ram_msync <- > qemu_ram_block_writeback <- ram_load_cleanup), it looks ok. > > So the question is, is it safe to unlock BQL in whatever context (in > coroutines, or in a bottom half)? > > If the answer is yes, we could make migration_incoming_state_destroy() > always not taking BQL (and assert(!bql_locked()) instead). All the other callers of migration_incoming_state_destroy() are holding BQL: process_incoming_migration_bh(), process_incoming_migration_co() (called on, failure path only), load_snapshot() and qmp_xen_load_devices_state(). So AFAIK the safer way is to standardize on holding BQL when calling that function. > If the answer is no, then vfio_load_cleanup_load_bufs_thread()'s current > version may not work either.. I think the reason for BQL is to serialize access to the QEMU internals which are not thread-safe. So as long as these internals aren't touched when not holding BQL then we should be safe - I don't see any particular state that's cached around these BQL calls and would need explicit reloading after re-gaining it. Thanks, Maciej
On Wed, Mar 05, 2025 at 04:11:30PM +0100, Maciej S. Szmigiero wrote: > On 5.03.2025 13:34, Peter Xu wrote: > > On Tue, Mar 04, 2025 at 11:03:34PM +0100, Maciej S. Szmigiero wrote: > > > From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> > > > > > > All callers to migration_incoming_state_destroy() other than > > > postcopy_ram_listen_thread() do this call with BQL held. > > > > > > Since migration_incoming_state_destroy() ultimately calls "load_cleanup" > > > SaveVMHandlers and it will soon call BQL-sensitive code it makes sense > > > to always call that function under BQL rather than to have it deal with > > > both cases (with BQL and without BQL). > > > Add the necessary bql_lock() and bql_unlock() to > > > postcopy_ram_listen_thread(). > > > > > > qemu_loadvm_state_main() in postcopy_ram_listen_thread() could call > > > "load_state" SaveVMHandlers that are expecting BQL to be held. > > > > > > In principle, the only devices that should be arriving on migration > > > channel serviced by postcopy_ram_listen_thread() are those that are > > > postcopiable and whose load handlers are safe to be called without BQL > > > being held. > > > > > > But nothing currently prevents the source from sending data for "unsafe" > > > devices which would cause trouble there. > > > Add a TODO comment there so it's clear that it would be good to improve > > > handling of such (erroneous) case in the future. > > > > > > Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> > > > --- > > > migration/migration.c | 16 ++++++++++++++++ > > > migration/savevm.c | 4 ++++ > > > 2 files changed, 20 insertions(+) > > > > > > diff --git a/migration/migration.c b/migration/migration.c > > > index 9e9db26667f1..6b2a8af4231d 100644 > > > --- a/migration/migration.c > > > +++ b/migration/migration.c > > > @@ -402,10 +402,26 @@ void migration_incoming_state_destroy(void) > > > struct MigrationIncomingState *mis = migration_incoming_get_current(); > > > multifd_recv_cleanup(); > > > + > > > /* > > > * RAM state cleanup needs to happen after multifd cleanup, because > > > * multifd threads can use some of its states (receivedmap). > > > + * > > > + * This call also needs BQL held since it calls all registered > > > + * load_cleanup SaveVMHandlers and at least the VFIO implementation is > > > + * BQL-sensitive. > > > + * > > > + * In addition to the above, it also performs cleanup of load threads > > > + * thread pool. > > > + * This cleanup operation is BQL-sensitive as it requires unlocking BQL > > > + * so a thread possibly waiting for it could get unblocked and finally > > > + * exit. > > > + * The reason why a load thread may need to hold BQL in the first place > > > + * is because address space modification operations require it. > > > > Hold on... > > > > This almost says exactly why load_cleanup() should _not_ take BQL... rather > > than should.. > > > > So I had a closer look at the latest code, it's about this: > > > > static void vfio_load_cleanup_load_bufs_thread(VFIOMultifd *multifd) > > { > > /* The lock order is load_bufs_mutex -> BQL so unlock BQL here first */ > > bql_unlock(); > > WITH_QEMU_LOCK_GUARD(&multifd->load_bufs_mutex) { > > while (multifd->load_bufs_thread_running) { > > multifd->load_bufs_thread_want_exit = true; > > > > qemu_cond_signal(&multifd->load_bufs_buffer_ready_cond); > > qemu_cond_signal(&multifd->load_bufs_iter_done_cond); > > qemu_cond_wait(&multifd->load_bufs_thread_finished_cond, > > &multifd->load_bufs_mutex); > > } > > } > > bql_lock(); > > } > > > > It doesn't make much sense to me to take it only because we want to drop it > > unconditionally. Can we guarantee the function not taking BQL instead? I > > had a quick look on pmem's pmem_persist() (from libpmem, qemu_ram_msync <- > > qemu_ram_block_writeback <- ram_load_cleanup), it looks ok. > > > > So the question is, is it safe to unlock BQL in whatever context (in > > coroutines, or in a bottom half)? > > > > If the answer is yes, we could make migration_incoming_state_destroy() > > always not taking BQL (and assert(!bql_locked()) instead). > > All the other callers of migration_incoming_state_destroy() are holding BQL: > process_incoming_migration_bh(), process_incoming_migration_co() (called on, > failure path only), load_snapshot() and qmp_xen_load_devices_state(). > > So AFAIK the safer way is to standardize on holding BQL when calling > that function. > > If the answer is no, then vfio_load_cleanup_load_bufs_thread()'s current > > version may not work either.. > > I think the reason for BQL is to serialize access to the QEMU internals > which are not thread-safe. > > So as long as these internals aren't touched when not holding BQL then > we should be safe - I don't see any particular state that's cached > around these BQL calls and would need explicit reloading after re-gaining > it. OK, I checked with misterious force and looks like it's ok. Would you please rephrase the comment, though? I want to make it crystal clear that what we're looking for is not holding BQL.. Maybe something like this: /* * The VFIO load_cleanup() implementation is BQL-sensitive. It requires * BQL must NOT be taken when recycling load threads, so that it won't * block the load threads from making progress on address space * modification operations. * * To make it work, we could try to not take BQL for all load_cleanup(), * or conditionally unlock BQL only if bql_locked() in VFIO. * * Since most existing call sites take BQL for load_cleanup(), make * it simple by taking BQL always as the rule, so that VFIO can unlock * BQL and retake unconditionally. */ We may also want to update the subject. Currently: "migration: postcopy_ram_listen_thread() should take BQL for some calls" It's not accurate anymore, it could be: "migration: Always take BQL for migration_incoming_state_destroy()" If with all above, please feel free to take: Acked-by: Peter Xu <peterx@redhat.com> I'm OK if it'll be touched up when merge too. Thanks,
On 3/5/25 17:15, Peter Xu wrote: > On Wed, Mar 05, 2025 at 04:11:30PM +0100, Maciej S. Szmigiero wrote: >> On 5.03.2025 13:34, Peter Xu wrote: >>> On Tue, Mar 04, 2025 at 11:03:34PM +0100, Maciej S. Szmigiero wrote: >>>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> >>>> >>>> All callers to migration_incoming_state_destroy() other than >>>> postcopy_ram_listen_thread() do this call with BQL held. >>>> >>>> Since migration_incoming_state_destroy() ultimately calls "load_cleanup" >>>> SaveVMHandlers and it will soon call BQL-sensitive code it makes sense >>>> to always call that function under BQL rather than to have it deal with >>>> both cases (with BQL and without BQL). >>>> Add the necessary bql_lock() and bql_unlock() to >>>> postcopy_ram_listen_thread(). >>>> >>>> qemu_loadvm_state_main() in postcopy_ram_listen_thread() could call >>>> "load_state" SaveVMHandlers that are expecting BQL to be held. >>>> >>>> In principle, the only devices that should be arriving on migration >>>> channel serviced by postcopy_ram_listen_thread() are those that are >>>> postcopiable and whose load handlers are safe to be called without BQL >>>> being held. >>>> >>>> But nothing currently prevents the source from sending data for "unsafe" >>>> devices which would cause trouble there. >>>> Add a TODO comment there so it's clear that it would be good to improve >>>> handling of such (erroneous) case in the future. >>>> >>>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> >>>> --- >>>> migration/migration.c | 16 ++++++++++++++++ >>>> migration/savevm.c | 4 ++++ >>>> 2 files changed, 20 insertions(+) >>>> >>>> diff --git a/migration/migration.c b/migration/migration.c >>>> index 9e9db26667f1..6b2a8af4231d 100644 >>>> --- a/migration/migration.c >>>> +++ b/migration/migration.c >>>> @@ -402,10 +402,26 @@ void migration_incoming_state_destroy(void) >>>> struct MigrationIncomingState *mis = migration_incoming_get_current(); >>>> multifd_recv_cleanup(); >>>> + >>>> /* >>>> * RAM state cleanup needs to happen after multifd cleanup, because >>>> * multifd threads can use some of its states (receivedmap). >>>> + * >>>> + * This call also needs BQL held since it calls all registered >>>> + * load_cleanup SaveVMHandlers and at least the VFIO implementation is >>>> + * BQL-sensitive. >>>> + * >>>> + * In addition to the above, it also performs cleanup of load threads >>>> + * thread pool. >>>> + * This cleanup operation is BQL-sensitive as it requires unlocking BQL >>>> + * so a thread possibly waiting for it could get unblocked and finally >>>> + * exit. >>>> + * The reason why a load thread may need to hold BQL in the first place >>>> + * is because address space modification operations require it. >>> >>> Hold on... >>> >>> This almost says exactly why load_cleanup() should _not_ take BQL... rather >>> than should.. >>> >>> So I had a closer look at the latest code, it's about this: >>> >>> static void vfio_load_cleanup_load_bufs_thread(VFIOMultifd *multifd) >>> { >>> /* The lock order is load_bufs_mutex -> BQL so unlock BQL here first */ >>> bql_unlock(); >>> WITH_QEMU_LOCK_GUARD(&multifd->load_bufs_mutex) { >>> while (multifd->load_bufs_thread_running) { >>> multifd->load_bufs_thread_want_exit = true; >>> >>> qemu_cond_signal(&multifd->load_bufs_buffer_ready_cond); >>> qemu_cond_signal(&multifd->load_bufs_iter_done_cond); >>> qemu_cond_wait(&multifd->load_bufs_thread_finished_cond, >>> &multifd->load_bufs_mutex); >>> } >>> } >>> bql_lock(); >>> } >>> >>> It doesn't make much sense to me to take it only because we want to drop it >>> unconditionally. Can we guarantee the function not taking BQL instead? I >>> had a quick look on pmem's pmem_persist() (from libpmem, qemu_ram_msync <- >>> qemu_ram_block_writeback <- ram_load_cleanup), it looks ok. >>> >>> So the question is, is it safe to unlock BQL in whatever context (in >>> coroutines, or in a bottom half)? >>> >>> If the answer is yes, we could make migration_incoming_state_destroy() >>> always not taking BQL (and assert(!bql_locked()) instead). >> >> All the other callers of migration_incoming_state_destroy() are holding BQL: >> process_incoming_migration_bh(), process_incoming_migration_co() (called on, >> failure path only), load_snapshot() and qmp_xen_load_devices_state(). >> >> So AFAIK the safer way is to standardize on holding BQL when calling >> that function. >>> If the answer is no, then vfio_load_cleanup_load_bufs_thread()'s current >>> version may not work either.. >> >> I think the reason for BQL is to serialize access to the QEMU internals >> which are not thread-safe. >> >> So as long as these internals aren't touched when not holding BQL then >> we should be safe - I don't see any particular state that's cached >> around these BQL calls and would need explicit reloading after re-gaining >> it. > > OK, I checked with misterious force and looks like it's ok. > > Would you please rephrase the comment, though? I want to make it crystal > clear that what we're looking for is not holding BQL.. Maybe something like > this: > > /* > * The VFIO load_cleanup() implementation is BQL-sensitive. It requires > * BQL must NOT be taken when recycling load threads, so that it won't > * block the load threads from making progress on address space > * modification operations. > * > * To make it work, we could try to not take BQL for all load_cleanup(), > * or conditionally unlock BQL only if bql_locked() in VFIO. > * > * Since most existing call sites take BQL for load_cleanup(), make > * it simple by taking BQL always as the rule, so that VFIO can unlock > * BQL and retake unconditionally. > */ > > We may also want to update the subject. Currently: > > "migration: postcopy_ram_listen_thread() should take BQL for some calls" > > It's not accurate anymore, it could be: > > "migration: Always take BQL for migration_incoming_state_destroy()" > > If with all above, please feel free to take: > > Acked-by: Peter Xu <peterx@redhat.com> > > I'm OK if it'll be touched up when merge too. Maciej, Could you please resend just that patch as a reply to the series ? No need to resend the whole series. Thanks, C.
On 5.03.2025 17:37, Cédric Le Goater wrote: > On 3/5/25 17:15, Peter Xu wrote: >> On Wed, Mar 05, 2025 at 04:11:30PM +0100, Maciej S. Szmigiero wrote: >>> On 5.03.2025 13:34, Peter Xu wrote: >>>> On Tue, Mar 04, 2025 at 11:03:34PM +0100, Maciej S. Szmigiero wrote: >>>>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> >>>>> >>>>> All callers to migration_incoming_state_destroy() other than >>>>> postcopy_ram_listen_thread() do this call with BQL held. >>>>> >>>>> Since migration_incoming_state_destroy() ultimately calls "load_cleanup" >>>>> SaveVMHandlers and it will soon call BQL-sensitive code it makes sense >>>>> to always call that function under BQL rather than to have it deal with >>>>> both cases (with BQL and without BQL). >>>>> Add the necessary bql_lock() and bql_unlock() to >>>>> postcopy_ram_listen_thread(). >>>>> >>>>> qemu_loadvm_state_main() in postcopy_ram_listen_thread() could call >>>>> "load_state" SaveVMHandlers that are expecting BQL to be held. >>>>> >>>>> In principle, the only devices that should be arriving on migration >>>>> channel serviced by postcopy_ram_listen_thread() are those that are >>>>> postcopiable and whose load handlers are safe to be called without BQL >>>>> being held. >>>>> >>>>> But nothing currently prevents the source from sending data for "unsafe" >>>>> devices which would cause trouble there. >>>>> Add a TODO comment there so it's clear that it would be good to improve >>>>> handling of such (erroneous) case in the future. >>>>> >>>>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> >>>>> --- >>>>> migration/migration.c | 16 ++++++++++++++++ >>>>> migration/savevm.c | 4 ++++ >>>>> 2 files changed, 20 insertions(+) >>>>> >>>>> diff --git a/migration/migration.c b/migration/migration.c >>>>> index 9e9db26667f1..6b2a8af4231d 100644 >>>>> --- a/migration/migration.c >>>>> +++ b/migration/migration.c >>>>> @@ -402,10 +402,26 @@ void migration_incoming_state_destroy(void) >>>>> struct MigrationIncomingState *mis = migration_incoming_get_current(); >>>>> multifd_recv_cleanup(); >>>>> + >>>>> /* >>>>> * RAM state cleanup needs to happen after multifd cleanup, because >>>>> * multifd threads can use some of its states (receivedmap). >>>>> + * >>>>> + * This call also needs BQL held since it calls all registered >>>>> + * load_cleanup SaveVMHandlers and at least the VFIO implementation is >>>>> + * BQL-sensitive. >>>>> + * >>>>> + * In addition to the above, it also performs cleanup of load threads >>>>> + * thread pool. >>>>> + * This cleanup operation is BQL-sensitive as it requires unlocking BQL >>>>> + * so a thread possibly waiting for it could get unblocked and finally >>>>> + * exit. >>>>> + * The reason why a load thread may need to hold BQL in the first place >>>>> + * is because address space modification operations require it. >>>> >>>> Hold on... >>>> >>>> This almost says exactly why load_cleanup() should _not_ take BQL... rather >>>> than should.. >>>> >>>> So I had a closer look at the latest code, it's about this: >>>> >>>> static void vfio_load_cleanup_load_bufs_thread(VFIOMultifd *multifd) >>>> { >>>> /* The lock order is load_bufs_mutex -> BQL so unlock BQL here first */ >>>> bql_unlock(); >>>> WITH_QEMU_LOCK_GUARD(&multifd->load_bufs_mutex) { >>>> while (multifd->load_bufs_thread_running) { >>>> multifd->load_bufs_thread_want_exit = true; >>>> >>>> qemu_cond_signal(&multifd->load_bufs_buffer_ready_cond); >>>> qemu_cond_signal(&multifd->load_bufs_iter_done_cond); >>>> qemu_cond_wait(&multifd->load_bufs_thread_finished_cond, >>>> &multifd->load_bufs_mutex); >>>> } >>>> } >>>> bql_lock(); >>>> } >>>> >>>> It doesn't make much sense to me to take it only because we want to drop it >>>> unconditionally. Can we guarantee the function not taking BQL instead? I >>>> had a quick look on pmem's pmem_persist() (from libpmem, qemu_ram_msync <- >>>> qemu_ram_block_writeback <- ram_load_cleanup), it looks ok. >>>> >>>> So the question is, is it safe to unlock BQL in whatever context (in >>>> coroutines, or in a bottom half)? >>>> >>>> If the answer is yes, we could make migration_incoming_state_destroy() >>>> always not taking BQL (and assert(!bql_locked()) instead). >>> >>> All the other callers of migration_incoming_state_destroy() are holding BQL: >>> process_incoming_migration_bh(), process_incoming_migration_co() (called on, >>> failure path only), load_snapshot() and qmp_xen_load_devices_state(). >>> >>> So AFAIK the safer way is to standardize on holding BQL when calling >>> that function. >>>> If the answer is no, then vfio_load_cleanup_load_bufs_thread()'s current >>>> version may not work either.. >>> >>> I think the reason for BQL is to serialize access to the QEMU internals >>> which are not thread-safe. >>> >>> So as long as these internals aren't touched when not holding BQL then >>> we should be safe - I don't see any particular state that's cached >>> around these BQL calls and would need explicit reloading after re-gaining >>> it. >> >> OK, I checked with misterious force and looks like it's ok. >> >> Would you please rephrase the comment, though? I want to make it crystal >> clear that what we're looking for is not holding BQL.. Maybe something like >> this: >> >> /* >> * The VFIO load_cleanup() implementation is BQL-sensitive. It requires >> * BQL must NOT be taken when recycling load threads, so that it won't >> * block the load threads from making progress on address space >> * modification operations. >> * >> * To make it work, we could try to not take BQL for all load_cleanup(), >> * or conditionally unlock BQL only if bql_locked() in VFIO. >> * >> * Since most existing call sites take BQL for load_cleanup(), make >> * it simple by taking BQL always as the rule, so that VFIO can unlock >> * BQL and retake unconditionally. >> */ >> >> We may also want to update the subject. Currently: >> >> "migration: postcopy_ram_listen_thread() should take BQL for some calls" >> >> It's not accurate anymore, it could be: >> >> "migration: Always take BQL for migration_incoming_state_destroy()" >> >> If with all above, please feel free to take: >> >> Acked-by: Peter Xu <peterx@redhat.com> >> >> I'm OK if it'll be touched up when merge too. @Peter: Updated the comment and patch subject. Added your review tag. > Maciej, > > Could you please resend just that patch as a reply to the series ? No need to > resend the whole series. @Cedric: I've sent the updated patch. I also updated the tree at GitLab with collected review tags and updated version of this patch. > Thanks, > > C. > > Thanks, Maciej
diff --git a/migration/migration.c b/migration/migration.c index 9e9db26667f1..6b2a8af4231d 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -402,10 +402,26 @@ void migration_incoming_state_destroy(void) struct MigrationIncomingState *mis = migration_incoming_get_current(); multifd_recv_cleanup(); + /* * RAM state cleanup needs to happen after multifd cleanup, because * multifd threads can use some of its states (receivedmap). + * + * This call also needs BQL held since it calls all registered + * load_cleanup SaveVMHandlers and at least the VFIO implementation is + * BQL-sensitive. + * + * In addition to the above, it also performs cleanup of load threads + * thread pool. + * This cleanup operation is BQL-sensitive as it requires unlocking BQL + * so a thread possibly waiting for it could get unblocked and finally + * exit. + * The reason why a load thread may need to hold BQL in the first place + * is because address space modification operations require it. + * + * Check proper BQL state here rather than risk possible deadlock later. */ + assert(bql_locked()); qemu_loadvm_state_cleanup(); if (mis->to_src_file) { diff --git a/migration/savevm.c b/migration/savevm.c index 7c1aa8ad7b9d..3e86b572cfa8 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1986,6 +1986,8 @@ static void *postcopy_ram_listen_thread(void *opaque) * in qemu_file, and thus we must be blocking now. */ qemu_file_set_blocking(f, true); + + /* TODO: sanity check that only postcopiable data will be loaded here */ load_res = qemu_loadvm_state_main(f, mis); /* @@ -2046,7 +2048,9 @@ static void *postcopy_ram_listen_thread(void *opaque) * (If something broke then qemu will have to exit anyway since it's * got a bad migration state). */ + bql_lock(); migration_incoming_state_destroy(); + bql_unlock(); rcu_unregister_thread(); mis->have_listen_thread = false;