diff mbox series

[1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever

Message ID 20211126153154.25424-2-lizhang@suse.de (mailing list archive)
State New, archived
Headers show
Series migration: multifd live migration improvement | expand

Commit Message

Li Zhang Nov. 26, 2021, 3:31 p.m. UTC
When doing live migration with multifd channels 8, 16 or larger number,
the guest hangs in the presence of the network errors such as missing TCP ACKs.

At sender's side:
The main thread is blocked on qemu_thread_join, migration_fd_cleanup
is called because one thread fails on qio_channel_write_all when
the network problem happens and other send threads are blocked on sendmsg.
They could not be terminated. So the main thread is blocked on qemu_thread_join
to wait for the threads terminated.

(gdb) bt
0  0x00007f30c8dcffc0 in __pthread_clockjoin_ex () at /lib64/libpthread.so.0
1  0x000055cbb716084b in qemu_thread_join (thread=0x55cbb881f418) at ../util/qemu-thread-posix.c:627
2  0x000055cbb6b54e40 in multifd_save_cleanup () at ../migration/multifd.c:542
3  0x000055cbb6b4de06 in migrate_fd_cleanup (s=0x55cbb8024000) at ../migration/migration.c:1808
4  0x000055cbb6b4dfb4 in migrate_fd_cleanup_bh (opaque=0x55cbb8024000) at ../migration/migration.c:1850
5  0x000055cbb7173ac1 in aio_bh_call (bh=0x55cbb7eb98e0) at ../util/async.c:141
6  0x000055cbb7173bcb in aio_bh_poll (ctx=0x55cbb7ebba80) at ../util/async.c:169
7  0x000055cbb715ba4b in aio_dispatch (ctx=0x55cbb7ebba80) at ../util/aio-posix.c:381
8  0x000055cbb7173ffe in aio_ctx_dispatch (source=0x55cbb7ebba80, callback=0x0, user_data=0x0) at ../util/async.c:311
9  0x00007f30c9c8cdf4 in g_main_context_dispatch () at /usr/lib64/libglib-2.0.so.0
10 0x000055cbb71851a2 in glib_pollfds_poll () at ../util/main-loop.c:232
11 0x000055cbb718521c in os_host_main_loop_wait (timeout=42251070366) at ../util/main-loop.c:255
12 0x000055cbb7185321 in main_loop_wait (nonblocking=0) at ../util/main-loop.c:531
13 0x000055cbb6e6ba27 in qemu_main_loop () at ../softmmu/runstate.c:726
14 0x000055cbb6ad6fd7 in main (argc=68, argv=0x7ffc0c578888, envp=0x7ffc0c578ab0) at ../softmmu/main.c:50

At receiver's side:
Several receive threads are not created successfully and the receive threads
which have been created are blocked on qemu_sem_wait. No semaphores are posted
because migration is not started if not all the receive threads are created
successfully and multifd_recv_sync_main is not called which posts the semaphore
to receive threads. So the receive threads are waiting on the semaphore and
never return. It shouldn't wait for the semaphore forever.
Use qemu_sem_timedwait to wait for a while, then return and close the channels.
So the guest doesn't hang anymore.

(gdb) bt
0  0x00007fd61c43f064 in do_futex_wait.constprop () at /lib64/libpthread.so.0
1  0x00007fd61c43f158 in __new_sem_wait_slow.constprop.0 () at /lib64/libpthread.so.0
2  0x000056075916014a in qemu_sem_wait (sem=0x56075b6515f0) at ../util/qemu-thread-posix.c:358
3  0x0000560758b56643 in multifd_recv_thread (opaque=0x56075b651550) at ../migration/multifd.c:1112
4  0x0000560759160598 in qemu_thread_start (args=0x56075befad00) at ../util/qemu-thread-posix.c:556
5  0x00007fd61c43594a in start_thread () at /lib64/libpthread.so.0
6  0x00007fd61c158d0f in clone () at /lib64/libc.so.6

Signed-off-by: Li Zhang <lizhang@suse.de>
---
 migration/multifd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Daniel P. Berrangé Nov. 26, 2021, 3:49 p.m. UTC | #1
On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> When doing live migration with multifd channels 8, 16 or larger number,
> the guest hangs in the presence of the network errors such as missing TCP ACKs.
> 
> At sender's side:
> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> is called because one thread fails on qio_channel_write_all when
> the network problem happens and other send threads are blocked on sendmsg.
> They could not be terminated. So the main thread is blocked on qemu_thread_join
> to wait for the threads terminated.

Isn't the right answer here to ensure we've called 'shutdown' on
all the FDs, so that the threads get kicked out of sendmsg, before
trying to join the thread ?


Regards,
Daniel
Juan Quintela Nov. 26, 2021, 4:33 p.m. UTC | #2
Li Zhang <lizhang@suse.de> wrote:
> When doing live migration with multifd channels 8, 16 or larger number,
> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>
> At sender's side:
> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> is called because one thread fails on qio_channel_write_all when
> the network problem happens and other send threads are blocked on sendmsg.
> They could not be terminated. So the main thread is blocked on qemu_thread_join
> to wait for the threads terminated.
>
> (gdb) bt
> 0  0x00007f30c8dcffc0 in __pthread_clockjoin_ex () at /lib64/libpthread.so.0
> 1  0x000055cbb716084b in qemu_thread_join (thread=0x55cbb881f418) at ../util/qemu-thread-posix.c:627
> 2  0x000055cbb6b54e40 in multifd_save_cleanup () at ../migration/multifd.c:542
> 3  0x000055cbb6b4de06 in migrate_fd_cleanup (s=0x55cbb8024000) at ../migration/migration.c:1808
> 4  0x000055cbb6b4dfb4 in migrate_fd_cleanup_bh (opaque=0x55cbb8024000) at ../migration/migration.c:1850
> 5  0x000055cbb7173ac1 in aio_bh_call (bh=0x55cbb7eb98e0) at ../util/async.c:141
> 6  0x000055cbb7173bcb in aio_bh_poll (ctx=0x55cbb7ebba80) at ../util/async.c:169
> 7  0x000055cbb715ba4b in aio_dispatch (ctx=0x55cbb7ebba80) at ../util/aio-posix.c:381
> 8  0x000055cbb7173ffe in aio_ctx_dispatch (source=0x55cbb7ebba80, callback=0x0, user_data=0x0) at ../util/async.c:311
> 9  0x00007f30c9c8cdf4 in g_main_context_dispatch () at /usr/lib64/libglib-2.0.so.0
> 10 0x000055cbb71851a2 in glib_pollfds_poll () at ../util/main-loop.c:232
> 11 0x000055cbb718521c in os_host_main_loop_wait (timeout=42251070366) at ../util/main-loop.c:255
> 12 0x000055cbb7185321 in main_loop_wait (nonblocking=0) at ../util/main-loop.c:531
> 13 0x000055cbb6e6ba27 in qemu_main_loop () at ../softmmu/runstate.c:726
> 14 0x000055cbb6ad6fd7 in main (argc=68, argv=0x7ffc0c578888, envp=0x7ffc0c578ab0) at ../softmmu/main.c:50
>
> At receiver's side:
> Several receive threads are not created successfully and the receive threads
> which have been created are blocked on qemu_sem_wait. No semaphores are posted
> because migration is not started if not all the receive threads are created
> successfully and multifd_recv_sync_main is not called which posts the semaphore
> to receive threads. So the receive threads are waiting on the semaphore and
> never return. It shouldn't wait for the semaphore forever.
> Use qemu_sem_timedwait to wait for a while, then return and close the channels.
> So the guest doesn't hang anymore.
>
> (gdb) bt
> 0  0x00007fd61c43f064 in do_futex_wait.constprop () at /lib64/libpthread.so.0
> 1  0x00007fd61c43f158 in __new_sem_wait_slow.constprop.0 () at /lib64/libpthread.so.0
> 2  0x000056075916014a in qemu_sem_wait (sem=0x56075b6515f0) at ../util/qemu-thread-posix.c:358
> 3  0x0000560758b56643 in multifd_recv_thread (opaque=0x56075b651550) at ../migration/multifd.c:1112
> 4  0x0000560759160598 in qemu_thread_start (args=0x56075befad00) at ../util/qemu-thread-posix.c:556
> 5  0x00007fd61c43594a in start_thread () at /lib64/libpthread.so.0
> 6  0x00007fd61c158d0f in clone () at /lib64/libc.so.6
>
> Signed-off-by: Li Zhang <lizhang@suse.de>
> ---
>  migration/multifd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 7c9deb1921..656239ca2a 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -1109,7 +1109,7 @@ static void *multifd_recv_thread(void *opaque)
>  
>          if (flags & MULTIFD_FLAG_SYNC) {
>              qemu_sem_post(&multifd_recv_state->sem_sync);
> -            qemu_sem_wait(&p->sem_sync);
> +            qemu_sem_timedwait(&p->sem_sync, 1000);
>          }
>      }

Problem happens here, but I think that the solution is not worng.  We
are returning from the semaphore without given a single error message.

Later, Juan.
Li Zhang Nov. 26, 2021, 4:44 p.m. UTC | #3
On 11/26/21 4:49 PM, Daniel P. Berrangé wrote:
> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
>> When doing live migration with multifd channels 8, 16 or larger number,
>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>
>> At sender's side:
>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>> is called because one thread fails on qio_channel_write_all when
>> the network problem happens and other send threads are blocked on sendmsg.
>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>> to wait for the threads terminated.
> Isn't the right answer here to ensure we've called 'shutdown' on
> all the FDs, so that the threads get kicked out of sendmsg, before
> trying to join the thread ?

If we shutdown the channels at sender's side, it could terminate send 
threads. The receive threads are still waiting there.

 From receiver's side, if wait semaphore is timeout, the channels can be 
terminated at last. And the sender threads also be terminated at last.

If we close channels on both sides, I am not sure if it works correctly.


>
> Regards,
> Daniel
Daniel P. Berrangé Nov. 26, 2021, 4:51 p.m. UTC | #4
On Fri, Nov 26, 2021 at 05:44:04PM +0100, Li Zhang wrote:
> 
> On 11/26/21 4:49 PM, Daniel P. Berrangé wrote:
> > On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> > > When doing live migration with multifd channels 8, 16 or larger number,
> > > the guest hangs in the presence of the network errors such as missing TCP ACKs.
> > > 
> > > At sender's side:
> > > The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> > > is called because one thread fails on qio_channel_write_all when
> > > the network problem happens and other send threads are blocked on sendmsg.
> > > They could not be terminated. So the main thread is blocked on qemu_thread_join
> > > to wait for the threads terminated.
> > Isn't the right answer here to ensure we've called 'shutdown' on
> > all the FDs, so that the threads get kicked out of sendmsg, before
> > trying to join the thread ?
> 
> If we shutdown the channels at sender's side, it could terminate send
> threads. The receive threads are still waiting there.
> 
> From receiver's side, if wait semaphore is timeout, the channels can be
> terminated at last. And the sender threads also be terminated at last.

If something goes wrong on the sender side, the mgmt app should be
tearing down the destination QEMU entirely, so I'm not sure we need
to do anything special to deal with received threads.

Using semtimedwait just feels risky because it will introduce false
failures if the system/network is under high load such that the
connections don't all establish within 1 second.

Regards,
Daniel
Li Zhang Nov. 26, 2021, 4:56 p.m. UTC | #5
On 11/26/21 5:33 PM, Juan Quintela wrote:
> Li Zhang <lizhang@suse.de> wrote:
>> When doing live migration with multifd channels 8, 16 or larger number,
>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>
>> At sender's side:
>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>> is called because one thread fails on qio_channel_write_all when
>> the network problem happens and other send threads are blocked on sendmsg.
>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>> to wait for the threads terminated.
>>
>> (gdb) bt
>> 0  0x00007f30c8dcffc0 in __pthread_clockjoin_ex () at /lib64/libpthread.so.0
>> 1  0x000055cbb716084b in qemu_thread_join (thread=0x55cbb881f418) at ../util/qemu-thread-posix.c:627
>> 2  0x000055cbb6b54e40 in multifd_save_cleanup () at ../migration/multifd.c:542
>> 3  0x000055cbb6b4de06 in migrate_fd_cleanup (s=0x55cbb8024000) at ../migration/migration.c:1808
>> 4  0x000055cbb6b4dfb4 in migrate_fd_cleanup_bh (opaque=0x55cbb8024000) at ../migration/migration.c:1850
>> 5  0x000055cbb7173ac1 in aio_bh_call (bh=0x55cbb7eb98e0) at ../util/async.c:141
>> 6  0x000055cbb7173bcb in aio_bh_poll (ctx=0x55cbb7ebba80) at ../util/async.c:169
>> 7  0x000055cbb715ba4b in aio_dispatch (ctx=0x55cbb7ebba80) at ../util/aio-posix.c:381
>> 8  0x000055cbb7173ffe in aio_ctx_dispatch (source=0x55cbb7ebba80, callback=0x0, user_data=0x0) at ../util/async.c:311
>> 9  0x00007f30c9c8cdf4 in g_main_context_dispatch () at /usr/lib64/libglib-2.0.so.0
>> 10 0x000055cbb71851a2 in glib_pollfds_poll () at ../util/main-loop.c:232
>> 11 0x000055cbb718521c in os_host_main_loop_wait (timeout=42251070366) at ../util/main-loop.c:255
>> 12 0x000055cbb7185321 in main_loop_wait (nonblocking=0) at ../util/main-loop.c:531
>> 13 0x000055cbb6e6ba27 in qemu_main_loop () at ../softmmu/runstate.c:726
>> 14 0x000055cbb6ad6fd7 in main (argc=68, argv=0x7ffc0c578888, envp=0x7ffc0c578ab0) at ../softmmu/main.c:50
>>
>> At receiver's side:
>> Several receive threads are not created successfully and the receive threads
>> which have been created are blocked on qemu_sem_wait. No semaphores are posted
>> because migration is not started if not all the receive threads are created
>> successfully and multifd_recv_sync_main is not called which posts the semaphore
>> to receive threads. So the receive threads are waiting on the semaphore and
>> never return. It shouldn't wait for the semaphore forever.
>> Use qemu_sem_timedwait to wait for a while, then return and close the channels.
>> So the guest doesn't hang anymore.
>>
>> (gdb) bt
>> 0  0x00007fd61c43f064 in do_futex_wait.constprop () at /lib64/libpthread.so.0
>> 1  0x00007fd61c43f158 in __new_sem_wait_slow.constprop.0 () at /lib64/libpthread.so.0
>> 2  0x000056075916014a in qemu_sem_wait (sem=0x56075b6515f0) at ../util/qemu-thread-posix.c:358
>> 3  0x0000560758b56643 in multifd_recv_thread (opaque=0x56075b651550) at ../migration/multifd.c:1112
>> 4  0x0000560759160598 in qemu_thread_start (args=0x56075befad00) at ../util/qemu-thread-posix.c:556
>> 5  0x00007fd61c43594a in start_thread () at /lib64/libpthread.so.0
>> 6  0x00007fd61c158d0f in clone () at /lib64/libc.so.6
>>
>> Signed-off-by: Li Zhang <lizhang@suse.de>
>> ---
>>   migration/multifd.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/migration/multifd.c b/migration/multifd.c
>> index 7c9deb1921..656239ca2a 100644
>> --- a/migration/multifd.c
>> +++ b/migration/multifd.c
>> @@ -1109,7 +1109,7 @@ static void *multifd_recv_thread(void *opaque)
>>   
>>           if (flags & MULTIFD_FLAG_SYNC) {
>>               qemu_sem_post(&multifd_recv_state->sem_sync);
>> -            qemu_sem_wait(&p->sem_sync);
>> +            qemu_sem_timedwait(&p->sem_sync, 1000);
>>           }
>>       }
> Problem happens here, but I think that the solution is not worng.  We
> are returning from the semaphore without given a single error message.

Ah, okay. I can add an error message.

Thanks

Li


>
> Later, Juan.
>
Li Zhang Nov. 26, 2021, 5 p.m. UTC | #6
On 11/26/21 5:51 PM, Daniel P. Berrangé wrote:
> On Fri, Nov 26, 2021 at 05:44:04PM +0100, Li Zhang wrote:
>> On 11/26/21 4:49 PM, Daniel P. Berrangé wrote:
>>> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
>>>> When doing live migration with multifd channels 8, 16 or larger number,
>>>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>>>
>>>> At sender's side:
>>>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>>>> is called because one thread fails on qio_channel_write_all when
>>>> the network problem happens and other send threads are blocked on sendmsg.
>>>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>>>> to wait for the threads terminated.
>>> Isn't the right answer here to ensure we've called 'shutdown' on
>>> all the FDs, so that the threads get kicked out of sendmsg, before
>>> trying to join the thread ?
>> If we shutdown the channels at sender's side, it could terminate send
>> threads. The receive threads are still waiting there.
>>
>>  From receiver's side, if wait semaphore is timeout, the channels can be
>> terminated at last. And the sender threads also be terminated at last.
> If something goes wrong on the sender side, the mgmt app should be
> tearing down the destination QEMU entirely, so I'm not sure we need
> to do anything special to deal with received threads.
>
> Using semtimedwait just feels risky because it will introduce false
> failures if the system/network is under high load such that the
> connections don't all establish within 1 second.

You are right. This may be a risk. I am not sure if the interval is 
proper, we can set longer.


>
> Regards,
> Daniel
Daniel P. Berrangé Nov. 26, 2021, 5:13 p.m. UTC | #7
On Fri, Nov 26, 2021 at 06:00:24PM +0100, Li Zhang wrote:
> 
> On 11/26/21 5:51 PM, Daniel P. Berrangé wrote:
> > On Fri, Nov 26, 2021 at 05:44:04PM +0100, Li Zhang wrote:
> > > On 11/26/21 4:49 PM, Daniel P. Berrangé wrote:
> > > > On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> > > > > When doing live migration with multifd channels 8, 16 or larger number,
> > > > > the guest hangs in the presence of the network errors such as missing TCP ACKs.
> > > > > 
> > > > > At sender's side:
> > > > > The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> > > > > is called because one thread fails on qio_channel_write_all when
> > > > > the network problem happens and other send threads are blocked on sendmsg.
> > > > > They could not be terminated. So the main thread is blocked on qemu_thread_join
> > > > > to wait for the threads terminated.
> > > > Isn't the right answer here to ensure we've called 'shutdown' on
> > > > all the FDs, so that the threads get kicked out of sendmsg, before
> > > > trying to join the thread ?
> > > If we shutdown the channels at sender's side, it could terminate send
> > > threads. The receive threads are still waiting there.
> > > 
> > >  From receiver's side, if wait semaphore is timeout, the channels can be
> > > terminated at last. And the sender threads also be terminated at last.
> > If something goes wrong on the sender side, the mgmt app should be
> > tearing down the destination QEMU entirely, so I'm not sure we need
> > to do anything special to deal with received threads.
> > 
> > Using semtimedwait just feels risky because it will introduce false
> > failures if the system/network is under high load such that the
> > connections don't all establish within 1 second.
> 
> You are right. This may be a risk. I am not sure if the interval is proper,
> we can set longer.

I don't think any kind of timeout is right in this context. There should
be a sem_post() invoked in every scenario where we want to tear down the
recv thread. That should only be the case when we see the other end of
the connection close IMHO.

Regards,
Daniel
Li Zhang Nov. 26, 2021, 5:44 p.m. UTC | #8
On 11/26/21 6:13 PM, Daniel P. Berrangé wrote:
> On Fri, Nov 26, 2021 at 06:00:24PM +0100, Li Zhang wrote:
>> On 11/26/21 5:51 PM, Daniel P. Berrangé wrote:
>>> On Fri, Nov 26, 2021 at 05:44:04PM +0100, Li Zhang wrote:
>>>> On 11/26/21 4:49 PM, Daniel P. Berrangé wrote:
>>>>> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
>>>>>> When doing live migration with multifd channels 8, 16 or larger number,
>>>>>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>>>>>
>>>>>> At sender's side:
>>>>>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>>>>>> is called because one thread fails on qio_channel_write_all when
>>>>>> the network problem happens and other send threads are blocked on sendmsg.
>>>>>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>>>>>> to wait for the threads terminated.
>>>>> Isn't the right answer here to ensure we've called 'shutdown' on
>>>>> all the FDs, so that the threads get kicked out of sendmsg, before
>>>>> trying to join the thread ?
>>>> If we shutdown the channels at sender's side, it could terminate send
>>>> threads. The receive threads are still waiting there.
>>>>
>>>>   From receiver's side, if wait semaphore is timeout, the channels can be
>>>> terminated at last. And the sender threads also be terminated at last.
>>> If something goes wrong on the sender side, the mgmt app should be
>>> tearing down the destination QEMU entirely, so I'm not sure we need
>>> to do anything special to deal with received threads.
>>>
>>> Using semtimedwait just feels risky because it will introduce false
>>> failures if the system/network is under high load such that the
>>> connections don't all establish within 1 second.
>> You are right. This may be a risk. I am not sure if the interval is proper,
>> we can set longer.
> I don't think any kind of timeout is right in this context. There should
> be a sem_post() invoked in every scenario where we want to tear down the
> recv thread. That should only be the case when we see the other end of
> the connection close IMHO.

OK,  I need to consider about that. It may be better to shutdown the 
channels from sender's side.


Thanks

Li

>
> Regards,
> Daniel
Dr. David Alan Gilbert Nov. 29, 2021, 11:20 a.m. UTC | #9
* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> > When doing live migration with multifd channels 8, 16 or larger number,
> > the guest hangs in the presence of the network errors such as missing TCP ACKs.
> > 
> > At sender's side:
> > The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> > is called because one thread fails on qio_channel_write_all when
> > the network problem happens and other send threads are blocked on sendmsg.
> > They could not be terminated. So the main thread is blocked on qemu_thread_join
> > to wait for the threads terminated.
> 
> Isn't the right answer here to ensure we've called 'shutdown' on
> all the FDs, so that the threads get kicked out of sendmsg, before
> trying to join the thread ?

I agree a timeout is wrong here; there is no way to get a good timeout
value.
However, I'm a bit confused - we should be able to try a shutdown on the
receive side using the 'yank' command. - that's what it's there for; Li
does this solve your problem?

multifd_load_cleanup already kicks sem_sync before trying to do a
thread_join - so have we managed to trigger that on the receive side?

Dave

> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>
Li Zhang Nov. 29, 2021, 1:37 p.m. UTC | #10
On 11/29/21 12:20 PM, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
>> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
>>> When doing live migration with multifd channels 8, 16 or larger number,
>>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>>
>>> At sender's side:
>>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>>> is called because one thread fails on qio_channel_write_all when
>>> the network problem happens and other send threads are blocked on sendmsg.
>>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>>> to wait for the threads terminated.
>> Isn't the right answer here to ensure we've called 'shutdown' on
>> all the FDs, so that the threads get kicked out of sendmsg, before
>> trying to join the thread ?
> I agree a timeout is wrong here; there is no way to get a good timeout
> value.
> However, I'm a bit confused - we should be able to try a shutdown on the
> receive side using the 'yank' command. - that's what it's there for; Li
> does this solve your problem?

No, I tried to register 'yank' on the receive side, the receive threads 
are still waiting there.

It seems that on send side, 'yank' doesn't work either when the send 
threads are blocked.

This may be not the case to call yank. I am not quite sure about it.

>
> multifd_load_cleanup already kicks sem_sync before trying to do a
> thread_join - so have we managed to trigger that on the receive side?

There is no problem with sem_sync in function multifd_load_cleanup.

But it is not called in my case, because no errors are detected on the 
receive side.

The problem is here:

void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
{
     MigrationIncomingState *mis = migration_incoming_get_current();
     Error *local_err = NULL;
     bool start_migration;

    ...

     if (!mis->from_src_file) {

     ...

      } else {
         /* Multiple connections */
         assert(migrate_use_multifd());
         start_migration = multifd_recv_new_channel(ioc, &local_err);
         if (local_err) {
             error_propagate(errp, local_err);
             return;
         }
     }
    if (start_migration) {
         migration_incoming_process();
     }
}

start_migration is always 0, and migration is not started because some 
receive threads are not created.

No errors are detected here and the main process works well but receive 
threads are all waiting for semaphore.

It's hard to know if the receive threads are not created. If we can find 
a way to check if any receive threads

are not created, we can kick the sem_sync and do cleanup.

 From the source code, the thread will be created when QIO channel 
detects something by GIO watch if I understand correctly.

If nothing is detected, socket_accept_icoming_migration won't be called, 
the thread will not be created.

socket_start_incoming_migration_internal ->

     qio_net_listener_set_client_func_full(listener,
socket_accept_incoming_migration,
                                           NULL, NULL,
g_main_context_get_thread_default());

    qio_net_listener_set_client_func_full ->

                qio_channel_add_watch_source(
                 QIO_CHANNEL(listener->sioc[i]), G_IO_IN,
                 qio_net_listener_channel_func,
                 listener, (GDestroyNotify)object_unref, context);

   socket_accept_incoming_migration ->

        migration_channel_process_incoming ->

                migration_ioc_process_incoming ->

                      multifd_recv_new_channel ->

                             qemu_thread_create(&p->thread, p->name, 
multifd_recv_thread, p,
QEMU_THREAD_JOINABLE);

>
> Dave
>
>> Regards,
>> Daniel
>> -- 
>> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
>> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
>> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>>
Dr. David Alan Gilbert Nov. 29, 2021, 2:50 p.m. UTC | #11
* Li Zhang (lizhang@suse.de) wrote:
> 
> On 11/29/21 12:20 PM, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> > > > When doing live migration with multifd channels 8, 16 or larger number,
> > > > the guest hangs in the presence of the network errors such as missing TCP ACKs.
> > > > 
> > > > At sender's side:
> > > > The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> > > > is called because one thread fails on qio_channel_write_all when
> > > > the network problem happens and other send threads are blocked on sendmsg.
> > > > They could not be terminated. So the main thread is blocked on qemu_thread_join
> > > > to wait for the threads terminated.
> > > Isn't the right answer here to ensure we've called 'shutdown' on
> > > all the FDs, so that the threads get kicked out of sendmsg, before
> > > trying to join the thread ?
> > I agree a timeout is wrong here; there is no way to get a good timeout
> > value.
> > However, I'm a bit confused - we should be able to try a shutdown on the
> > receive side using the 'yank' command. - that's what it's there for; Li
> > does this solve your problem?
> 
> No, I tried to register 'yank' on the receive side, the receive threads are
> still waiting there.
> 
> It seems that on send side, 'yank' doesn't work either when the send threads
> are blocked.
> 
> This may be not the case to call yank. I am not quite sure about it.

We need to fix that; 'yank' should be able to recover from any network
issue.  If it's not working we need to understand why.

> > 
> > multifd_load_cleanup already kicks sem_sync before trying to do a
> > thread_join - so have we managed to trigger that on the receive side?
> 
> There is no problem with sem_sync in function multifd_load_cleanup.
> 
> But it is not called in my case, because no errors are detected on the
> receive side.

If you're getting TCP errors why aren't you seeing any errors on the
receive side?

> The problem is here:
> 
> void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
> {
>     MigrationIncomingState *mis = migration_incoming_get_current();
>     Error *local_err = NULL;
>     bool start_migration;
> 
>    ...
> 
>     if (!mis->from_src_file) {
> 
>     ...
> 
>      } else {
>         /* Multiple connections */
>         assert(migrate_use_multifd());
>         start_migration = multifd_recv_new_channel(ioc, &local_err);
>         if (local_err) {
>             error_propagate(errp, local_err);
>             return;
>         }
>     }
>    if (start_migration) {
>         migration_incoming_process();
>     }
> }
> 
> start_migration is always 0, and migration is not started because some
> receive threads are not created.
> 
> No errors are detected here and the main process works well but receive
> threads are all waiting for semaphore.
> 
> It's hard to know if the receive threads are not created. If we can find a
> way to check if any receive threads

So is this only a problem for network issues that happen during startup,
before all the threads have been created?

Dave

> are not created, we can kick the sem_sync and do cleanup.
> 
> From the source code, the thread will be created when QIO channel detects
> something by GIO watch if I understand correctly.
> 
> If nothing is detected, socket_accept_icoming_migration won't be called, the
> thread will not be created.
> 
> socket_start_incoming_migration_internal ->
> 
>     qio_net_listener_set_client_func_full(listener,
> socket_accept_incoming_migration,
>                                           NULL, NULL,
> g_main_context_get_thread_default());
> 
>    qio_net_listener_set_client_func_full ->
> 
>                qio_channel_add_watch_source(
>                 QIO_CHANNEL(listener->sioc[i]), G_IO_IN,
>                 qio_net_listener_channel_func,
>                 listener, (GDestroyNotify)object_unref, context);
> 
>   socket_accept_incoming_migration ->
> 
>        migration_channel_process_incoming ->
> 
>                migration_ioc_process_incoming ->
> 
>                      multifd_recv_new_channel ->
> 
>                             qemu_thread_create(&p->thread, p->name,
> multifd_recv_thread, p,
> QEMU_THREAD_JOINABLE);
> 
> > 
> > Dave
> > 
> > > Regards,
> > > Daniel
> > > -- 
> > > |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> > > |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> > > |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> > > 
>
Daniel P. Berrangé Nov. 29, 2021, 2:58 p.m. UTC | #12
On Mon, Nov 29, 2021 at 11:20:08AM +0000, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> > > When doing live migration with multifd channels 8, 16 or larger number,
> > > the guest hangs in the presence of the network errors such as missing TCP ACKs.
> > > 
> > > At sender's side:
> > > The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> > > is called because one thread fails on qio_channel_write_all when
> > > the network problem happens and other send threads are blocked on sendmsg.
> > > They could not be terminated. So the main thread is blocked on qemu_thread_join
> > > to wait for the threads terminated.
> > 
> > Isn't the right answer here to ensure we've called 'shutdown' on
> > all the FDs, so that the threads get kicked out of sendmsg, before
> > trying to join the thread ?
> 
> I agree a timeout is wrong here; there is no way to get a good timeout
> value.
> However, I'm a bit confused - we should be able to try a shutdown on the
> receive side using the 'yank' command. - that's what it's there for; Li
> does this solve your problem?

Why do we even need to use 'yank' on the receive side ? Until migration
has switched over from src to dst, the receive side is discardable and
the whole process can just be teminated with kill(SIGTERM/SIGKILL).

On the source side 'yank' is needed, because the QEMU process is still
running the live workload and thus is precious and mustn't be killed.

Regards,
Daniel
Li Zhang Nov. 29, 2021, 3:34 p.m. UTC | #13
On 11/29/21 3:50 PM, Dr. David Alan Gilbert wrote:
> * Li Zhang (lizhang@suse.de) wrote:
>> On 11/29/21 12:20 PM, Dr. David Alan Gilbert wrote:
>>> * Daniel P. Berrangé (berrange@redhat.com) wrote:
>>>> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
>>>>> When doing live migration with multifd channels 8, 16 or larger number,
>>>>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>>>>
>>>>> At sender's side:
>>>>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>>>>> is called because one thread fails on qio_channel_write_all when
>>>>> the network problem happens and other send threads are blocked on sendmsg.
>>>>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>>>>> to wait for the threads terminated.
>>>> Isn't the right answer here to ensure we've called 'shutdown' on
>>>> all the FDs, so that the threads get kicked out of sendmsg, before
>>>> trying to join the thread ?
>>> I agree a timeout is wrong here; there is no way to get a good timeout
>>> value.
>>> However, I'm a bit confused - we should be able to try a shutdown on the
>>> receive side using the 'yank' command. - that's what it's there for; Li
>>> does this solve your problem?
>> No, I tried to register 'yank' on the receive side, the receive threads are
>> still waiting there.
>>
>> It seems that on send side, 'yank' doesn't work either when the send threads
>> are blocked.
>>
>> This may be not the case to call yank. I am not quite sure about it.
> We need to fix that; 'yank' should be able to recover from any network
> issue.  If it's not working we need to understand why.

OK, I will look into it.

>
>>> multifd_load_cleanup already kicks sem_sync before trying to do a
>>> thread_join - so have we managed to trigger that on the receive side?
>> There is no problem with sem_sync in function multifd_load_cleanup.
>>
>> But it is not called in my case, because no errors are detected on the
>> receive side.
> If you're getting TCP errors why aren't you seeing any errors on the
> receive side?

That's  a good point. I need to find out it.

>
>> The problem is here:
>>
>> void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
>> {
>>      MigrationIncomingState *mis = migration_incoming_get_current();
>>      Error *local_err = NULL;
>>      bool start_migration;
>>
>>     ...
>>
>>      if (!mis->from_src_file) {
>>
>>      ...
>>
>>       } else {
>>          /* Multiple connections */
>>          assert(migrate_use_multifd());
>>          start_migration = multifd_recv_new_channel(ioc, &local_err);
>>          if (local_err) {
>>              error_propagate(errp, local_err);
>>              return;
>>          }
>>      }
>>     if (start_migration) {
>>          migration_incoming_process();
>>      }
>> }
>>
>> start_migration is always 0, and migration is not started because some
>> receive threads are not created.
>>
>> No errors are detected here and the main process works well but receive
>> threads are all waiting for semaphore.
>>
>> It's hard to know if the receive threads are not created. If we can find a
>> way to check if any receive threads
> So is this only a problem for network issues that happen during startup,
> before all the threads have been created?

Yes, it is.

>
> Dave
>
>> are not created, we can kick the sem_sync and do cleanup.
>>
>>  From the source code, the thread will be created when QIO channel detects
>> something by GIO watch if I understand correctly.
>>
>> If nothing is detected, socket_accept_icoming_migration won't be called, the
>> thread will not be created.
>>
>> socket_start_incoming_migration_internal ->
>>
>>      qio_net_listener_set_client_func_full(listener,
>> socket_accept_incoming_migration,
>>                                            NULL, NULL,
>> g_main_context_get_thread_default());
>>
>>     qio_net_listener_set_client_func_full ->
>>
>>                 qio_channel_add_watch_source(
>>                  QIO_CHANNEL(listener->sioc[i]), G_IO_IN,
>>                  qio_net_listener_channel_func,
>>                  listener, (GDestroyNotify)object_unref, context);
>>
>>    socket_accept_incoming_migration ->
>>
>>         migration_channel_process_incoming ->
>>
>>                 migration_ioc_process_incoming ->
>>
>>                       multifd_recv_new_channel ->
>>
>>                              qemu_thread_create(&p->thread, p->name,
>> multifd_recv_thread, p,
>> QEMU_THREAD_JOINABLE);
>>
>>> Dave
>>>
>>>> Regards,
>>>> Daniel
>>>> -- 
>>>> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
>>>> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
>>>> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>>>>
Dr. David Alan Gilbert Nov. 29, 2021, 3:49 p.m. UTC | #14
* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Mon, Nov 29, 2021 at 11:20:08AM +0000, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> > > > When doing live migration with multifd channels 8, 16 or larger number,
> > > > the guest hangs in the presence of the network errors such as missing TCP ACKs.
> > > > 
> > > > At sender's side:
> > > > The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> > > > is called because one thread fails on qio_channel_write_all when
> > > > the network problem happens and other send threads are blocked on sendmsg.
> > > > They could not be terminated. So the main thread is blocked on qemu_thread_join
> > > > to wait for the threads terminated.
> > > 
> > > Isn't the right answer here to ensure we've called 'shutdown' on
> > > all the FDs, so that the threads get kicked out of sendmsg, before
> > > trying to join the thread ?
> > 
> > I agree a timeout is wrong here; there is no way to get a good timeout
> > value.
> > However, I'm a bit confused - we should be able to try a shutdown on the
> > receive side using the 'yank' command. - that's what it's there for; Li
> > does this solve your problem?
> 
> Why do we even need to use 'yank' on the receive side ? Until migration
> has switched over from src to dst, the receive side is discardable and
> the whole process can just be teminated with kill(SIGTERM/SIGKILL).

True, although it's nice to be able to quit cleanly.

> On the source side 'yank' is needed, because the QEMU process is still
> running the live workload and thus is precious and mustn't be killed.

True.

Dave

> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>
Li Zhang Dec. 1, 2021, 12:11 p.m. UTC | #15
On 11/29/21 3:50 PM, Dr. David Alan Gilbert wrote:
> * Li Zhang (lizhang@suse.de) wrote:
>> On 11/29/21 12:20 PM, Dr. David Alan Gilbert wrote:
>>> * Daniel P. Berrangé (berrange@redhat.com) wrote:
>>>> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
>>>>> When doing live migration with multifd channels 8, 16 or larger number,
>>>>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>>>>
>>>>> At sender's side:
>>>>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>>>>> is called because one thread fails on qio_channel_write_all when
>>>>> the network problem happens and other send threads are blocked on sendmsg.
>>>>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>>>>> to wait for the threads terminated.
>>>> Isn't the right answer here to ensure we've called 'shutdown' on
>>>> all the FDs, so that the threads get kicked out of sendmsg, before
>>>> trying to join the thread ?
>>> I agree a timeout is wrong here; there is no way to get a good timeout
>>> value.
>>> However, I'm a bit confused - we should be able to try a shutdown on the
>>> receive side using the 'yank' command. - that's what it's there for; Li
>>> does this solve your problem?
>> No, I tried to register 'yank' on the receive side, the receive threads are
>> still waiting there.
>>
>> It seems that on send side, 'yank' doesn't work either when the send threads
>> are blocked.
>>
>> This may be not the case to call yank. I am not quite sure about it.
> We need to fix that; 'yank' should be able to recover from any network
> issue.  If it's not working we need to understand why.

Hi Dr. David,

On the receive side, I register 'yank' and it is called. But it is just 
to shut down the channels,

it couldn't fix the problem of the receive threads which are waiting for 
the semaphore.

So the receive threads are still waiting there.

On the send side,  the main process is blocked on qemu_thread_join(), 
when I tried the 'yank'

command with QMP,  it is not handled. So the QMP doesn't work and yank 
doesn't work.

I think it's necessary to shutdown the channels before terminating the 
threads, which can prevent the send threads

being blocked on sendmsg.

By investigating the source code of yank, it only shuts down the 
channels, the live migration may recover when

something wrong occurs because of io channels. But if the threads are 
blocked on semphores,  locks or

something else, it couldn't recover by yank command line.

>
>>> multifd_load_cleanup already kicks sem_sync before trying to do a
>>> thread_join - so have we managed to trigger that on the receive side?
>> There is no problem with sem_sync in function multifd_load_cleanup.
>>
>> But it is not called in my case, because no errors are detected on the
>> receive side.
> If you're getting TCP errors why aren't you seeing any errors on the
> receive side?

 From the kernel log,  a TCP SYN flooding is detected. This causes the 
TCP ACK

missing and the receive side just sends a RST to reset the connection 
forcely without errors.

If TCP SYN Flooding detecting is disabled, the problem can be ignored 
and it can cotinue to tranfer the data.

And live migration works, but I don't think the TCP SYNC flooding 
detecting should be disabled.

On the send side, it causes a failure when writing qio channels and 
migration_save_cleanup is called.


Thank

Li

>
>> The problem is here:
>>
>> void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
>> {
>>      MigrationIncomingState *mis = migration_incoming_get_current();
>>      Error *local_err = NULL;
>>      bool start_migration;
>>
>>     ...
>>
>>      if (!mis->from_src_file) {
>>
>>      ...
>>
>>       } else {
>>          /* Multiple connections */
>>          assert(migrate_use_multifd());
>>          start_migration = multifd_recv_new_channel(ioc, &local_err);
>>          if (local_err) {
>>              error_propagate(errp, local_err);
>>              return;
>>          }
>>      }
>>     if (start_migration) {
>>          migration_incoming_process();
>>      }
>> }
>>
>> start_migration is always 0, and migration is not started because some
>> receive threads are not created.
>>
>> No errors are detected here and the main process works well but receive
>> threads are all waiting for semaphore.
>>
>> It's hard to know if the receive threads are not created. If we can find a
>> way to check if any receive threads
> So is this only a problem for network issues that happen during startup,
> before all the threads have been created?
>
> Dave
>
>> are not created, we can kick the sem_sync and do cleanup.
>>
>>  From the source code, the thread will be created when QIO channel detects
>> something by GIO watch if I understand correctly.
>>
>> If nothing is detected, socket_accept_icoming_migration won't be called, the
>> thread will not be created.
>>
>> socket_start_incoming_migration_internal ->
>>
>>      qio_net_listener_set_client_func_full(listener,
>> socket_accept_incoming_migration,
>>                                            NULL, NULL,
>> g_main_context_get_thread_default());
>>
>>     qio_net_listener_set_client_func_full ->
>>
>>                 qio_channel_add_watch_source(
>>                  QIO_CHANNEL(listener->sioc[i]), G_IO_IN,
>>                  qio_net_listener_channel_func,
>>                  listener, (GDestroyNotify)object_unref, context);
>>
>>    socket_accept_incoming_migration ->
>>
>>         migration_channel_process_incoming ->
>>
>>                 migration_ioc_process_incoming ->
>>
>>                       multifd_recv_new_channel ->
>>
>>                              qemu_thread_create(&p->thread, p->name,
>> multifd_recv_thread, p,
>> QEMU_THREAD_JOINABLE);
>>
>>> Dave
>>>
>>>> Regards,
>>>> Daniel
>>>> -- 
>>>> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
>>>> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
>>>> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>>>>
Daniel P. Berrangé Dec. 1, 2021, 12:22 p.m. UTC | #16
On Wed, Dec 01, 2021 at 01:11:13PM +0100, Li Zhang wrote:
> 
> On 11/29/21 3:50 PM, Dr. David Alan Gilbert wrote:
> > * Li Zhang (lizhang@suse.de) wrote:
> > > On 11/29/21 12:20 PM, Dr. David Alan Gilbert wrote:
> > > > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > > > On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> > > > > > When doing live migration with multifd channels 8, 16 or larger number,
> > > > > > the guest hangs in the presence of the network errors such as missing TCP ACKs.
> > > > > > 
> > > > > > At sender's side:
> > > > > > The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> > > > > > is called because one thread fails on qio_channel_write_all when
> > > > > > the network problem happens and other send threads are blocked on sendmsg.
> > > > > > They could not be terminated. So the main thread is blocked on qemu_thread_join
> > > > > > to wait for the threads terminated.
> > > > > Isn't the right answer here to ensure we've called 'shutdown' on
> > > > > all the FDs, so that the threads get kicked out of sendmsg, before
> > > > > trying to join the thread ?
> > > > I agree a timeout is wrong here; there is no way to get a good timeout
> > > > value.
> > > > However, I'm a bit confused - we should be able to try a shutdown on the
> > > > receive side using the 'yank' command. - that's what it's there for; Li
> > > > does this solve your problem?
> > > No, I tried to register 'yank' on the receive side, the receive threads are
> > > still waiting there.
> > > 
> > > It seems that on send side, 'yank' doesn't work either when the send threads
> > > are blocked.
> > > 
> > > This may be not the case to call yank. I am not quite sure about it.
> > We need to fix that; 'yank' should be able to recover from any network
> > issue.  If it's not working we need to understand why.
> 
> Hi Dr. David,
> 
> On the receive side, I register 'yank' and it is called. But it is just to
> shut down the channels,
> 
> it couldn't fix the problem of the receive threads which are waiting for the
> semaphore.
> 
> So the receive threads are still waiting there.
> 
> On the send side,  the main process is blocked on qemu_thread_join(), when I
> tried the 'yank'
> 
> command with QMP,  it is not handled. So the QMP doesn't work and yank
> doesn't work.

IOW, there is a bug in QEMU on the send side. It should not be calling
qemu_thread_join() from the main thread, unless it is extremely
confident that the thread in question has already finished.

You seem to be showing that the thread(s) are still running, so we
need to understand why that is the case, and why the main thread
still decided to try to join these threads which haven't finished.


Regards,
Daniel
Li Zhang Dec. 1, 2021, 1:42 p.m. UTC | #17
On 12/1/21 1:22 PM, Daniel P. Berrangé wrote:
> On Wed, Dec 01, 2021 at 01:11:13PM +0100, Li Zhang wrote:
>> On 11/29/21 3:50 PM, Dr. David Alan Gilbert wrote:
>>> * Li Zhang (lizhang@suse.de) wrote:
>>>> On 11/29/21 12:20 PM, Dr. David Alan Gilbert wrote:
>>>>> * Daniel P. Berrangé (berrange@redhat.com) wrote:
>>>>>> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
>>>>>>> When doing live migration with multifd channels 8, 16 or larger number,
>>>>>>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>>>>>>
>>>>>>> At sender's side:
>>>>>>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>>>>>>> is called because one thread fails on qio_channel_write_all when
>>>>>>> the network problem happens and other send threads are blocked on sendmsg.
>>>>>>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>>>>>>> to wait for the threads terminated.
>>>>>> Isn't the right answer here to ensure we've called 'shutdown' on
>>>>>> all the FDs, so that the threads get kicked out of sendmsg, before
>>>>>> trying to join the thread ?
>>>>> I agree a timeout is wrong here; there is no way to get a good timeout
>>>>> value.
>>>>> However, I'm a bit confused - we should be able to try a shutdown on the
>>>>> receive side using the 'yank' command. - that's what it's there for; Li
>>>>> does this solve your problem?
>>>> No, I tried to register 'yank' on the receive side, the receive threads are
>>>> still waiting there.
>>>>
>>>> It seems that on send side, 'yank' doesn't work either when the send threads
>>>> are blocked.
>>>>
>>>> This may be not the case to call yank. I am not quite sure about it.
>>> We need to fix that; 'yank' should be able to recover from any network
>>> issue.  If it's not working we need to understand why.
>> Hi Dr. David,
>>
>> On the receive side, I register 'yank' and it is called. But it is just to
>> shut down the channels,
>>
>> it couldn't fix the problem of the receive threads which are waiting for the
>> semaphore.
>>
>> So the receive threads are still waiting there.
>>
>> On the send side,  the main process is blocked on qemu_thread_join(), when I
>> tried the 'yank'
>>
>> command with QMP,  it is not handled. So the QMP doesn't work and yank
>> doesn't work.
> IOW, there is a bug in QEMU on the send side. It should not be calling
> qemu_thread_join() from the main thread, unless it is extremely
> confident that the thread in question has already finished.
>
> You seem to be showing that the thread(s) are still running, so we
> need to understand why that is the case, and why the main thread
> still decided to try to join these threads which haven't finished.

Some threads are running. But there is one thread fails to 
qio_channel_write_all.

In migration_thread(), it detects an error here:

    thr_error = migration_detect_error(s);
         if (thr_error == MIG_THR_ERR_FATAL) {
             /* Stop migration */
             break;

It will stop migration and cleanup.

>
> Regards,
> Daniel
Daniel P. Berrangé Dec. 1, 2021, 2:09 p.m. UTC | #18
On Wed, Dec 01, 2021 at 02:42:04PM +0100, Li Zhang wrote:
> 
> On 12/1/21 1:22 PM, Daniel P. Berrangé wrote:
> > On Wed, Dec 01, 2021 at 01:11:13PM +0100, Li Zhang wrote:
> > > On 11/29/21 3:50 PM, Dr. David Alan Gilbert wrote:
> > > > * Li Zhang (lizhang@suse.de) wrote:
> > > > > On 11/29/21 12:20 PM, Dr. David Alan Gilbert wrote:
> > > > > > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > > > > > On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> > > > > > > > When doing live migration with multifd channels 8, 16 or larger number,
> > > > > > > > the guest hangs in the presence of the network errors such as missing TCP ACKs.
> > > > > > > > 
> > > > > > > > At sender's side:
> > > > > > > > The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> > > > > > > > is called because one thread fails on qio_channel_write_all when
> > > > > > > > the network problem happens and other send threads are blocked on sendmsg.
> > > > > > > > They could not be terminated. So the main thread is blocked on qemu_thread_join
> > > > > > > > to wait for the threads terminated.
> > > > > > > Isn't the right answer here to ensure we've called 'shutdown' on
> > > > > > > all the FDs, so that the threads get kicked out of sendmsg, before
> > > > > > > trying to join the thread ?
> > > > > > I agree a timeout is wrong here; there is no way to get a good timeout
> > > > > > value.
> > > > > > However, I'm a bit confused - we should be able to try a shutdown on the
> > > > > > receive side using the 'yank' command. - that's what it's there for; Li
> > > > > > does this solve your problem?
> > > > > No, I tried to register 'yank' on the receive side, the receive threads are
> > > > > still waiting there.
> > > > > 
> > > > > It seems that on send side, 'yank' doesn't work either when the send threads
> > > > > are blocked.
> > > > > 
> > > > > This may be not the case to call yank. I am not quite sure about it.
> > > > We need to fix that; 'yank' should be able to recover from any network
> > > > issue.  If it's not working we need to understand why.
> > > Hi Dr. David,
> > > 
> > > On the receive side, I register 'yank' and it is called. But it is just to
> > > shut down the channels,
> > > 
> > > it couldn't fix the problem of the receive threads which are waiting for the
> > > semaphore.
> > > 
> > > So the receive threads are still waiting there.
> > > 
> > > On the send side,  the main process is blocked on qemu_thread_join(), when I
> > > tried the 'yank'
> > > 
> > > command with QMP,  it is not handled. So the QMP doesn't work and yank
> > > doesn't work.
> > IOW, there is a bug in QEMU on the send side. It should not be calling
> > qemu_thread_join() from the main thread, unless it is extremely
> > confident that the thread in question has already finished.
> > 
> > You seem to be showing that the thread(s) are still running, so we
> > need to understand why that is the case, and why the main thread
> > still decided to try to join these threads which haven't finished.
> 
> Some threads are running. But there is one thread fails to
> qio_channel_write_all.
> 
> In migration_thread(), it detects an error here:
> 
>    thr_error = migration_detect_error(s);
>         if (thr_error == MIG_THR_ERR_FATAL) {
>             /* Stop migration */
>             break;
> 
> It will stop migration and cleanup.

Those threads which are still running need to be made to
terminate before trying to join them

A quick glance at multifd_send_terminate_threads() makes me
suspect multifd shutdown is not reliable.

It is merely setting some boolean flags and posting to a
semaphore. It is doing nothing to shutdown the socket
associated with each thread, so the threads can still be
waiting in an I/O call. IMHO multifd_send_terminate_threads
needs to call qio_chanel_shutdown(p->c, QIO_CHANNEL_SHUTDOWN_BOTH)


Regards,
Daniel
Li Zhang Dec. 1, 2021, 2:15 p.m. UTC | #19
On 12/1/21 3:09 PM, Daniel P. Berrangé wrote:
> On Wed, Dec 01, 2021 at 02:42:04PM +0100, Li Zhang wrote:
>> On 12/1/21 1:22 PM, Daniel P. Berrangé wrote:
>>> On Wed, Dec 01, 2021 at 01:11:13PM +0100, Li Zhang wrote:
>>>> On 11/29/21 3:50 PM, Dr. David Alan Gilbert wrote:
>>>>> * Li Zhang (lizhang@suse.de) wrote:
>>>>>> On 11/29/21 12:20 PM, Dr. David Alan Gilbert wrote:
>>>>>>> * Daniel P. Berrangé (berrange@redhat.com) wrote:
>>>>>>>> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
>>>>>>>>> When doing live migration with multifd channels 8, 16 or larger number,
>>>>>>>>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>>>>>>>>
>>>>>>>>> At sender's side:
>>>>>>>>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>>>>>>>>> is called because one thread fails on qio_channel_write_all when
>>>>>>>>> the network problem happens and other send threads are blocked on sendmsg.
>>>>>>>>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>>>>>>>>> to wait for the threads terminated.
>>>>>>>> Isn't the right answer here to ensure we've called 'shutdown' on
>>>>>>>> all the FDs, so that the threads get kicked out of sendmsg, before
>>>>>>>> trying to join the thread ?
>>>>>>> I agree a timeout is wrong here; there is no way to get a good timeout
>>>>>>> value.
>>>>>>> However, I'm a bit confused - we should be able to try a shutdown on the
>>>>>>> receive side using the 'yank' command. - that's what it's there for; Li
>>>>>>> does this solve your problem?
>>>>>> No, I tried to register 'yank' on the receive side, the receive threads are
>>>>>> still waiting there.
>>>>>>
>>>>>> It seems that on send side, 'yank' doesn't work either when the send threads
>>>>>> are blocked.
>>>>>>
>>>>>> This may be not the case to call yank. I am not quite sure about it.
>>>>> We need to fix that; 'yank' should be able to recover from any network
>>>>> issue.  If it's not working we need to understand why.
>>>> Hi Dr. David,
>>>>
>>>> On the receive side, I register 'yank' and it is called. But it is just to
>>>> shut down the channels,
>>>>
>>>> it couldn't fix the problem of the receive threads which are waiting for the
>>>> semaphore.
>>>>
>>>> So the receive threads are still waiting there.
>>>>
>>>> On the send side,  the main process is blocked on qemu_thread_join(), when I
>>>> tried the 'yank'
>>>>
>>>> command with QMP,  it is not handled. So the QMP doesn't work and yank
>>>> doesn't work.
>>> IOW, there is a bug in QEMU on the send side. It should not be calling
>>> qemu_thread_join() from the main thread, unless it is extremely
>>> confident that the thread in question has already finished.
>>>
>>> You seem to be showing that the thread(s) are still running, so we
>>> need to understand why that is the case, and why the main thread
>>> still decided to try to join these threads which haven't finished.
>> Some threads are running. But there is one thread fails to
>> qio_channel_write_all.
>>
>> In migration_thread(), it detects an error here:
>>
>>     thr_error = migration_detect_error(s);
>>          if (thr_error == MIG_THR_ERR_FATAL) {
>>              /* Stop migration */
>>              break;
>>
>> It will stop migration and cleanup.
> Those threads which are still running need to be made to
> terminate before trying to join them
>
> A quick glance at multifd_send_terminate_threads() makes me
> suspect multifd shutdown is not reliable.
>
> It is merely setting some boolean flags and posting to a
> semaphore. It is doing nothing to shutdown the socket
> associated with each thread, so the threads can still be
> waiting in an I/O call. IMHO multifd_send_terminate_threads
> needs to call qio_chanel_shutdown(p->c, QIO_CHANNEL_SHUTDOWN_BOTH)

Agree with you.


>
> Regards,
> Daniel
Li Zhang Dec. 6, 2021, 9:28 a.m. UTC | #20
On 11/29/21 4:49 PM, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
>> On Mon, Nov 29, 2021 at 11:20:08AM +0000, Dr. David Alan Gilbert wrote:
>>> * Daniel P. Berrangé (berrange@redhat.com) wrote:
>>>> On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
>>>>> When doing live migration with multifd channels 8, 16 or larger number,
>>>>> the guest hangs in the presence of the network errors such as missing TCP ACKs.
>>>>>
>>>>> At sender's side:
>>>>> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
>>>>> is called because one thread fails on qio_channel_write_all when
>>>>> the network problem happens and other send threads are blocked on sendmsg.
>>>>> They could not be terminated. So the main thread is blocked on qemu_thread_join
>>>>> to wait for the threads terminated.
>>>> Isn't the right answer here to ensure we've called 'shutdown' on
>>>> all the FDs, so that the threads get kicked out of sendmsg, before
>>>> trying to join the thread ?
>>> I agree a timeout is wrong here; there is no way to get a good timeout
>>> value.
>>> However, I'm a bit confused - we should be able to try a shutdown on the
>>> receive side using the 'yank' command. - that's what it's there for; Li
>>> does this solve your problem?
>> Why do we even need to use 'yank' on the receive side ? Until migration
>> has switched over from src to dst, the receive side is discardable and
>> the whole process can just be teminated with kill(SIGTERM/SIGKILL).
> True, although it's nice to be able to quit cleanly.

I found that the 'yank' function has been registered on receive side 
actually.

It's different from the send side.

It's in the function:

void migration_channel_process_incoming(QIOChannel *ioc)
{
     MigrationState *s = migrate_get_current();
     Error *local_err = NULL;

     trace_migration_set_incoming_channel(
         ioc, object_get_typename(OBJECT(ioc)));

     if (s->parameters.tls_creds &&
         *s->parameters.tls_creds &&
         !object_dynamic_cast(OBJECT(ioc),
                              TYPE_QIO_CHANNEL_TLS)) {
         migration_tls_channel_process_incoming(s, ioc, &local_err);
     } else {
         migration_ioc_register_yank(ioc);
         migration_ioc_process_incoming(ioc, &local_err);
     }

     if (local_err) {
         error_report_err(local_err);
     }
}


>
>> On the source side 'yank' is needed, because the QEMU process is still
>> running the live workload and thus is precious and mustn't be killed.
> True.
>
> Dave
>
>> Regards,
>> Daniel
>> -- 
>> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
>> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
>> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>>
diff mbox series

Patch

diff --git a/migration/multifd.c b/migration/multifd.c
index 7c9deb1921..656239ca2a 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -1109,7 +1109,7 @@  static void *multifd_recv_thread(void *opaque)
 
         if (flags & MULTIFD_FLAG_SYNC) {
             qemu_sem_post(&multifd_recv_state->sem_sync);
-            qemu_sem_wait(&p->sem_sync);
+            qemu_sem_timedwait(&p->sem_sync, 1000);
         }
     }