Message ID | 37febc26060949f891aedea01de724fc@h3c.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | migration/multifd: receive channel socket needs to be set to non-blocking | expand |
On Fri, Sep 20, 2024 at 10:05:42AM +0000, Yuchen wrote: > When the migration network is disconnected, the source > qemu can exit normally with an error, but the destination > qemu is always blocked in recvmsg(), causes the destination > qemu main thread to be blocked. > > The destination qemu block stack: > Thread 13 (Thread 0x7f0178bfa640 (LWP 1895906) "multifdrecv_6"): > #0 0x00007f041b5af56f in recvmsg () > #1 0x000055573ebd0b42 in qio_channel_socket_readv > #2 0x000055573ebce83f in qio_channel_readv > #3 qio_channel_readv_all_eof > #4 0x000055573ebce909 in qio_channel_readv_all > #5 0x000055573eaa1b1f in multifd_recv_thread > #6 0x000055573ec2f0b9 in qemu_thread_start > #7 0x00007f041b52bf7a in start_thread > #8 0x00007f041b5ae600 in clone3 > > Thread 1 (Thread 0x7f0410c62240 (LWP 1895156) "kvm"): > #0 0x00007f041b528ae2 in __futex_abstimed_wait_common () > #1 0x00007f041b5338b8 in __new_sem_wait_slow64.constprop.0 > #2 0x000055573ec2fd34 in qemu_sem_wait (sem=0x555742b5a4e0) > #3 0x000055573eaa2f09 in multifd_recv_sync_main () > #4 0x000055573e7d590d in ram_load_precopy (f=f@entry=0x555742291c20) > #5 0x000055573e7d5cbf in ram_load (opaque=<optimized out>, version_id=<optimized out>, f=0x555742291c20) > #6 ram_load_entry (f=0x555742291c20, opaque=<optimized out>, version_id=<optimized out>) > #7 0x000055573ea932e7 in qemu_loadvm_section_part_end (mis=0x555741136c00, f=0x555742291c20) > #8 qemu_loadvm_state_main (f=f@entry=0x555742291c20, mis=mis@entry=0x555741136c00) > #9 0x000055573ea94418 in qemu_loadvm_state (f=0x555742291c20, mode=mode@entry=VMS_MIGRATE) > #10 0x000055573ea88be1 in process_incoming_migration_co (opaque=<optimized out>) > #11 0x000055573ec43d13 in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) > #12 0x00007f041b4f5d90 in ?? () from target:/usr/lib64/libc.so.6 > #13 0x00007ffc11890270 in ?? () > #14 0x0000000000000000 in ?? () > > Setting the receive channel to non-blocking can solve the problem. Multifd threads are real threads and there's no coroutine, I'm slightly confused why it needs to use nonblock. Why recvmsg() didn't get kicked out when disconnect? Is it a generic Linux kernel are you using? I wonder whether that's the expected behavior for sockets. E.g., we do have multifd/cancel test (test_multifd_tcp_cancel) and I think that runs this path too with it always in block mode as of now.. > > Signed-off-by: YuChen <Yu.Chen@h3c.com> > --- > migration/multifd.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/migration/multifd.c b/migration/multifd.c > index 9b200f4ad9..7b2a768f05 100644 > --- a/migration/multifd.c > +++ b/migration/multifd.c > @@ -1318,6 +1318,8 @@ void multifd_recv_new_channel(QIOChannel *ioc, Error **errp) > id = qatomic_read(&multifd_recv_state->count); > } > > + qio_channel_set_blocking(ioc, false, NULL); > + > p = &multifd_recv_state->params[id]; > if (p->c != NULL) { > error_setg(&local_err, "multifd: received id '%d' already setup'", > -- > 2.30.2 > ------------------------------------------------------------------------------------------------------------------------------------- > 本邮件及其附件含有新华三集团的保密信息,仅限于发送给上面地址中列出 > 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 > 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 > 邮件! > This e-mail and its attachments contain confidential information from New H3C, which is > intended only for the person or entity whose address is listed above. Any use of the > information contained herein in any way (including, but not limited to, total or partial > disclosure, reproduction, or dissemination) by persons other than the intended > recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender > by phone or email immediately and delete it!
> -----邮件原件----- > 发件人: Peter Xu <peterx@redhat.com> > 发送时间: 2024年9月20日 23:53 > 收件人: yuchen (CCSPL) <yu.chen@h3c.com> > 抄送: farosas@suse.de; qemu-devel@nongnu.org > 主题: Re: [PATCH] migration/multifd: receive channel socket needs to be set to > non-blocking > > On Fri, Sep 20, 2024 at 10:05:42AM +0000, Yuchen wrote: > > When the migration network is disconnected, the source qemu can exit > > normally with an error, but the destination qemu is always blocked in > > recvmsg(), causes the destination qemu main thread to be blocked. > > > > The destination qemu block stack: > > Thread 13 (Thread 0x7f0178bfa640 (LWP 1895906) "multifdrecv_6"): > > #0 0x00007f041b5af56f in recvmsg () > > #1 0x000055573ebd0b42 in qio_channel_socket_readv > > #2 0x000055573ebce83f in qio_channel_readv > > #3 qio_channel_readv_all_eof > > #4 0x000055573ebce909 in qio_channel_readv_all > > #5 0x000055573eaa1b1f in multifd_recv_thread > > #6 0x000055573ec2f0b9 in qemu_thread_start > > #7 0x00007f041b52bf7a in start_thread > > #8 0x00007f041b5ae600 in clone3 > > > > Thread 1 (Thread 0x7f0410c62240 (LWP 1895156) "kvm"): > > #0 0x00007f041b528ae2 in __futex_abstimed_wait_common () > > #1 0x00007f041b5338b8 in __new_sem_wait_slow64.constprop.0 > > #2 0x000055573ec2fd34 in qemu_sem_wait (sem=0x555742b5a4e0) > > #3 0x000055573eaa2f09 in multifd_recv_sync_main () > > #4 0x000055573e7d590d in ram_load_precopy > (f=f@entry=0x555742291c20) > > #5 0x000055573e7d5cbf in ram_load (opaque=<optimized out>, > > version_id=<optimized out>, f=0x555742291c20) > > #6 ram_load_entry (f=0x555742291c20, opaque=<optimized out>, > > version_id=<optimized out>) > > #7 0x000055573ea932e7 in qemu_loadvm_section_part_end > > (mis=0x555741136c00, f=0x555742291c20) > > #8 qemu_loadvm_state_main (f=f@entry=0x555742291c20, > > mis=mis@entry=0x555741136c00) > > #9 0x000055573ea94418 in qemu_loadvm_state (f=0x555742291c20, > > mode=mode@entry=VMS_MIGRATE) > > #10 0x000055573ea88be1 in process_incoming_migration_co > > (opaque=<optimized out>) > > #11 0x000055573ec43d13 in coroutine_trampoline (i0=<optimized out>, > > i1=<optimized out>) > > #12 0x00007f041b4f5d90 in ?? () from target:/usr/lib64/libc.so.6 > > #13 0x00007ffc11890270 in ?? () > > #14 0x0000000000000000 in ?? () > > > > Setting the receive channel to non-blocking can solve the problem. > > Multifd threads are real threads and there's no coroutine, I'm slightly confused > why it needs to use nonblock. > > Why recvmsg() didn't get kicked out when disconnect? Is it a generic Linux > kernel are you using? > My steps to reproduce: ifdown migration network, or disable migration network using iptables. The probability of recurrence of these two methods is very high. My test environment uses is linux-5.10.136. multifd thread block in kernel: # cat /proc/3416190/stack [<0>] wait_woken+0x43/0x80 [<0>] sk_wait_data+0x123/0x140 [<0>] tcp_recvmsg+0x4f8/0xa50 [<0>] inet6_recvmsg+0x5e/0x120 [<0>] ____sys_recvmsg+0x87/0x180 [<0>] ___sys_recvmsg+0x82/0x110 [<0>] __sys_recvmsg+0x56/0xa0 [<0>] do_syscall_64+0x3d/0x80 [<0>] entry_SYSCALL_64_after_hwframe+0x61/0xc6 > I wonder whether that's the expected behavior for sockets. E.g., we do have > multifd/cancel test (test_multifd_tcp_cancel) and I think that runs this path too > with it always in block mode as of now.. > My previous statement may not be accurate. The migration network socket is not disconnected. I use ifdown or iptables to simulate the network card failure. Because the TCP connection was not disconnected, so recvmsg() was blocked. Ordinary precopy migration, the destination also uses non-blocking, I think it's to avoid non-blocking. Qemu master lastest code: /** * migration_incoming_setup: Setup incoming migration * @f: file for main migration channel */ static void migration_incoming_setup(QEMUFile *f) { MigrationIncomingState *mis = migration_incoming_get_current(); if (!mis->from_src_file) { mis->from_src_file = f; } qemu_file_set_blocking(f, false); } > > > > Signed-off-by: YuChen <Yu.Chen@h3c.com> > > --- > > migration/multifd.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/migration/multifd.c b/migration/multifd.c index > > 9b200f4ad9..7b2a768f05 100644 > > --- a/migration/multifd.c > > +++ b/migration/multifd.c > > @@ -1318,6 +1318,8 @@ void multifd_recv_new_channel(QIOChannel *ioc, > Error **errp) > > id = qatomic_read(&multifd_recv_state->count); > > } > > > > + qio_channel_set_blocking(ioc, false, NULL); > > + > > p = &multifd_recv_state->params[id]; > > if (p->c != NULL) { > > error_setg(&local_err, "multifd: received id '%d' already > > setup'", > > -- > > 2.30.2 > > ---------------------------------------------------------------------- > > --------------------------------------------------------------- > > 本邮件及其附件含有新华三集团的保密信息,仅限于发送给上面地址中 > 列出 > > 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或 > 部分地泄露、复制、 > > 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件 > 通知发件人并删除本 > > 邮件! > > This e-mail and its attachments contain confidential information from > > New H3C, which is intended only for the person or entity whose address > > is listed above. Any use of the information contained herein in any > > way (including, but not limited to, total or partial disclosure, > > reproduction, or dissemination) by persons other than the intended > > recipient(s) is prohibited. If you receive this e-mail in error, > > please notify the sender by phone or email immediately and delete it! > > -- > Peter Xu
On Mon, Sep 23, 2024 at 01:33:13AM +0000, Yuchen wrote: > > > > -----邮件原件----- > > 发件人: Peter Xu <peterx@redhat.com> > > 发送时间: 2024年9月20日 23:53 > > 收件人: yuchen (CCSPL) <yu.chen@h3c.com> > > 抄送: farosas@suse.de; qemu-devel@nongnu.org > > 主题: Re: [PATCH] migration/multifd: receive channel socket needs to be set to > > non-blocking > > > > On Fri, Sep 20, 2024 at 10:05:42AM +0000, Yuchen wrote: > > > When the migration network is disconnected, the source qemu can exit > > > normally with an error, but the destination qemu is always blocked in > > > recvmsg(), causes the destination qemu main thread to be blocked. > > > > > > The destination qemu block stack: > > > Thread 13 (Thread 0x7f0178bfa640 (LWP 1895906) "multifdrecv_6"): > > > #0 0x00007f041b5af56f in recvmsg () > > > #1 0x000055573ebd0b42 in qio_channel_socket_readv > > > #2 0x000055573ebce83f in qio_channel_readv > > > #3 qio_channel_readv_all_eof > > > #4 0x000055573ebce909 in qio_channel_readv_all > > > #5 0x000055573eaa1b1f in multifd_recv_thread > > > #6 0x000055573ec2f0b9 in qemu_thread_start > > > #7 0x00007f041b52bf7a in start_thread > > > #8 0x00007f041b5ae600 in clone3 > > > > > > Thread 1 (Thread 0x7f0410c62240 (LWP 1895156) "kvm"): > > > #0 0x00007f041b528ae2 in __futex_abstimed_wait_common () > > > #1 0x00007f041b5338b8 in __new_sem_wait_slow64.constprop.0 > > > #2 0x000055573ec2fd34 in qemu_sem_wait (sem=0x555742b5a4e0) > > > #3 0x000055573eaa2f09 in multifd_recv_sync_main () > > > #4 0x000055573e7d590d in ram_load_precopy > > (f=f@entry=0x555742291c20) > > > #5 0x000055573e7d5cbf in ram_load (opaque=<optimized out>, > > > version_id=<optimized out>, f=0x555742291c20) > > > #6 ram_load_entry (f=0x555742291c20, opaque=<optimized out>, > > > version_id=<optimized out>) > > > #7 0x000055573ea932e7 in qemu_loadvm_section_part_end > > > (mis=0x555741136c00, f=0x555742291c20) > > > #8 qemu_loadvm_state_main (f=f@entry=0x555742291c20, > > > mis=mis@entry=0x555741136c00) > > > #9 0x000055573ea94418 in qemu_loadvm_state (f=0x555742291c20, > > > mode=mode@entry=VMS_MIGRATE) > > > #10 0x000055573ea88be1 in process_incoming_migration_co > > > (opaque=<optimized out>) > > > #11 0x000055573ec43d13 in coroutine_trampoline (i0=<optimized out>, > > > i1=<optimized out>) > > > #12 0x00007f041b4f5d90 in ?? () from target:/usr/lib64/libc.so.6 > > > #13 0x00007ffc11890270 in ?? () > > > #14 0x0000000000000000 in ?? () > > > > > > Setting the receive channel to non-blocking can solve the problem. > > > > Multifd threads are real threads and there's no coroutine, I'm slightly confused > > why it needs to use nonblock. > > > > Why recvmsg() didn't get kicked out when disconnect? Is it a generic Linux > > kernel are you using? > > > My steps to reproduce: > ifdown migration network, or disable migration network using iptables. > The probability of recurrence of these two methods is very high. > > My test environment uses is linux-5.10.136. > > multifd thread block in kernel: > # cat /proc/3416190/stack > [<0>] wait_woken+0x43/0x80 > [<0>] sk_wait_data+0x123/0x140 > [<0>] tcp_recvmsg+0x4f8/0xa50 > [<0>] inet6_recvmsg+0x5e/0x120 > [<0>] ____sys_recvmsg+0x87/0x180 > [<0>] ___sys_recvmsg+0x82/0x110 > [<0>] __sys_recvmsg+0x56/0xa0 > [<0>] do_syscall_64+0x3d/0x80 > [<0>] entry_SYSCALL_64_after_hwframe+0x61/0xc6 > > > I wonder whether that's the expected behavior for sockets. E.g., we do have > > multifd/cancel test (test_multifd_tcp_cancel) and I think that runs this path too > > with it always in block mode as of now.. > > > My previous statement may not be accurate. The migration network socket is not disconnected. > I use ifdown or iptables to simulate the network card failure. > Because the TCP connection was not disconnected, so recvmsg() was blocked. How long did you wait after doing ifdown ? TCP is intended to wait if there is an interruption.... only eventually after relevant TCP timeouts are hit, it will terminate the connection. QEMU shouldn't proactively give up if the TCP conn is still in an active state as reported by the kernel, even if traffic isn't currently flowing. With regards, Daniel
> -----邮件原件----- > 发件人: Daniel P. Berrangé <berrange@redhat.com> > 发送时间: 2024年9月24日 0:59 > 收件人: yuchen (CCSPL) <yu.chen@h3c.com> > 抄送: Peter Xu <peterx@redhat.com>; farosas@suse.de; > qemu-devel@nongnu.org > 主题: Re: 回复: [PATCH] migration/multifd: receive channel socket needs to > be set to non-blocking > > On Mon, Sep 23, 2024 at 01:33:13AM +0000, Yuchen wrote: > > > > > > > -----邮件原件----- > > > 发件人: Peter Xu <peterx@redhat.com> > > > 发送时间: 2024年9月20日 23:53 > > > 收件人: yuchen (CCSPL) <yu.chen@h3c.com> > > > 抄送: farosas@suse.de; qemu-devel@nongnu.org > > > 主题: Re: [PATCH] migration/multifd: receive channel socket needs to > > > be set to non-blocking > > > > > > On Fri, Sep 20, 2024 at 10:05:42AM +0000, Yuchen wrote: > > > > When the migration network is disconnected, the source qemu can > > > > exit normally with an error, but the destination qemu is always > > > > blocked in recvmsg(), causes the destination qemu main thread to be > blocked. > > > > > > > > The destination qemu block stack: > > > > Thread 13 (Thread 0x7f0178bfa640 (LWP 1895906) "multifdrecv_6"): > > > > #0 0x00007f041b5af56f in recvmsg () > > > > #1 0x000055573ebd0b42 in qio_channel_socket_readv > > > > #2 0x000055573ebce83f in qio_channel_readv > > > > #3 qio_channel_readv_all_eof > > > > #4 0x000055573ebce909 in qio_channel_readv_all > > > > #5 0x000055573eaa1b1f in multifd_recv_thread > > > > #6 0x000055573ec2f0b9 in qemu_thread_start > > > > #7 0x00007f041b52bf7a in start_thread > > > > #8 0x00007f041b5ae600 in clone3 > > > > > > > > Thread 1 (Thread 0x7f0410c62240 (LWP 1895156) "kvm"): > > > > #0 0x00007f041b528ae2 in __futex_abstimed_wait_common () > > > > #1 0x00007f041b5338b8 in __new_sem_wait_slow64.constprop.0 > > > > #2 0x000055573ec2fd34 in qemu_sem_wait (sem=0x555742b5a4e0) > > > > #3 0x000055573eaa2f09 in multifd_recv_sync_main () > > > > #4 0x000055573e7d590d in ram_load_precopy > > > (f=f@entry=0x555742291c20) > > > > #5 0x000055573e7d5cbf in ram_load (opaque=<optimized out>, > > > > version_id=<optimized out>, f=0x555742291c20) > > > > #6 ram_load_entry (f=0x555742291c20, opaque=<optimized out>, > > > > version_id=<optimized out>) > > > > #7 0x000055573ea932e7 in qemu_loadvm_section_part_end > > > > (mis=0x555741136c00, f=0x555742291c20) > > > > #8 qemu_loadvm_state_main (f=f@entry=0x555742291c20, > > > > mis=mis@entry=0x555741136c00) > > > > #9 0x000055573ea94418 in qemu_loadvm_state (f=0x555742291c20, > > > > mode=mode@entry=VMS_MIGRATE) > > > > #10 0x000055573ea88be1 in process_incoming_migration_co > > > > (opaque=<optimized out>) > > > > #11 0x000055573ec43d13 in coroutine_trampoline (i0=<optimized > > > > out>, i1=<optimized out>) > > > > #12 0x00007f041b4f5d90 in ?? () from target:/usr/lib64/libc.so.6 > > > > #13 0x00007ffc11890270 in ?? () > > > > #14 0x0000000000000000 in ?? () > > > > > > > > Setting the receive channel to non-blocking can solve the problem. > > > > > > Multifd threads are real threads and there's no coroutine, I'm > > > slightly confused why it needs to use nonblock. > > > > > > Why recvmsg() didn't get kicked out when disconnect? Is it a > > > generic Linux kernel are you using? > > > > > My steps to reproduce: > > ifdown migration network, or disable migration network using iptables. > > The probability of recurrence of these two methods is very high. > > > > My test environment uses is linux-5.10.136. > > > > multifd thread block in kernel: > > # cat /proc/3416190/stack > > [<0>] wait_woken+0x43/0x80 > > [<0>] sk_wait_data+0x123/0x140 > > [<0>] tcp_recvmsg+0x4f8/0xa50 > > [<0>] inet6_recvmsg+0x5e/0x120 > > [<0>] ____sys_recvmsg+0x87/0x180 > > [<0>] ___sys_recvmsg+0x82/0x110 > > [<0>] __sys_recvmsg+0x56/0xa0 > > [<0>] do_syscall_64+0x3d/0x80 > > [<0>] entry_SYSCALL_64_after_hwframe+0x61/0xc6 > > > > > I wonder whether that's the expected behavior for sockets. E.g., we > > > do have multifd/cancel test (test_multifd_tcp_cancel) and I think > > > that runs this path too with it always in block mode as of now.. > > > > > My previous statement may not be accurate. The migration network socket is > not disconnected. > > I use ifdown or iptables to simulate the network card failure. > > Because the TCP connection was not disconnected, so recvmsg() was > blocked. > > How long did you wait after doing ifdown ? TCP is intended to wait if I waited about 15 minutes, the source qemu migration threads quit, but the destination qemu migration threads is still there. > there is an interruption.... only eventually after relevant TCP timeouts are hit, it > will terminate the connection. QEMU shouldn't proactively give up if the TCP > conn is still in an active state as reported by the kernel, even if traffic isn't > currently flowing. > > Daniel, I agree with what you said, But in fact, the destination migration connection is not disconnected and is in the close wait state. The source qemu process lsof and top: # lsof -p 384509 ... kvm 384509 root 112u sock 0,8 0t0 157321811 protocol: TCP kvm 384509 root 113u sock 0,8 0t0 157321813 protocol: TCP kvm 384509 root 114u sock 0,8 0t0 157321815 protocol: TCP kvm 384509 root 115u sock 0,8 0t0 157321817 protocol: TCP kvm 384509 root 116u sock 0,8 0t0 157321819 protocol: TCP kvm 384509 root 117u sock 0,8 0t0 157321821 protocol: TCP kvm 384509 root 118u sock 0,8 0t0 157321823 protocol: TCP kvm 384509 root 119u sock 0,8 0t0 157321825 protocol: TCP # top -H -p 384509 top - 15:10:22 up 5 days, 18:54, 3 users, load average: 5.16, 4.61, 4.50 Threads: 8 total, 3 running, 5 sleeping, 0 stopped, 0 zombie %Cpu(s): 5.2 us, 5.2 sy, 0.0 ni, 89.3 id, 0.0 wa, 0.1 hi, 0.1 si, 0.0 st MiB Mem : 128298.7 total, 41490.2 free, 89470.2 used, 2168.0 buff/cache MiB Swap: 42922.0 total, 42910.4 free, 11.6 used. 38828.5 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 384596 root 20 0 11.9g 93516 40112 R 98.7 0.1 261:13.24 CPU 1/KVM 384595 root 20 0 11.9g 93516 40112 R 98.0 0.1 56:44.31 CPU 0/KVM 384509 root 20 0 11.9g 93516 40112 R 1.3 0.1 7:38.73 kvm 384563 root 20 0 11.9g 93516 40112 S 0.0 0.1 0:00.05 kvm 384598 root 20 0 11.9g 93516 40112 S 0.0 0.1 0:01.00 vnc_worker 1544593 root 20 0 11.9g 93516 40112 S 0.0 0.1 0:00.00 worker The destination qemu process lsof and top: # lsof -p 3236693 kvm 3236693 root 29u IPv6 159227758 0t0 TCP node18:49156->2.2.2.6:41880 (CLOSE_WAIT) kvm 3236693 root 30u IPv6 159227759 0t0 TCP node18:49156->2.2.2.6:41890 (ESTABLISHED) kvm 3236693 root 31u IPv6 159227760 0t0 TCP node18:49156->2.2.2.6:41902 (ESTABLISHED) kvm 3236693 root 32u IPv6 159227762 0t0 TCP node18:49156->2.2.2.6:41912 (ESTABLISHED) kvm 3236693 root 33u IPv6 159227761 0t0 TCP node18:49156->2.2.2.6:41904 (ESTABLISHED) kvm 3236693 root 34u IPv6 159227763 0t0 TCP node18:49156->2.2.2.6:41918 (ESTABLISHED) kvm 3236693 root 35u IPv6 159227764 0t0 TCP node18:49156->2.2.2.6:41924 (ESTABLISHED) kvm 3236693 root 36u IPv6 159227765 0t0 TCP node18:49156->2.2.2.6:41934 (ESTABLISHED) kvm 3236693 root 37u IPv6 159227766 0t0 TCP node18:49156->2.2.2.6:41942 (ESTABLISHED) # top -H -p 3236693 top - 15:09:25 up 5 days, 19:12, 2 users, load average: 0.63, 0.68, 0.89 Threads: 15 total, 0 running, 15 sleeping, 0 stopped, 0 zombie %Cpu(s): 1.3 us, 0.5 sy, 0.0 ni, 98.1 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st MiB Mem : 128452.1 total, 43515.4 free, 87291.7 used, 2527.4 buff/cache MiB Swap: 42973.0 total, 42968.4 free, 4.6 used. 41160.4 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3236693 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:00.41 kvm 3236714 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:00.00 kvm 3236745 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:00.00 CPU 0/KVM 3236746 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:00.00 CPU 1/KVM 3236748 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:00.00 vnc_worker 3236750 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:01.45 multifdrecv_4 3236751 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:01.42 multifdrecv_5 3236752 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:01.35 multifdrecv_6 3236753 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:01.49 multifdrecv_7 3236754 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:01.45 multifdrecv_1 3236755 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:01.43 multifdrecv_2 3236756 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:01.45 multifdrecv_3 3236757 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:01.44 multifdrecv_0 So we should still set the multifd channel socket to non-blocking ? > With regards, > Daniel > -- > |: https://berrange.com -o- > https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- > https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- > https://www.instagram.com/dberrange :| ------------------------------------------------------------------------------------------------------------------------------------- 本邮件及其附件含有新华三集团的保密信息,仅限于发送给上面地址中列出 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 邮件! This e-mail and its attachments contain confidential information from New H3C, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!
On Tue, Sep 24, 2024 at 08:25:22AM +0000, Yuchen wrote: > > > > -----邮件原件----- > > 发件人: Daniel P. Berrangé <berrange@redhat.com> > > 发送时间: 2024年9月24日 0:59 > > 收件人: yuchen (CCSPL) <yu.chen@h3c.com> > > 抄送: Peter Xu <peterx@redhat.com>; farosas@suse.de; > > qemu-devel@nongnu.org > > 主题: Re: 回复: [PATCH] migration/multifd: receive channel socket needs to > > be set to non-blocking > > > > On Mon, Sep 23, 2024 at 01:33:13AM +0000, Yuchen wrote: > > > > > > > > > > -----邮件原件----- > > > > 发件人: Peter Xu <peterx@redhat.com> > > > > 发送时间: 2024年9月20日 23:53 > > > > 收件人: yuchen (CCSPL) <yu.chen@h3c.com> > > > > 抄送: farosas@suse.de; qemu-devel@nongnu.org > > > > 主题: Re: [PATCH] migration/multifd: receive channel socket needs to > > > > be set to non-blocking > > > > > > > > On Fri, Sep 20, 2024 at 10:05:42AM +0000, Yuchen wrote: > > > > > When the migration network is disconnected, the source qemu can > > > > > exit normally with an error, but the destination qemu is always > > > > > blocked in recvmsg(), causes the destination qemu main thread to be > > blocked. > > > > > > > > > > The destination qemu block stack: > > > > > Thread 13 (Thread 0x7f0178bfa640 (LWP 1895906) "multifdrecv_6"): > > > > > #0 0x00007f041b5af56f in recvmsg () > > > > > #1 0x000055573ebd0b42 in qio_channel_socket_readv > > > > > #2 0x000055573ebce83f in qio_channel_readv > > > > > #3 qio_channel_readv_all_eof > > > > > #4 0x000055573ebce909 in qio_channel_readv_all > > > > > #5 0x000055573eaa1b1f in multifd_recv_thread > > > > > #6 0x000055573ec2f0b9 in qemu_thread_start > > > > > #7 0x00007f041b52bf7a in start_thread > > > > > #8 0x00007f041b5ae600 in clone3 > > > > > > > > > > Thread 1 (Thread 0x7f0410c62240 (LWP 1895156) "kvm"): > > > > > #0 0x00007f041b528ae2 in __futex_abstimed_wait_common () > > > > > #1 0x00007f041b5338b8 in __new_sem_wait_slow64.constprop.0 > > > > > #2 0x000055573ec2fd34 in qemu_sem_wait (sem=0x555742b5a4e0) > > > > > #3 0x000055573eaa2f09 in multifd_recv_sync_main () > > > > > #4 0x000055573e7d590d in ram_load_precopy > > > > (f=f@entry=0x555742291c20) > > > > > #5 0x000055573e7d5cbf in ram_load (opaque=<optimized out>, > > > > > version_id=<optimized out>, f=0x555742291c20) > > > > > #6 ram_load_entry (f=0x555742291c20, opaque=<optimized out>, > > > > > version_id=<optimized out>) > > > > > #7 0x000055573ea932e7 in qemu_loadvm_section_part_end > > > > > (mis=0x555741136c00, f=0x555742291c20) > > > > > #8 qemu_loadvm_state_main (f=f@entry=0x555742291c20, > > > > > mis=mis@entry=0x555741136c00) > > > > > #9 0x000055573ea94418 in qemu_loadvm_state (f=0x555742291c20, > > > > > mode=mode@entry=VMS_MIGRATE) > > > > > #10 0x000055573ea88be1 in process_incoming_migration_co > > > > > (opaque=<optimized out>) > > > > > #11 0x000055573ec43d13 in coroutine_trampoline (i0=<optimized > > > > > out>, i1=<optimized out>) > > > > > #12 0x00007f041b4f5d90 in ?? () from target:/usr/lib64/libc.so.6 > > > > > #13 0x00007ffc11890270 in ?? () > > > > > #14 0x0000000000000000 in ?? () > > > > > > > > > > Setting the receive channel to non-blocking can solve the problem. > > > > > > > > Multifd threads are real threads and there's no coroutine, I'm > > > > slightly confused why it needs to use nonblock. > > > > > > > > Why recvmsg() didn't get kicked out when disconnect? Is it a > > > > generic Linux kernel are you using? > > > > > > > My steps to reproduce: > > > ifdown migration network, or disable migration network using iptables. > > > The probability of recurrence of these two methods is very high. > > > > > > My test environment uses is linux-5.10.136. > > > > > > multifd thread block in kernel: > > > # cat /proc/3416190/stack > > > [<0>] wait_woken+0x43/0x80 > > > [<0>] sk_wait_data+0x123/0x140 > > > [<0>] tcp_recvmsg+0x4f8/0xa50 > > > [<0>] inet6_recvmsg+0x5e/0x120 > > > [<0>] ____sys_recvmsg+0x87/0x180 > > > [<0>] ___sys_recvmsg+0x82/0x110 > > > [<0>] __sys_recvmsg+0x56/0xa0 > > > [<0>] do_syscall_64+0x3d/0x80 > > > [<0>] entry_SYSCALL_64_after_hwframe+0x61/0xc6 > > > > > > > I wonder whether that's the expected behavior for sockets. E.g., we > > > > do have multifd/cancel test (test_multifd_tcp_cancel) and I think > > > > that runs this path too with it always in block mode as of now.. > > > > > > > My previous statement may not be accurate. The migration network socket is > > not disconnected. > > > I use ifdown or iptables to simulate the network card failure. > > > Because the TCP connection was not disconnected, so recvmsg() was > > blocked. > > > > How long did you wait after doing ifdown ? TCP is intended to wait if > > I waited about 15 minutes, the source qemu migration threads quit, but > the destination qemu migration threads is still there. > > > there is an interruption.... only eventually after relevant TCP timeouts are hit, it > > will terminate the connection. QEMU shouldn't proactively give up if the TCP > > conn is still in an active state as reported by the kernel, even if traffic isn't > > currently flowing. > > > > > > Daniel, I agree with what you said, But in fact, the destination migration connection is not disconnected > and is in the close wait state. > > The source qemu process lsof and top: > # lsof -p 384509 > ... > kvm 384509 root 112u sock 0,8 0t0 157321811 protocol: TCP > kvm 384509 root 113u sock 0,8 0t0 157321813 protocol: TCP > kvm 384509 root 114u sock 0,8 0t0 157321815 protocol: TCP > kvm 384509 root 115u sock 0,8 0t0 157321817 protocol: TCP > kvm 384509 root 116u sock 0,8 0t0 157321819 protocol: TCP > kvm 384509 root 117u sock 0,8 0t0 157321821 protocol: TCP > kvm 384509 root 118u sock 0,8 0t0 157321823 protocol: TCP > kvm 384509 root 119u sock 0,8 0t0 157321825 protocol: TCP > > # top -H -p 384509 > top - 15:10:22 up 5 days, 18:54, 3 users, load average: 5.16, 4.61, 4.50 > Threads: 8 total, 3 running, 5 sleeping, 0 stopped, 0 zombie > %Cpu(s): 5.2 us, 5.2 sy, 0.0 ni, 89.3 id, 0.0 wa, 0.1 hi, 0.1 si, 0.0 st > MiB Mem : 128298.7 total, 41490.2 free, 89470.2 used, 2168.0 buff/cache > MiB Swap: 42922.0 total, 42910.4 free, 11.6 used. 38828.5 avail Mem > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 384596 root 20 0 11.9g 93516 40112 R 98.7 0.1 261:13.24 CPU 1/KVM > 384595 root 20 0 11.9g 93516 40112 R 98.0 0.1 56:44.31 CPU 0/KVM > 384509 root 20 0 11.9g 93516 40112 R 1.3 0.1 7:38.73 kvm > 384563 root 20 0 11.9g 93516 40112 S 0.0 0.1 0:00.05 kvm > 384598 root 20 0 11.9g 93516 40112 S 0.0 0.1 0:01.00 vnc_worker > 1544593 root 20 0 11.9g 93516 40112 S 0.0 0.1 0:00.00 worker > > The destination qemu process lsof and top: > # lsof -p 3236693 > kvm 3236693 root 29u IPv6 159227758 0t0 TCP node18:49156->2.2.2.6:41880 (CLOSE_WAIT) > kvm 3236693 root 30u IPv6 159227759 0t0 TCP node18:49156->2.2.2.6:41890 (ESTABLISHED) > kvm 3236693 root 31u IPv6 159227760 0t0 TCP node18:49156->2.2.2.6:41902 (ESTABLISHED) > kvm 3236693 root 32u IPv6 159227762 0t0 TCP node18:49156->2.2.2.6:41912 (ESTABLISHED) > kvm 3236693 root 33u IPv6 159227761 0t0 TCP node18:49156->2.2.2.6:41904 (ESTABLISHED) > kvm 3236693 root 34u IPv6 159227763 0t0 TCP node18:49156->2.2.2.6:41918 (ESTABLISHED) > kvm 3236693 root 35u IPv6 159227764 0t0 TCP node18:49156->2.2.2.6:41924 (ESTABLISHED) > kvm 3236693 root 36u IPv6 159227765 0t0 TCP node18:49156->2.2.2.6:41934 (ESTABLISHED) > kvm 3236693 root 37u IPv6 159227766 0t0 TCP node18:49156->2.2.2.6:41942 (ESTABLISHED) > > # top -H -p 3236693 > top - 15:09:25 up 5 days, 19:12, 2 users, load average: 0.63, 0.68, 0.89 > Threads: 15 total, 0 running, 15 sleeping, 0 stopped, 0 zombie > %Cpu(s): 1.3 us, 0.5 sy, 0.0 ni, 98.1 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st > MiB Mem : 128452.1 total, 43515.4 free, 87291.7 used, 2527.4 buff/cache > MiB Swap: 42973.0 total, 42968.4 free, 4.6 used. 41160.4 avail Mem > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 3236693 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:00.41 kvm > 3236714 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:00.00 kvm > 3236745 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:00.00 CPU 0/KVM > 3236746 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:00.00 CPU 1/KVM > 3236748 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:00.00 vnc_worker > 3236750 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:01.45 multifdrecv_4 > 3236751 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:01.42 multifdrecv_5 > 3236752 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:01.35 multifdrecv_6 > 3236753 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:01.49 multifdrecv_7 > 3236754 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:01.45 multifdrecv_1 > 3236755 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:01.43 multifdrecv_2 > 3236756 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:01.45 multifdrecv_3 > 3236757 root 20 0 11.3g 100192 38508 S 0.0 0.1 0:01.44 multifdrecv_0 > > So we should still set the multifd channel socket to non-blocking ? Have you looked at why the timeout didn't work? After all, QEMU is not the only application that uses recvmsg() like this, so I wonder whether it's intended or it's a kernel bug that recvmsg() didn't get kicked out. > > > With regards, > > Daniel > > -- > > |: https://berrange.com -o- > > https://www.flickr.com/photos/dberrange :| > > |: https://libvirt.org -o- > > https://fstop138.berrange.com :| > > |: https://entangle-photo.org -o- > > https://www.instagram.com/dberrange :| > > ------------------------------------------------------------------------------------------------------------------------------------- > 本邮件及其附件含有新华三集团的保密信息,仅限于发送给上面地址中列出 > 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 > 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 > 邮件! > This e-mail and its attachments contain confidential information from New H3C, which is > intended only for the person or entity whose address is listed above. Any use of the > information contained herein in any way (including, but not limited to, total or partial > disclosure, reproduction, or dissemination) by persons other than the intended > recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender > by phone or email immediately and delete it!
diff --git a/migration/multifd.c b/migration/multifd.c index 9b200f4ad9..7b2a768f05 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -1318,6 +1318,8 @@ void multifd_recv_new_channel(QIOChannel *ioc, Error **errp) id = qatomic_read(&multifd_recv_state->count); } + qio_channel_set_blocking(ioc, false, NULL); + p = &multifd_recv_state->params[id]; if (p->c != NULL) { error_setg(&local_err, "multifd: received id '%d' already setup'",
When the migration network is disconnected, the source qemu can exit normally with an error, but the destination qemu is always blocked in recvmsg(), causes the destination qemu main thread to be blocked. The destination qemu block stack: Thread 13 (Thread 0x7f0178bfa640 (LWP 1895906) "multifdrecv_6"): #0 0x00007f041b5af56f in recvmsg () #1 0x000055573ebd0b42 in qio_channel_socket_readv #2 0x000055573ebce83f in qio_channel_readv #3 qio_channel_readv_all_eof #4 0x000055573ebce909 in qio_channel_readv_all #5 0x000055573eaa1b1f in multifd_recv_thread #6 0x000055573ec2f0b9 in qemu_thread_start #7 0x00007f041b52bf7a in start_thread #8 0x00007f041b5ae600 in clone3 Thread 1 (Thread 0x7f0410c62240 (LWP 1895156) "kvm"): #0 0x00007f041b528ae2 in __futex_abstimed_wait_common () #1 0x00007f041b5338b8 in __new_sem_wait_slow64.constprop.0 #2 0x000055573ec2fd34 in qemu_sem_wait (sem=0x555742b5a4e0) #3 0x000055573eaa2f09 in multifd_recv_sync_main () #4 0x000055573e7d590d in ram_load_precopy (f=f@entry=0x555742291c20) #5 0x000055573e7d5cbf in ram_load (opaque=<optimized out>, version_id=<optimized out>, f=0x555742291c20) #6 ram_load_entry (f=0x555742291c20, opaque=<optimized out>, version_id=<optimized out>) #7 0x000055573ea932e7 in qemu_loadvm_section_part_end (mis=0x555741136c00, f=0x555742291c20) #8 qemu_loadvm_state_main (f=f@entry=0x555742291c20, mis=mis@entry=0x555741136c00) #9 0x000055573ea94418 in qemu_loadvm_state (f=0x555742291c20, mode=mode@entry=VMS_MIGRATE) #10 0x000055573ea88be1 in process_incoming_migration_co (opaque=<optimized out>) #11 0x000055573ec43d13 in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) #12 0x00007f041b4f5d90 in ?? () from target:/usr/lib64/libc.so.6 #13 0x00007ffc11890270 in ?? () #14 0x0000000000000000 in ?? () Setting the receive channel to non-blocking can solve the problem. Signed-off-by: YuChen <Yu.Chen@h3c.com> --- migration/multifd.c | 2 ++ 1 file changed, 2 insertions(+) -- 2.30.2 ------------------------------------------------------------------------------------------------------------------------------------- 本邮件及其附件含有新华三集团的保密信息,仅限于发送给上面地址中列出 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 邮件! This e-mail and its attachments contain confidential information from New H3C, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!