From patchwork Thu Apr 28 10:15:27 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= X-Patchwork-Id: 8967711 Return-Path: X-Original-To: patchwork-qemu-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 823FF9F441 for ; Thu, 28 Apr 2016 10:15:51 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 7EF0620121 for ; Thu, 28 Apr 2016 10:15:50 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 12E7320035 for ; Thu, 28 Apr 2016 10:15:48 +0000 (UTC) Received: from localhost ([::1]:47794 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aviz5-00084P-4G for patchwork-qemu-devel@patchwork.kernel.org; Thu, 28 Apr 2016 06:15:47 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37956) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aviyu-00081I-Aa for qemu-devel@nongnu.org; Thu, 28 Apr 2016 06:15:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aviyq-0000fT-NR for qemu-devel@nongnu.org; Thu, 28 Apr 2016 06:15:36 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58830) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aviyq-0000fH-7p for qemu-devel@nongnu.org; Thu, 28 Apr 2016 06:15:32 -0400 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2D5A8C062C92; Thu, 28 Apr 2016 10:15:31 +0000 (UTC) Received: from redhat.com (vpn1-4-54.ams2.redhat.com [10.36.4.54]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u3SAFRHx028713 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Thu, 28 Apr 2016 06:15:30 -0400 Date: Thu, 28 Apr 2016 11:15:27 +0100 From: "Daniel P. Berrange" To: qemu-devel@nongnu.org Message-ID: <20160428101527.GG1797@redhat.com> References: <20160427142023.GC17937@redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20160427142023.GC17937@redhat.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: Re: [Qemu-devel] Hang with migration multi-thread compression under high load X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: "Daniel P. Berrange" Cc: "Li, Liang Z" , "Dr. David Alan Gilbert" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Wed, Apr 27, 2016 at 03:20:23PM +0100, Daniel P. Berrange wrote: > I've been testing various features of migration and have hit a problem > with the multi-thread compression. It works fine when I have 2 or more > threads, but if I tell it to only use a single thread, then it almost > always hangs [snip] > Now the target QEMU shows this stack trace: [snip] > Thread 2 (Thread 0x7f8677bed700 (LWP 4657)): > #0 0x00007f86eb035b10 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0 > #1 0x0000560ecd8a2709 in qemu_cond_wait (cond=cond@entry=0x560ed03918a0, mutex=mutex@entry=0x560ed0391878) at util/qemu-thread-posix.c:123 > #2 0x0000560ecd5b422d in do_data_decompress (opaque=0x560ed0391870) at /home/berrange/src/virt/qemu/migration/ram.c:2195 > #3 0x00007f86eb03060a in start_thread () at /lib64/libpthread.so.0 > #4 0x00007f86ead6aa4d in clone () at /lib64/libc.so.6 > > Thread 1 (Thread 0x7f86f767dc80 (LWP 4651)): > #0 0x0000560ecd5b744f in ram_load (len=711, host=0x7f8677e06000, f=0x560ed03a7950) at /home/berrange/src/virt/qemu/migration/ram.c:2263 > #1 0x0000560ecd5b744f in ram_load (f=0x560ed03a7950, opaque=, version_id=) > at /home/berrange/src/virt/qemu/migration/ram.c:2513 > #2 0x0000560ecd5b8b87 in vmstate_load (f=0x560ed03a7950, se=0x560ece731f90, version_id=4) > at /home/berrange/src/virt/qemu/migration/savevm.c:643 > #3 0x0000560ecd5b8fc3 in qemu_loadvm_state_main (mis=0x560ece75c330, f=0x560ed03a7950) at /home/berrange/src/virt/qemu/migration/savevm.c:1819 > #4 0x0000560ecd5b8fc3 in qemu_loadvm_state_main (f=f@entry=0x560ed03a7950, mis=mis@entry=0x560ece75c330) > at /home/berrange/src/virt/qemu/migration/savevm.c:1850 > #5 0x0000560ecd5bbd36 in qemu_loadvm_state (f=f@entry=0x560ed03a7950) at /home/berrange/src/virt/qemu/migration/savevm.c:1911 > #6 0x0000560ecd7b1b2f in process_incoming_migration_co (opaque=0x560ed03a7950) at migration/migration.c:384 > #7 0x0000560ecd8b1eba in coroutine_trampoline (i0=, i1=) at util/coroutine-ucontext.c:78 > #8 0x00007f86eacb0f30 in __start_context () at /lib64/libc.so.6 > #9 0x00007ffc877e49c0 in () > > > for some reason it isn't shown in the stack thrace for thread > 1 above, when initially connecting GDB it says the main thread > is at: > > decompress_data_with_multi_threads (len=702, host=0x7fd78fe06000, f=0x55901af09950) at /home/berrange/src/virt/qemu/migration/ram.c:2254 > 2254 for (idx = 0; idx < thread_count; idx++) { > > > Looking at the target QEMU, we see do_data_decompress method > is waiting in a condition var: > > while (!param->start && !quit_decomp_thread) { > qemu_cond_wait(¶m->cond, ¶m->mutex); > ....do stuff.. > param->start = false > } > > > Now the decompress_data_with_multi_threads is checking param->start without > holding the param->mutex lock. Ok, so I've investigated this further and decompress_data_with_multi_threads is where I believe the flaw is. Its code is: static void decompress_data_with_multi_threads(QEMUFile *f, void *host, int len) { int idx, thread_count; thread_count = migrate_decompress_threads(); while (true) { for (idx = 0; idx < thread_count; idx++) { if (!decomp_param[idx].start) { qemu_get_buffer(f, decomp_param[idx].compbuf, len); decomp_param[idx].des = host; decomp_param[idx].len = len; start_decompression(&decomp_param[idx]); break; } } if (idx < thread_count) { break; } } } Looking at the dissembly for the start of that method we have: for (idx = 0; idx < thread_count; idx++) { 358a: 45 85 f6 test %r14d,%r14d 358d: 7e fb jle 358a if (!decomp_param[idx].start) { 358f: 45 31 e4 xor %r12d,%r12d 3592: 40 84 ff test %dil,%dil 3595: 48 8d 42 78 lea 0x78(%rdx),%rax 3599: 0f 84 b1 04 00 00 je 3a50 359f: 90 nop Now asm is not my strong suit, but IIUC, that is showing that the access to 'decomp_param[idx].start' is *not* accessing the actual struct memory offset, but instead accessing a cached copy in the %dil register. This is valid because we've not told the compiler that this struct variable can be modified by multiple threads at once. So when the corresponding do_data_decompress() method sets 'decomp_param[idx].start = false', this is never seen by the decompress_data_with_multi_threads() method. If decompress_data_with_multi_threads() used a mutex lock around its access to 'decomp_param[idx].start', then there would be a memory barrier and the code would not be checking the value cached in the %dil register. Now we don't want a mutex lock in this code, but we do still need to have a memory barrier here, so I think we need this patch: Running my test case with that applied certainly makes migration run as normal without the hangs I see currently Regards, Daniel diff --git a/migration/ram.c b/migration/ram.c index be0233f..fddc136 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2260,6 +2260,7 @@ static void decompress_data_with_multi_threads(QEMUFile *f, thread_count = migrate_decompress_threads(); while (true) { for (idx = 0; idx < thread_count; idx++) { + smp_mb(); if (!decomp_param[idx].start) { qemu_get_buffer(f, decomp_param[idx].compbuf, len); decomp_param[idx].des = host;