From patchwork Wed Oct 9 08:02:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Glauber X-Patchwork-Id: 11181769 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1A9721709 for ; Wed, 9 Oct 2019 17:23:49 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EF41320B7C for ; Wed, 9 Oct 2019 17:23:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EF41320B7C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=marvell.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:53002 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iIFgh-0005NE-Uf for patchwork-qemu-devel@patchwork.kernel.org; Wed, 09 Oct 2019 13:23:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44811) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iI78o-0005fO-F4 for qemu-devel@nongnu.org; Wed, 09 Oct 2019 04:16:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iI78m-0001Kd-Ne for qemu-devel@nongnu.org; Wed, 09 Oct 2019 04:16:14 -0400 Received: from indium.canonical.com ([91.189.90.7]:51808) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1iI78m-0001KS-Hw for qemu-devel@nongnu.org; Wed, 09 Oct 2019 04:16:12 -0400 Received: from loganberry.canonical.com ([91.189.90.37]) by indium.canonical.com with esmtp (Exim 4.86_2 #2 (Debian)) id 1iI78l-0004VG-6p for ; Wed, 09 Oct 2019 08:16:11 +0000 Received: from loganberry.canonical.com (localhost [127.0.0.1]) by loganberry.canonical.com (Postfix) with ESMTP id 295462E8070 for ; Wed, 9 Oct 2019 08:16:11 +0000 (UTC) MIME-Version: 1.0 Date: Wed, 09 Oct 2019 08:02:38 -0000 From: Jan Glauber To: qemu-devel@nongnu.org X-Launchpad-Notification-Type: bug X-Launchpad-Bug: product=kunpeng920; status=New; importance=Undecided; assignee=None; X-Launchpad-Bug: product=qemu; status=In Progress; importance=Undecided; assignee=rafaeldtinoco@kernelpath.com; X-Launchpad-Bug: distribution=ubuntu; sourcepackage=qemu; component=main; status=In Progress; importance=Medium; assignee=rafaeldtinoco@kernelpath.com; X-Launchpad-Bug: distribution=ubuntu; distroseries=bionic; sourcepackage=qemu; component=main; status=New; importance=Medium; assignee=None; X-Launchpad-Bug: distribution=ubuntu; distroseries=disco; sourcepackage=qemu; component=main; status=New; importance=Medium; assignee=None; X-Launchpad-Bug: distribution=ubuntu; distroseries=eoan; sourcepackage=qemu; component=main; status=In Progress; importance=Medium; assignee=rafaeldtinoco@kernelpath.com; X-Launchpad-Bug: distribution=ubuntu; distroseries=ff-series; sourcepackage=qemu; component=None; status=New; importance=Medium; assignee=None; X-Launchpad-Bug-Tags: qemu-img X-Launchpad-Bug-Information-Type: Public X-Launchpad-Bug-Private: no X-Launchpad-Bug-Security-Vulnerability: no X-Launchpad-Bug-Commenters: dannf jan-glauber-i jnsnow lizhengui rafaeldtinoco X-Launchpad-Bug-Reporter: dann frazier (dannf) X-Launchpad-Bug-Modifier: Jan Glauber (jan-glauber-i) References: <154327283728.15443.11625169757714443608.malonedeb@soybean.canonical.com> Message-Id: <20191009080220.GA2905@hc> Subject: [Bug 1805256] Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues X-Launchpad-Message-Rationale: Subscriber (QEMU) @qemu-devel-ml X-Launchpad-Message-For: qemu-devel-ml Precedence: bulk X-Generated-By: Launchpad (canonical.com); Revision="af2eefe214bd95389a09b7c956720881bab16807"; Instance="production-secrets-lazr.conf" X-Launchpad-Hash: 3ae3824d3451fab4ecd42e8050746f0d6a55d34b X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 91.189.90.7 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Bug 1805256 <1805256@bugs.launchpad.net> Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" On Mon, Oct 07, 2019 at 04:58:30PM +0200, Paolo Bonzini wrote: > On 07/10/19 16:44, dann frazier wrote: > > On Mon, Oct 07, 2019 at 01:06:20PM +0200, Paolo Bonzini wrote: > >> On 02/10/19 11:23, Jan Glauber wrote: > >>> I've looked into this on ThunderX2. The arm64 code generated for the > >>> atomic_[add|sub] accesses of ctx->notify_me doesn't contain any > >>> memory barriers. It is just plain ldaxr/stlxr. > >>> > >>> From my understanding this is not sufficient for SMP sync. > >>> > >>> If I read this comment correct: > >>> > >>> void aio_notify(AioContext *ctx) > >>> { > >>> /* Write e.g. bh->scheduled before reading ctx->notify_me. Pairs > >>> * with atomic_or in aio_ctx_prepare or atomic_add in aio_poll. > >>> */ > >>> smp_mb(); > >>> if (ctx->notify_me) { > >>> > >>> it points out that the smp_mb() should be paired. But as > >>> I said the used atomics don't generate any barriers at all. > >> > >> Based on the rest of the thread, this patch should also fix the bug: > >> > >> diff --git a/util/async.c b/util/async.c > >> index 47dcbfa..721ea53 100644 > >> --- a/util/async.c > >> +++ b/util/async.c > >> @@ -249,7 +249,7 @@ aio_ctx_check(GSource *source) > >> aio_notify_accept(ctx); > >> > >> for (bh = ctx->first_bh; bh; bh = bh->next) { > >> - if (bh->scheduled) { > >> + if (atomic_mb_read(&bh->scheduled)) { > >> return true; > >> } > >> } > >> > >> > >> And also the memory barrier in aio_notify can actually be replaced > >> with a SEQ_CST load: > >> > >> diff --git a/util/async.c b/util/async.c > >> index 47dcbfa..721ea53 100644 > >> --- a/util/async.c > >> +++ b/util/async.c > >> @@ -349,11 +349,11 @@ LinuxAioState *aio_get_linux_aio(AioContext *ctx) > >> > >> void aio_notify(AioContext *ctx) > >> { > >> - /* Write e.g. bh->scheduled before reading ctx->notify_me. Pairs > >> - * with atomic_or in aio_ctx_prepare or atomic_add in aio_poll. > >> + /* Using atomic_mb_read ensures that e.g. bh->scheduled is written before > >> + * ctx->notify_me is read. Pairs with atomic_or in aio_ctx_prepare or > >> + * atomic_add in aio_poll. > >> */ > >> - smp_mb(); > >> - if (ctx->notify_me) { > >> + if (atomic_mb_read(&ctx->notify_me)) { > >> event_notifier_set(&ctx->notifier); > >> atomic_mb_set(&ctx->notified, true); > >> } > >> > >> > >> Would you be able to test these (one by one possibly)? > > > > Paolo, > > I tried them both separately and together on a Hi1620 system, each > > time it hung in the first iteration. Here's a backtrace of a run with > > both patches applied: > > Ok, not a great start... I'll find myself an aarch64 machine and look > at it more closely. I'd like the patch to be something we can > understand and document, since this is probably the second most-used > memory barrier idiom in QEMU. > > Paolo I'm still not sure what the actual issue is here, but could it be some bad interaction between the notify_me and the list_lock? The are both 4 byte and side-by-side: address notify_me: 0xaaaadb528aa0 sizeof notify_me: 4 address list_lock: 0xaaaadb528aa4 sizeof list_lock: 4 AFAICS the generated code looks OK (all load/store exclusive done with 32 bit size): e6c: 885ffc01 ldaxr w1, [x0] e70: 11000821 add w1, w1, #0x2 e74: 8802fc01 stlxr w2, w1, [x0] ...but if I bump notify_me size to uint64_t the issue goes away. BTW, the image file I convert in the testcase is ~20 GB. --Jan diff --git a/include/block/aio.h b/include/block/aio.h index a1d6b9e24939..e8a5ea3860bb 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -83,7 +83,7 @@ struct AioContext { * Instead, the aio_poll calls include both the prepare and the * dispatch phase, hence a simple counter is enough for them. */ - uint32_t notify_me; + uint64_t notify_me; /* A lock to protect between QEMUBH and AioHandler adders and deleter, * and to ensure that no callbacks are removed while we're walking and