From patchwork Wed Oct 2 09:23:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Glauber X-Patchwork-Id: 11171005 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B7B121747 for ; Wed, 2 Oct 2019 12:28:51 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9819A21A4A for ; Wed, 2 Oct 2019 12:28:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9819A21A4A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=marvell.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:54833 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iFdkQ-0000rZ-JB for patchwork-qemu-devel@patchwork.kernel.org; Wed, 02 Oct 2019 08:28:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46558) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iFb2y-0000Nk-2l for qemu-devel@nongnu.org; Wed, 02 Oct 2019 05:35:49 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iFb2w-0006jl-B3 for qemu-devel@nongnu.org; Wed, 02 Oct 2019 05:35:47 -0400 Received: from indium.canonical.com ([91.189.90.7]:40492) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1iFb2w-0006j0-56 for qemu-devel@nongnu.org; Wed, 02 Oct 2019 05:35:46 -0400 Received: from loganberry.canonical.com ([91.189.90.37]) by indium.canonical.com with esmtp (Exim 4.86_2 #2 (Debian)) id 1iFb2t-0005jU-FD for ; Wed, 02 Oct 2019 09:35:43 +0000 Received: from loganberry.canonical.com (localhost [127.0.0.1]) by loganberry.canonical.com (Postfix) with ESMTP id 70F972E80D0 for ; Wed, 2 Oct 2019 09:35:43 +0000 (UTC) MIME-Version: 1.0 Date: Wed, 02 Oct 2019 09:23:00 -0000 From: Jan Glauber To: qemu-devel@nongnu.org X-Launchpad-Notification-Type: bug X-Launchpad-Bug: product=qemu; status=In Progress; importance=Undecided; assignee=rafaeldtinoco@kernelpath.com; X-Launchpad-Bug: distribution=ubuntu; sourcepackage=qemu; component=main; status=In Progress; importance=Medium; assignee=rafaeldtinoco@kernelpath.com; X-Launchpad-Bug: distribution=ubuntu; distroseries=bionic; sourcepackage=qemu; component=main; status=New; importance=Undecided; assignee=None; X-Launchpad-Bug: distribution=ubuntu; distroseries=disco; sourcepackage=qemu; component=main; status=New; importance=Undecided; assignee=None; X-Launchpad-Bug: distribution=ubuntu; distroseries=eoan; sourcepackage=qemu; component=main; status=In Progress; importance=Medium; assignee=rafaeldtinoco@kernelpath.com; X-Launchpad-Bug: distribution=ubuntu; distroseries=ff-series; sourcepackage=qemu; component=None; status=New; importance=Undecided; assignee=None; X-Launchpad-Bug-Tags: qemu-img X-Launchpad-Bug-Information-Type: Public X-Launchpad-Bug-Private: no X-Launchpad-Bug-Security-Vulnerability: no X-Launchpad-Bug-Commenters: dannf jan-glauber-i jnsnow lizhengui rafaeldtinoco X-Launchpad-Bug-Reporter: dann frazier (dannf) X-Launchpad-Bug-Modifier: Jan Glauber (jan-glauber-i) References: <154327283728.15443.11625169757714443608.malonedeb@soybean.canonical.com> <1864070a-2f84-1d98-341e-f01ddf74ec4b@ubuntu.com> <20190924202517.GA21422@xps13.dannf> Message-Id: <20191002092253.GA3857@hc> Subject: [Bug 1805256] Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues X-Launchpad-Message-Rationale: Subscriber (QEMU) @qemu-devel-ml X-Launchpad-Message-For: qemu-devel-ml Precedence: bulk X-Generated-By: Launchpad (canonical.com); Revision="19066"; Instance="production-secrets-lazr.conf" X-Launchpad-Hash: 2e3ae6c15255563c05797f29c21e59ec758a4d2e X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 91.189.90.7 X-Mailman-Approved-At: Wed, 02 Oct 2019 08:27:28 -0400 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Bug 1805256 <1805256@bugs.launchpad.net> Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" I've looked into this on ThunderX2. The arm64 code generated for the atomic_[add|sub] accesses of ctx->notify_me doesn't contain any memory barriers. It is just plain ldaxr/stlxr. >From my understanding this is not sufficient for SMP sync. If I read this comment correct: void aio_notify(AioContext *ctx) { /* Write e.g. bh->scheduled before reading ctx->notify_me. Pairs * with atomic_or in aio_ctx_prepare or atomic_add in aio_poll. */ smp_mb(); if (ctx->notify_me) { it points out that the smp_mb() should be paired. But as I said the used atomics don't generate any barriers at all. I've tried to verify me theory with this patch and didn't run into the issue for ~500 iterations (usually I would trigger the issue ~20 iterations). --Jan diff --git a/util/aio-posix.c b/util/aio-posix.c index d8f0cb4af8dd..d07dcd4e9993 100644 --- a/util/aio-posix.c +++ b/util/aio-posix.c @@ -591,6 +591,7 @@ bool aio_poll(AioContext *ctx, bool blocking) */ if (blocking) { atomic_add(&ctx->notify_me, 2); + smp_mb(); } qemu_lockcnt_inc(&ctx->list_lock); @@ -632,6 +633,7 @@ bool aio_poll(AioContext *ctx, bool blocking) if (blocking) { atomic_sub(&ctx->notify_me, 2); + smp_mb(); } /* Adjust polling time */ diff --git a/util/async.c b/util/async.c index 4dd9d95a9e73..92ac209c4615 100644 --- a/util/async.c +++ b/util/async.c @@ -222,6 +222,7 @@ aio_ctx_prepare(GSource *source, gint *timeout) AioContext *ctx = (AioContext *) source; atomic_or(&ctx->notify_me, 1); + smp_mb(); /* We assume there is no timeout already supplied */ *timeout = qemu_timeout_ns_to_ms(aio_compute_timeout(ctx)); @@ -240,6 +241,7 @@ aio_ctx_check(GSource *source) QEMUBH *bh; atomic_and(&ctx->notify_me, ~1); + smp_mb(); aio_notify_accept(ctx); for (bh = ctx->first_bh; bh; bh = bh->next) {