From patchwork Tue Nov 7 15:09:37 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sergio Lopez Pascual X-Patchwork-Id: 10046895 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 18FD36031B for ; Tue, 7 Nov 2017 15:13:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 09D312A1BE for ; Tue, 7 Nov 2017 15:13:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F27AD2A1D2; Tue, 7 Nov 2017 15:13:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 81F782A1BE for ; Tue, 7 Nov 2017 15:13:53 +0000 (UTC) Received: from localhost ([::1]:53883 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eC5ZY-0002gn-4A for patchwork-qemu-devel@patchwork.kernel.org; Tue, 07 Nov 2017 10:13:52 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36397) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eC5XO-0001kV-D2 for qemu-devel@nongnu.org; Tue, 07 Nov 2017 10:11:44 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eC5XK-0002O6-CS for qemu-devel@nongnu.org; Tue, 07 Nov 2017 10:11:38 -0500 Received: from mx1.redhat.com ([209.132.183.28]:6950) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1eC5XA-0002FE-CG; Tue, 07 Nov 2017 10:11:24 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B8331624DA; Tue, 7 Nov 2017 15:11:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com B8331624DA Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=slp@redhat.com Received: from kthompson.redhat.com (ovpn-116-186.ams2.redhat.com [10.36.116.186]) by smtp.corp.redhat.com (Postfix) with ESMTP id C69D35C886; Tue, 7 Nov 2017 15:11:11 +0000 (UTC) From: Sergio Lopez To: qemu-devel@nongnu.org Date: Tue, 7 Nov 2017 16:09:37 +0100 Message-Id: <20171107150937.23188-1-slp@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 07 Nov 2017 15:11:21 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH] util/async: use atomic_mb_set in qemu_bh_cancel X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Sergio Lopez , pbonzini@redhat.com, famz@redhat.com, stefanha@redhat.com, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP Commit b7a745d added a qemu_bh_cancel call to the completion function as an optimization to prevent it from unnecessarily rescheduling itself. This completion function is scheduled from worker_thread, after setting the state of a ThreadPoolElement to THREAD_DONE. This was considered to be safe, as the completion function restarts the loop just after the call to qemu_bh_cancel. But, under certain access patterns and scheduling conditions, the loop may wrongly use a pre-fetched elem->state value, reading it as THREAD_QUEUED, and ending the completion function without having processed a pending TPE linked at pool->head. In some situations, if there are no other independent requests in the same aio context that could eventually trigger the scheduling of the completion function, the omitted TPE and all operations pending on it will get stuck forever. Signed-off-by: Sergio Lopez Reviewed-by: Stefan Hajnoczi --- util/async.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/util/async.c b/util/async.c index 355af73ee7..0e1bd8780a 100644 --- a/util/async.c +++ b/util/async.c @@ -174,7 +174,7 @@ void qemu_bh_schedule(QEMUBH *bh) */ void qemu_bh_cancel(QEMUBH *bh) { - bh->scheduled = 0; + atomic_mb_set(&bh->scheduled, 0); } /* This func is async.The bottom half will do the delete action at the finial