From patchwork Tue Nov 21 17:03:47 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Cody X-Patchwork-Id: 10068621 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B41CA60375 for ; Tue, 21 Nov 2017 17:05:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A1193297AE for ; Tue, 21 Nov 2017 17:05:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 95DD5297EA; Tue, 21 Nov 2017 17:05:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 45D8C297B3 for ; Tue, 21 Nov 2017 17:05:48 +0000 (UTC) Received: from localhost ([::1]:35600 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eHBzX-0008OQ-Dz for patchwork-qemu-devel@patchwork.kernel.org; Tue, 21 Nov 2017 12:05:47 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48309) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eHBy4-0008KW-Rh for qemu-devel@nongnu.org; Tue, 21 Nov 2017 12:04:18 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eHBy0-0005lF-Jv for qemu-devel@nongnu.org; Tue, 21 Nov 2017 12:04:16 -0500 Received: from mx1.redhat.com ([209.132.183.28]:44006) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1eHBxt-0005fF-IK; Tue, 21 Nov 2017 12:04:05 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B69F5C04B93B; Tue, 21 Nov 2017 17:04:04 +0000 (UTC) Received: from localhost (ovpn-124-90.rdu2.redhat.com [10.10.124.90]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 725CC7BF85; Tue, 21 Nov 2017 17:03:56 +0000 (UTC) From: Jeff Cody To: qemu-block@nongnu.org Date: Tue, 21 Nov 2017 12:03:47 -0500 Message-Id: <20171121170350.31290-2-jcody@redhat.com> In-Reply-To: <20171121170350.31290-1-jcody@redhat.com> References: <20171121170350.31290-1-jcody@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Tue, 21 Nov 2017 17:04:04 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL 1/4] blockjob: do not allow coroutine double entry or entry-after-completion X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, jcody@redhat.com, qemu-devel@nongnu.org, stefanha@redhat.com, pbonzini@redhat.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP When block_job_sleep_ns() is called, the co-routine is scheduled for future execution. If we allow the job to be re-entered prior to the scheduled time, we present a race condition in which a coroutine can be entered recursively, or even entered after the coroutine is deleted. The job->busy flag is used by blockjobs when a coroutine is busy executing. The function 'block_job_enter()' obeys the busy flag, and will not enter a coroutine if set. If we sleep a job, we need to leave the busy flag set, so that subsequent calls to block_job_enter() are prevented. This changes the prior behavior of block_job_cancel() being able to immediately wake up and cancel a job; in practice, this should not be an issue, as the coroutine sleep times are generally very small, and the cancel will occur the next time the coroutine wakes up. This fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1508708 Signed-off-by: Jeff Cody Reviewed-by: Stefan Hajnoczi --- blockjob.c | 7 +++++-- include/block/blockjob_int.h | 3 ++- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/blockjob.c b/blockjob.c index 3a0c491..ff9a614 100644 --- a/blockjob.c +++ b/blockjob.c @@ -797,11 +797,14 @@ void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns) return; } - job->busy = false; + /* We need to leave job->busy set here, because when we have + * put a coroutine to 'sleep', we have scheduled it to run in + * the future. We cannot enter that same coroutine again before + * it wakes and runs, otherwise we risk double-entry or entry after + * completion. */ if (!block_job_should_pause(job)) { co_aio_sleep_ns(blk_get_aio_context(job->blk), type, ns); } - job->busy = true; block_job_pause_point(job); } diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h index f13ad05..43f3be2 100644 --- a/include/block/blockjob_int.h +++ b/include/block/blockjob_int.h @@ -143,7 +143,8 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver, * @ns: How many nanoseconds to stop for. * * Put the job to sleep (assuming that it wasn't canceled) for @ns - * nanoseconds. Canceling the job will interrupt the wait immediately. + * nanoseconds. Canceling the job will not interrupt the wait, so the + * cancel will not process until the coroutine wakes up. */ void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns);