From patchwork Mon Feb 18 16:18:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 10818441 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 908226CB for ; Mon, 18 Feb 2019 16:25:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7C93629D70 for ; Mon, 18 Feb 2019 16:25:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6FA352A169; Mon, 18 Feb 2019 16:25:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0DB6F29D70 for ; Mon, 18 Feb 2019 16:25:56 +0000 (UTC) Received: from localhost ([127.0.0.1]:33161 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gvljv-0002vh-8t for patchwork-qemu-devel@patchwork.kernel.org; Mon, 18 Feb 2019 11:25:55 -0500 Received: from eggs.gnu.org ([209.51.188.92]:34854) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gvldQ-0006K8-N8 for qemu-devel@nongnu.org; Mon, 18 Feb 2019 11:19:14 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gvldP-0002QK-Ca for qemu-devel@nongnu.org; Mon, 18 Feb 2019 11:19:12 -0500 Received: from mx1.redhat.com ([209.132.183.28]:37482) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gvldH-0002Dj-E4; Mon, 18 Feb 2019 11:19:03 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1E4764FCDB; Mon, 18 Feb 2019 16:18:51 +0000 (UTC) Received: from dhcp-200-176.str.redhat.com (dhcp-200-176.str.redhat.com [10.33.200.176]) by smtp.corp.redhat.com (Postfix) with ESMTP id 809185D70E; Mon, 18 Feb 2019 16:18:49 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Date: Mon, 18 Feb 2019 17:18:13 +0100 Message-Id: <20190218161822.3573-4-kwolf@redhat.com> In-Reply-To: <20190218161822.3573-1-kwolf@redhat.com> References: <20190218161822.3573-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Mon, 18 Feb 2019 16:18:51 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH 03/12] nbd: Restrict connection_co reentrance X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, qemu-devel@nongnu.org, mreitz@redhat.com, stefanha@redhat.com, pbonzini@redhat.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP nbd_client_attach_aio_context() schedules connection_co in the new AioContext and this way reenters it in any arbitrary place that has yielded. We can restrict this a bit to the function call where the coroutine actually sits waiting when it's idle. This doesn't solve any bug yet, but it shows where in the code we need to support this random reentrance and where we don't have to care. Add FIXME comments for the existing bugs that the rest of this series will fix. Signed-off-by: Kevin Wolf Reviewed-by: Eric Blake --- block/nbd-client.h | 1 + block/nbd-client.c | 23 +++++++++++++++++++++++ 2 files changed, 24 insertions(+) diff --git a/block/nbd-client.h b/block/nbd-client.h index d990207a5c..09e03013d2 100644 --- a/block/nbd-client.h +++ b/block/nbd-client.h @@ -35,6 +35,7 @@ typedef struct NBDClientSession { NBDClientRequest requests[MAX_NBD_REQUESTS]; NBDReply reply; + BlockDriverState *bs; bool quit; } NBDClientSession; diff --git a/block/nbd-client.c b/block/nbd-client.c index f0ad54ce21..e776785325 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -76,8 +76,24 @@ static coroutine_fn void nbd_connection_entry(void *opaque) Error *local_err = NULL; while (!s->quit) { + /* + * The NBD client can only really be considered idle when it has + * yielded from qio_channel_readv_all_eof(), waiting for data. This is + * the point where the additional scheduled coroutine entry happens + * after nbd_client_attach_aio_context(). + * + * Therefore we keep an additional in_flight reference all the time and + * only drop it temporarily here. + * + * FIXME This is not safe because the QIOChannel could wake up the + * coroutine for a second time; it is not prepared for coroutine + * resumption from external code. + */ + bdrv_dec_in_flight(s->bs); assert(s->reply.handle == 0); ret = nbd_receive_reply(s->ioc, &s->reply, &local_err); + bdrv_inc_in_flight(s->bs); + if (local_err) { trace_nbd_read_reply_entry_fail(ret, error_get_pretty(local_err)); error_free(local_err); @@ -116,6 +132,8 @@ static coroutine_fn void nbd_connection_entry(void *opaque) s->quit = true; nbd_recv_coroutines_wake_all(s); + bdrv_dec_in_flight(s->bs); + s->connection_co = NULL; aio_wait_kick(); } @@ -970,6 +988,9 @@ void nbd_client_attach_aio_context(BlockDriverState *bs, { NBDClientSession *client = nbd_get_client_session(bs); qio_channel_attach_aio_context(QIO_CHANNEL(client->ioc), new_context); + + /* FIXME Really need a bdrv_inc_in_flight() here, but the corresponding + * bdrv_dec_in_flight() would have to be in QIOChannel code :-/ */ aio_co_schedule(new_context, client->connection_co); } @@ -1076,6 +1097,7 @@ static int nbd_client_connect(BlockDriverState *bs, * kick the reply mechanism. */ qio_channel_set_blocking(QIO_CHANNEL(sioc), false, NULL); client->connection_co = qemu_coroutine_create(nbd_connection_entry, client); + bdrv_inc_in_flight(bs); nbd_client_attach_aio_context(bs, bdrv_get_aio_context(bs)); logout("Established connection with NBD server\n"); @@ -1108,6 +1130,7 @@ int nbd_client_init(BlockDriverState *bs, { NBDClientSession *client = nbd_get_client_session(bs); + client->bs = bs; qemu_co_mutex_init(&client->send_mutex); qemu_co_queue_init(&client->free_sema);