From patchwork Wed Aug 30 18:07:02 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 9930481 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 4AEB76022E for ; Wed, 30 Aug 2017 18:11:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 42E062874A for ; Wed, 30 Aug 2017 18:11:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 379C02874E; Wed, 30 Aug 2017 18:11:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id A6EDE2874A for ; Wed, 30 Aug 2017 18:11:31 +0000 (UTC) Received: from localhost ([::1]:52053 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dn7Sc-0004n9-QQ for patchwork-qemu-devel@patchwork.kernel.org; Wed, 30 Aug 2017 14:11:30 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53040) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dn7Og-0001qQ-AB for qemu-devel@nongnu.org; Wed, 30 Aug 2017 14:07:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dn7Oe-0001a7-Uc for qemu-devel@nongnu.org; Wed, 30 Aug 2017 14:07:26 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46462) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dn7Ob-0001Yr-Tz; Wed, 30 Aug 2017 14:07:22 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id EB16B80C06; Wed, 30 Aug 2017 18:07:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com EB16B80C06 Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=eblake@redhat.com Received: from red.redhat.com (ovpn-122-186.rdu2.redhat.com [10.10.122.186]) by smtp.corp.redhat.com (Postfix) with ESMTP id BEBC1173A6; Wed, 30 Aug 2017 18:07:17 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Date: Wed, 30 Aug 2017 13:07:02 -0500 Message-Id: <20170830180711.7974-3-eblake@redhat.com> In-Reply-To: <20170830180711.7974-1-eblake@redhat.com> References: <20170830180711.7974-1-eblake@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Wed, 30 Aug 2017 18:07:21 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL 02/11] nbd-client: avoid read_reply_co entry if send failed X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Paolo Bonzini , "open list:Network Block Dev..." , Stefan Hajnoczi , Max Reitz Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Stefan Hajnoczi The following segfault is encountered if the NBD server closes the UNIX domain socket immediately after negotiation: Program terminated with signal SIGSEGV, Segmentation fault. #0 aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441 441 QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines, (gdb) bt #0 0x000000d3c01a50f8 in aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441 #1 0x000000d3c012fa90 in nbd_coroutine_end (bs=bs@entry=0xd3c0fec650, request=) at block/nbd-client.c:207 #2 0x000000d3c012fb58 in nbd_client_co_preadv (bs=0xd3c0fec650, offset=0, bytes=, qiov=0x7ffc10a91b20, flags=0) at block/nbd-client.c:237 #3 0x000000d3c0128e63 in bdrv_driver_preadv (bs=bs@entry=0xd3c0fec650, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=0) at block/io.c:836 #4 0x000000d3c012c3e0 in bdrv_aligned_preadv (child=child@entry=0xd3c0ff51d0, req=req@entry=0x7f31885d6e90, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=1, qiov=qiov@entry=0x7ffc10a91b20, f +lags=0) at block/io.c:1086 #5 0x000000d3c012c6b8 in bdrv_co_preadv (child=0xd3c0ff51d0, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=flags@entry=0) at block/io.c:1182 #6 0x000000d3c011cc17 in blk_co_preadv (blk=0xd3c0ff4f80, offset=0, bytes=512, qiov=0x7ffc10a91b20, flags=0) at block/block-backend.c:1032 #7 0x000000d3c011ccec in blk_read_entry (opaque=0x7ffc10a91b40) at block/block-backend.c:1079 #8 0x000000d3c01bbb96 in coroutine_trampoline (i0=, i1=) at util/coroutine-ucontext.c:79 #9 0x00007f3196cb8600 in __start_context () at /lib64/libc.so.6 The problem is that nbd_client_init() uses nbd_client_attach_aio_context() -> aio_co_schedule(new_context, client->read_reply_co). Execution of read_reply_co is deferred to a BH which doesn't run until later. In the mean time blk_co_preadv() can be called and nbd_coroutine_end() calls aio_wake() on read_reply_co. At this point in time read_reply_co's ctx isn't set because it has never been entered yet. This patch simplifies the nbd_co_send_request() -> nbd_co_receive_reply() -> nbd_coroutine_end() lifecycle to just nbd_co_send_request() -> nbd_co_receive_reply(). The request is "ended" if an error occurs at any point. Callers no longer have to invoke nbd_coroutine_end(). This cleanup also eliminates the segfault because we don't call aio_co_schedule() to wake up s->read_reply_co if sending the request failed. It is only necessary to wake up s->read_reply_co if a reply was received. Note this only happens with UNIX domain sockets on Linux. It doesn't seem possible to reproduce this with TCP sockets. Suggested-by: Paolo Bonzini Signed-off-by: Stefan Hajnoczi Message-Id: <20170829122745.14309-2-stefanha@redhat.com> Signed-off-by: Eric Blake --- block/nbd-client.c | 25 +++++++++---------------- 1 file changed, 9 insertions(+), 16 deletions(-) diff --git a/block/nbd-client.c b/block/nbd-client.c index 25bcaa2346..ea728fffc8 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -144,12 +144,12 @@ static int nbd_co_send_request(BlockDriverState *bs, request->handle = INDEX_TO_HANDLE(s, i); if (s->quit) { - qemu_co_mutex_unlock(&s->send_mutex); - return -EIO; + rc = -EIO; + goto err; } if (!s->ioc) { - qemu_co_mutex_unlock(&s->send_mutex); - return -EPIPE; + rc = -EPIPE; + goto err; } if (qiov) { @@ -166,8 +166,13 @@ static int nbd_co_send_request(BlockDriverState *bs, } else { rc = nbd_send_request(s->ioc, request); } + +err: if (rc < 0) { s->quit = true; + s->requests[i].coroutine = NULL; + s->in_flight--; + qemu_co_queue_next(&s->free_sema); } qemu_co_mutex_unlock(&s->send_mutex); return rc; @@ -201,13 +206,6 @@ static void nbd_co_receive_reply(NBDClientSession *s, /* Tell the read handler to read another header. */ s->reply.handle = 0; } -} - -static void nbd_coroutine_end(BlockDriverState *bs, - NBDRequest *request) -{ - NBDClientSession *s = nbd_get_client_session(bs); - int i = HANDLE_TO_INDEX(s, request->handle); s->requests[i].coroutine = NULL; @@ -243,7 +241,6 @@ int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset, } else { nbd_co_receive_reply(client, &request, &reply, qiov); } - nbd_coroutine_end(bs, &request); return -reply.error; } @@ -272,7 +269,6 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset, } else { nbd_co_receive_reply(client, &request, &reply, NULL); } - nbd_coroutine_end(bs, &request); return -reply.error; } @@ -306,7 +302,6 @@ int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, } else { nbd_co_receive_reply(client, &request, &reply, NULL); } - nbd_coroutine_end(bs, &request); return -reply.error; } @@ -330,7 +325,6 @@ int nbd_client_co_flush(BlockDriverState *bs) } else { nbd_co_receive_reply(client, &request, &reply, NULL); } - nbd_coroutine_end(bs, &request); return -reply.error; } @@ -355,7 +349,6 @@ int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes) } else { nbd_co_receive_reply(client, &request, &reply, NULL); } - nbd_coroutine_end(bs, &request); return -reply.error; }