From patchwork Mon Sep 27 21:55:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12520849 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A0F3C433EF for ; Mon, 27 Sep 2021 21:57:53 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 150AB610A2 for ; Mon, 27 Sep 2021 21:57:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 150AB610A2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:58826 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUydE-0005Zx-46 for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 17:57:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54214) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybO-0002W5-Ek for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:55:58 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:25158) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybL-0002i3-Sp for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:55:57 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779755; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=i7dmGNPUnoMVpFUrIEkhEnvvqaBiBXE6XZxgi4AXxyI=; b=CzJ6zcwL/sreVXEqDYIFqDRKpVLQRT5JObC4TbdI6y17Dx/0IDj4rXdtwFCxto7uD/EZDk rg9zxtQE1wATbd9dS8vCpdL7Y3PjbEq826S2lameckiP4LIQO4lQAbVHD23mLx5BU33dZP 54iATNWQL0eJ19V2dUZqwadIMCSHa14= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-573-JlbCgeWQPJGZ4p3iTY0uHQ-1; Mon, 27 Sep 2021 17:55:53 -0400 X-MC-Unique: JlbCgeWQPJGZ4p3iTY0uHQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 341C45074E; Mon, 27 Sep 2021 21:55:52 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 308D118A6A; Mon, 27 Sep 2021 21:55:51 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 01/20] qemu-nbd: Change default cache mode to writeback Date: Mon, 27 Sep 2021 16:55:26 -0500 Message-Id: <20210927215545.3930309-2-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Nir Soffer , "open list:Network Block Dev..." , Vladimir Sementsov-Ogievskiy , qemu-stable@nongnu.org, Nir Soffer Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Nir Soffer Both qemu and qemu-img use writeback cache mode by default, which is already documented in qemu(1). qemu-nbd uses writethrough cache mode by default, and the default cache mode is not documented. According to the qemu-nbd(8): --cache=CACHE The cache mode to be used with the file. See the documentation of the emulator's -drive cache=... option for allowed values. qemu(1) says: The default mode is cache=writeback. So users have no reason to assume that qemu-nbd is using writethough cache mode. The only hint is the painfully slow writing when using the defaults. Looking in git history, it seems that qemu used writethrough in the past to support broken guests that did not flush data properly, or could not flush due to limitations in qemu. But qemu-nbd clients can use NBD_CMD_FLUSH to flush data, so using writethrough does not help anyone. Change the default cache mode to writback, and document the default and available values properly in the online help and manual. With this change converting image via qemu-nbd is 3.5 times faster. $ qemu-img create dst.img 50g $ qemu-nbd -t -f raw -k /tmp/nbd.sock dst.img Before this change: $ hyperfine -r3 "./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock" Benchmark #1: ./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock Time (mean ± σ): 83.639 s ± 5.970 s [User: 2.733 s, System: 6.112 s] Range (min … max): 76.749 s … 87.245 s 3 runs After this change: $ hyperfine -r3 "./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock" Benchmark #1: ./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock Time (mean ± σ): 23.522 s ± 0.433 s [User: 2.083 s, System: 5.475 s] Range (min … max): 23.234 s … 24.019 s 3 runs Users can avoid the issue by using --cache=writeback[1] but the defaults should give good performance for the common use case. [1] https://bugzilla.redhat.com/1990656 Signed-off-by: Nir Soffer Message-Id: <20210813205519.50518-1-nsoffer@redhat.com> Reviewed-by: Eric Blake CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake --- docs/tools/qemu-nbd.rst | 6 ++++-- qemu-nbd.c | 6 ++++-- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/docs/tools/qemu-nbd.rst b/docs/tools/qemu-nbd.rst index e39a9f4b1a67..56e54cd44114 100644 --- a/docs/tools/qemu-nbd.rst +++ b/docs/tools/qemu-nbd.rst @@ -99,8 +99,10 @@ driver options if ``--image-opts`` is specified. .. option:: --cache=CACHE - The cache mode to be used with the file. See the documentation of - the emulator's ``-drive cache=...`` option for allowed values. + The cache mode to be used with the file. Valid values are: + ``none``, ``writeback`` (the default), ``writethrough``, + ``directsync`` and ``unsafe``. See the documentation of + the emulator's ``-drive cache=...`` option for more info. .. option:: -n, --nocache diff --git a/qemu-nbd.c b/qemu-nbd.c index 65ebec598f88..9d895ba24b1e 100644 --- a/qemu-nbd.c +++ b/qemu-nbd.c @@ -135,7 +135,9 @@ static void usage(const char *name) " 'snapshot.id=[ID],snapshot.name=[NAME]', or\n" " '[ID_OR_NAME]'\n" " -n, --nocache disable host cache\n" -" --cache=MODE set cache mode (none, writeback, ...)\n" +" --cache=MODE set cache mode used to access the disk image, the\n" +" valid options are: 'none', 'writeback' (default),\n" +" 'writethrough', 'directsync' and 'unsafe'\n" " --aio=MODE set AIO mode (native, io_uring or threads)\n" " --discard=MODE set discard mode (ignore, unmap)\n" " --detect-zeroes=MODE set detect-zeroes mode (off, on, unmap)\n" @@ -552,7 +554,7 @@ int main(int argc, char **argv) bool alloc_depth = false; const char *tlscredsid = NULL; bool imageOpts = false; - bool writethrough = true; + bool writethrough = false; /* Client will flush as needed. */ bool fork_process = false; bool list = false; int old_stderr = -1; From patchwork Mon Sep 27 21:55:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12520865 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C60AAC433EF for ; Mon, 27 Sep 2021 22:01:54 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8FCCF60E54 for ; Mon, 27 Sep 2021 22:01:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 8FCCF60E54 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:40534 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUyh7-0003tS-OR for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 18:01:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54252) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybS-0002XQ-Hh for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:02 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:26181) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybR-0002mm-4t for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779760; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VHU1nNS3v78i1CuTDSq3suMz9ooy1pEi/CgwaRKAHuw=; b=eE1lmdMzOh66Flkcn4i/Y4RuR2yUMMf9+J2kw5EoEOK8NJcsUhe+0wcquV//yQNJwUm3eg t+t4VkHgOBOSuhL7LUpO3wb6fQOGv+bWoT0nA4IZSVfwJXU9js9Z1Ig4txtks3RYjFxMAs vO/xq9vbTLqRtp49vzMpflta8yUUe0s= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-463-S1kkCzM9N9iUNC79vYnNUA-1; Mon, 27 Sep 2021 17:55:58 -0400 X-MC-Unique: S1kkCzM9N9iUNC79vYnNUA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id CA544824FA9; Mon, 27 Sep 2021 21:55:57 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5BC7F5D9D5; Mon, 27 Sep 2021 21:55:52 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 02/20] block/io: bring request check to bdrv_co_(read, write)v_vmstate Date: Mon, 27 Sep 2021 16:55:27 -0500 Message-Id: <20210927215545.3930309-3-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -8 X-Spam_score: -0.9 X-Spam_bar: / X-Spam_report: (-0.9 / 5.0 requ) DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Fam Zheng , Kevin Wolf , Vladimir Sementsov-Ogievskiy , "open list:Block I/O path" , Hanna Reitz , Stefan Hajnoczi Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Vladimir Sementsov-Ogievskiy Only qcow2 driver supports vmstate. In qcow2 these requests go through .bdrv_co_p{read,write}v_part handlers. So, let's do our basic check for the request on vmstate generic handlers. Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Eric Blake Message-Id: <20210903102807.27127-2-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake --- block/io.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/block/io.c b/block/io.c index 99ee182ca449..58602f84dbf0 100644 --- a/block/io.c +++ b/block/io.c @@ -2810,7 +2810,12 @@ bdrv_co_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos) { BlockDriver *drv = bs->drv; BlockDriverState *child_bs = bdrv_primary_bs(bs); - int ret = -ENOTSUP; + int ret; + + ret = bdrv_check_qiov_request(pos, qiov->size, qiov, 0, NULL); + if (ret < 0) { + return ret; + } if (!drv) { return -ENOMEDIUM; @@ -2822,6 +2827,8 @@ bdrv_co_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos) ret = drv->bdrv_load_vmstate(bs, qiov, pos); } else if (child_bs) { ret = bdrv_co_readv_vmstate(child_bs, qiov, pos); + } else { + ret = -ENOTSUP; } bdrv_dec_in_flight(bs); @@ -2834,7 +2841,12 @@ bdrv_co_writev_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos) { BlockDriver *drv = bs->drv; BlockDriverState *child_bs = bdrv_primary_bs(bs); - int ret = -ENOTSUP; + int ret; + + ret = bdrv_check_qiov_request(pos, qiov->size, qiov, 0, NULL); + if (ret < 0) { + return ret; + } if (!drv) { return -ENOMEDIUM; @@ -2846,6 +2858,8 @@ bdrv_co_writev_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos) ret = drv->bdrv_save_vmstate(bs, qiov, pos); } else if (child_bs) { ret = bdrv_co_writev_vmstate(child_bs, qiov, pos); + } else { + ret = -ENOTSUP; } bdrv_dec_in_flight(bs); From patchwork Mon Sep 27 21:55:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12520847 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CD5DC433F5 for ; Mon, 27 Sep 2021 21:57:47 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1B27E60EE9 for ; Mon, 27 Sep 2021 21:57:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 1B27E60EE9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:58306 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUyd8-0005E9-9I for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 17:57:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54280) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybU-0002YH-O4 for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:06 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:49860) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybR-0002nK-Pv for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779761; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CnaTtIFlu8h3fzy0uvKMYWkFyYzCckWHm+jXBjvCJiA=; b=awHQPG2V3jDloSdXjheAZRfAYuVRrSepXo5mDbg766aBVtpw1gtbjF+5J1sX1lg0E3+HH/ bDd4WtXg7riZ/3D1A6mURpdE7CYb3AlgrxdSR90D4ojO1UFogsj3YaWeqRDJJM+VNJiw/k fRwywC/izaoEE3jPPsxP6s4J1dvApKk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-375--Q6iT1euM3uVxt8eraOJkw-1; Mon, 27 Sep 2021 17:55:59 -0400 X-MC-Unique: -Q6iT1euM3uVxt8eraOJkw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 93227108468A; Mon, 27 Sep 2021 21:55:58 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 07D515D9D5; Mon, 27 Sep 2021 21:55:57 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 03/20] qcow2: check request on vmstate save/load path Date: Mon, 27 Sep 2021 16:55:28 -0500 Message-Id: <20210927215545.3930309-4-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Fam Zheng , Kevin Wolf , Vladimir Sementsov-Ogievskiy , "open list:Block I/O path" , Hanna Reitz , Stefan Hajnoczi Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Vladimir Sementsov-Ogievskiy We modify the request by adding an offset to vmstate. Let's check the modified request. It will help us to safely move .bdrv_co_preadv_part and .bdrv_co_pwritev_part to int64_t type of offset and bytes. Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Eric Blake Message-Id: <20210903102807.27127-3-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake --- include/block/block_int.h | 3 +++ block/io.c | 6 +++--- block/qcow2.c | 43 +++++++++++++++++++++++++++++++++------ 3 files changed, 43 insertions(+), 9 deletions(-) diff --git a/include/block/block_int.h b/include/block/block_int.h index 5451f89b8df9..ed60495938a6 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -94,6 +94,9 @@ typedef struct BdrvTrackedRequest { struct BdrvTrackedRequest *waiting_for; } BdrvTrackedRequest; +int bdrv_check_qiov_request(int64_t offset, int64_t bytes, + QEMUIOVector *qiov, size_t qiov_offset, + Error **errp); int bdrv_check_request(int64_t offset, int64_t bytes, Error **errp); struct BlockDriver { diff --git a/block/io.c b/block/io.c index 58602f84dbf0..a4f124f75577 100644 --- a/block/io.c +++ b/block/io.c @@ -956,9 +956,9 @@ bool coroutine_fn bdrv_make_request_serialising(BdrvTrackedRequest *req, return waited; } -static int bdrv_check_qiov_request(int64_t offset, int64_t bytes, - QEMUIOVector *qiov, size_t qiov_offset, - Error **errp) +int bdrv_check_qiov_request(int64_t offset, int64_t bytes, + QEMUIOVector *qiov, size_t qiov_offset, + Error **errp) { /* * Check generic offset/bytes correctness diff --git a/block/qcow2.c b/block/qcow2.c index 02f9f3e63679..1c3cf7f91d86 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -5227,24 +5227,55 @@ static int qcow2_has_zero_init(BlockDriverState *bs) } } +/* + * Check the request to vmstate. On success return + * qcow2_vm_state_offset(bs) + @pos + */ +static int64_t qcow2_check_vmstate_request(BlockDriverState *bs, + QEMUIOVector *qiov, int64_t pos) +{ + BDRVQcow2State *s = bs->opaque; + int64_t vmstate_offset = qcow2_vm_state_offset(s); + int ret; + + /* Incoming requests must be OK */ + bdrv_check_qiov_request(pos, qiov->size, qiov, 0, &error_abort); + + if (INT64_MAX - pos < vmstate_offset) { + return -EIO; + } + + pos += vmstate_offset; + ret = bdrv_check_qiov_request(pos, qiov->size, qiov, 0, NULL); + if (ret < 0) { + return ret; + } + + return pos; +} + static int qcow2_save_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos) { - BDRVQcow2State *s = bs->opaque; + int64_t offset = qcow2_check_vmstate_request(bs, qiov, pos); + if (offset < 0) { + return offset; + } BLKDBG_EVENT(bs->file, BLKDBG_VMSTATE_SAVE); - return bs->drv->bdrv_co_pwritev_part(bs, qcow2_vm_state_offset(s) + pos, - qiov->size, qiov, 0, 0); + return bs->drv->bdrv_co_pwritev_part(bs, offset, qiov->size, qiov, 0, 0); } static int qcow2_load_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos) { - BDRVQcow2State *s = bs->opaque; + int64_t offset = qcow2_check_vmstate_request(bs, qiov, pos); + if (offset < 0) { + return offset; + } BLKDBG_EVENT(bs->file, BLKDBG_VMSTATE_LOAD); - return bs->drv->bdrv_co_preadv_part(bs, qcow2_vm_state_offset(s) + pos, - qiov->size, qiov, 0, 0); + return bs->drv->bdrv_co_preadv_part(bs, offset, qiov->size, qiov, 0, 0); } /* From patchwork Mon Sep 27 21:55:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12521079 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7E06C4332F for ; Mon, 27 Sep 2021 22:05:47 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2475F610A2 for ; Mon, 27 Sep 2021 22:05:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2475F610A2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:49158 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUyks-0001H2-Bo for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 18:05:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54326) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUyba-0002aq-6n for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:10 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:36239) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybW-0002rH-Bk for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:09 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779765; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9oDVb2Be5KoMlJSXYL0jlY9mZmT4doJahB/5Pv9hrpw=; b=HpW864gXQu5bw/A89LuHMnPH8N864E/qVP1Nz9WXbUBBofzJo4uyfu7QBM8L8p4KYIo5DB KG7V5/4sitQrD+HQ7F2X2SR4rQ5rtEZk76MLjHAplRChXlpK1VfkWzKMYjHqDDKk1dpeI1 AKUXlSAmH2Ik4JSB1Vss1l53JmUPcag= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-368-r5UAJaQCNMisFmxKfqMvlw-1; Mon, 27 Sep 2021 17:56:04 -0400 X-MC-Unique: r5UAJaQCNMisFmxKfqMvlw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7131E18125C0; Mon, 27 Sep 2021 21:56:01 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 30EE65D9D5; Mon, 27 Sep 2021 21:55:59 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 04/20] block: use int64_t instead of uint64_t in driver read handlers Date: Mon, 27 Sep 2021 16:55:29 -0500 Message-Id: <20210927215545.3930309-5-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Vladimir Sementsov-Ogievskiy , Alberto Garcia , Stefan Hajnoczi , "open list:blkdebug" , Stefan Weil , Peter Lieven , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Hanna Reitz , Pavel Dovgalyuk , Paolo Bonzini , Ilya Dryomov , John Snow , Ari Sundholm Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Vladimir Sementsov-Ogievskiy We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). So, convert driver read handlers parameters which are already 64bit to signed type. While being here, convert also flags parameter to be BdrvRequestFlags. Now let's consider all callers. Simple git grep '\->bdrv_\(aio\|co\)_preadv\(_part\)\?' shows that's there three callers of driver function: bdrv_driver_preadv() in block/io.c, passes int64_t, checked by bdrv_check_qiov_request() to be non-negative. qcow2_load_vmstate() does bdrv_check_qiov_request(). do_perform_cow_read() has uint64_t argument. And a lot of things in qcow2 driver are uint64_t, so converting it is big job. But we must not work with requests that don't satisfy bdrv_check_qiov_request(), so let's just assert it here. Still, the functions may be called directly, not only by drv->... Let's check: git grep '\.bdrv_\(aio\|co\)_preadv\(_part\)\?\s*=' | \ awk '{print $4}' | sed 's/,//' | sed 's/&//' | sort | uniq | \ while read func; do git grep "$func(" | \ grep -v "$func(BlockDriverState"; done The only one such caller: QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, &data, 1); ... ret = bdrv_replace_test_co_preadv(bs, 0, 1, &qiov, 0); in tests/unit/test-bdrv-drain.c, and it's OK obviously. Signed-off-by: Vladimir Sementsov-Ogievskiy Message-Id: <20210903102807.27127-4-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake [eblake: fix typos] Signed-off-by: Eric Blake --- include/block/block_int.h | 11 ++++++----- block/blkdebug.c | 4 ++-- block/blklogwrites.c | 4 ++-- block/blkreplay.c | 2 +- block/blkverify.c | 4 ++-- block/bochs.c | 4 ++-- block/cloop.c | 4 ++-- block/commit.c | 2 +- block/copy-before-write.c | 4 ++-- block/copy-on-read.c | 4 ++-- block/crypto.c | 4 ++-- block/curl.c | 3 ++- block/dmg.c | 4 ++-- block/file-posix.c | 6 +++--- block/file-win32.c | 4 ++-- block/filter-compress.c | 4 ++-- block/mirror.c | 2 +- block/nbd.c | 5 +++-- block/nfs.c | 6 +++--- block/null.c | 9 +++++---- block/nvme.c | 5 +++-- block/preallocate.c | 4 ++-- block/qcow.c | 6 +++--- block/qcow2-cluster.c | 14 +++++++++++++- block/qcow2.c | 5 +++-- block/quorum.c | 4 ++-- block/raw-format.c | 20 ++++++++++---------- block/rbd.c | 6 +++--- block/throttle.c | 5 +++-- block/vdi.c | 4 ++-- block/vmdk.c | 4 ++-- block/vpc.c | 4 ++-- block/vvfat.c | 4 ++-- tests/unit/test-bdrv-drain.c | 16 +++++++++------- tests/unit/test-block-iothread.c | 19 ++++++++++++++----- 35 files changed, 120 insertions(+), 90 deletions(-) diff --git a/include/block/block_int.h b/include/block/block_int.h index ed60495938a6..9b1a276fa1c9 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -235,8 +235,8 @@ struct BlockDriver { /* aio */ BlockAIOCB *(*bdrv_aio_preadv)(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags, - BlockCompletionFunc *cb, void *opaque); + int64_t offset, int64_t bytes, QEMUIOVector *qiov, + BdrvRequestFlags flags, BlockCompletionFunc *cb, void *opaque); BlockAIOCB *(*bdrv_aio_pwritev)(BlockDriverState *bs, uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags, BlockCompletionFunc *cb, void *opaque); @@ -265,10 +265,11 @@ struct BlockDriver { * The buffer in @qiov may point directly to guest memory. */ int coroutine_fn (*bdrv_co_preadv)(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags); + int64_t offset, int64_t bytes, QEMUIOVector *qiov, + BdrvRequestFlags flags); int coroutine_fn (*bdrv_co_preadv_part)(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, size_t qiov_offset, int flags); + int64_t offset, int64_t bytes, + QEMUIOVector *qiov, size_t qiov_offset, BdrvRequestFlags flags); int coroutine_fn (*bdrv_co_writev)(BlockDriverState *bs, int64_t sector_num, int nb_sectors, QEMUIOVector *qiov, int flags); /** diff --git a/block/blkdebug.c b/block/blkdebug.c index 8b67554bec91..12b84190656e 100644 --- a/block/blkdebug.c +++ b/block/blkdebug.c @@ -631,8 +631,8 @@ static int rule_check(BlockDriverState *bs, uint64_t offset, uint64_t bytes, } static int coroutine_fn -blkdebug_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +blkdebug_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { int err; diff --git a/block/blklogwrites.c b/block/blklogwrites.c index b7579370a30a..de3d4ba2b6dc 100644 --- a/block/blklogwrites.c +++ b/block/blklogwrites.c @@ -301,8 +301,8 @@ static void blk_log_writes_refresh_limits(BlockDriverState *bs, Error **errp) } static int coroutine_fn -blk_log_writes_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +blk_log_writes_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags); } diff --git a/block/blkreplay.c b/block/blkreplay.c index 4a247752fd8f..13ea2166bb69 100644 --- a/block/blkreplay.c +++ b/block/blkreplay.c @@ -72,7 +72,7 @@ static void block_request_create(uint64_t reqid, BlockDriverState *bs, } static int coroutine_fn blkreplay_co_preadv(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags) + int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags) { uint64_t reqid = blkreplay_next_id(); int ret = bdrv_co_preadv(bs->file, offset, bytes, qiov, flags); diff --git a/block/blkverify.c b/block/blkverify.c index 188d7632faec..5e35744b8aff 100644 --- a/block/blkverify.c +++ b/block/blkverify.c @@ -221,8 +221,8 @@ blkverify_co_prwv(BlockDriverState *bs, BlkverifyRequest *r, uint64_t offset, } static int coroutine_fn -blkverify_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +blkverify_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { BlkverifyRequest r; QEMUIOVector raw_qiov; diff --git a/block/bochs.c b/block/bochs.c index 2f010ab40a1c..4d68658087bd 100644 --- a/block/bochs.c +++ b/block/bochs.c @@ -238,8 +238,8 @@ static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num) } static int coroutine_fn -bochs_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +bochs_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { BDRVBochsState *s = bs->opaque; uint64_t sector_num = offset >> BDRV_SECTOR_BITS; diff --git a/block/cloop.c b/block/cloop.c index c99192a57f45..b8c6d0eccdba 100644 --- a/block/cloop.c +++ b/block/cloop.c @@ -245,8 +245,8 @@ static inline int cloop_read_block(BlockDriverState *bs, int block_num) } static int coroutine_fn -cloop_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +cloop_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { BDRVCloopState *s = bs->opaque; uint64_t sector_num = offset >> BDRV_SECTOR_BITS; diff --git a/block/commit.c b/block/commit.c index 42792b45562a..10cc5ff4518a 100644 --- a/block/commit.c +++ b/block/commit.c @@ -207,7 +207,7 @@ static const BlockJobDriver commit_job_driver = { }; static int coroutine_fn bdrv_commit_top_preadv(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags) + int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags) { return bdrv_co_preadv(bs->backing, offset, bytes, qiov, flags); } diff --git a/block/copy-before-write.c b/block/copy-before-write.c index 2a5e57decaa1..591cc3ac7519 100644 --- a/block/copy-before-write.c +++ b/block/copy-before-write.c @@ -40,8 +40,8 @@ typedef struct BDRVCopyBeforeWriteState { } BDRVCopyBeforeWriteState; static coroutine_fn int cbw_co_preadv( - BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) + BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags); } diff --git a/block/copy-on-read.c b/block/copy-on-read.c index c42868227233..d34add447622 100644 --- a/block/copy-on-read.c +++ b/block/copy-on-read.c @@ -128,10 +128,10 @@ static int64_t cor_getlength(BlockDriverState *bs) static int coroutine_fn cor_co_preadv_part(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, + int64_t offset, int64_t bytes, QEMUIOVector *qiov, size_t qiov_offset, - int flags) + BdrvRequestFlags flags) { int64_t n; int local_flags; diff --git a/block/crypto.c b/block/crypto.c index 1d30fde38e55..a732a36d1001 100644 --- a/block/crypto.c +++ b/block/crypto.c @@ -397,8 +397,8 @@ static int block_crypto_reopen_prepare(BDRVReopenState *state, #define BLOCK_CRYPTO_MAX_IO_SIZE (1024 * 1024) static coroutine_fn int -block_crypto_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +block_crypto_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { BlockCrypto *crypto = bs->opaque; uint64_t cur_bytes; /* number of bytes in current iteration */ diff --git a/block/curl.c b/block/curl.c index 50e741a0d7a5..4a8ae2b26986 100644 --- a/block/curl.c +++ b/block/curl.c @@ -896,7 +896,8 @@ out: } static int coroutine_fn curl_co_preadv(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags) + int64_t offset, int64_t bytes, QEMUIOVector *qiov, + BdrvRequestFlags flags) { CURLAIOCB acb = { .co = qemu_coroutine_self(), diff --git a/block/dmg.c b/block/dmg.c index ef35a505f266..447901fbb875 100644 --- a/block/dmg.c +++ b/block/dmg.c @@ -689,8 +689,8 @@ static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num) } static int coroutine_fn -dmg_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +dmg_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { BDRVDMGState *s = bs->opaque; uint64_t sector_num = offset >> BDRV_SECTOR_BITS; diff --git a/block/file-posix.c b/block/file-posix.c index d81e15efa475..df4a67f8b247 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -2077,9 +2077,9 @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset, return raw_thread_pool_submit(bs, handle_aiocb_rw, &acb); } -static int coroutine_fn raw_co_preadv(BlockDriverState *bs, uint64_t offset, - uint64_t bytes, QEMUIOVector *qiov, - int flags) +static int coroutine_fn raw_co_preadv(BlockDriverState *bs, int64_t offset, + int64_t bytes, QEMUIOVector *qiov, + BdrvRequestFlags flags) { return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_READ); } diff --git a/block/file-win32.c b/block/file-win32.c index b97c58d642f8..4e8947009bcc 100644 --- a/block/file-win32.c +++ b/block/file-win32.c @@ -440,8 +440,8 @@ fail: } static BlockAIOCB *raw_aio_preadv(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags, + int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags, BlockCompletionFunc *cb, void *opaque) { BDRVRawState *s = bs->opaque; diff --git a/block/filter-compress.c b/block/filter-compress.c index 5136371bf8b9..54a16c6c64ad 100644 --- a/block/filter-compress.c +++ b/block/filter-compress.c @@ -63,10 +63,10 @@ static int64_t compress_getlength(BlockDriverState *bs) static int coroutine_fn compress_co_preadv_part(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, + int64_t offset, int64_t bytes, QEMUIOVector *qiov, size_t qiov_offset, - int flags) + BdrvRequestFlags flags) { return bdrv_co_preadv_part(bs->file, offset, bytes, qiov, qiov_offset, flags); diff --git a/block/mirror.c b/block/mirror.c index 85b781bc2169..e24d0aedd5f3 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -1402,7 +1402,7 @@ static void coroutine_fn active_write_settle(MirrorOp *op) } static int coroutine_fn bdrv_mirror_top_preadv(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags) + int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags) { return bdrv_co_preadv(bs->backing, offset, bytes, qiov, flags); } diff --git a/block/nbd.c b/block/nbd.c index f6ff1c4fb472..c816933d7da5 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -1322,8 +1322,9 @@ static int nbd_co_request(BlockDriverState *bs, NBDRequest *request, return ret ? ret : request_ret; } -static int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset, - uint64_t bytes, QEMUIOVector *qiov, int flags) +static int nbd_client_co_preadv(BlockDriverState *bs, int64_t offset, + int64_t bytes, QEMUIOVector *qiov, + BdrvRequestFlags flags) { int ret, request_ret; Error *local_err = NULL; diff --git a/block/nfs.c b/block/nfs.c index 9aeaefb3644f..27f9ab8307f6 100644 --- a/block/nfs.c +++ b/block/nfs.c @@ -262,9 +262,9 @@ nfs_co_generic_cb(int ret, struct nfs_context *nfs, void *data, nfs_co_generic_bh_cb, task); } -static int coroutine_fn nfs_co_preadv(BlockDriverState *bs, uint64_t offset, - uint64_t bytes, QEMUIOVector *iov, - int flags) +static int coroutine_fn nfs_co_preadv(BlockDriverState *bs, int64_t offset, + int64_t bytes, QEMUIOVector *iov, + BdrvRequestFlags flags) { NFSClient *client = bs->opaque; NFSRPC task; diff --git a/block/null.c b/block/null.c index cc9b1d4ea720..343dbb580d46 100644 --- a/block/null.c +++ b/block/null.c @@ -116,8 +116,9 @@ static coroutine_fn int null_co_common(BlockDriverState *bs) } static coroutine_fn int null_co_preadv(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) + int64_t offset, int64_t bytes, + QEMUIOVector *qiov, + BdrvRequestFlags flags) { BDRVNullState *s = bs->opaque; @@ -187,8 +188,8 @@ static inline BlockAIOCB *null_aio_common(BlockDriverState *bs, } static BlockAIOCB *null_aio_preadv(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags, + int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags, BlockCompletionFunc *cb, void *opaque) { diff --git a/block/nvme.c b/block/nvme.c index abfe305baf2f..f812eb1cc2a9 100644 --- a/block/nvme.c +++ b/block/nvme.c @@ -1251,8 +1251,9 @@ static int nvme_co_prw(BlockDriverState *bs, uint64_t offset, uint64_t bytes, } static coroutine_fn int nvme_co_preadv(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) + int64_t offset, int64_t bytes, + QEMUIOVector *qiov, + BdrvRequestFlags flags) { return nvme_co_prw(bs, offset, bytes, qiov, false, flags); } diff --git a/block/preallocate.c b/block/preallocate.c index b6192063046d..57094436124a 100644 --- a/block/preallocate.c +++ b/block/preallocate.c @@ -227,8 +227,8 @@ static void preallocate_reopen_abort(BDRVReopenState *state) } static coroutine_fn int preallocate_co_preadv_part( - BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, size_t qiov_offset, int flags) + BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, size_t qiov_offset, BdrvRequestFlags flags) { return bdrv_co_preadv_part(bs->file, offset, bytes, qiov, qiov_offset, flags); diff --git a/block/qcow.c b/block/qcow.c index f8919a44d193..1151b8d79bd7 100644 --- a/block/qcow.c +++ b/block/qcow.c @@ -617,9 +617,9 @@ static void qcow_refresh_limits(BlockDriverState *bs, Error **errp) bs->bl.request_alignment = BDRV_SECTOR_SIZE; } -static coroutine_fn int qcow_co_preadv(BlockDriverState *bs, uint64_t offset, - uint64_t bytes, QEMUIOVector *qiov, - int flags) +static coroutine_fn int qcow_co_preadv(BlockDriverState *bs, int64_t offset, + int64_t bytes, QEMUIOVector *qiov, + BdrvRequestFlags flags) { BDRVQcowState *s = bs->opaque; int offset_in_cluster; diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 4ebb49a087df..5727f92dcb39 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -505,7 +505,19 @@ static int coroutine_fn do_perform_cow_read(BlockDriverState *bs, return -ENOMEDIUM; } - /* Call .bdrv_co_readv() directly instead of using the public block-layer + /* + * We never deal with requests that don't satisfy + * bdrv_check_qiov_request(), and aligning requests to clusters never + * breaks this condition. So, do some assertions before calling + * bs->drv->bdrv_co_preadv_part() which has int64_t arguments. + */ + assert(src_cluster_offset <= INT64_MAX); + assert(src_cluster_offset + offset_in_cluster <= INT64_MAX); + assert(qiov->size <= INT64_MAX); + bdrv_check_qiov_request(src_cluster_offset + offset_in_cluster, qiov->size, + qiov, 0, &error_abort); + /* + * Call .bdrv_co_readv() directly instead of using the public block-layer * interface. This avoids double I/O throttling and request tracking, * which can lead to deadlock when block layer copy-on-read is enabled. */ diff --git a/block/qcow2.c b/block/qcow2.c index 1c3cf7f91d86..680ef705323e 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -2310,9 +2310,10 @@ static coroutine_fn int qcow2_co_preadv_task_entry(AioTask *task) } static coroutine_fn int qcow2_co_preadv_part(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, + int64_t offset, int64_t bytes, QEMUIOVector *qiov, - size_t qiov_offset, int flags) + size_t qiov_offset, + BdrvRequestFlags flags) { BDRVQcow2State *s = bs->opaque; int ret = 0; diff --git a/block/quorum.c b/block/quorum.c index f2c08050000e..57c73b215643 100644 --- a/block/quorum.c +++ b/block/quorum.c @@ -663,8 +663,8 @@ static int read_fifo_child(QuorumAIOCB *acb) return ret; } -static int quorum_co_preadv(BlockDriverState *bs, uint64_t offset, - uint64_t bytes, QEMUIOVector *qiov, int flags) +static int quorum_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { BDRVQuorumState *s = bs->opaque; QuorumAIOCB *acb = quorum_aio_get(bs, qiov, offset, bytes, flags); diff --git a/block/raw-format.c b/block/raw-format.c index c26f493688fc..a5728e7b0ca7 100644 --- a/block/raw-format.c +++ b/block/raw-format.c @@ -181,8 +181,8 @@ static void raw_reopen_abort(BDRVReopenState *state) } /* Check and adjust the offset, against 'offset' and 'size' options. */ -static inline int raw_adjust_offset(BlockDriverState *bs, uint64_t *offset, - uint64_t bytes, bool is_write) +static inline int raw_adjust_offset(BlockDriverState *bs, int64_t *offset, + int64_t bytes, bool is_write) { BDRVRawState *s = bs->opaque; @@ -201,9 +201,9 @@ static inline int raw_adjust_offset(BlockDriverState *bs, uint64_t *offset, return 0; } -static int coroutine_fn raw_co_preadv(BlockDriverState *bs, uint64_t offset, - uint64_t bytes, QEMUIOVector *qiov, - int flags) +static int coroutine_fn raw_co_preadv(BlockDriverState *bs, int64_t offset, + int64_t bytes, QEMUIOVector *qiov, + BdrvRequestFlags flags) { int ret; @@ -259,7 +259,7 @@ static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, uint64_t offset, qiov = &local_qiov; } - ret = raw_adjust_offset(bs, &offset, bytes, true); + ret = raw_adjust_offset(bs, (int64_t *)&offset, bytes, true); if (ret) { goto fail; } @@ -294,7 +294,7 @@ static int coroutine_fn raw_co_pwrite_zeroes(BlockDriverState *bs, { int ret; - ret = raw_adjust_offset(bs, (uint64_t *)&offset, bytes, true); + ret = raw_adjust_offset(bs, &offset, bytes, true); if (ret) { return ret; } @@ -306,7 +306,7 @@ static int coroutine_fn raw_co_pdiscard(BlockDriverState *bs, { int ret; - ret = raw_adjust_offset(bs, (uint64_t *)&offset, bytes, true); + ret = raw_adjust_offset(bs, &offset, bytes, true); if (ret) { return ret; } @@ -541,7 +541,7 @@ static int coroutine_fn raw_co_copy_range_from(BlockDriverState *bs, { int ret; - ret = raw_adjust_offset(bs, &src_offset, bytes, false); + ret = raw_adjust_offset(bs, (int64_t *)&src_offset, bytes, false); if (ret) { return ret; } @@ -560,7 +560,7 @@ static int coroutine_fn raw_co_copy_range_to(BlockDriverState *bs, { int ret; - ret = raw_adjust_offset(bs, &dst_offset, bytes, true); + ret = raw_adjust_offset(bs, (int64_t *)&dst_offset, bytes, true); if (ret) { return ret; } diff --git a/block/rbd.c b/block/rbd.c index dcf82b15b802..21438dfb7c4f 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -1164,9 +1164,9 @@ static int coroutine_fn qemu_rbd_start_co(BlockDriverState *bs, } static int -coroutine_fn qemu_rbd_co_preadv(BlockDriverState *bs, uint64_t offset, - uint64_t bytes, QEMUIOVector *qiov, - int flags) +coroutine_fn qemu_rbd_co_preadv(BlockDriverState *bs, int64_t offset, + int64_t bytes, QEMUIOVector *qiov, + BdrvRequestFlags flags) { return qemu_rbd_start_co(bs, offset, bytes, qiov, flags, RBD_AIO_READ); } diff --git a/block/throttle.c b/block/throttle.c index b685166ad4a9..20362b5fe58f 100644 --- a/block/throttle.c +++ b/block/throttle.c @@ -112,8 +112,9 @@ static int64_t throttle_getlength(BlockDriverState *bs) } static int coroutine_fn throttle_co_preadv(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) + int64_t offset, int64_t bytes, + QEMUIOVector *qiov, + BdrvRequestFlags flags) { ThrottleGroupMember *tgm = bs->opaque; diff --git a/block/vdi.c b/block/vdi.c index 548f8a057b87..b394cf6ca638 100644 --- a/block/vdi.c +++ b/block/vdi.c @@ -544,8 +544,8 @@ static int coroutine_fn vdi_co_block_status(BlockDriverState *bs, } static int coroutine_fn -vdi_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +vdi_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { BDRVVdiState *s = bs->opaque; QEMUIOVector local_qiov; diff --git a/block/vmdk.c b/block/vmdk.c index 4499f136bdff..78592160d039 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -1888,8 +1888,8 @@ static int vmdk_read_extent(VmdkExtent *extent, int64_t cluster_offset, } static int coroutine_fn -vmdk_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +vmdk_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { BDRVVmdkState *s = bs->opaque; int ret; diff --git a/block/vpc.c b/block/vpc.c index 17a705b482af..29c8517ff83f 100644 --- a/block/vpc.c +++ b/block/vpc.c @@ -608,8 +608,8 @@ static int vpc_get_info(BlockDriverState *bs, BlockDriverInfo *bdi) } static int coroutine_fn -vpc_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +vpc_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { BDRVVPCState *s = bs->opaque; int ret; diff --git a/block/vvfat.c b/block/vvfat.c index 34bf1e3a86e1..9c53c2d0a4f1 100644 --- a/block/vvfat.c +++ b/block/vvfat.c @@ -1522,8 +1522,8 @@ static int vvfat_read(BlockDriverState *bs, int64_t sector_num, } static int coroutine_fn -vvfat_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +vvfat_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { int ret; BDRVVVFATState *s = bs->opaque; diff --git a/tests/unit/test-bdrv-drain.c b/tests/unit/test-bdrv-drain.c index ce071b5fc5ab..2d3c17e566f1 100644 --- a/tests/unit/test-bdrv-drain.c +++ b/tests/unit/test-bdrv-drain.c @@ -65,8 +65,9 @@ static void co_reenter_bh(void *opaque) } static int coroutine_fn bdrv_test_co_preadv(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) + int64_t offset, int64_t bytes, + QEMUIOVector *qiov, + BdrvRequestFlags flags) { BDRVTestState *s = bs->opaque; @@ -1106,8 +1107,9 @@ static void bdrv_test_top_close(BlockDriverState *bs) } static int coroutine_fn bdrv_test_top_co_preadv(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) + int64_t offset, int64_t bytes, + QEMUIOVector *qiov, + BdrvRequestFlags flags) { BDRVTestTopState *tts = bs->opaque; return bdrv_co_preadv(tts->wait_child, offset, bytes, qiov, flags); @@ -1855,10 +1857,10 @@ static void bdrv_replace_test_close(BlockDriverState *bs) * Set .has_read to true and return success. */ static int coroutine_fn bdrv_replace_test_co_preadv(BlockDriverState *bs, - uint64_t offset, - uint64_t bytes, + int64_t offset, + int64_t bytes, QEMUIOVector *qiov, - int flags) + BdrvRequestFlags flags) { BDRVReplaceTestState *s = bs->opaque; diff --git a/tests/unit/test-block-iothread.c b/tests/unit/test-block-iothread.c index c39e70b2f515..f18fa6e0fba2 100644 --- a/tests/unit/test-block-iothread.c +++ b/tests/unit/test-block-iothread.c @@ -31,9 +31,18 @@ #include "qemu/main-loop.h" #include "iothread.h" -static int coroutine_fn bdrv_test_co_prwv(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +static int coroutine_fn bdrv_test_co_preadv(BlockDriverState *bs, + int64_t offset, int64_t bytes, + QEMUIOVector *qiov, + BdrvRequestFlags flags) +{ + return 0; +} + +static int coroutine_fn bdrv_test_co_pwritev(BlockDriverState *bs, + uint64_t offset, uint64_t bytes, + QEMUIOVector *qiov, + int flags) { return 0; } @@ -66,8 +75,8 @@ static BlockDriver bdrv_test = { .format_name = "test", .instance_size = 1, - .bdrv_co_preadv = bdrv_test_co_prwv, - .bdrv_co_pwritev = bdrv_test_co_prwv, + .bdrv_co_preadv = bdrv_test_co_preadv, + .bdrv_co_pwritev = bdrv_test_co_pwritev, .bdrv_co_pdiscard = bdrv_test_co_pdiscard, .bdrv_co_truncate = bdrv_test_co_truncate, .bdrv_co_block_status = bdrv_test_co_block_status, From patchwork Mon Sep 27 21:55:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12521077 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0057EC433F5 for ; Mon, 27 Sep 2021 22:05:03 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 62E6961058 for ; Mon, 27 Sep 2021 22:05:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 62E6961058 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:48394 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUyk8-0000kq-CI for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 18:05:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54382) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybc-0002dQ-UN for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:14 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:58262) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybW-0002rR-Ve for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:12 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779766; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Sscteo6jh3KrzgAMHYZDwXQ2tB+Qdlw6bmtddzUmkoI=; b=WmSx0o5aJh+njlsazKm91UDltYfrZiJwOx/LXcb9E/jP6JzcMZ33ktNw9GiNUxnL1RMEWM LveMBZK8516xxi8dCCiLMlta/8i2TW3RRrE2+9eJaYOy8lG1FDYAX1SOP8SlC8ykuWeuJQ Sjb0Xx7CA+r5COWirgvACgAw14oFkls= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-364-Mz7Ooj-1OpKn4OTSaLa-OA-1; Mon, 27 Sep 2021 17:56:04 -0400 X-MC-Unique: Mz7Ooj-1OpKn4OTSaLa-OA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3E8AC1084684; Mon, 27 Sep 2021 21:56:03 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1A3F61850E; Mon, 27 Sep 2021 21:56:02 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 05/20] block: use int64_t instead of uint64_t in driver write handlers Date: Mon, 27 Sep 2021 16:55:30 -0500 Message-Id: <20210927215545.3930309-6-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Vladimir Sementsov-Ogievskiy , Alberto Garcia , Stefan Hajnoczi , "open list:blkdebug" , Stefan Weil , Peter Lieven , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Hanna Reitz , Pavel Dovgalyuk , Paolo Bonzini , Ilya Dryomov , John Snow , Ari Sundholm Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Vladimir Sementsov-Ogievskiy We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). So, convert driver write handlers parameters which are already 64bit to signed type. While being here, convert also flags parameter to be BdrvRequestFlags. Now let's consider all callers. Simple git grep '\->bdrv_\(aio\|co\)_pwritev\(_part\)\?' shows that's there three callers of driver function: bdrv_driver_pwritev() and bdrv_driver_pwritev_compressed() in block/io.c, both pass int64_t, checked by bdrv_check_qiov_request() to be non-negative. qcow2_save_vmstate() does bdrv_check_qiov_request(). Still, the functions may be called directly, not only by drv->... Let's check: git grep '\.bdrv_\(aio\|co\)_pwritev\(_part\)\?\s*=' | \ awk '{print $4}' | sed 's/,//' | sed 's/&//' | sort | uniq | \ while read func; do git grep "$func(" | \ grep -v "$func(BlockDriverState"; done shows several callers: qcow2: qcow2_co_truncate() write at most up to @offset, which is checked in generic qcow2_co_truncate() by bdrv_check_request(). qcow2_co_pwritev_compressed_task() pass the request (or part of the request) that already went through normal write path, so it should be OK qcow: qcow_co_pwritev_compressed() pass int64_t, it's updated by this patch quorum: quorum_co_pwrite_zeroes() pass int64_t and int - OK throttle: throttle_co_pwritev_compressed() pass int64_t, it's updated by this patch vmdk: vmdk_co_pwritev_compressed() pass int64_t, it's updated by this patch Signed-off-by: Vladimir Sementsov-Ogievskiy Message-Id: <20210903102807.27127-5-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake Signed-off-by: Eric Blake --- include/block/block_int.h | 16 ++++++++-------- block/io.c | 6 ++++-- block/blkdebug.c | 4 ++-- block/blklogwrites.c | 4 ++-- block/blkreplay.c | 2 +- block/blkverify.c | 4 ++-- block/copy-before-write.c | 7 ++++--- block/copy-on-read.c | 11 ++++++----- block/crypto.c | 4 ++-- block/file-posix.c | 6 +++--- block/file-win32.c | 4 ++-- block/filter-compress.c | 7 ++++--- block/mirror.c | 2 +- block/nbd.c | 5 +++-- block/nfs.c | 6 +++--- block/null.c | 9 +++++---- block/nvme.c | 5 +++-- block/preallocate.c | 6 +++--- block/qcow.c | 10 +++++----- block/qcow2.c | 6 +++--- block/quorum.c | 5 +++-- block/raw-format.c | 8 ++++---- block/rbd.c | 6 +++--- block/throttle.c | 9 +++++---- block/vdi.c | 4 ++-- block/vmdk.c | 8 ++++---- block/vpc.c | 4 ++-- block/vvfat.c | 4 ++-- tests/unit/test-block-iothread.c | 4 ++-- block/trace-events | 2 +- 30 files changed, 94 insertions(+), 84 deletions(-) diff --git a/include/block/block_int.h b/include/block/block_int.h index 9b1a276fa1c9..2cf5f1722a7f 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -238,8 +238,8 @@ struct BlockDriver { int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags, BlockCompletionFunc *cb, void *opaque); BlockAIOCB *(*bdrv_aio_pwritev)(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags, - BlockCompletionFunc *cb, void *opaque); + int64_t offset, int64_t bytes, QEMUIOVector *qiov, + BdrvRequestFlags flags, BlockCompletionFunc *cb, void *opaque); BlockAIOCB *(*bdrv_aio_flush)(BlockDriverState *bs, BlockCompletionFunc *cb, void *opaque); BlockAIOCB *(*bdrv_aio_pdiscard)(BlockDriverState *bs, @@ -288,10 +288,11 @@ struct BlockDriver { * The buffer in @qiov may point directly to guest memory. */ int coroutine_fn (*bdrv_co_pwritev)(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags); + int64_t offset, int64_t bytes, QEMUIOVector *qiov, + BdrvRequestFlags flags); int coroutine_fn (*bdrv_co_pwritev_part)(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, size_t qiov_offset, int flags); + int64_t offset, int64_t bytes, QEMUIOVector *qiov, size_t qiov_offset, + BdrvRequestFlags flags); /* * Efficiently zero a region of the disk image. Typically an image format @@ -438,10 +439,9 @@ struct BlockDriver { Error **errp); int coroutine_fn (*bdrv_co_pwritev_compressed)(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, QEMUIOVector *qiov); + int64_t offset, int64_t bytes, QEMUIOVector *qiov); int coroutine_fn (*bdrv_co_pwritev_compressed_part)(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, - size_t qiov_offset); + int64_t offset, int64_t bytes, QEMUIOVector *qiov, size_t qiov_offset); int (*bdrv_snapshot_create)(BlockDriverState *bs, QEMUSnapshotInfo *sn_info); diff --git a/block/io.c b/block/io.c index a4f124f75577..aa6f7b075e78 100644 --- a/block/io.c +++ b/block/io.c @@ -1230,7 +1230,8 @@ out: static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs, int64_t offset, int64_t bytes, QEMUIOVector *qiov, - size_t qiov_offset, int flags) + size_t qiov_offset, + BdrvRequestFlags flags) { BlockDriver *drv = bs->drv; int64_t sector_num; @@ -2073,7 +2074,8 @@ bdrv_co_write_req_finish(BdrvChild *child, int64_t offset, int64_t bytes, */ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild *child, BdrvTrackedRequest *req, int64_t offset, int64_t bytes, - int64_t align, QEMUIOVector *qiov, size_t qiov_offset, int flags) + int64_t align, QEMUIOVector *qiov, size_t qiov_offset, + BdrvRequestFlags flags) { BlockDriverState *bs = child->bs; BlockDriver *drv = bs->drv; diff --git a/block/blkdebug.c b/block/blkdebug.c index 12b84190656e..e686cd97993d 100644 --- a/block/blkdebug.c +++ b/block/blkdebug.c @@ -652,8 +652,8 @@ blkdebug_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, } static int coroutine_fn -blkdebug_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +blkdebug_co_pwritev(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { int err; diff --git a/block/blklogwrites.c b/block/blklogwrites.c index de3d4ba2b6dc..ca174ab135a0 100644 --- a/block/blklogwrites.c +++ b/block/blklogwrites.c @@ -460,8 +460,8 @@ blk_log_writes_co_do_file_pdiscard(BlkLogWritesFileReq *fr) } static int coroutine_fn -blk_log_writes_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +blk_log_writes_co_pwritev(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { return blk_log_writes_co_log(bs, offset, bytes, qiov, flags, blk_log_writes_co_do_file_pwritev, 0, false); diff --git a/block/blkreplay.c b/block/blkreplay.c index 13ea2166bb69..7ba62dcac12a 100644 --- a/block/blkreplay.c +++ b/block/blkreplay.c @@ -83,7 +83,7 @@ static int coroutine_fn blkreplay_co_preadv(BlockDriverState *bs, } static int coroutine_fn blkreplay_co_pwritev(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags) + int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags) { uint64_t reqid = blkreplay_next_id(); int ret = bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags); diff --git a/block/blkverify.c b/block/blkverify.c index 5e35744b8aff..d1facf5ba90b 100644 --- a/block/blkverify.c +++ b/block/blkverify.c @@ -250,8 +250,8 @@ blkverify_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, } static int coroutine_fn -blkverify_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +blkverify_co_pwritev(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { BlkverifyRequest r; return blkverify_co_prwv(bs, &r, offset, bytes, qiov, qiov, flags, true); diff --git a/block/copy-before-write.c b/block/copy-before-write.c index 591cc3ac7519..74360b48536f 100644 --- a/block/copy-before-write.c +++ b/block/copy-before-write.c @@ -86,9 +86,10 @@ static int coroutine_fn cbw_co_pwrite_zeroes(BlockDriverState *bs, } static coroutine_fn int cbw_co_pwritev(BlockDriverState *bs, - uint64_t offset, - uint64_t bytes, - QEMUIOVector *qiov, int flags) + int64_t offset, + int64_t bytes, + QEMUIOVector *qiov, + BdrvRequestFlags flags) { int ret = cbw_do_copy_before_write(bs, offset, bytes, flags); if (ret < 0) { diff --git a/block/copy-on-read.c b/block/copy-on-read.c index d34add447622..b2ec36b6fc7f 100644 --- a/block/copy-on-read.c +++ b/block/copy-on-read.c @@ -181,10 +181,11 @@ static int coroutine_fn cor_co_preadv_part(BlockDriverState *bs, static int coroutine_fn cor_co_pwritev_part(BlockDriverState *bs, - uint64_t offset, - uint64_t bytes, + int64_t offset, + int64_t bytes, QEMUIOVector *qiov, - size_t qiov_offset, int flags) + size_t qiov_offset, + BdrvRequestFlags flags) { return bdrv_co_pwritev_part(bs->file, offset, bytes, qiov, qiov_offset, flags); @@ -207,8 +208,8 @@ static int coroutine_fn cor_co_pdiscard(BlockDriverState *bs, static int coroutine_fn cor_co_pwritev_compressed(BlockDriverState *bs, - uint64_t offset, - uint64_t bytes, + int64_t offset, + int64_t bytes, QEMUIOVector *qiov) { return bdrv_co_pwritev(bs->file, offset, bytes, qiov, diff --git a/block/crypto.c b/block/crypto.c index a732a36d1001..c8ba4681e200 100644 --- a/block/crypto.c +++ b/block/crypto.c @@ -460,8 +460,8 @@ block_crypto_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, static coroutine_fn int -block_crypto_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +block_crypto_co_pwritev(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { BlockCrypto *crypto = bs->opaque; uint64_t cur_bytes; /* number of bytes in current iteration */ diff --git a/block/file-posix.c b/block/file-posix.c index df4a67f8b247..994f1c26ca7b 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -2084,9 +2084,9 @@ static int coroutine_fn raw_co_preadv(BlockDriverState *bs, int64_t offset, return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_READ); } -static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, uint64_t offset, - uint64_t bytes, QEMUIOVector *qiov, - int flags) +static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, int64_t offset, + int64_t bytes, QEMUIOVector *qiov, + BdrvRequestFlags flags) { assert(flags == 0); return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_WRITE); diff --git a/block/file-win32.c b/block/file-win32.c index 4e8947009bcc..ec9d64d0e4e2 100644 --- a/block/file-win32.c +++ b/block/file-win32.c @@ -455,8 +455,8 @@ static BlockAIOCB *raw_aio_preadv(BlockDriverState *bs, } static BlockAIOCB *raw_aio_pwritev(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags, + int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags, BlockCompletionFunc *cb, void *opaque) { BDRVRawState *s = bs->opaque; diff --git a/block/filter-compress.c b/block/filter-compress.c index 54a16c6c64ad..505822a44fe2 100644 --- a/block/filter-compress.c +++ b/block/filter-compress.c @@ -74,10 +74,11 @@ static int coroutine_fn compress_co_preadv_part(BlockDriverState *bs, static int coroutine_fn compress_co_pwritev_part(BlockDriverState *bs, - uint64_t offset, - uint64_t bytes, + int64_t offset, + int64_t bytes, QEMUIOVector *qiov, - size_t qiov_offset, int flags) + size_t qiov_offset, + BdrvRequestFlags flags) { return bdrv_co_pwritev_part(bs->file, offset, bytes, qiov, qiov_offset, flags | BDRV_REQ_WRITE_COMPRESSED); diff --git a/block/mirror.c b/block/mirror.c index e24d0aedd5f3..c4c623ceed85 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -1456,7 +1456,7 @@ out: } static int coroutine_fn bdrv_mirror_top_pwritev(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags) + int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags) { MirrorBDSOpaque *s = bs->opaque; QEMUIOVector bounce_qiov; diff --git a/block/nbd.c b/block/nbd.c index c816933d7da5..caee396525ab 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -1381,8 +1381,9 @@ static int nbd_client_co_preadv(BlockDriverState *bs, int64_t offset, return ret ? ret : request_ret; } -static int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset, - uint64_t bytes, QEMUIOVector *qiov, int flags) +static int nbd_client_co_pwritev(BlockDriverState *bs, int64_t offset, + int64_t bytes, QEMUIOVector *qiov, + BdrvRequestFlags flags) { BDRVNBDState *s = (BDRVNBDState *)bs->opaque; NBDRequest request = { diff --git a/block/nfs.c b/block/nfs.c index 27f9ab8307f6..577aea1d222a 100644 --- a/block/nfs.c +++ b/block/nfs.c @@ -296,9 +296,9 @@ static int coroutine_fn nfs_co_preadv(BlockDriverState *bs, int64_t offset, return 0; } -static int coroutine_fn nfs_co_pwritev(BlockDriverState *bs, uint64_t offset, - uint64_t bytes, QEMUIOVector *iov, - int flags) +static int coroutine_fn nfs_co_pwritev(BlockDriverState *bs, int64_t offset, + int64_t bytes, QEMUIOVector *iov, + BdrvRequestFlags flags) { NFSClient *client = bs->opaque; NFSRPC task; diff --git a/block/null.c b/block/null.c index 343dbb580d46..75f7d0db40c4 100644 --- a/block/null.c +++ b/block/null.c @@ -130,8 +130,9 @@ static coroutine_fn int null_co_preadv(BlockDriverState *bs, } static coroutine_fn int null_co_pwritev(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) + int64_t offset, int64_t bytes, + QEMUIOVector *qiov, + BdrvRequestFlags flags) { return null_co_common(bs); } @@ -203,8 +204,8 @@ static BlockAIOCB *null_aio_preadv(BlockDriverState *bs, } static BlockAIOCB *null_aio_pwritev(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags, + int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags, BlockCompletionFunc *cb, void *opaque) { diff --git a/block/nvme.c b/block/nvme.c index f812eb1cc2a9..c44db1893909 100644 --- a/block/nvme.c +++ b/block/nvme.c @@ -1259,8 +1259,9 @@ static coroutine_fn int nvme_co_preadv(BlockDriverState *bs, } static coroutine_fn int nvme_co_pwritev(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) + int64_t offset, int64_t bytes, + QEMUIOVector *qiov, + BdrvRequestFlags flags) { return nvme_co_prw(bs, offset, bytes, qiov, true, flags); } diff --git a/block/preallocate.c b/block/preallocate.c index 57094436124a..c19885af17a8 100644 --- a/block/preallocate.c +++ b/block/preallocate.c @@ -349,11 +349,11 @@ static int coroutine_fn preallocate_co_pwrite_zeroes(BlockDriverState *bs, } static coroutine_fn int preallocate_co_pwritev_part(BlockDriverState *bs, - uint64_t offset, - uint64_t bytes, + int64_t offset, + int64_t bytes, QEMUIOVector *qiov, size_t qiov_offset, - int flags) + BdrvRequestFlags flags) { handle_write(bs, offset, bytes, false); diff --git a/block/qcow.c b/block/qcow.c index 1151b8d79bd7..c39940f33ebe 100644 --- a/block/qcow.c +++ b/block/qcow.c @@ -714,9 +714,9 @@ static coroutine_fn int qcow_co_preadv(BlockDriverState *bs, int64_t offset, return ret; } -static coroutine_fn int qcow_co_pwritev(BlockDriverState *bs, uint64_t offset, - uint64_t bytes, QEMUIOVector *qiov, - int flags) +static coroutine_fn int qcow_co_pwritev(BlockDriverState *bs, int64_t offset, + int64_t bytes, QEMUIOVector *qiov, + BdrvRequestFlags flags) { BDRVQcowState *s = bs->opaque; int offset_in_cluster; @@ -1047,8 +1047,8 @@ static int qcow_make_empty(BlockDriverState *bs) /* XXX: put compressed sectors first, then all the cluster aligned tables to avoid losing bytes in alignment */ static coroutine_fn int -qcow_co_pwritev_compressed(BlockDriverState *bs, uint64_t offset, - uint64_t bytes, QEMUIOVector *qiov) +qcow_co_pwritev_compressed(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov) { BDRVQcowState *s = bs->opaque; z_stream strm; diff --git a/block/qcow2.c b/block/qcow2.c index 680ef705323e..bb5455ed3dee 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -2597,8 +2597,8 @@ static coroutine_fn int qcow2_co_pwritev_task_entry(AioTask *task) } static coroutine_fn int qcow2_co_pwritev_part( - BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, size_t qiov_offset, int flags) + BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, size_t qiov_offset, BdrvRequestFlags flags) { BDRVQcow2State *s = bs->opaque; int offset_in_cluster; @@ -4631,7 +4631,7 @@ static coroutine_fn int qcow2_co_pwritev_compressed_task_entry(AioTask *task) */ static coroutine_fn int qcow2_co_pwritev_compressed_part(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, + int64_t offset, int64_t bytes, QEMUIOVector *qiov, size_t qiov_offset) { BDRVQcow2State *s = bs->opaque; diff --git a/block/quorum.c b/block/quorum.c index 57c73b215643..f4b76ea010c0 100644 --- a/block/quorum.c +++ b/block/quorum.c @@ -714,8 +714,9 @@ static void write_quorum_entry(void *opaque) } } -static int quorum_co_pwritev(BlockDriverState *bs, uint64_t offset, - uint64_t bytes, QEMUIOVector *qiov, int flags) +static int quorum_co_pwritev(BlockDriverState *bs, int64_t offset, + int64_t bytes, QEMUIOVector *qiov, + BdrvRequestFlags flags) { BDRVQuorumState *s = bs->opaque; QuorumAIOCB *acb = quorum_aio_get(bs, qiov, offset, bytes, flags); diff --git a/block/raw-format.c b/block/raw-format.c index a5728e7b0ca7..d223298dfc26 100644 --- a/block/raw-format.c +++ b/block/raw-format.c @@ -216,9 +216,9 @@ static int coroutine_fn raw_co_preadv(BlockDriverState *bs, int64_t offset, return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags); } -static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, uint64_t offset, - uint64_t bytes, QEMUIOVector *qiov, - int flags) +static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, int64_t offset, + int64_t bytes, QEMUIOVector *qiov, + BdrvRequestFlags flags) { void *buf = NULL; BlockDriver *drv; @@ -259,7 +259,7 @@ static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, uint64_t offset, qiov = &local_qiov; } - ret = raw_adjust_offset(bs, (int64_t *)&offset, bytes, true); + ret = raw_adjust_offset(bs, &offset, bytes, true); if (ret) { goto fail; } diff --git a/block/rbd.c b/block/rbd.c index 21438dfb7c4f..efc0835ee7b0 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -1172,9 +1172,9 @@ coroutine_fn qemu_rbd_co_preadv(BlockDriverState *bs, int64_t offset, } static int -coroutine_fn qemu_rbd_co_pwritev(BlockDriverState *bs, uint64_t offset, - uint64_t bytes, QEMUIOVector *qiov, - int flags) +coroutine_fn qemu_rbd_co_pwritev(BlockDriverState *bs, int64_t offset, + int64_t bytes, QEMUIOVector *qiov, + BdrvRequestFlags flags) { BDRVRBDState *s = bs->opaque; /* diff --git a/block/throttle.c b/block/throttle.c index 20362b5fe58f..1330e844c391 100644 --- a/block/throttle.c +++ b/block/throttle.c @@ -124,8 +124,9 @@ static int coroutine_fn throttle_co_preadv(BlockDriverState *bs, } static int coroutine_fn throttle_co_pwritev(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) + int64_t offset, int64_t bytes, + QEMUIOVector *qiov, + BdrvRequestFlags flags) { ThrottleGroupMember *tgm = bs->opaque; throttle_group_co_io_limits_intercept(tgm, bytes, true); @@ -153,8 +154,8 @@ static int coroutine_fn throttle_co_pdiscard(BlockDriverState *bs, } static int coroutine_fn throttle_co_pwritev_compressed(BlockDriverState *bs, - uint64_t offset, - uint64_t bytes, + int64_t offset, + int64_t bytes, QEMUIOVector *qiov) { return throttle_co_pwritev(bs, offset, bytes, qiov, diff --git a/block/vdi.c b/block/vdi.c index b394cf6ca638..bdc58d726ee1 100644 --- a/block/vdi.c +++ b/block/vdi.c @@ -600,8 +600,8 @@ vdi_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, } static int coroutine_fn -vdi_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +vdi_co_pwritev(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { BDRVVdiState *s = bs->opaque; QEMUIOVector local_qiov; diff --git a/block/vmdk.c b/block/vmdk.c index 78592160d039..8d49e54bdd90 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -2068,8 +2068,8 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset, } static int coroutine_fn -vmdk_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +vmdk_co_pwritev(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { int ret; BDRVVmdkState *s = bs->opaque; @@ -2080,8 +2080,8 @@ vmdk_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes, } static int coroutine_fn -vmdk_co_pwritev_compressed(BlockDriverState *bs, uint64_t offset, - uint64_t bytes, QEMUIOVector *qiov) +vmdk_co_pwritev_compressed(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov) { if (bytes == 0) { /* The caller will write bytes 0 to signal EOF. diff --git a/block/vpc.c b/block/vpc.c index 29c8517ff83f..1b4c7333af0e 100644 --- a/block/vpc.c +++ b/block/vpc.c @@ -658,8 +658,8 @@ fail: } static int coroutine_fn -vpc_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +vpc_co_pwritev(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { BDRVVPCState *s = bs->opaque; int64_t image_offset; diff --git a/block/vvfat.c b/block/vvfat.c index 9c53c2d0a4f1..05e78e3c2756 100644 --- a/block/vvfat.c +++ b/block/vvfat.c @@ -3061,8 +3061,8 @@ DLOG(checkpoint()); } static int coroutine_fn -vvfat_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes, - QEMUIOVector *qiov, int flags) +vvfat_co_pwritev(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) { int ret; BDRVVVFATState *s = bs->opaque; diff --git a/tests/unit/test-block-iothread.c b/tests/unit/test-block-iothread.c index f18fa6e0fba2..c86954c7baa1 100644 --- a/tests/unit/test-block-iothread.c +++ b/tests/unit/test-block-iothread.c @@ -40,9 +40,9 @@ static int coroutine_fn bdrv_test_co_preadv(BlockDriverState *bs, } static int coroutine_fn bdrv_test_co_pwritev(BlockDriverState *bs, - uint64_t offset, uint64_t bytes, + int64_t offset, int64_t bytes, QEMUIOVector *qiov, - int flags) + BdrvRequestFlags flags) { return 0; } diff --git a/block/trace-events b/block/trace-events index f4f1267c8c01..983dd54830a5 100644 --- a/block/trace-events +++ b/block/trace-events @@ -75,7 +75,7 @@ luring_resubmit_short_read(void *s, void *luringcb, int nread) "LuringState %p l # qcow2.c qcow2_add_task(void *co, void *bs, void *pool, const char *action, int cluster_type, uint64_t host_offset, uint64_t offset, uint64_t bytes, void *qiov, size_t qiov_offset) "co %p bs %p pool %p: %s: cluster_type %d file_cluster_offset %" PRIu64 " offset %" PRIu64 " bytes %" PRIu64 " qiov %p qiov_offset %zu" -qcow2_writev_start_req(void *co, int64_t offset, int bytes) "co %p offset 0x%" PRIx64 " bytes %d" +qcow2_writev_start_req(void *co, int64_t offset, int64_t bytes) "co %p offset 0x%" PRIx64 " bytes %" PRId64 qcow2_writev_done_req(void *co, int ret) "co %p ret %d" qcow2_writev_start_part(void *co) "co %p" qcow2_writev_done_part(void *co, int cur_bytes) "co %p cur_bytes %d" From patchwork Mon Sep 27 21:55:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12520861 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8186C433FE for ; Mon, 27 Sep 2021 22:00:50 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4F5DA61157 for ; Mon, 27 Sep 2021 22:00:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4F5DA61157 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:38012 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUyg3-00025R-TC for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 18:00:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54378) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybc-0002dO-Gt for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:14 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:38524) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybX-0002ry-F2 for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:12 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779766; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xd6BD/x/sTKuOfEvZBRNF9VFjqcuZKbWj3CEpGn+E8Y=; b=MndqAmpCIXI4tMKD+RAabgB1JpupdW49/EO3s8A5q947lhbSha9XaT87cJKtPpMF1/aE5j 7esJ/BeEtE+0VZOsgbSmZBc9BkN77PqfTNpUno7KtL0NvhO88EDFagM0G7zEVNdRWiGuZs tB40XNDKcS3qjBLBH+/j0HKE33nNoV8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-193-_KrnYuJJMsGntjd6dXbuPA-1; Mon, 27 Sep 2021 17:56:05 -0400 X-MC-Unique: _KrnYuJJMsGntjd6dXbuPA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1CED018125C1; Mon, 27 Sep 2021 21:56:04 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 775A25D9D5; Mon, 27 Sep 2021 21:56:03 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 06/20] block: use int64_t instead of uint64_t in copy_range driver handlers Date: Mon, 27 Sep 2021 16:55:31 -0500 Message-Id: <20210927215545.3930309-7-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Vladimir Sementsov-Ogievskiy , "open list:raw" , Peter Lieven , Hanna Reitz , Ronnie Sahlberg , Paolo Bonzini Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Vladimir Sementsov-Ogievskiy We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). So, convert driver copy_range handlers parameters which are already 64bit to signed type. Now let's consider all callers. Simple git grep '\->bdrv_co_copy_range' shows the only caller: bdrv_co_copy_range_internal(), which does bdrv_check_request32(), so everything is OK. Still, the functions may be called directly, not only by drv->... Let's check: git grep '\.bdrv_co_copy_range_\(from\|to\)\s*=' | \ awk '{print $4}' | sed 's/,//' | sed 's/&//' | sort | uniq | \ while read func; do git grep "$func(" | \ grep -v "$func(BlockDriverState"; done shows no more callers. So, we are done. Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Eric Blake Message-Id: <20210903102807.27127-6-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake --- include/block/block_int.h | 12 ++++++------ block/file-posix.c | 10 +++++----- block/iscsi.c | 12 ++++++------ block/qcow2.c | 12 ++++++------ block/raw-format.c | 16 ++++++++-------- 5 files changed, 31 insertions(+), 31 deletions(-) diff --git a/include/block/block_int.h b/include/block/block_int.h index 2cf5f1722a7f..5536f49bc67c 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -314,10 +314,10 @@ struct BlockDriver { */ int coroutine_fn (*bdrv_co_copy_range_from)(BlockDriverState *bs, BdrvChild *src, - uint64_t offset, + int64_t offset, BdrvChild *dst, - uint64_t dst_offset, - uint64_t bytes, + int64_t dst_offset, + int64_t bytes, BdrvRequestFlags read_flags, BdrvRequestFlags write_flags); @@ -331,10 +331,10 @@ struct BlockDriver { */ int coroutine_fn (*bdrv_co_copy_range_to)(BlockDriverState *bs, BdrvChild *src, - uint64_t src_offset, + int64_t src_offset, BdrvChild *dst, - uint64_t dst_offset, - uint64_t bytes, + int64_t dst_offset, + int64_t bytes, BdrvRequestFlags read_flags, BdrvRequestFlags write_flags); diff --git a/block/file-posix.c b/block/file-posix.c index 994f1c26ca7b..ed71e8d2dfee 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -3203,8 +3203,8 @@ static void raw_abort_perm_update(BlockDriverState *bs) } static int coroutine_fn raw_co_copy_range_from( - BlockDriverState *bs, BdrvChild *src, uint64_t src_offset, - BdrvChild *dst, uint64_t dst_offset, uint64_t bytes, + BlockDriverState *bs, BdrvChild *src, int64_t src_offset, + BdrvChild *dst, int64_t dst_offset, int64_t bytes, BdrvRequestFlags read_flags, BdrvRequestFlags write_flags) { return bdrv_co_copy_range_to(src, src_offset, dst, dst_offset, bytes, @@ -3213,10 +3213,10 @@ static int coroutine_fn raw_co_copy_range_from( static int coroutine_fn raw_co_copy_range_to(BlockDriverState *bs, BdrvChild *src, - uint64_t src_offset, + int64_t src_offset, BdrvChild *dst, - uint64_t dst_offset, - uint64_t bytes, + int64_t dst_offset, + int64_t bytes, BdrvRequestFlags read_flags, BdrvRequestFlags write_flags) { diff --git a/block/iscsi.c b/block/iscsi.c index 852384086b61..01fdd1775f12 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -2169,10 +2169,10 @@ static void coroutine_fn iscsi_co_invalidate_cache(BlockDriverState *bs, static int coroutine_fn iscsi_co_copy_range_from(BlockDriverState *bs, BdrvChild *src, - uint64_t src_offset, + int64_t src_offset, BdrvChild *dst, - uint64_t dst_offset, - uint64_t bytes, + int64_t dst_offset, + int64_t bytes, BdrvRequestFlags read_flags, BdrvRequestFlags write_flags) { @@ -2310,10 +2310,10 @@ static void iscsi_xcopy_data(struct iscsi_data *data, static int coroutine_fn iscsi_co_copy_range_to(BlockDriverState *bs, BdrvChild *src, - uint64_t src_offset, + int64_t src_offset, BdrvChild *dst, - uint64_t dst_offset, - uint64_t bytes, + int64_t dst_offset, + int64_t bytes, BdrvRequestFlags read_flags, BdrvRequestFlags write_flags) { diff --git a/block/qcow2.c b/block/qcow2.c index bb5455ed3dee..520ae37a296a 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -4026,9 +4026,9 @@ static coroutine_fn int qcow2_co_pdiscard(BlockDriverState *bs, static int coroutine_fn qcow2_co_copy_range_from(BlockDriverState *bs, - BdrvChild *src, uint64_t src_offset, - BdrvChild *dst, uint64_t dst_offset, - uint64_t bytes, BdrvRequestFlags read_flags, + BdrvChild *src, int64_t src_offset, + BdrvChild *dst, int64_t dst_offset, + int64_t bytes, BdrvRequestFlags read_flags, BdrvRequestFlags write_flags) { BDRVQcow2State *s = bs->opaque; @@ -4109,9 +4109,9 @@ out: static int coroutine_fn qcow2_co_copy_range_to(BlockDriverState *bs, - BdrvChild *src, uint64_t src_offset, - BdrvChild *dst, uint64_t dst_offset, - uint64_t bytes, BdrvRequestFlags read_flags, + BdrvChild *src, int64_t src_offset, + BdrvChild *dst, int64_t dst_offset, + int64_t bytes, BdrvRequestFlags read_flags, BdrvRequestFlags write_flags) { BDRVQcow2State *s = bs->opaque; diff --git a/block/raw-format.c b/block/raw-format.c index d223298dfc26..345137813e5c 100644 --- a/block/raw-format.c +++ b/block/raw-format.c @@ -532,16 +532,16 @@ static int raw_probe_geometry(BlockDriverState *bs, HDGeometry *geo) static int coroutine_fn raw_co_copy_range_from(BlockDriverState *bs, BdrvChild *src, - uint64_t src_offset, + int64_t src_offset, BdrvChild *dst, - uint64_t dst_offset, - uint64_t bytes, + int64_t dst_offset, + int64_t bytes, BdrvRequestFlags read_flags, BdrvRequestFlags write_flags) { int ret; - ret = raw_adjust_offset(bs, (int64_t *)&src_offset, bytes, false); + ret = raw_adjust_offset(bs, &src_offset, bytes, false); if (ret) { return ret; } @@ -551,16 +551,16 @@ static int coroutine_fn raw_co_copy_range_from(BlockDriverState *bs, static int coroutine_fn raw_co_copy_range_to(BlockDriverState *bs, BdrvChild *src, - uint64_t src_offset, + int64_t src_offset, BdrvChild *dst, - uint64_t dst_offset, - uint64_t bytes, + int64_t dst_offset, + int64_t bytes, BdrvRequestFlags read_flags, BdrvRequestFlags write_flags) { int ret; - ret = raw_adjust_offset(bs, (int64_t *)&dst_offset, bytes, true); + ret = raw_adjust_offset(bs, &dst_offset, bytes, true); if (ret) { return ret; } From patchwork Mon Sep 27 21:55:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12521085 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 614E7C433F5 for ; Mon, 27 Sep 2021 22:07:40 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2307560F3A for ; Mon, 27 Sep 2021 22:07:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2307560F3A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:56660 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUymh-0006Q3-7Q for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 18:07:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54418) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybe-0002f5-IJ for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:16 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:34377) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybZ-0002tH-TN for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:14 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779768; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NxO4OylQBnHRwBakwTpZ1GdVAMFwP5cZPLLTPb9c9fg=; b=aYk7Hvg4/8fG/8cQXEdvNFGwyJapDPFhHkbP0as3/bpP5qGofG0WTm7he63+/vLEqU94ph MOfQ20CCEstQsVoMECBek9LDt+rsOkivXHTSEXuESE8kxtsQ4k16ejs/ufWXLqAWNmCy8F N41UQyg6Z8VG1Y3V/xjpYkux5l2cToA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-534-xa-DHGLCMxmZkFJ78-BGjA-1; Mon, 27 Sep 2021 17:56:07 -0400 X-MC-Unique: xa-DHGLCMxmZkFJ78-BGjA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 0B24518125C2; Mon, 27 Sep 2021 21:56:06 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4EB4B5D9D5; Mon, 27 Sep 2021 21:56:04 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 07/20] block: make BlockLimits::max_pwrite_zeroes 64bit Date: Mon, 27 Sep 2021 16:55:32 -0500 Message-Id: <20210927215545.3930309-8-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Fam Zheng , Kevin Wolf , Vladimir Sementsov-Ogievskiy , "open list:Block I/O path" , Hanna Reitz , Stefan Hajnoczi Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Vladimir Sementsov-Ogievskiy We are going to support 64 bit write-zeroes requests. Now update the limit variable. It's absolutely safe. The variable is set in some drivers, and used in bdrv_co_do_pwrite_zeroes(). Update also max_write_zeroes variable in bdrv_co_do_pwrite_zeroes(), so that bdrv_co_do_pwrite_zeroes() is now prepared to 64bit requests. The remaining logic including num, offset and bytes variables is already supporting 64bit requests. So the only thing that prevents 64 bit requests is limiting max_write_zeroes variable to INT_MAX in bdrv_co_do_pwrite_zeroes(). We'll drop this limitation after updating all block drivers. Ah, we also have bdrv_check_request32() in bdrv_co_pwritev_part(). It will be modified to do bdrv_check_request() for write-zeroes path. Signed-off-by: Vladimir Sementsov-Ogievskiy Message-Id: <20210903102807.27127-7-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake Signed-off-by: Eric Blake --- include/block/block_int.h | 9 +++++---- block/io.c | 2 +- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/include/block/block_int.h b/include/block/block_int.h index 5536f49bc67c..24958acd33f4 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -686,10 +686,11 @@ typedef struct BlockLimits { * that is set. May be 0 if bl.request_alignment is good enough */ uint32_t pdiscard_alignment; - /* Maximum number of bytes that can zeroized at once (since it is - * signed, it must be < 2G, if set). Must be multiple of - * pwrite_zeroes_alignment. May be 0 if no inherent 32-bit limit */ - int32_t max_pwrite_zeroes; + /* + * Maximum number of bytes that can zeroized at once. Must be multiple of + * pwrite_zeroes_alignment. 0 means no limit. + */ + int64_t max_pwrite_zeroes; /* Optimal alignment for write zeroes requests in bytes. A power * of 2 is best but not mandatory. Must be a multiple of diff --git a/block/io.c b/block/io.c index aa6f7b075e78..0090224603f5 100644 --- a/block/io.c +++ b/block/io.c @@ -1869,7 +1869,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, int head = 0; int tail = 0; - int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX); + int64_t max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX); int alignment = MAX(bs->bl.pwrite_zeroes_alignment, bs->bl.request_alignment); int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer, MAX_BOUNCE_BUFFER); From patchwork Mon Sep 27 21:55:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12521083 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CBC52C433EF for ; Mon, 27 Sep 2021 22:07:34 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 60DF960F4F for ; Mon, 27 Sep 2021 22:07:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 60DF960F4F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:56238 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUymb-00068d-HM for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 18:07:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54550) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybj-0002hw-2e for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:19 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:54386) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybb-0002u3-H9 for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:18 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779770; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EK+JkXMDrCeU92/HHiVItUB5YHQ8TjLB1xHQ9ZGLLxI=; b=EPGj/KUpJbJH7o8Nj8YrKqjTDBzhEJ7IpfFwQxZyQgm1zBOug8o/PElh4TnLtndBOXRk7+ nK4xIZ1wn6x56uMucD3OnXe94T6G/GdjMOv6uJKZt4HQjVc/Y0sP7jSYJ6xgmsToJ69BW9 zLA0Ish7CVgR8DTjPjjJpRfo6UjG0m8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-298-E06j1wQfOCeZPQT-jCBkNQ-1; Mon, 27 Sep 2021 17:56:09 -0400 X-MC-Unique: E06j1wQfOCeZPQT-jCBkNQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B757A802C87; Mon, 27 Sep 2021 21:56:07 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 782855D9D5; Mon, 27 Sep 2021 21:56:06 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 08/20] block: use int64_t instead of int in driver write_zeroes handlers Date: Mon, 27 Sep 2021 16:55:33 -0500 Message-Id: <20210927215545.3930309-9-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -8 X-Spam_score: -0.9 X-Spam_bar: / X-Spam_report: (-0.9 / 5.0 requ) DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Vladimir Sementsov-Ogievskiy , Alberto Garcia , Ronnie Sahlberg , "open list:blkdebug" , =?utf-8?q?Philippe_Mathieu-Da?= =?utf-8?q?ud=C3=A9?= , Peter Lieven , Stefan Hajnoczi , Hanna Reitz , "open list:GLUSTER" , Pavel Dovgalyuk , Paolo Bonzini , Ilya Dryomov , John Snow , Ari Sundholm Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Vladimir Sementsov-Ogievskiy We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). So, convert driver write_zeroes handlers bytes parameter to int64_t. The only caller of all updated function is bdrv_co_do_pwrite_zeroes(). bdrv_co_do_pwrite_zeroes() itself is of course OK with widening of callee parameter type. Also, bdrv_co_do_pwrite_zeroes()'s max_write_zeroes is limited to INT_MAX. So, updated functions all are safe, they will not get "bytes" larger than before. Still, let's look through all updated functions, and add assertions to the ones which are actually unprepared to values larger than INT_MAX. For these drivers also set explicit max_pwrite_zeroes limit. Let's go: blkdebug: calculations can't overflow, thanks to bdrv_check_qiov_request() in generic layer. rule_check() and bdrv_co_pwrite_zeroes() both have 64bit argument. blklogwrites: pass to blk_log_writes_co_log() with 64bit argument. blkreplay, copy-on-read, filter-compress: pass to bdrv_co_pwrite_zeroes() which is OK copy-before-write: Calls cbw_do_copy_before_write() and bdrv_co_pwrite_zeroes, both have 64bit argument. file-posix: both handler calls raw_do_pwrite_zeroes, which is updated. In raw_do_pwrite_zeroes() calculations are OK due to bdrv_check_qiov_request(), bytes go to RawPosixAIOData::aio_nbytes which is uint64_t. Check also where that uint64_t gets handed: handle_aiocb_write_zeroes_block() passes a uint64_t[2] to ioctl(BLKZEROOUT), handle_aiocb_write_zeroes() calls do_fallocate() which takes off_t (and we compile to always have 64-bit off_t), as does handle_aiocb_write_zeroes_unmap. All look safe. gluster: bytes go to GlusterAIOCB::size which is int64_t and to glfs_zerofill_async works with off_t. iscsi: Aha, here we deal with iscsi_writesame16_task() that has uint32_t num_blocks argument and iscsi_writesame16_task() has uint16_t argument. Make comments, add assertions and clarify max_pwrite_zeroes calculation. iscsi_allocmap_() functions already has int64_t argument is_byte_request_lun_aligned is simple to update, do it. mirror_top: pass to bdrv_mirror_top_do_write which has uint64_t argument nbd: Aha, here we have protocol limitation, and NBDRequest::len is uint32_t. max_pwrite_zeroes is cleanly set to 32bit value, so we are OK for now. nvme: Again, protocol limitation. And no inherent limit for write-zeroes at all. But from code that calculates cdw12 it's obvious that we do have limit and alignment. Let's clarify it. Also, obviously the code is not prepared to handle bytes=0. Let's handle this case too. trace events already 64bit preallocate: pass to handle_write() and bdrv_co_pwrite_zeroes(), both 64bit. rbd: pass to qemu_rbd_start_co() which is 64bit. qcow2: offset + bytes and alignment still works good (thanks to bdrv_check_qiov_request()), so tail calculation is OK qcow2_subcluster_zeroize() has 64bit argument, should be OK trace events updated qed: qed_co_request wants int nb_sectors. Also in code we have size_t used for request length which may be 32bit. So, let's just keep INT_MAX as a limit (aligning it down to pwrite_zeroes_alignment) and don't care. raw-format: Is OK. raw_adjust_offset and bdrv_co_pwrite_zeroes are both 64bit. throttle: Both throttle_group_co_io_limits_intercept() and bdrv_co_pwrite_zeroes() are 64bit. vmdk: pass to vmdk_pwritev which is 64bit quorum: pass to quorum_co_pwritev() which is 64bit Hooray! At this point all block drivers are prepared to support 64bit write-zero requests, or have explicitly set max_pwrite_zeroes. Signed-off-by: Vladimir Sementsov-Ogievskiy Message-Id: <20210903102807.27127-8-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake [eblake: use <= rather than < in assertions relying on max_pwrite_zeroes] Signed-off-by: Eric Blake --- include/block/block_int.h | 2 +- block/blkdebug.c | 2 +- block/blklogwrites.c | 4 ++-- block/blkreplay.c | 2 +- block/copy-before-write.c | 2 +- block/copy-on-read.c | 2 +- block/file-posix.c | 6 +++--- block/filter-compress.c | 2 +- block/gluster.c | 6 +++--- block/iscsi.c | 30 ++++++++++++++++++++---------- block/mirror.c | 2 +- block/nbd.c | 6 ++++-- block/nvme.c | 24 +++++++++++++++++++++--- block/preallocate.c | 2 +- block/qcow2.c | 2 +- block/qed.c | 9 ++++++++- block/quorum.c | 2 +- block/raw-format.c | 2 +- block/rbd.c | 4 ++-- block/throttle.c | 2 +- block/vmdk.c | 2 +- block/trace-events | 4 ++-- 22 files changed, 78 insertions(+), 41 deletions(-) diff --git a/include/block/block_int.h b/include/block/block_int.h index 24958acd33f4..d518703e3e59 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -301,7 +301,7 @@ struct BlockDriver { * will be called instead. */ int coroutine_fn (*bdrv_co_pwrite_zeroes)(BlockDriverState *bs, - int64_t offset, int bytes, BdrvRequestFlags flags); + int64_t offset, int64_t bytes, BdrvRequestFlags flags); int coroutine_fn (*bdrv_co_pdiscard)(BlockDriverState *bs, int64_t offset, int bytes); diff --git a/block/blkdebug.c b/block/blkdebug.c index e686cd97993d..742b4a3834d8 100644 --- a/block/blkdebug.c +++ b/block/blkdebug.c @@ -684,7 +684,7 @@ static int blkdebug_co_flush(BlockDriverState *bs) } static int coroutine_fn blkdebug_co_pwrite_zeroes(BlockDriverState *bs, - int64_t offset, int bytes, + int64_t offset, int64_t bytes, BdrvRequestFlags flags) { uint32_t align = MAX(bs->bl.request_alignment, diff --git a/block/blklogwrites.c b/block/blklogwrites.c index ca174ab135a0..d7ae64c22d81 100644 --- a/block/blklogwrites.c +++ b/block/blklogwrites.c @@ -468,8 +468,8 @@ blk_log_writes_co_pwritev(BlockDriverState *bs, int64_t offset, int64_t bytes, } static int coroutine_fn -blk_log_writes_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int bytes, - BdrvRequestFlags flags) +blk_log_writes_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, + int64_t bytes, BdrvRequestFlags flags) { return blk_log_writes_co_log(bs, offset, bytes, NULL, flags, blk_log_writes_co_do_file_pwrite_zeroes, 0, diff --git a/block/blkreplay.c b/block/blkreplay.c index 7ba62dcac12a..89d74a3cca29 100644 --- a/block/blkreplay.c +++ b/block/blkreplay.c @@ -94,7 +94,7 @@ static int coroutine_fn blkreplay_co_pwritev(BlockDriverState *bs, } static int coroutine_fn blkreplay_co_pwrite_zeroes(BlockDriverState *bs, - int64_t offset, int bytes, BdrvRequestFlags flags) + int64_t offset, int64_t bytes, BdrvRequestFlags flags) { uint64_t reqid = blkreplay_next_id(); int ret = bdrv_co_pwrite_zeroes(bs->file, offset, bytes, flags); diff --git a/block/copy-before-write.c b/block/copy-before-write.c index 74360b48536f..d210e87a4511 100644 --- a/block/copy-before-write.c +++ b/block/copy-before-write.c @@ -75,7 +75,7 @@ static int coroutine_fn cbw_co_pdiscard(BlockDriverState *bs, } static int coroutine_fn cbw_co_pwrite_zeroes(BlockDriverState *bs, - int64_t offset, int bytes, BdrvRequestFlags flags) + int64_t offset, int64_t bytes, BdrvRequestFlags flags) { int ret = cbw_do_copy_before_write(bs, offset, bytes, flags); if (ret < 0) { diff --git a/block/copy-on-read.c b/block/copy-on-read.c index b2ec36b6fc7f..f83dd83f14ed 100644 --- a/block/copy-on-read.c +++ b/block/copy-on-read.c @@ -193,7 +193,7 @@ static int coroutine_fn cor_co_pwritev_part(BlockDriverState *bs, static int coroutine_fn cor_co_pwrite_zeroes(BlockDriverState *bs, - int64_t offset, int bytes, + int64_t offset, int64_t bytes, BdrvRequestFlags flags) { return bdrv_co_pwrite_zeroes(bs->file, offset, bytes, flags); diff --git a/block/file-posix.c b/block/file-posix.c index ed71e8d2dfee..f375070f25bb 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -2972,7 +2972,7 @@ raw_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes) } static int coroutine_fn -raw_do_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int bytes, +raw_do_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int64_t bytes, BdrvRequestFlags flags, bool blkdev) { BDRVRawState *s = bs->opaque; @@ -3040,7 +3040,7 @@ raw_do_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int bytes, static int coroutine_fn raw_co_pwrite_zeroes( BlockDriverState *bs, int64_t offset, - int bytes, BdrvRequestFlags flags) + int64_t bytes, BdrvRequestFlags flags) { return raw_do_pwrite_zeroes(bs, offset, bytes, flags, false); } @@ -3605,7 +3605,7 @@ hdev_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes) } static coroutine_fn int hdev_co_pwrite_zeroes(BlockDriverState *bs, - int64_t offset, int bytes, BdrvRequestFlags flags) + int64_t offset, int64_t bytes, BdrvRequestFlags flags) { int rc; diff --git a/block/filter-compress.c b/block/filter-compress.c index 505822a44fe2..fb85686b6930 100644 --- a/block/filter-compress.c +++ b/block/filter-compress.c @@ -86,7 +86,7 @@ static int coroutine_fn compress_co_pwritev_part(BlockDriverState *bs, static int coroutine_fn compress_co_pwrite_zeroes(BlockDriverState *bs, - int64_t offset, int bytes, + int64_t offset, int64_t bytes, BdrvRequestFlags flags) { return bdrv_co_pwrite_zeroes(bs->file, offset, bytes, flags); diff --git a/block/gluster.c b/block/gluster.c index d51938e447c0..4e3c9cd14fd9 100644 --- a/block/gluster.c +++ b/block/gluster.c @@ -1003,19 +1003,19 @@ static void qemu_gluster_reopen_abort(BDRVReopenState *state) #ifdef CONFIG_GLUSTERFS_ZEROFILL static coroutine_fn int qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, - int size, + int64_t bytes, BdrvRequestFlags flags) { int ret; GlusterAIOCB acb; BDRVGlusterState *s = bs->opaque; - acb.size = size; + acb.size = bytes; acb.ret = 0; acb.coroutine = qemu_coroutine_self(); acb.aio_context = bdrv_get_aio_context(bs); - ret = glfs_zerofill_async(s->fd, offset, size, gluster_finish_aiocb, &acb); + ret = glfs_zerofill_async(s->fd, offset, bytes, gluster_finish_aiocb, &acb); if (ret < 0) { return -errno; } diff --git a/block/iscsi.c b/block/iscsi.c index 01fdd1775f12..74ff7e307ece 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -427,14 +427,14 @@ static int64_t sector_qemu2lun(int64_t sector, IscsiLun *iscsilun) return sector * BDRV_SECTOR_SIZE / iscsilun->block_size; } -static bool is_byte_request_lun_aligned(int64_t offset, int count, +static bool is_byte_request_lun_aligned(int64_t offset, int64_t bytes, IscsiLun *iscsilun) { - if (offset % iscsilun->block_size || count % iscsilun->block_size) { + if (offset % iscsilun->block_size || bytes % iscsilun->block_size) { error_report("iSCSI misaligned request: " "iscsilun->block_size %u, offset %" PRIi64 - ", count %d", - iscsilun->block_size, offset, count); + ", bytes %" PRIi64, + iscsilun->block_size, offset, bytes); return false; } return true; @@ -1202,12 +1202,12 @@ out_unlock: static int coroutine_fn iscsi_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, - int bytes, BdrvRequestFlags flags) + int64_t bytes, BdrvRequestFlags flags) { IscsiLun *iscsilun = bs->opaque; struct IscsiTask iTask; uint64_t lba; - uint32_t nb_blocks; + uint64_t nb_blocks; bool use_16_for_ws = iscsilun->use_16_for_rw; int r = 0; @@ -1247,11 +1247,21 @@ coroutine_fn iscsi_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, iscsi_co_init_iscsitask(iscsilun, &iTask); retry: if (use_16_for_ws) { + /* + * iscsi_writesame16_task num_blocks argument is uint32_t. We rely here + * on our max_pwrite_zeroes limit. + */ + assert(nb_blocks <= UINT32_MAX); iTask.task = iscsi_writesame16_task(iscsilun->iscsi, iscsilun->lun, lba, iscsilun->zeroblock, iscsilun->block_size, nb_blocks, 0, !!(flags & BDRV_REQ_MAY_UNMAP), 0, 0, iscsi_co_generic_cb, &iTask); } else { + /* + * iscsi_writesame10_task num_blocks argument is uint16_t. We rely here + * on our max_pwrite_zeroes limit. + */ + assert(nb_blocks <= UINT16_MAX); iTask.task = iscsi_writesame10_task(iscsilun->iscsi, iscsilun->lun, lba, iscsilun->zeroblock, iscsilun->block_size, nb_blocks, 0, !!(flags & BDRV_REQ_MAY_UNMAP), @@ -2071,10 +2081,10 @@ static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp) bs->bl.pdiscard_alignment = iscsilun->block_size; } - if (iscsilun->bl.max_ws_len < 0xffffffff / block_size) { - bs->bl.max_pwrite_zeroes = - iscsilun->bl.max_ws_len * iscsilun->block_size; - } + bs->bl.max_pwrite_zeroes = + MIN_NON_ZERO(iscsilun->bl.max_ws_len * iscsilun->block_size, + max_xfer_len * iscsilun->block_size); + if (iscsilun->lbp.lbpws) { bs->bl.pwrite_zeroes_alignment = iscsilun->bl.opt_unmap_gran * iscsilun->block_size; diff --git a/block/mirror.c b/block/mirror.c index c4c623ceed85..fab75087b0ba 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -1501,7 +1501,7 @@ static int coroutine_fn bdrv_mirror_top_flush(BlockDriverState *bs) } static int coroutine_fn bdrv_mirror_top_pwrite_zeroes(BlockDriverState *bs, - int64_t offset, int bytes, BdrvRequestFlags flags) + int64_t offset, int64_t bytes, BdrvRequestFlags flags) { return bdrv_mirror_top_do_write(bs, MIRROR_METHOD_ZERO, offset, bytes, NULL, flags); diff --git a/block/nbd.c b/block/nbd.c index caee396525ab..c0c479abe9ce 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -1407,15 +1407,17 @@ static int nbd_client_co_pwritev(BlockDriverState *bs, int64_t offset, } static int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, - int bytes, BdrvRequestFlags flags) + int64_t bytes, BdrvRequestFlags flags) { BDRVNBDState *s = (BDRVNBDState *)bs->opaque; NBDRequest request = { .type = NBD_CMD_WRITE_ZEROES, .from = offset, - .len = bytes, + .len = bytes, /* .len is uint32_t actually */ }; + assert(bytes <= UINT32_MAX); /* rely on max_pwrite_zeroes */ + assert(!(s->info.flags & NBD_FLAG_READ_ONLY)); if (!(s->info.flags & NBD_FLAG_SEND_WRITE_ZEROES)) { return -ENOTSUP; diff --git a/block/nvme.c b/block/nvme.c index c44db1893909..2e0fd9e76a2d 100644 --- a/block/nvme.c +++ b/block/nvme.c @@ -1296,19 +1296,29 @@ static coroutine_fn int nvme_co_flush(BlockDriverState *bs) static coroutine_fn int nvme_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, - int bytes, + int64_t bytes, BdrvRequestFlags flags) { BDRVNVMeState *s = bs->opaque; NVMeQueuePair *ioq = s->queues[INDEX_IO(0)]; NVMeRequest *req; - - uint32_t cdw12 = ((bytes >> s->blkshift) - 1) & 0xFFFF; + uint32_t cdw12; if (!s->supports_write_zeroes) { return -ENOTSUP; } + if (bytes == 0) { + return 0; + } + + cdw12 = ((bytes >> s->blkshift) - 1) & 0xFFFF; + /* + * We should not lose information. pwrite_zeroes_alignment and + * max_pwrite_zeroes guarantees it. + */ + assert(((cdw12 + 1) << s->blkshift) == bytes); + NvmeCmd cmd = { .opcode = NVME_CMD_WRITE_ZEROES, .nsid = cpu_to_le32(s->nsid), @@ -1472,6 +1482,14 @@ static void nvme_refresh_limits(BlockDriverState *bs, Error **errp) bs->bl.opt_mem_alignment = s->page_size; bs->bl.request_alignment = s->page_size; bs->bl.max_transfer = s->max_transfer; + + /* + * Look at nvme_co_pwrite_zeroes: after shift and decrement we should get + * at most 0xFFFF + */ + bs->bl.max_pwrite_zeroes = 1ULL << (s->blkshift + 16); + bs->bl.pwrite_zeroes_alignment = MAX(bs->bl.request_alignment, + 1UL << s->blkshift); } static void nvme_detach_aio_context(BlockDriverState *bs) diff --git a/block/preallocate.c b/block/preallocate.c index c19885af17a8..99e28d9f086e 100644 --- a/block/preallocate.c +++ b/block/preallocate.c @@ -337,7 +337,7 @@ static bool coroutine_fn handle_write(BlockDriverState *bs, int64_t offset, } static int coroutine_fn preallocate_co_pwrite_zeroes(BlockDriverState *bs, - int64_t offset, int bytes, BdrvRequestFlags flags) + int64_t offset, int64_t bytes, BdrvRequestFlags flags) { bool want_merge_zero = !(flags & ~(BDRV_REQ_ZERO_WRITE | BDRV_REQ_NO_FALLBACK)); diff --git a/block/qcow2.c b/block/qcow2.c index 520ae37a296a..4b2e8694952f 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -3941,7 +3941,7 @@ static bool is_zero(BlockDriverState *bs, int64_t offset, int64_t bytes) } static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs, - int64_t offset, int bytes, BdrvRequestFlags flags) + int64_t offset, int64_t bytes, BdrvRequestFlags flags) { int ret; BDRVQcow2State *s = bs->opaque; diff --git a/block/qed.c b/block/qed.c index f45c640513f5..558d3646c4b2 100644 --- a/block/qed.c +++ b/block/qed.c @@ -582,6 +582,7 @@ static void bdrv_qed_refresh_limits(BlockDriverState *bs, Error **errp) BDRVQEDState *s = bs->opaque; bs->bl.pwrite_zeroes_alignment = s->header.cluster_size; + bs->bl.max_pwrite_zeroes = QEMU_ALIGN_DOWN(INT_MAX, s->header.cluster_size); } /* We have nothing to do for QED reopen, stubs just return @@ -1397,7 +1398,7 @@ static int coroutine_fn bdrv_qed_co_writev(BlockDriverState *bs, static int coroutine_fn bdrv_qed_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, - int bytes, + int64_t bytes, BdrvRequestFlags flags) { BDRVQEDState *s = bs->opaque; @@ -1408,6 +1409,12 @@ static int coroutine_fn bdrv_qed_co_pwrite_zeroes(BlockDriverState *bs, */ QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, NULL, bytes); + /* + * QED is not prepared for 63bit write-zero requests, so rely on + * max_pwrite_zeroes. + */ + assert(bytes <= INT_MAX); + /* Fall back if the request is not aligned */ if (qed_offset_into_cluster(s, offset) || qed_offset_into_cluster(s, bytes)) { diff --git a/block/quorum.c b/block/quorum.c index f4b76ea010c0..c28dda7baac7 100644 --- a/block/quorum.c +++ b/block/quorum.c @@ -746,7 +746,7 @@ static int quorum_co_pwritev(BlockDriverState *bs, int64_t offset, } static int quorum_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, - int bytes, BdrvRequestFlags flags) + int64_t bytes, BdrvRequestFlags flags) { return quorum_co_pwritev(bs, offset, bytes, NULL, diff --git a/block/raw-format.c b/block/raw-format.c index 345137813e5c..a2485926b8cd 100644 --- a/block/raw-format.c +++ b/block/raw-format.c @@ -289,7 +289,7 @@ static int coroutine_fn raw_co_block_status(BlockDriverState *bs, } static int coroutine_fn raw_co_pwrite_zeroes(BlockDriverState *bs, - int64_t offset, int bytes, + int64_t offset, int64_t bytes, BdrvRequestFlags flags) { int ret; diff --git a/block/rbd.c b/block/rbd.c index efc0835ee7b0..053eb8e48f44 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -1205,9 +1205,9 @@ static int coroutine_fn qemu_rbd_co_pdiscard(BlockDriverState *bs, #ifdef LIBRBD_SUPPORTS_WRITE_ZEROES static int coroutine_fn qemu_rbd_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, - int count, BdrvRequestFlags flags) + int64_t bytes, BdrvRequestFlags flags) { - return qemu_rbd_start_co(bs, offset, count, NULL, flags, + return qemu_rbd_start_co(bs, offset, bytes, NULL, flags, RBD_AIO_WRITE_ZEROES); } #endif diff --git a/block/throttle.c b/block/throttle.c index 1330e844c391..c13fe9067f4e 100644 --- a/block/throttle.c +++ b/block/throttle.c @@ -135,7 +135,7 @@ static int coroutine_fn throttle_co_pwritev(BlockDriverState *bs, } static int coroutine_fn throttle_co_pwrite_zeroes(BlockDriverState *bs, - int64_t offset, int bytes, + int64_t offset, int64_t bytes, BdrvRequestFlags flags) { ThrottleGroupMember *tgm = bs->opaque; diff --git a/block/vmdk.c b/block/vmdk.c index 8d49e54bdd90..fb4cc9da9077 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -2109,7 +2109,7 @@ vmdk_co_pwritev_compressed(BlockDriverState *bs, int64_t offset, int64_t bytes, static int coroutine_fn vmdk_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, - int bytes, + int64_t bytes, BdrvRequestFlags flags) { int ret; diff --git a/block/trace-events b/block/trace-events index 983dd54830a5..d8a08563f1ca 100644 --- a/block/trace-events +++ b/block/trace-events @@ -80,8 +80,8 @@ qcow2_writev_done_req(void *co, int ret) "co %p ret %d" qcow2_writev_start_part(void *co) "co %p" qcow2_writev_done_part(void *co, int cur_bytes) "co %p cur_bytes %d" qcow2_writev_data(void *co, uint64_t offset) "co %p offset 0x%" PRIx64 -qcow2_pwrite_zeroes_start_req(void *co, int64_t offset, int count) "co %p offset 0x%" PRIx64 " count %d" -qcow2_pwrite_zeroes(void *co, int64_t offset, int count) "co %p offset 0x%" PRIx64 " count %d" +qcow2_pwrite_zeroes_start_req(void *co, int64_t offset, int64_t bytes) "co %p offset 0x%" PRIx64 " bytes %" PRId64 +qcow2_pwrite_zeroes(void *co, int64_t offset, int64_t bytes) "co %p offset 0x%" PRIx64 " bytes %" PRId64 qcow2_skip_cow(void *co, uint64_t offset, int nb_clusters) "co %p offset 0x%" PRIx64 " nb_clusters %d" # qcow2-cluster.c From patchwork Mon Sep 27 21:55:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12520851 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA65BC433EF for ; Mon, 27 Sep 2021 21:58:33 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6E32B610E8 for ; Mon, 27 Sep 2021 21:58:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6E32B610E8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:33294 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUyds-0007N3-LS for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 17:58:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54498) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybi-0002hR-4J for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:19 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:46259) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybd-0002wF-GU for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779773; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Xq5R/IEFf+WSLrp1HO4kNmCEVBlv23Tmmn2Mi5BJVbA=; b=YXxnsCCmrpiL4r1bRZkY71ibIY2bo1NQ1auOLbtVrWtx57h1MdYbkHyZB/KAIhgWIP9ofu iGCF0WIA7zYlZPCzTuVIA6hkuskLHo+GDQ/2iIPM2pQXtliKkVkh2yIuzNyrsxVWOQiQgm l4t7suUD5y2c4Cakfrm3Q7fd+Rt3iD8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-516-mAYTR6Z6M7SlHW945MfH_w-1; Mon, 27 Sep 2021 17:56:09 -0400 X-MC-Unique: mAYTR6Z6M7SlHW945MfH_w-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7937E5074B; Mon, 27 Sep 2021 21:56:08 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id E41F317B5B; Mon, 27 Sep 2021 21:56:07 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 09/20] block/io: allow 64bit write-zeroes requests Date: Mon, 27 Sep 2021 16:55:34 -0500 Message-Id: <20210927215545.3930309-10-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Fam Zheng , Kevin Wolf , Vladimir Sementsov-Ogievskiy , "open list:Block I/O path" , Hanna Reitz , Stefan Hajnoczi Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Vladimir Sementsov-Ogievskiy Now that all drivers are updated by previous commit, we can drop two last limiters on write-zeroes path: INT_MAX in bdrv_co_do_pwrite_zeroes() and bdrv_check_request32() in bdrv_co_pwritev_part(). Now everything is prepared for implementing incredibly cool and fast big-write-zeroes in NBD and qcow2. And any other driver which wants it of course. Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Eric Blake Message-Id: <20210903102807.27127-9-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake --- block/io.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/block/io.c b/block/io.c index 0090224603f5..e40462742ea1 100644 --- a/block/io.c +++ b/block/io.c @@ -1869,7 +1869,8 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, int head = 0; int tail = 0; - int64_t max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX); + int64_t max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, + INT64_MAX); int alignment = MAX(bs->bl.pwrite_zeroes_alignment, bs->bl.request_alignment); int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer, MAX_BOUNCE_BUFFER); @@ -2248,7 +2249,11 @@ int coroutine_fn bdrv_co_pwritev_part(BdrvChild *child, return -ENOMEDIUM; } - ret = bdrv_check_request32(offset, bytes, qiov, qiov_offset); + if (flags & BDRV_REQ_ZERO_WRITE) { + ret = bdrv_check_qiov_request(offset, bytes, qiov, qiov_offset, NULL); + } else { + ret = bdrv_check_request32(offset, bytes, qiov, qiov_offset); + } if (ret < 0) { return ret; } From patchwork Mon Sep 27 21:55:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12520869 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3DEDFC433EF for ; Mon, 27 Sep 2021 22:03:58 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C8F1B610A2 for ; Mon, 27 Sep 2021 22:03:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C8F1B610A2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:47268 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUyj6-0008RJ-Sc for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 18:03:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54450) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybf-0002fI-Mu for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:16 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:52561) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybc-0002vd-Ah for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:15 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779771; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Db7uj7BlWuB4tMfhYSbJArnzVFeptnWsZIkoFWOBWos=; b=IKjA2WyvhjuUdxibD1KrVaBa6kuwTzufevlJVvelbVKsqkSo8lBp0t+E9FeImgJu52j6Yl Ww7GjWr0rIR43CJHuLVS+JsLb1oQinhYpBchrf/H39O2Qz87VRBzU9CUyXUOb7baKTtakt 9o1k592GaUS12TFwMHmsDhGJX1JR3UE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-38-6pfP5iedON2e1J4bvQaZEg-1; Mon, 27 Sep 2021 17:56:10 -0400 X-MC-Unique: 6pfP5iedON2e1J4bvQaZEg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4351B18125C3; Mon, 27 Sep 2021 21:56:09 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id AB22B5D9D5; Mon, 27 Sep 2021 21:56:08 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 10/20] block: make BlockLimits::max_pdiscard 64bit Date: Mon, 27 Sep 2021 16:55:35 -0500 Message-Id: <20210927215545.3930309-11-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -8 X-Spam_score: -0.9 X-Spam_bar: / X-Spam_report: (-0.9 / 5.0 requ) DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Fam Zheng , Kevin Wolf , Vladimir Sementsov-Ogievskiy , "open list:Block I/O path" , Hanna Reitz , Stefan Hajnoczi Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Vladimir Sementsov-Ogievskiy We are going to support 64 bit discard requests. Now update the limit variable. It's absolutely safe. The variable is set in some drivers, and used in bdrv_co_pdiscard(). Update also max_pdiscard variable in bdrv_co_pdiscard(), so that bdrv_co_pdiscard() is now prepared for 64bit requests. The remaining logic including num, offset and bytes variables is already supporting 64bit requests. So the only thing that prevents 64 bit requests is limiting max_pdiscard variable to INT_MAX in bdrv_co_pdiscard(). We'll drop this limitation after updating all block drivers. Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Eric Blake Message-Id: <20210903102807.27127-10-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake --- include/block/block_int.h | 11 ++++++----- block/io.c | 3 ++- 2 files changed, 8 insertions(+), 6 deletions(-) diff --git a/include/block/block_int.h b/include/block/block_int.h index d518703e3e59..9b4e0748bc86 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -674,11 +674,12 @@ typedef struct BlockLimits { * otherwise. */ uint32_t request_alignment; - /* Maximum number of bytes that can be discarded at once (since it - * is signed, it must be < 2G, if set). Must be multiple of - * pdiscard_alignment, but need not be power of 2. May be 0 if no - * inherent 32-bit limit */ - int32_t max_pdiscard; + /* + * Maximum number of bytes that can be discarded at once. Must be multiple + * of pdiscard_alignment, but need not be power of 2. May be 0 if no + * inherent 64-bit limit. + */ + int64_t max_pdiscard; /* Optimal alignment for discard requests in bytes. A power of 2 * is best but not mandatory. Must be a multiple of diff --git a/block/io.c b/block/io.c index e40462742ea1..3846e2ed961b 100644 --- a/block/io.c +++ b/block/io.c @@ -3056,7 +3056,8 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes) { BdrvTrackedRequest req; - int max_pdiscard, ret; + int ret; + int64_t max_pdiscard; int head, tail, align; BlockDriverState *bs = child->bs; From patchwork Mon Sep 27 21:55:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12521163 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3D20C433EF for ; Mon, 27 Sep 2021 22:11:03 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 589916101A for ; Mon, 27 Sep 2021 22:11:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 589916101A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:35950 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUypy-000337-GL for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 18:11:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54574) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybj-0002iN-Sl for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:21 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:52990) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybg-0002ym-GJ for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779776; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Vdh1hb7y00u4QhopaI/tHMIt6EpwFydgNaJ0WXDkmX0=; b=bRn2qMt0BgopuXSru3TkD399VKVV+lQgWQ1eGoeEqEHK1FxHQe60dlfL8xV/Y2pE4LTLhm UgOLrC84P1FnAFSHwtrJEad4yC0TZ4UdrjonWL7tThMHWvZWILdh8Vudqk2gz5XKkKk0ux L1g8SmKkgjqTWjyjJnM09XOXsdl5CC8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-399-RQG-ffpIMyGgUfZoe8Gr7w-1; Mon, 27 Sep 2021 17:56:12 -0400 X-MC-Unique: RQG-ffpIMyGgUfZoe8Gr7w-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id CAC235074B; Mon, 27 Sep 2021 21:56:10 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id AA2575D9D5; Mon, 27 Sep 2021 21:56:09 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 11/20] block: use int64_t instead of int in driver discard handlers Date: Mon, 27 Sep 2021 16:55:36 -0500 Message-Id: <20210927215545.3930309-12-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Vladimir Sementsov-Ogievskiy , Ronnie Sahlberg , "open list:blkdebug" , =?utf-8?q?Philippe_Mathieu-Da?= =?utf-8?q?ud=C3=A9?= , Peter Lieven , Stefan Hajnoczi , Hanna Reitz , "open list:GLUSTER" , Pavel Dovgalyuk , Paolo Bonzini , Ilya Dryomov , John Snow , Ari Sundholm Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Vladimir Sementsov-Ogievskiy We are generally moving to int64_t for both offset and bytes parameters on all io paths. Main motivation is realization of 64-bit write_zeroes operation for fast zeroing large disk chunks, up to the whole disk. We chose signed type, to be consistent with off_t (which is signed) and with possibility for signed return type (where negative value means error). So, convert driver discard handlers bytes parameter to int64_t. The only caller of all updated function is bdrv_co_pdiscard in block/io.c. It is already prepared to work with 64bit requests, but pass at most max(bs->bl.max_pdiscard, INT_MAX) to the driver. Let's look at all updated functions: blkdebug: all calculations are still OK, thanks to bdrv_check_qiov_request(). both rule_check and bdrv_co_pdiscard are 64bit blklogwrites: pass to blk_loc_writes_co_log which is 64bit blkreplay, copy-on-read, filter-compress: pass to bdrv_co_pdiscard, OK copy-before-write: pass to bdrv_co_pdiscard which is 64bit and to cbw_do_copy_before_write which is 64bit file-posix: one handler calls raw_account_discard() is 64bit and both handlers calls raw_do_pdiscard(). Update raw_do_pdiscard, which pass to RawPosixAIOData::aio_nbytes, which is 64bit (and calls raw_account_discard()) gluster: somehow, third argument of glfs_discard_async is size_t. Let's set max_pdiscard accordingly. iscsi: iscsi_allocmap_set_invalid is 64bit, !is_byte_request_lun_aligned is 64bit. list.num is uint32_t. Let's clarify max_pdiscard and pdiscard_alignment. mirror_top: pass to bdrv_mirror_top_do_write() which is 64bit nbd: protocol limitation. max_pdiscard is alredy set strict enough, keep it as is for now. nvme: buf.nlb is uint32_t and we do shift. So, add corresponding limits to nvme_refresh_limits(). preallocate: pass to bdrv_co_pdiscard() which is 64bit. rbd: pass to qemu_rbd_start_co() which is 64bit. qcow2: calculations are still OK, thanks to bdrv_check_qiov_request(), qcow2_cluster_discard() is 64bit. raw-format: raw_adjust_offset() is 64bit, bdrv_co_pdiscard too. throttle: pass to bdrv_co_pdiscard() which is 64bit and to throttle_group_co_io_limits_intercept() which is 64bit as well. test-block-iothread: bytes argument is unused Great! Now all drivers are prepared to handle 64bit discard requests, or else have explicit max_pdiscard limits. Signed-off-by: Vladimir Sementsov-Ogievskiy Message-Id: <20210903102807.27127-11-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake Signed-off-by: Eric Blake --- include/block/block_int.h | 2 +- block/blkdebug.c | 2 +- block/blklogwrites.c | 4 ++-- block/blkreplay.c | 2 +- block/copy-before-write.c | 2 +- block/copy-on-read.c | 2 +- block/file-posix.c | 7 ++++--- block/filter-compress.c | 2 +- block/gluster.c | 7 +++++-- block/iscsi.c | 16 +++++++++++----- block/mirror.c | 2 +- block/nbd.c | 6 ++++-- block/nvme.c | 14 +++++++++++++- block/preallocate.c | 2 +- block/qcow2.c | 2 +- block/raw-format.c | 2 +- block/rbd.c | 4 ++-- block/throttle.c | 2 +- tests/unit/test-block-iothread.c | 2 +- block/trace-events | 4 ++-- 20 files changed, 55 insertions(+), 31 deletions(-) diff --git a/include/block/block_int.h b/include/block/block_int.h index 9b4e0748bc86..ffe86068d4d5 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -303,7 +303,7 @@ struct BlockDriver { int coroutine_fn (*bdrv_co_pwrite_zeroes)(BlockDriverState *bs, int64_t offset, int64_t bytes, BdrvRequestFlags flags); int coroutine_fn (*bdrv_co_pdiscard)(BlockDriverState *bs, - int64_t offset, int bytes); + int64_t offset, int64_t bytes); /* Map [offset, offset + nbytes) range onto a child of @bs to copy from, * and invoke bdrv_co_copy_range_from(child, ...), or invoke diff --git a/block/blkdebug.c b/block/blkdebug.c index 742b4a3834d8..bbf294870308 100644 --- a/block/blkdebug.c +++ b/block/blkdebug.c @@ -717,7 +717,7 @@ static int coroutine_fn blkdebug_co_pwrite_zeroes(BlockDriverState *bs, } static int coroutine_fn blkdebug_co_pdiscard(BlockDriverState *bs, - int64_t offset, int bytes) + int64_t offset, int64_t bytes) { uint32_t align = bs->bl.pdiscard_alignment; int err; diff --git a/block/blklogwrites.c b/block/blklogwrites.c index d7ae64c22d81..f7a251e91f9e 100644 --- a/block/blklogwrites.c +++ b/block/blklogwrites.c @@ -484,9 +484,9 @@ static int coroutine_fn blk_log_writes_co_flush_to_disk(BlockDriverState *bs) } static int coroutine_fn -blk_log_writes_co_pdiscard(BlockDriverState *bs, int64_t offset, int count) +blk_log_writes_co_pdiscard(BlockDriverState *bs, int64_t offset, int64_t bytes) { - return blk_log_writes_co_log(bs, offset, count, NULL, 0, + return blk_log_writes_co_log(bs, offset, bytes, NULL, 0, blk_log_writes_co_do_file_pdiscard, LOG_DISCARD_FLAG, false); } diff --git a/block/blkreplay.c b/block/blkreplay.c index 89d74a3cca29..dcbe780ddbd3 100644 --- a/block/blkreplay.c +++ b/block/blkreplay.c @@ -105,7 +105,7 @@ static int coroutine_fn blkreplay_co_pwrite_zeroes(BlockDriverState *bs, } static int coroutine_fn blkreplay_co_pdiscard(BlockDriverState *bs, - int64_t offset, int bytes) + int64_t offset, int64_t bytes) { uint64_t reqid = blkreplay_next_id(); int ret = bdrv_co_pdiscard(bs->file, offset, bytes); diff --git a/block/copy-before-write.c b/block/copy-before-write.c index d210e87a4511..c30a5ff8dea9 100644 --- a/block/copy-before-write.c +++ b/block/copy-before-write.c @@ -64,7 +64,7 @@ static coroutine_fn int cbw_do_copy_before_write(BlockDriverState *bs, } static int coroutine_fn cbw_co_pdiscard(BlockDriverState *bs, - int64_t offset, int bytes) + int64_t offset, int64_t bytes) { int ret = cbw_do_copy_before_write(bs, offset, bytes, 0); if (ret < 0) { diff --git a/block/copy-on-read.c b/block/copy-on-read.c index f83dd83f14ed..1fc7fb3333b1 100644 --- a/block/copy-on-read.c +++ b/block/copy-on-read.c @@ -201,7 +201,7 @@ static int coroutine_fn cor_co_pwrite_zeroes(BlockDriverState *bs, static int coroutine_fn cor_co_pdiscard(BlockDriverState *bs, - int64_t offset, int bytes) + int64_t offset, int64_t bytes) { return bdrv_co_pdiscard(bs->file, offset, bytes); } diff --git a/block/file-posix.c b/block/file-posix.c index f375070f25bb..c62e42743de1 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -2942,7 +2942,8 @@ static void raw_account_discard(BDRVRawState *s, uint64_t nbytes, int ret) } static coroutine_fn int -raw_do_pdiscard(BlockDriverState *bs, int64_t offset, int bytes, bool blkdev) +raw_do_pdiscard(BlockDriverState *bs, int64_t offset, int64_t bytes, + bool blkdev) { BDRVRawState *s = bs->opaque; RawPosixAIOData acb; @@ -2966,7 +2967,7 @@ raw_do_pdiscard(BlockDriverState *bs, int64_t offset, int bytes, bool blkdev) } static coroutine_fn int -raw_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes) +raw_co_pdiscard(BlockDriverState *bs, int64_t offset, int64_t bytes) { return raw_do_pdiscard(bs, offset, bytes, false); } @@ -3591,7 +3592,7 @@ hdev_co_ioctl(BlockDriverState *bs, unsigned long int req, void *buf) #endif /* linux */ static coroutine_fn int -hdev_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes) +hdev_co_pdiscard(BlockDriverState *bs, int64_t offset, int64_t bytes) { BDRVRawState *s = bs->opaque; int ret; diff --git a/block/filter-compress.c b/block/filter-compress.c index fb85686b6930..d5be538619ae 100644 --- a/block/filter-compress.c +++ b/block/filter-compress.c @@ -94,7 +94,7 @@ static int coroutine_fn compress_co_pwrite_zeroes(BlockDriverState *bs, static int coroutine_fn compress_co_pdiscard(BlockDriverState *bs, - int64_t offset, int bytes) + int64_t offset, int64_t bytes) { return bdrv_co_pdiscard(bs->file, offset, bytes); } diff --git a/block/gluster.c b/block/gluster.c index 4e3c9cd14fd9..398976bc66d2 100644 --- a/block/gluster.c +++ b/block/gluster.c @@ -891,6 +891,7 @@ out: static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp) { bs->bl.max_transfer = GLUSTER_MAX_TRANSFER; + bs->bl.max_pdiscard = SIZE_MAX; } static int qemu_gluster_reopen_prepare(BDRVReopenState *state, @@ -1297,18 +1298,20 @@ error: #ifdef CONFIG_GLUSTERFS_DISCARD static coroutine_fn int qemu_gluster_co_pdiscard(BlockDriverState *bs, - int64_t offset, int size) + int64_t offset, int64_t bytes) { int ret; GlusterAIOCB acb; BDRVGlusterState *s = bs->opaque; + assert(bytes <= SIZE_MAX); /* rely on max_pdiscard */ + acb.size = 0; acb.ret = 0; acb.coroutine = qemu_coroutine_self(); acb.aio_context = bdrv_get_aio_context(bs); - ret = glfs_discard_async(s->fd, offset, size, gluster_finish_aiocb, &acb); + ret = glfs_discard_async(s->fd, offset, bytes, gluster_finish_aiocb, &acb); if (ret < 0) { return -errno; } diff --git a/block/iscsi.c b/block/iscsi.c index 74ff7e307ece..57aa07a40d7f 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -1138,7 +1138,8 @@ iscsi_getlength(BlockDriverState *bs) } static int -coroutine_fn iscsi_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes) +coroutine_fn iscsi_co_pdiscard(BlockDriverState *bs, int64_t offset, + int64_t bytes) { IscsiLun *iscsilun = bs->opaque; struct IscsiTask iTask; @@ -1154,6 +1155,12 @@ coroutine_fn iscsi_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes) return 0; } + /* + * We don't want to overflow list.num which is uint32_t. + * We rely on our max_pdiscard. + */ + assert(bytes / iscsilun->block_size <= UINT32_MAX); + list.lba = offset / iscsilun->block_size; list.num = bytes / iscsilun->block_size; @@ -2071,10 +2078,9 @@ static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp) } if (iscsilun->lbp.lbpu) { - if (iscsilun->bl.max_unmap < 0xffffffff / block_size) { - bs->bl.max_pdiscard = - iscsilun->bl.max_unmap * iscsilun->block_size; - } + bs->bl.max_pdiscard = + MIN_NON_ZERO(iscsilun->bl.max_unmap * iscsilun->block_size, + (uint64_t)UINT32_MAX * iscsilun->block_size); bs->bl.pdiscard_alignment = iscsilun->bl.opt_unmap_gran * iscsilun->block_size; } else { diff --git a/block/mirror.c b/block/mirror.c index fab75087b0ba..c962e8b47128 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -1508,7 +1508,7 @@ static int coroutine_fn bdrv_mirror_top_pwrite_zeroes(BlockDriverState *bs, } static int coroutine_fn bdrv_mirror_top_pdiscard(BlockDriverState *bs, - int64_t offset, int bytes) + int64_t offset, int64_t bytes) { return bdrv_mirror_top_do_write(bs, MIRROR_METHOD_DISCARD, offset, bytes, NULL, 0); diff --git a/block/nbd.c b/block/nbd.c index c0c479abe9ce..a66b2c282dc3 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -1457,15 +1457,17 @@ static int nbd_client_co_flush(BlockDriverState *bs) } static int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, - int bytes) + int64_t bytes) { BDRVNBDState *s = (BDRVNBDState *)bs->opaque; NBDRequest request = { .type = NBD_CMD_TRIM, .from = offset, - .len = bytes, + .len = bytes, /* len is uint32_t */ }; + assert(bytes <= UINT32_MAX); /* rely on max_pdiscard */ + assert(!(s->info.flags & NBD_FLAG_READ_ONLY)); if (!(s->info.flags & NBD_FLAG_SEND_TRIM) || !bytes) { return 0; diff --git a/block/nvme.c b/block/nvme.c index 2e0fd9e76a2d..1cc7b62bb4b6 100644 --- a/block/nvme.c +++ b/block/nvme.c @@ -1360,7 +1360,7 @@ static coroutine_fn int nvme_co_pwrite_zeroes(BlockDriverState *bs, static int coroutine_fn nvme_co_pdiscard(BlockDriverState *bs, int64_t offset, - int bytes) + int64_t bytes) { BDRVNVMeState *s = bs->opaque; NVMeQueuePair *ioq = s->queues[INDEX_IO(0)]; @@ -1387,6 +1387,14 @@ static int coroutine_fn nvme_co_pdiscard(BlockDriverState *bs, assert(s->queue_count > 1); + /* + * Filling the @buf requires @offset and @bytes to satisfy restrictions + * defined in nvme_refresh_limits(). + */ + assert(QEMU_IS_ALIGNED(bytes, 1UL << s->blkshift)); + assert(QEMU_IS_ALIGNED(offset, 1UL << s->blkshift)); + assert((bytes >> s->blkshift) <= UINT32_MAX); + buf = qemu_try_memalign(s->page_size, s->page_size); if (!buf) { return -ENOMEM; @@ -1490,6 +1498,10 @@ static void nvme_refresh_limits(BlockDriverState *bs, Error **errp) bs->bl.max_pwrite_zeroes = 1ULL << (s->blkshift + 16); bs->bl.pwrite_zeroes_alignment = MAX(bs->bl.request_alignment, 1UL << s->blkshift); + + bs->bl.max_pdiscard = (uint64_t)UINT32_MAX << s->blkshift; + bs->bl.pdiscard_alignment = MAX(bs->bl.request_alignment, + 1UL << s->blkshift); } static void nvme_detach_aio_context(BlockDriverState *bs) diff --git a/block/preallocate.c b/block/preallocate.c index 99e28d9f086e..1d4233f73008 100644 --- a/block/preallocate.c +++ b/block/preallocate.c @@ -235,7 +235,7 @@ static coroutine_fn int preallocate_co_preadv_part( } static int coroutine_fn preallocate_co_pdiscard(BlockDriverState *bs, - int64_t offset, int bytes) + int64_t offset, int64_t bytes) { return bdrv_co_pdiscard(bs->file, offset, bytes); } diff --git a/block/qcow2.c b/block/qcow2.c index 4b2e8694952f..d50901675699 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -3996,7 +3996,7 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs, } static coroutine_fn int qcow2_co_pdiscard(BlockDriverState *bs, - int64_t offset, int bytes) + int64_t offset, int64_t bytes) { int ret; BDRVQcow2State *s = bs->opaque; diff --git a/block/raw-format.c b/block/raw-format.c index a2485926b8cd..bda757fd1954 100644 --- a/block/raw-format.c +++ b/block/raw-format.c @@ -302,7 +302,7 @@ static int coroutine_fn raw_co_pwrite_zeroes(BlockDriverState *bs, } static int coroutine_fn raw_co_pdiscard(BlockDriverState *bs, - int64_t offset, int bytes) + int64_t offset, int64_t bytes) { int ret; diff --git a/block/rbd.c b/block/rbd.c index 053eb8e48f44..701fbf2b0cf0 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -1197,9 +1197,9 @@ static int coroutine_fn qemu_rbd_co_flush(BlockDriverState *bs) } static int coroutine_fn qemu_rbd_co_pdiscard(BlockDriverState *bs, - int64_t offset, int count) + int64_t offset, int64_t bytes) { - return qemu_rbd_start_co(bs, offset, count, NULL, 0, RBD_AIO_DISCARD); + return qemu_rbd_start_co(bs, offset, bytes, NULL, 0, RBD_AIO_DISCARD); } #ifdef LIBRBD_SUPPORTS_WRITE_ZEROES diff --git a/block/throttle.c b/block/throttle.c index c13fe9067f4e..6e8d52fa2451 100644 --- a/block/throttle.c +++ b/block/throttle.c @@ -145,7 +145,7 @@ static int coroutine_fn throttle_co_pwrite_zeroes(BlockDriverState *bs, } static int coroutine_fn throttle_co_pdiscard(BlockDriverState *bs, - int64_t offset, int bytes) + int64_t offset, int64_t bytes) { ThrottleGroupMember *tgm = bs->opaque; throttle_group_co_io_limits_intercept(tgm, bytes, true); diff --git a/tests/unit/test-block-iothread.c b/tests/unit/test-block-iothread.c index c86954c7baa1..aea660aeed86 100644 --- a/tests/unit/test-block-iothread.c +++ b/tests/unit/test-block-iothread.c @@ -48,7 +48,7 @@ static int coroutine_fn bdrv_test_co_pwritev(BlockDriverState *bs, } static int coroutine_fn bdrv_test_co_pdiscard(BlockDriverState *bs, - int64_t offset, int bytes) + int64_t offset, int64_t bytes) { return 0; } diff --git a/block/trace-events b/block/trace-events index d8a08563f1ca..f2d0a9b62a7e 100644 --- a/block/trace-events +++ b/block/trace-events @@ -152,8 +152,8 @@ nvme_write_zeroes(void *s, uint64_t offset, uint64_t bytes, int flags) "s %p off nvme_qiov_unaligned(const void *qiov, int n, void *base, size_t size, int align) "qiov %p n %d base %p size 0x%zx align 0x%x" nvme_prw_buffered(void *s, uint64_t offset, uint64_t bytes, int niov, int is_write) "s %p offset 0x%"PRIx64" bytes %"PRId64" niov %d is_write %d" nvme_rw_done(void *s, int is_write, uint64_t offset, uint64_t bytes, int ret) "s %p is_write %d offset 0x%"PRIx64" bytes %"PRId64" ret %d" -nvme_dsm(void *s, uint64_t offset, uint64_t bytes) "s %p offset 0x%"PRIx64" bytes %"PRId64"" -nvme_dsm_done(void *s, uint64_t offset, uint64_t bytes, int ret) "s %p offset 0x%"PRIx64" bytes %"PRId64" ret %d" +nvme_dsm(void *s, int64_t offset, int64_t bytes) "s %p offset 0x%"PRIx64" bytes %"PRId64"" +nvme_dsm_done(void *s, int64_t offset, int64_t bytes, int ret) "s %p offset 0x%"PRIx64" bytes %"PRId64" ret %d" nvme_dma_map_flush(void *s) "s %p" nvme_free_req_queue_wait(void *s, unsigned q_index) "s %p q #%u" nvme_create_queue_pair(unsigned q_index, void *q, size_t size, void *aio_context, int fd) "index %u q %p size %zu aioctx %p fd %d" From patchwork Mon Sep 27 21:55:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12521161 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5F29C433F5 for ; Mon, 27 Sep 2021 22:10:43 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6264460FF2 for ; Mon, 27 Sep 2021 22:10:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6264460FF2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:35654 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUype-0002rT-Cr for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 18:10:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54628) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybl-0002j7-Qp for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:23 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:34631) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybh-0002yy-6o for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:20 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779776; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nYcY8/gPpGTrFbLOdoGvl9L1U7nOty+5ARQWlo+lWf0=; b=ERpVzBVSzluS3HmPVTfVk3oVYmHHJJPqpR1l/N/6wvHcQb+wvoDa53XcR9tH2HGcxvPkex 7lUYiKq5kUQ/qRMfzF32RM/2UVEgRIPf0JGr7vNRaR/aZKPTe2dDSn2gMb1CZ9k0wd+aFg qHj03RjqCD9MrfF8NaOz5Q2LbIWWbXY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-97-EFklq3HmMlGbWrNtvPru_w-1; Mon, 27 Sep 2021 17:56:15 -0400 X-MC-Unique: EFklq3HmMlGbWrNtvPru_w-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 02ED6801E72; Mon, 27 Sep 2021 21:56:14 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id F2F2E5D9D5; Mon, 27 Sep 2021 21:56:10 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 12/20] block/io: allow 64bit discard requests Date: Mon, 27 Sep 2021 16:55:37 -0500 Message-Id: <20210927215545.3930309-13-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Fam Zheng , Kevin Wolf , Vladimir Sementsov-Ogievskiy , "open list:Block I/O path" , Hanna Reitz , Stefan Hajnoczi Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Vladimir Sementsov-Ogievskiy Now that all drivers are updated by the previous commit, we can drop the last limiter on pdiscard path: INT_MAX in bdrv_co_pdiscard(). Now everything is prepared for implementing incredibly cool and fast big-discard requests in NBD and qcow2. And any other driver which wants it of course. Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Eric Blake Message-Id: <20210903102807.27127-12-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake --- block/io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/io.c b/block/io.c index 3846e2ed961b..18d345a87af3 100644 --- a/block/io.c +++ b/block/io.c @@ -3104,7 +3104,7 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, goto out; } - max_pdiscard = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_pdiscard, INT_MAX), + max_pdiscard = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_pdiscard, INT64_MAX), align); assert(max_pdiscard >= bs->bl.request_alignment); From patchwork Mon Sep 27 21:55:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12521177 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F267C433F5 for ; Mon, 27 Sep 2021 22:17:43 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E38A560F94 for ; Mon, 27 Sep 2021 22:17:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org E38A560F94 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:50370 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUywQ-0004fU-22 for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 18:17:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54662) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybo-0002lK-1Y for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:24 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:29455) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybj-00031j-0a for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779778; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EQQanGm4fTPSC6uaDARC04N2wLZ+oPBky28ifjvlJpo=; b=K5sdzpam0yJXLLIbWtEy3DOc2nJVHWB6yTmB6oakreoTfI75Mv4uklPZdyemEgeC3TnMm8 6yI3xg1UycLciTzSZRiS9K1qagE9VTybB7hoLvILQBt5XCw2MRItRhhdDklMwsRjBMPKCG TkZtU7s52VD8iH5KGskVOL9gNTweTpY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-154-8F6jCiSsOrS5K4ii4W9odg-1; Mon, 27 Sep 2021 17:56:15 -0400 X-MC-Unique: 8F6jCiSsOrS5K4ii4W9odg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 88C5E18125C2; Mon, 27 Sep 2021 21:56:14 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 37F875D9D5; Mon, 27 Sep 2021 21:56:14 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 13/20] nbd/server: Allow LIST_META_CONTEXT without STRUCTURED_REPLY Date: Mon, 27 Sep 2021 16:55:38 -0500 Message-Id: <20210927215545.3930309-14-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Vladimir Sementsov-Ogievskiy , "open list:Network Block Dev..." Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" The NBD protocol just relaxed the requirements on NBD_OPT_LIST_META_CONTEXT: https://github.com/NetworkBlockDevice/nbd/commit/13a4e33a87 Since listing is not stateful (unlike SET_META_CONTEXT), we don't care if a client asks for meta contexts without first requesting structured replies. Well-behaved clients will still ask for structured reply first (if for no other reason than for back-compat to older servers), but that's no reason to avoid this change. Signed-off-by: Eric Blake Message-Id: <20210907173505.1499709-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy --- nbd/server.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/nbd/server.c b/nbd/server.c index 3927f7789dcf..6d03e8a4b436 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -980,7 +980,7 @@ static int nbd_negotiate_meta_queries(NBDClient *client, size_t i; size_t count = 0; - if (!client->structured_reply) { + if (client->opt == NBD_OPT_SET_META_CONTEXT && !client->structured_reply) { return nbd_opt_invalid(client, errp, "request option '%s' when structured reply " "is not negotiated", From patchwork Mon Sep 27 21:55:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12521165 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 820DDC433EF for ; Mon, 27 Sep 2021 22:13:01 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F1E0F60F94 for ; Mon, 27 Sep 2021 22:13:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org F1E0F60F94 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:42362 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUyrs-0007WL-1B for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 18:13:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54582) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybk-0002iQ-55 for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:21 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:40447) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybi-00030x-Bk for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779777; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5qJkNWxgCdSvtca0kiCyehbqTU5dpRbBVer+mbjjHqg=; b=NfxMIYXDdD9PTPAcc8HYgfRRH0uoluQ27gIL9I2RifpY3TugBTxu7JnipHSJTO88PW/p/Y wRLhWViZevuMhND7lgkRr3GlkuXeQyTGEctkG6STMGBkT0Lx1kvJe2maeu8RD7wYKpI3ol wnvITaAWeYm0RWV2AwzoUycB062CypU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-444-3Ber0sAXNgmYLp18541dRw-1; Mon, 27 Sep 2021 17:56:16 -0400 X-MC-Unique: 3Ber0sAXNgmYLp18541dRw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 13B6B824FA6; Mon, 27 Sep 2021 21:56:15 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id B7F7C5D9D5; Mon, 27 Sep 2021 21:56:14 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 14/20] nbd/client-connection: nbd_co_establish_connection(): fix non set errp Date: Mon, 27 Sep 2021 16:55:39 -0500 Message-Id: <20210927215545.3930309-15-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Vladimir Sementsov-Ogievskiy , "open list:Network Block Dev..." Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Vladimir Sementsov-Ogievskiy When we don't have a connection and blocking is false, we return NULL but don't set errp. That's wrong. We have two paths for calling nbd_co_establish_connection(): 1. nbd_open() -> nbd_do_establish_connection() -> ... but that will never set blocking=false 2. nbd_reconnect_attempt() -> nbd_co_do_establish_connection() -> ... but that uses errp=NULL So, we are safe with our wrong errp policy in nbd_co_establish_connection(). Still let's fix it. Signed-off-by: Vladimir Sementsov-Ogievskiy Message-Id: <20210906190654.183421-2-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake Signed-off-by: Eric Blake --- nbd/client-connection.c | 1 + 1 file changed, 1 insertion(+) diff --git a/nbd/client-connection.c b/nbd/client-connection.c index 7123b1e18900..695f85575414 100644 --- a/nbd/client-connection.c +++ b/nbd/client-connection.c @@ -318,6 +318,7 @@ nbd_co_establish_connection(NBDClientConnection *conn, NBDExportInfo *info, } if (!blocking) { + error_setg(errp, "No connection at the moment"); return NULL; } From patchwork Mon Sep 27 21:55:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12521167 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32070C433EF for ; Mon, 27 Sep 2021 22:13:31 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CE71F610A2 for ; Mon, 27 Sep 2021 22:13:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org CE71F610A2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:43148 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUysM-00082D-22 for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 18:13:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54706) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybp-0002nu-PK for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:26 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:30488) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybl-00033g-Ir for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779780; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EakKonmquLM+MKGqgB7Pn8AV0UXhkZK4cNuUbpxiS+c=; b=g58qZ2MYwLShfvwgn+94DquYSsz4WuAgdb5uHFNTzx5+jS6tTmn81DZZR+ee4HbrkkvPnc jm4V2PDdBbWfYmq3M4hhftlmOaUuXVINf7JRi7Fa5Y2/8HIQJJOLCsr8bdfy53hSp87eO6 0wlQs0xlYrIsI9My4fQCBhAGgae01Hs= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-587-3xRWWIlWNGKltM34DqFu0g-1; Mon, 27 Sep 2021 17:56:16 -0400 X-MC-Unique: 3xRWWIlWNGKltM34DqFu0g-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id A9C075074C; Mon, 27 Sep 2021 21:56:15 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3FC615D9D5; Mon, 27 Sep 2021 21:56:15 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 15/20] block/nbd: nbd_channel_error() shutdown channel unconditionally Date: Mon, 27 Sep 2021 16:55:40 -0500 Message-Id: <20210927215545.3930309-16-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Hanna Reitz , Vladimir Sementsov-Ogievskiy , "open list:Network Block Dev..." Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Vladimir Sementsov-Ogievskiy Don't rely on connection being totally broken in case of -EIO. Safer and more correct is to just shut down the channel anyway, since we change the state and plan on reconnecting. Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Eric Blake Message-Id: <20210902103805.25686-2-vsementsov@virtuozzo.com> [eblake: grammar tweaks] Signed-off-by: Eric Blake --- block/nbd.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/block/nbd.c b/block/nbd.c index a66b2c282dc3..de59e76378ab 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -129,15 +129,16 @@ static bool nbd_client_connected(BDRVNBDState *s) static void nbd_channel_error(BDRVNBDState *s, int ret) { + if (nbd_client_connected(s)) { + qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL); + } + if (ret == -EIO) { if (nbd_client_connected(s)) { s->state = s->reconnect_delay ? NBD_CLIENT_CONNECTING_WAIT : NBD_CLIENT_CONNECTING_NOWAIT; } } else { - if (nbd_client_connected(s)) { - qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL); - } s->state = NBD_CLIENT_QUIT; } } From patchwork Mon Sep 27 21:55:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12521081 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC296C433EF for ; Mon, 27 Sep 2021 22:05:57 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A32A060EB6 for ; Mon, 27 Sep 2021 22:05:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A32A060EB6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:49912 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUyl2-0001n6-SG for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 18:05:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54646) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybn-0002jp-FP for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:23 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:60444) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybj-00032P-Ln for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779778; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ljiF9o/BxZLMVY9v+EznM1MHruhp9Mf/QYChNFexN94=; b=DIFV3MM2wrYN0O+kkJtFkgEQ5amiaU1eWbBcoudea4/1LGpU65eEyL0e3mZSvCBhPCEtbS u1pT/d40vsroxRKMB8d/ebbqrmZ6b03NGbj7VQ6akO+FrBP07nvD5mGpblnODuQKF6qL3f 6c7Eih4xQvOfViQ+nw0Iw9yLLMpsFpc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-369-Ltlq3K3oOKuxtKKI6jK57w-1; Mon, 27 Sep 2021 17:56:17 -0400 X-MC-Unique: Ltlq3K3oOKuxtKKI6jK57w-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4CA3B824FA8; Mon, 27 Sep 2021 21:56:16 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id D6B2A5D9D5; Mon, 27 Sep 2021 21:56:15 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 16/20] block/nbd: move nbd_recv_coroutines_wake_all() up Date: Mon, 27 Sep 2021 16:55:41 -0500 Message-Id: <20210927215545.3930309-17-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Hanna Reitz , Vladimir Sementsov-Ogievskiy , "open list:Network Block Dev..." Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Vladimir Sementsov-Ogievskiy We are going to use it in nbd_channel_error(), so move it up. Note, that we are going also refactor and rename nbd_recv_coroutines_wake_all() in future anyway, so keeping it where it is and making forward declaration doesn't make real sense. Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Eric Blake Message-Id: <20210902103805.25686-3-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake --- block/nbd.c | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/block/nbd.c b/block/nbd.c index de59e76378ab..2842e6263fe4 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -127,22 +127,6 @@ static bool nbd_client_connected(BDRVNBDState *s) return qatomic_load_acquire(&s->state) == NBD_CLIENT_CONNECTED; } -static void nbd_channel_error(BDRVNBDState *s, int ret) -{ - if (nbd_client_connected(s)) { - qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL); - } - - if (ret == -EIO) { - if (nbd_client_connected(s)) { - s->state = s->reconnect_delay ? NBD_CLIENT_CONNECTING_WAIT : - NBD_CLIENT_CONNECTING_NOWAIT; - } - } else { - s->state = NBD_CLIENT_QUIT; - } -} - static void nbd_recv_coroutines_wake_all(BDRVNBDState *s) { int i; @@ -157,6 +141,22 @@ static void nbd_recv_coroutines_wake_all(BDRVNBDState *s) } } +static void nbd_channel_error(BDRVNBDState *s, int ret) +{ + if (nbd_client_connected(s)) { + qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL); + } + + if (ret == -EIO) { + if (nbd_client_connected(s)) { + s->state = s->reconnect_delay ? NBD_CLIENT_CONNECTING_WAIT : + NBD_CLIENT_CONNECTING_NOWAIT; + } + } else { + s->state = NBD_CLIENT_QUIT; + } +} + static void reconnect_delay_timer_del(BDRVNBDState *s) { if (s->reconnect_delay_timer) { From patchwork Mon Sep 27 21:55:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12521159 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E168DC433F5 for ; Mon, 27 Sep 2021 22:09:21 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6A37360F3A for ; Mon, 27 Sep 2021 22:09:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6A37360F3A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:33322 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUyoK-0001Gj-H4 for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 18:09:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54676) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybo-0002nf-HI for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:26 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:46402) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybk-000333-1k for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:24 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779779; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tv592j8nE0eDGBLj7h5GztvPkezcCSs+Fyejz5ZWsGQ=; b=bma45o2MufA6w+9g2mVHFZDPblZkZP4WfiSSJP0++CVlsXU1VcSVZ8NoL9eHJQMoiR2FWS 2X7zamsMaZppwIYSe+mdl1kXVLorMPPd7hecEO5xkaE705QqvKF1w9cvA9JxjMg4OyEwS9 KjpCBJ0qRufRAKp1GYjGURWU52MTkF4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-285-EUlYHKVHNI-sfoxyj-LKWg-1; Mon, 27 Sep 2021 17:56:18 -0400 X-MC-Unique: EUlYHKVHNI-sfoxyj-LKWg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id ED1C81006AA2; Mon, 27 Sep 2021 21:56:16 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 829CE5D9D5; Mon, 27 Sep 2021 21:56:16 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 17/20] block/nbd: refactor nbd_recv_coroutines_wake_all() Date: Mon, 27 Sep 2021 16:55:42 -0500 Message-Id: <20210927215545.3930309-18-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Hanna Reitz , Vladimir Sementsov-Ogievskiy , "open list:Network Block Dev..." Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Vladimir Sementsov-Ogievskiy Split out nbd_recv_coroutine_wake_one(), as it will be used separately. Rename the function and add a possibility to wake only first found sleeping coroutine. Signed-off-by: Vladimir Sementsov-Ogievskiy Message-Id: <20210902103805.25686-4-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake [eblake: grammar tweak] Signed-off-by: Eric Blake --- block/nbd.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-) diff --git a/block/nbd.c b/block/nbd.c index 2842e6263fe4..709c2499e33c 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -127,16 +127,24 @@ static bool nbd_client_connected(BDRVNBDState *s) return qatomic_load_acquire(&s->state) == NBD_CLIENT_CONNECTED; } -static void nbd_recv_coroutines_wake_all(BDRVNBDState *s) +static bool nbd_recv_coroutine_wake_one(NBDClientRequest *req) +{ + if (req->receiving) { + req->receiving = false; + aio_co_wake(req->coroutine); + return true; + } + + return false; +} + +static void nbd_recv_coroutines_wake(BDRVNBDState *s, bool all) { int i; for (i = 0; i < MAX_NBD_REQUESTS; i++) { - NBDClientRequest *req = &s->requests[i]; - - if (req->coroutine && req->receiving) { - req->receiving = false; - aio_co_wake(req->coroutine); + if (nbd_recv_coroutine_wake_one(&s->requests[i]) && !all) { + return; } } } @@ -415,7 +423,7 @@ static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s) while (s->in_flight > 0) { qemu_co_mutex_unlock(&s->send_mutex); - nbd_recv_coroutines_wake_all(s); + nbd_recv_coroutines_wake(s, true); s->wait_in_flight = true; qemu_coroutine_yield(); s->wait_in_flight = false; @@ -558,7 +566,7 @@ static coroutine_fn void nbd_connection_entry(void *opaque) } qemu_co_queue_restart_all(&s->free_sema); - nbd_recv_coroutines_wake_all(s); + nbd_recv_coroutines_wake(s, true); bdrv_dec_in_flight(s->bs); s->connection_co = NULL; @@ -1035,7 +1043,7 @@ static coroutine_fn int nbd_co_receive_one_chunk( if (s->connection_co && !s->wait_in_flight) { /* * We must check s->wait_in_flight, because we may entered by - * nbd_recv_coroutines_wake_all(), in this case we should not + * nbd_recv_coroutines_wake(), in this case we should not * wake connection_co here, it will woken by last request. */ aio_co_wake(s->connection_co); From patchwork Mon Sep 27 21:55:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12521175 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A3B6C433F5 for ; Mon, 27 Sep 2021 22:17:38 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 89B466101A for ; Mon, 27 Sep 2021 22:17:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 89B466101A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:50084 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUywK-0004RM-Oq for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 18:17:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54718) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybr-0002sF-SS for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:27 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:49526) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybl-00033l-IN for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779780; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sWh/NqJhRcBUVvkh7Zehi9ZskwbaruG3J+YmQkqUh8A=; b=W8oFc1WY3rBj3pE4cBm0LGhghfsONrQWshnkKOR1TVomwEp3Qk+7BseZCGcTj9KjIHMSt9 ruAY6mnkkgOyNPCIg1Vpo8BV1p2wiWASVhZltRsX7iV4/H9trE3hNcv7pFSlsgWkDw9flX beqDxHAYknEJfrqEHeN6uSiDMUE+7gM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-485-DC-RuJXzNP-CsnQ40XhF0A-1; Mon, 27 Sep 2021 17:56:18 -0400 X-MC-Unique: DC-RuJXzNP-CsnQ40XhF0A-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9ECAB1006AA3; Mon, 27 Sep 2021 21:56:17 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2F6B45D9D5; Mon, 27 Sep 2021 21:56:17 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 18/20] block/nbd: drop connection_co Date: Mon, 27 Sep 2021 16:55:43 -0500 Message-Id: <20210927215545.3930309-19-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Hanna Reitz , Vladimir Sementsov-Ogievskiy , "open list:Network Block Dev..." Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Vladimir Sementsov-Ogievskiy OK, that's a big rewrite of the logic. Pre-patch we have an always running coroutine - connection_co. It does reply receiving and reconnecting. And it leads to a lot of difficult and unobvious code around drained sections and context switch. We also abuse bs->in_flight counter which is increased for connection_co and temporary decreased in points where we want to allow drained section to begin. One of these place is in another file: in nbd_read_eof() in nbd/client.c. We also cancel reconnect and requests waiting for reconnect on drained begin which is not correct. And this patch fixes that. Let's finally drop this always running coroutine and go another way: do both reconnect and receiving in request coroutines. The detailed list of changes below (in the sequence of diff hunks). 1. receiving coroutines are woken directly from nbd_channel_error, when we change s->state 2. nbd_co_establish_connection_cancel(): we don't have drain_begin now, and in nbd_teardown_connection() all requests should already be finished (and reconnect is done from request). So nbd_co_establish_connection_cancel() is called from nbd_cancel_in_flight() (to cancel the request that is doing nbd_co_establish_connection()) and from reconnect_delay_timer_cb() (previously we didn't need it, as reconnect delay only should cancel active requests not the reconnection itself). But now reconnection itself is done in the separate thread (we now call nbd_client_connection_enable_retry() in nbd_open()), and we need to cancel the requests that wait in nbd_co_establish_connection() now). 2A. We do receive headers in request coroutine. But we also should dispatch replies for other pending requests. So, nbd_connection_entry() is turned into nbd_receive_replies(), which does reply dispatching while it receives other request headers, and returns when it receives the requested header. 3. All old staff around drained sections and context switch is dropped. In details: - we don't need to move connection_co to new aio context, as we don't have connection_co anymore - we don't have a fake "request" of connection_co (extra increasing in_flight), so don't care with it in drain_begin/end - we don't stop reconnection during drained section anymore. This means that drain_begin may wait for a long time (up to reconnect_delay). But that's an improvement and more correct behavior see below[*] 4. In nbd_teardown_connection() we don't have to wait for connection_co, as it is dropped. And cleanup for s->ioc and nbd_yank is moved here from removed connection_co. 5. In nbd_co_do_establish_connection() we now should handle NBD_CLIENT_CONNECTING_NOWAIT: if new request comes when we are in NBD_CLIENT_CONNECTING_NOWAIT, it still should call nbd_co_establish_connection() (who knows, maybe the connection was already established by another thread in the background). But we shouldn't wait: if nbd_co_establish_connection() can't return new channel immediately the request should fail (we are in NBD_CLIENT_CONNECTING_NOWAIT state). 6. nbd_reconnect_attempt() is simplified: it's now easier to wait for other requests in the caller, so here we just assert that fact. Also delay time is now initialized here: we can easily detect first attempt and start a timer. 7. nbd_co_reconnect_loop() is dropped, we don't need it. Reconnect retries are fully handle by thread (nbd/client-connection.c), delay timer we initialize in nbd_reconnect_attempt(), we don't have to bother with s->drained and friends. nbd_reconnect_attempt() now called from nbd_co_send_request(). 8. nbd_connection_entry is dropped: reconnect is now handled by nbd_co_send_request(), receiving reply is now handled by nbd_receive_replies(): all handled from request coroutines. 9. So, welcome new nbd_receive_replies() called from request coroutine, that receives reply header instead of nbd_connection_entry(). Like with sending requests, only one coroutine may receive in a moment. So we introduce receive_mutex, which is locked around nbd_receive_reply(). It also protects some related fields. Still, full audit of thread-safety in nbd driver is a separate task. New function waits for a reply with specified handle being received and works rather simple: Under mutex: - if current handle is 0, do receive by hand. If another handle received - switch to other request coroutine, release mutex and yield. Otherwise return success - if current handle == requested handle, we are done - otherwise, release mutex and yield 10: in nbd_co_send_request() we now do nbd_reconnect_attempt() if needed. Also waiting in free_sema queue we now wait for one of two conditions: - connectED, in_flight < MAX_NBD_REQUESTS (so we can start new one) - connectING, in_flight == 0, so we can call nbd_reconnect_attempt() And this logic is protected by s->send_mutex Also, on failure we don't have to care of removed s->connection_co 11. nbd_co_do_receive_one_chunk(): now instead of yield() and wait for s->connection_co we just call new nbd_receive_replies(). 12. nbd_co_receive_one_chunk(): place where s->reply.handle becomes 0, which means that handling of the whole reply is finished. Here we need to wake one of coroutines sleeping in nbd_receive_replies(). If none are sleeping - do nothing. That's another behavior change: we don't have endless recv() in the idle time. It may be considered as a drawback. If so, it may be fixed later. 13. nbd_reply_chunk_iter_receive(): don't care about removed connection_co, just ping in_flight waiters. 14. Don't create connection_co, enable retry in the connection thread (we don't have own reconnect loop anymore) 15. We now need to add a nbd_co_establish_connection_cancel() call in nbd_cancel_in_flight(), to cancel the request that is doing a connection attempt. [*], ok, now we don't cancel reconnect on drain begin. That's correct: reconnect feature leads to possibility of long-running requests (up to reconnect delay). Still, drain begin is not a reason to kill long requests. We should wait for them. This also means, that we can again reproduce a dead-lock, described in 8c517de24a8a1dcbeb54e7e12b5b0fda42a90ace. Why we are OK with it: 1. Now this is not absolutely-dead dead-lock: the vm is unfrozen after reconnect delay. Actually 8c517de24a8a1dc fixed a bug in NBD logic, that was not described in 8c517de24a8a1dc and led to forever dead-lock. The problem was that nobody woke the free_sema queue, but drain_begin can't finish until there is a request in free_sema queue. Now we have a reconnect delay timer that works well. 2. It's not a problem of the NBD driver, but of the ide code, because it does drain_begin under the global mutex; the problem doesn't reproduce when using scsi instead of ide. Signed-off-by: Vladimir Sementsov-Ogievskiy Message-Id: <20210902103805.25686-5-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake [eblake: grammar and comment tweaks] Signed-off-by: Eric Blake --- block/nbd.c | 385 ++++++++++++++------------------------------------- nbd/client.c | 2 - 2 files changed, 105 insertions(+), 282 deletions(-) diff --git a/block/nbd.c b/block/nbd.c index 709c2499e33c..8ff6daf43d46 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -57,7 +57,7 @@ typedef struct { Coroutine *coroutine; uint64_t offset; /* original offset of the request */ - bool receiving; /* waiting for connection_co? */ + bool receiving; /* sleeping in the yield in nbd_receive_replies */ } NBDClientRequest; typedef enum NBDClientState { @@ -73,14 +73,10 @@ typedef struct BDRVNBDState { CoMutex send_mutex; CoQueue free_sema; - Coroutine *connection_co; - Coroutine *teardown_co; - QemuCoSleep reconnect_sleep; - bool drained; - bool wait_drained_end; + + CoMutex receive_mutex; int in_flight; NBDClientState state; - bool wait_in_flight; QEMUTimer *reconnect_delay_timer; @@ -163,6 +159,8 @@ static void nbd_channel_error(BDRVNBDState *s, int ret) } else { s->state = NBD_CLIENT_QUIT; } + + nbd_recv_coroutines_wake(s, true); } static void reconnect_delay_timer_del(BDRVNBDState *s) @@ -179,6 +177,7 @@ static void reconnect_delay_timer_cb(void *opaque) if (qatomic_load_acquire(&s->state) == NBD_CLIENT_CONNECTING_WAIT) { s->state = NBD_CLIENT_CONNECTING_NOWAIT; + nbd_co_establish_connection_cancel(s->conn); while (qemu_co_enter_next(&s->free_sema, NULL)) { /* Resume all queued requests */ } @@ -201,113 +200,21 @@ static void reconnect_delay_timer_init(BDRVNBDState *s, uint64_t expire_time_ns) timer_mod(s->reconnect_delay_timer, expire_time_ns); } -static void nbd_client_detach_aio_context(BlockDriverState *bs) -{ - BDRVNBDState *s = (BDRVNBDState *)bs->opaque; - - /* Timer is deleted in nbd_client_co_drain_begin() */ - assert(!s->reconnect_delay_timer); - /* - * If reconnect is in progress we may have no ->ioc. It will be - * re-instantiated in the proper aio context once the connection is - * reestablished. - */ - if (s->ioc) { - qio_channel_detach_aio_context(QIO_CHANNEL(s->ioc)); - } -} - -static void nbd_client_attach_aio_context_bh(void *opaque) -{ - BlockDriverState *bs = opaque; - BDRVNBDState *s = (BDRVNBDState *)bs->opaque; - - if (s->connection_co) { - /* - * The node is still drained, so we know the coroutine has yielded in - * nbd_read_eof(), the only place where bs->in_flight can reach 0, or - * it is entered for the first time. Both places are safe for entering - * the coroutine. - */ - qemu_aio_coroutine_enter(bs->aio_context, s->connection_co); - } - bdrv_dec_in_flight(bs); -} - -static void nbd_client_attach_aio_context(BlockDriverState *bs, - AioContext *new_context) -{ - BDRVNBDState *s = (BDRVNBDState *)bs->opaque; - - /* - * s->connection_co is either yielded from nbd_receive_reply or from - * nbd_co_reconnect_loop() - */ - if (nbd_client_connected(s)) { - qio_channel_attach_aio_context(QIO_CHANNEL(s->ioc), new_context); - } - - bdrv_inc_in_flight(bs); - - /* - * Need to wait here for the BH to run because the BH must run while the - * node is still drained. - */ - aio_wait_bh_oneshot(new_context, nbd_client_attach_aio_context_bh, bs); -} - -static void coroutine_fn nbd_client_co_drain_begin(BlockDriverState *bs) -{ - BDRVNBDState *s = (BDRVNBDState *)bs->opaque; - - s->drained = true; - qemu_co_sleep_wake(&s->reconnect_sleep); - - nbd_co_establish_connection_cancel(s->conn); - - reconnect_delay_timer_del(s); - - if (qatomic_load_acquire(&s->state) == NBD_CLIENT_CONNECTING_WAIT) { - s->state = NBD_CLIENT_CONNECTING_NOWAIT; - qemu_co_queue_restart_all(&s->free_sema); - } -} - -static void coroutine_fn nbd_client_co_drain_end(BlockDriverState *bs) -{ - BDRVNBDState *s = (BDRVNBDState *)bs->opaque; - - s->drained = false; - if (s->wait_drained_end) { - s->wait_drained_end = false; - aio_co_wake(s->connection_co); - } -} - - static void nbd_teardown_connection(BlockDriverState *bs) { BDRVNBDState *s = (BDRVNBDState *)bs->opaque; + assert(!s->in_flight); + if (s->ioc) { - /* finish any pending coroutines */ qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL); + yank_unregister_function(BLOCKDEV_YANK_INSTANCE(s->bs->node_name), + nbd_yank, s->bs); + object_unref(OBJECT(s->ioc)); + s->ioc = NULL; } s->state = NBD_CLIENT_QUIT; - if (s->connection_co) { - qemu_co_sleep_wake(&s->reconnect_sleep); - nbd_co_establish_connection_cancel(s->conn); - } - if (qemu_in_coroutine()) { - s->teardown_co = qemu_coroutine_self(); - /* connection_co resumes us when it terminates */ - qemu_coroutine_yield(); - s->teardown_co = NULL; - } else { - BDRV_POLL_WHILE(bs, s->connection_co); - } - assert(!s->connection_co); } static bool nbd_client_connecting(BDRVNBDState *s) @@ -372,10 +279,11 @@ int coroutine_fn nbd_co_do_establish_connection(BlockDriverState *bs, { BDRVNBDState *s = (BDRVNBDState *)bs->opaque; int ret; + bool blocking = nbd_client_connecting_wait(s); assert(!s->ioc); - s->ioc = nbd_co_establish_connection(s->conn, &s->info, true, errp); + s->ioc = nbd_co_establish_connection(s->conn, &s->info, blocking, errp); if (!s->ioc) { return -ECONNREFUSED; } @@ -411,29 +319,22 @@ int coroutine_fn nbd_co_do_establish_connection(BlockDriverState *bs, return 0; } +/* called under s->send_mutex */ static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s) { - if (!nbd_client_connecting(s)) { - return; - } - - /* Wait for completion of all in-flight requests */ - - qemu_co_mutex_lock(&s->send_mutex); - - while (s->in_flight > 0) { - qemu_co_mutex_unlock(&s->send_mutex); - nbd_recv_coroutines_wake(s, true); - s->wait_in_flight = true; - qemu_coroutine_yield(); - s->wait_in_flight = false; - qemu_co_mutex_lock(&s->send_mutex); - } - - qemu_co_mutex_unlock(&s->send_mutex); - - if (!nbd_client_connecting(s)) { - return; + assert(nbd_client_connecting(s)); + assert(s->in_flight == 0); + + if (nbd_client_connecting_wait(s) && s->reconnect_delay && + !s->reconnect_delay_timer) + { + /* + * It's first reconnect attempt after switching to + * NBD_CLIENT_CONNECTING_WAIT + */ + reconnect_delay_timer_init(s, + qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + + s->reconnect_delay * NANOSECONDS_PER_SECOND); } /* @@ -453,135 +354,80 @@ static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s) nbd_co_do_establish_connection(s->bs, NULL); } -static coroutine_fn void nbd_co_reconnect_loop(BDRVNBDState *s) +static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t handle) { - uint64_t timeout = 1 * NANOSECONDS_PER_SECOND; - uint64_t max_timeout = 16 * NANOSECONDS_PER_SECOND; - - if (qatomic_load_acquire(&s->state) == NBD_CLIENT_CONNECTING_WAIT) { - reconnect_delay_timer_init(s, qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + - s->reconnect_delay * NANOSECONDS_PER_SECOND); - } - - nbd_reconnect_attempt(s); - - while (nbd_client_connecting(s)) { - if (s->drained) { - bdrv_dec_in_flight(s->bs); - s->wait_drained_end = true; - while (s->drained) { - /* - * We may be entered once from nbd_client_attach_aio_context_bh - * and then from nbd_client_co_drain_end. So here is a loop. - */ - qemu_coroutine_yield(); - } - bdrv_inc_in_flight(s->bs); - } else { - qemu_co_sleep_ns_wakeable(&s->reconnect_sleep, - QEMU_CLOCK_REALTIME, timeout); - if (s->drained) { - continue; - } - if (timeout < max_timeout) { - timeout *= 2; - } - } - - nbd_reconnect_attempt(s); - } - - reconnect_delay_timer_del(s); -} - -static coroutine_fn void nbd_connection_entry(void *opaque) -{ - BDRVNBDState *s = opaque; - uint64_t i; - int ret = 0; - Error *local_err = NULL; - - while (qatomic_load_acquire(&s->state) != NBD_CLIENT_QUIT) { - /* - * The NBD client can only really be considered idle when it has - * yielded from qio_channel_readv_all_eof(), waiting for data. This is - * the point where the additional scheduled coroutine entry happens - * after nbd_client_attach_aio_context(). - * - * Therefore we keep an additional in_flight reference all the time and - * only drop it temporarily here. - */ - - if (nbd_client_connecting(s)) { - nbd_co_reconnect_loop(s); + int ret; + uint64_t ind = HANDLE_TO_INDEX(s, handle), ind2; + QEMU_LOCK_GUARD(&s->receive_mutex); + + while (true) { + if (s->reply.handle == handle) { + /* We are done */ + return 0; } if (!nbd_client_connected(s)) { + return -EIO; + } + + if (s->reply.handle != 0) { + /* + * Some other request is being handled now. It should already be + * woken by whoever set s->reply.handle (or never wait in this + * yield). So, we should not wake it here. + */ + ind2 = HANDLE_TO_INDEX(s, s->reply.handle); + assert(!s->requests[ind2].receiving); + + s->requests[ind].receiving = true; + qemu_co_mutex_unlock(&s->receive_mutex); + + qemu_coroutine_yield(); + /* + * We may be woken for 3 reasons: + * 1. From this function, executing in parallel coroutine, when our + * handle is received. + * 2. From nbd_channel_error(), when connection is lost. + * 3. From nbd_co_receive_one_chunk(), when previous request is + * finished and s->reply.handle set to 0. + * Anyway, it's OK to lock the mutex and go to the next iteration. + */ + + qemu_co_mutex_lock(&s->receive_mutex); + assert(!s->requests[ind].receiving); continue; } + /* We are under mutex and handle is 0. We have to do the dirty work. */ assert(s->reply.handle == 0); - ret = nbd_receive_reply(s->bs, s->ioc, &s->reply, &local_err); - - if (local_err) { - trace_nbd_read_reply_entry_fail(ret, error_get_pretty(local_err)); - error_free(local_err); - local_err = NULL; - } + ret = nbd_receive_reply(s->bs, s->ioc, &s->reply, NULL); if (ret <= 0) { - nbd_channel_error(s, ret ? ret : -EIO); - continue; + ret = ret ? ret : -EIO; + nbd_channel_error(s, ret); + return ret; } - - /* - * There's no need for a mutex on the receive side, because the - * handler acts as a synchronization point and ensures that only - * one coroutine is called until the reply finishes. - */ - i = HANDLE_TO_INDEX(s, s->reply.handle); - if (i >= MAX_NBD_REQUESTS || - !s->requests[i].coroutine || - !s->requests[i].receiving || - (nbd_reply_is_structured(&s->reply) && !s->info.structured_reply)) - { + if (nbd_reply_is_structured(&s->reply) && !s->info.structured_reply) { nbd_channel_error(s, -EINVAL); - continue; + return -EINVAL; } - - /* - * We're woken up again by the request itself. Note that there - * is no race between yielding and reentering connection_co. This - * is because: - * - * - if the request runs on the same AioContext, it is only - * entered after we yield - * - * - if the request runs on a different AioContext, reentering - * connection_co happens through a bottom half, which can only - * run after we yield. - */ - s->requests[i].receiving = false; - aio_co_wake(s->requests[i].coroutine); - qemu_coroutine_yield(); - } - - qemu_co_queue_restart_all(&s->free_sema); - nbd_recv_coroutines_wake(s, true); - bdrv_dec_in_flight(s->bs); - - s->connection_co = NULL; - if (s->ioc) { - qio_channel_detach_aio_context(QIO_CHANNEL(s->ioc)); - yank_unregister_function(BLOCKDEV_YANK_INSTANCE(s->bs->node_name), - nbd_yank, s->bs); - object_unref(OBJECT(s->ioc)); - s->ioc = NULL; - } - - if (s->teardown_co) { - aio_co_wake(s->teardown_co); + if (s->reply.handle == handle) { + /* We are done */ + return 0; + } + ind2 = HANDLE_TO_INDEX(s, s->reply.handle); + if (ind2 >= MAX_NBD_REQUESTS || !s->requests[ind2].coroutine) { + /* + * We only check that ind2 request exists. But don't check + * whether it is now waiting for the reply header or + * not. We can't just check s->requests[ind2].receiving: + * ind2 request may wait in trying to lock + * receive_mutex. So that's a TODO. + */ + nbd_channel_error(s, -EINVAL); + return -EINVAL; + } + nbd_recv_coroutine_wake_one(&s->requests[ind2]); } - aio_wait_kick(); } static int nbd_co_send_request(BlockDriverState *bs, @@ -592,10 +438,17 @@ static int nbd_co_send_request(BlockDriverState *bs, int rc, i = -1; qemu_co_mutex_lock(&s->send_mutex); - while (s->in_flight == MAX_NBD_REQUESTS || nbd_client_connecting_wait(s)) { + + while (s->in_flight == MAX_NBD_REQUESTS || + (!nbd_client_connected(s) && s->in_flight > 0)) + { qemu_co_queue_wait(&s->free_sema, &s->send_mutex); } + if (nbd_client_connecting(s)) { + nbd_reconnect_attempt(s); + } + if (!nbd_client_connected(s)) { rc = -EIO; goto err; @@ -642,10 +495,6 @@ err: if (i != -1) { s->requests[i].coroutine = NULL; s->in_flight--; - } - if (s->in_flight == 0 && s->wait_in_flight) { - aio_co_wake(s->connection_co); - } else { qemu_co_queue_next(&s->free_sema); } } @@ -944,10 +793,7 @@ static coroutine_fn int nbd_co_do_receive_one_chunk( } *request_ret = 0; - /* Wait until we're woken up by nbd_connection_entry. */ - s->requests[i].receiving = true; - qemu_coroutine_yield(); - assert(!s->requests[i].receiving); + nbd_receive_replies(s, handle); if (!nbd_client_connected(s)) { error_setg(errp, "Connection closed"); return -EIO; @@ -1040,14 +886,7 @@ static coroutine_fn int nbd_co_receive_one_chunk( } s->reply.handle = 0; - if (s->connection_co && !s->wait_in_flight) { - /* - * We must check s->wait_in_flight, because we may entered by - * nbd_recv_coroutines_wake(), in this case we should not - * wake connection_co here, it will woken by last request. - */ - aio_co_wake(s->connection_co); - } + nbd_recv_coroutines_wake(s, false); return ret; } @@ -1158,11 +997,7 @@ break_loop: qemu_co_mutex_lock(&s->send_mutex); s->in_flight--; - if (s->in_flight == 0 && s->wait_in_flight) { - aio_co_wake(s->connection_co); - } else { - qemu_co_queue_next(&s->free_sema); - } + qemu_co_queue_next(&s->free_sema); qemu_co_mutex_unlock(&s->send_mutex); return false; @@ -1984,6 +1819,7 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags, s->bs = bs; qemu_co_mutex_init(&s->send_mutex); qemu_co_queue_init(&s->free_sema); + qemu_co_mutex_init(&s->receive_mutex); if (!yank_register_instance(BLOCKDEV_YANK_INSTANCE(bs->node_name), errp)) { return -EEXIST; @@ -1998,14 +1834,13 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags, s->x_dirty_bitmap, s->tlscreds); /* TODO: Configurable retry-until-timeout behaviour. */ + s->state = NBD_CLIENT_CONNECTING_WAIT; ret = nbd_do_establish_connection(bs, errp); if (ret < 0) { goto fail; } - s->connection_co = qemu_coroutine_create(nbd_connection_entry, s); - bdrv_inc_in_flight(bs); - aio_co_schedule(bdrv_get_aio_context(bs), s->connection_co); + nbd_client_connection_enable_retry(s->conn); return 0; @@ -2159,6 +1994,8 @@ static void nbd_cancel_in_flight(BlockDriverState *bs) s->state = NBD_CLIENT_CONNECTING_NOWAIT; qemu_co_queue_restart_all(&s->free_sema); } + + nbd_co_establish_connection_cancel(s->conn); } static BlockDriver bdrv_nbd = { @@ -2179,10 +2016,6 @@ static BlockDriver bdrv_nbd = { .bdrv_refresh_limits = nbd_refresh_limits, .bdrv_co_truncate = nbd_co_truncate, .bdrv_getlength = nbd_getlength, - .bdrv_detach_aio_context = nbd_client_detach_aio_context, - .bdrv_attach_aio_context = nbd_client_attach_aio_context, - .bdrv_co_drain_begin = nbd_client_co_drain_begin, - .bdrv_co_drain_end = nbd_client_co_drain_end, .bdrv_refresh_filename = nbd_refresh_filename, .bdrv_co_block_status = nbd_client_co_block_status, .bdrv_dirname = nbd_dirname, @@ -2208,10 +2041,6 @@ static BlockDriver bdrv_nbd_tcp = { .bdrv_refresh_limits = nbd_refresh_limits, .bdrv_co_truncate = nbd_co_truncate, .bdrv_getlength = nbd_getlength, - .bdrv_detach_aio_context = nbd_client_detach_aio_context, - .bdrv_attach_aio_context = nbd_client_attach_aio_context, - .bdrv_co_drain_begin = nbd_client_co_drain_begin, - .bdrv_co_drain_end = nbd_client_co_drain_end, .bdrv_refresh_filename = nbd_refresh_filename, .bdrv_co_block_status = nbd_client_co_block_status, .bdrv_dirname = nbd_dirname, @@ -2237,10 +2066,6 @@ static BlockDriver bdrv_nbd_unix = { .bdrv_refresh_limits = nbd_refresh_limits, .bdrv_co_truncate = nbd_co_truncate, .bdrv_getlength = nbd_getlength, - .bdrv_detach_aio_context = nbd_client_detach_aio_context, - .bdrv_attach_aio_context = nbd_client_attach_aio_context, - .bdrv_co_drain_begin = nbd_client_co_drain_begin, - .bdrv_co_drain_end = nbd_client_co_drain_end, .bdrv_refresh_filename = nbd_refresh_filename, .bdrv_co_block_status = nbd_client_co_block_status, .bdrv_dirname = nbd_dirname, diff --git a/nbd/client.c b/nbd/client.c index 0c2db4bcba9f..30d5383cb195 100644 --- a/nbd/client.c +++ b/nbd/client.c @@ -1434,9 +1434,7 @@ nbd_read_eof(BlockDriverState *bs, QIOChannel *ioc, void *buffer, size_t size, len = qio_channel_readv(ioc, &iov, 1, errp); if (len == QIO_CHANNEL_ERR_BLOCK) { - bdrv_dec_in_flight(bs); qio_channel_yield(ioc, G_IO_IN); - bdrv_inc_in_flight(bs); continue; } else if (len < 0) { return -EIO; From patchwork Mon Sep 27 21:55:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12520867 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DCE15C43217 for ; Mon, 27 Sep 2021 22:02:09 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AB89A60F58 for ; Mon, 27 Sep 2021 22:02:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org AB89A60F58 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:41522 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUyhM-0004XI-Pd for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 18:02:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54686) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybo-0002nr-Tw for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:26 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:21670) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUybl-00034A-Ic for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:24 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779780; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=i7cYrL1csZev2/4P5tAvdGKtIzV1H0EVC7XYk93Rlo0=; b=eyyYaguWCzP8QRXhEDXXZIks87AW9GnrDmWbdLeBXWyrg7rnXwGaXfOTkpDes+/ciQ1wSL PRK41Z40yjqzL5xTYEfU03dWrdsK1fgETNZ6FKjsnOrz4jS3ZPwrZ491GYgssKtQBx5Ih6 RmvtELPUc6nHgh3OWswLQ/VCx+oZu5o= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-406-AfMgT85BONWmoGEc5pJUUw-1; Mon, 27 Sep 2021 17:56:19 -0400 X-MC-Unique: AfMgT85BONWmoGEc5pJUUw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 40D741006AA4; Mon, 27 Sep 2021 21:56:18 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id CA8295D9D5; Mon, 27 Sep 2021 21:56:17 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 19/20] block/nbd: check that received handle is valid Date: Mon, 27 Sep 2021 16:55:44 -0500 Message-Id: <20210927215545.3930309-20-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Hanna Reitz , Vladimir Sementsov-Ogievskiy , "open list:Network Block Dev..." Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Vladimir Sementsov-Ogievskiy If we don't have active request, that waiting for this handle to be received, we should report an error. Signed-off-by: Vladimir Sementsov-Ogievskiy Message-Id: <20210902103805.25686-6-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake Signed-off-by: Eric Blake --- block/nbd.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/block/nbd.c b/block/nbd.c index 8ff6daf43d46..5ef462db1b7f 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -58,6 +58,7 @@ typedef struct { Coroutine *coroutine; uint64_t offset; /* original offset of the request */ bool receiving; /* sleeping in the yield in nbd_receive_replies */ + bool reply_possible; /* reply header not yet received */ } NBDClientRequest; typedef enum NBDClientState { @@ -415,14 +416,7 @@ static coroutine_fn int nbd_receive_replies(BDRVNBDState *s, uint64_t handle) return 0; } ind2 = HANDLE_TO_INDEX(s, s->reply.handle); - if (ind2 >= MAX_NBD_REQUESTS || !s->requests[ind2].coroutine) { - /* - * We only check that ind2 request exists. But don't check - * whether it is now waiting for the reply header or - * not. We can't just check s->requests[ind2].receiving: - * ind2 request may wait in trying to lock - * receive_mutex. So that's a TODO. - */ + if (ind2 >= MAX_NBD_REQUESTS || !s->requests[ind2].reply_possible) { nbd_channel_error(s, -EINVAL); return -EINVAL; } @@ -468,6 +462,7 @@ static int nbd_co_send_request(BlockDriverState *bs, s->requests[i].coroutine = qemu_coroutine_self(); s->requests[i].offset = request->from; s->requests[i].receiving = false; + s->requests[i].reply_possible = true; request->handle = INDEX_TO_HANDLE(s, i); From patchwork Mon Sep 27 21:55:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 12521181 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 77710C433EF for ; Mon, 27 Sep 2021 22:19:40 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 12464604D2 for ; Mon, 27 Sep 2021 22:19:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 12464604D2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=nongnu.org Received: from localhost ([::1]:56466 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUyyI-0000L2-4H for qemu-devel@archiver.kernel.org; Mon, 27 Sep 2021 18:19:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54792) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUyc1-00039I-MJ for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:37 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:25084) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUyby-0003Cn-Iy for qemu-devel@nongnu.org; Mon, 27 Sep 2021 17:56:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632779793; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Q1PCt1/H4GdHin41jmCy07n+wcB6dFHXh8/1q2mNuEQ=; b=fVo8dGITYCbcbcuWgE9ylX0kKYT6Bb3oeFjkCbMGufl0B0ha8bbpqvSRXwt2JSeKxrznqn f6Eutrhwfuzr2HyVvf/De6HJEW/jhKDBOnOwg0M5QFmJIHVUbND41UiAafFE5jVRaKDx6o XgvIwZbkyZYbsViSb1Db1JWE2VhGBdA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-236-sTf_bhMcNKSKG1RhMol9Bg-1; Mon, 27 Sep 2021 17:56:32 -0400 X-MC-Unique: sTf_bhMcNKSKG1RhMol9Bg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 63A565074C; Mon, 27 Sep 2021 21:56:31 +0000 (UTC) Received: from blue.redhat.com (ovpn-113-87.phx2.redhat.com [10.3.113.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 99B5A5D9D5; Mon, 27 Sep 2021 21:56:18 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Subject: [PULL 20/20] nbd/server: Add --selinux-label option Date: Mon, 27 Sep 2021 16:55:45 -0500 Message-Id: <20210927215545.3930309-21-eblake@redhat.com> In-Reply-To: <20210927215545.3930309-1-eblake@redhat.com> References: <20210927215545.3930309-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eblake@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Thomas Huth , Vladimir Sementsov-Ogievskiy , =?utf-8?q?Daniel_P?= =?utf-8?q?_=2E_Berrang=C3=A9?= , "open list:Network Block Dev..." , "Richard W.M. Jones" , Wainer dos Santos Moschetta , =?utf-8?q?Philippe_Mathi?= =?utf-8?q?eu-Daud=C3=A9?= , Willian Rampazzo , =?utf-8?q?Alex_Benn=C3=A9e?= Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: "Richard W.M. Jones" Under SELinux, Unix domain sockets have two labels. One is on the disk and can be set with commands such as chcon(1). There is a different label stored in memory (called the process label). This can only be set by the process creating the socket. When using SELinux + SVirt and wanting qemu to be able to connect to a qemu-nbd instance, you must set both labels correctly first. For qemu-nbd the options to set the second label are awkward. You can create the socket in a wrapper program and then exec into qemu-nbd. Or you could try something with LD_PRELOAD. This commit adds the ability to set the label straightforwardly on the command line, via the new --selinux-label flag. (The name of the flag is the same as the equivalent nbdkit option.) A worked example showing how to use the new option can be found in this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1984938 Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1984938 Signed-off-by: Richard W.M. Jones Message-Id: <20210723103303.1731437-2-rjones@redhat.com> Reviewed-by: Daniel P. Berrangé [eblake: Fail if option is used when not compiled in] Signed-off-by: Eric Blake --- configure | 8 +++- meson.build | 10 ++++- qemu-nbd.c | 39 +++++++++++++++++++ meson_options.txt | 3 ++ tests/docker/dockerfiles/centos8.docker | 1 + tests/docker/dockerfiles/fedora.docker | 1 + tests/docker/dockerfiles/opensuse-leap.docker | 1 + tests/docker/dockerfiles/ubuntu1804.docker | 1 + tests/docker/dockerfiles/ubuntu2004.docker | 1 + 9 files changed, 63 insertions(+), 2 deletions(-) diff --git a/configure b/configure index 1043ccce4f99..b3211a66eeea 100755 --- a/configure +++ b/configure @@ -445,6 +445,7 @@ fuse="auto" fuse_lseek="auto" multiprocess="auto" slirp_smbd="$default_feature" +selinux="auto" malloc_trim="auto" gio="$default_feature" @@ -1576,6 +1577,10 @@ for opt do ;; --disable-slirp-smbd) slirp_smbd=no ;; + --enable-selinux) selinux="enabled" + ;; + --disable-selinux) selinux="disabled" + ;; *) echo "ERROR: unknown option $opt" echo "Try '$0 --help' for more information" @@ -1963,6 +1968,7 @@ disabled with --disable-FEATURE, default is enabled if available multiprocess Out of process device emulation support gio libgio support slirp-smbd use smbd (at path --smbd=*) in slirp networking + selinux SELinux support in qemu-nbd NOTE: The object files are built at the place where configure is launched EOF @@ -5207,7 +5213,7 @@ if test "$skip_meson" = no; then -Dattr=$attr -Ddefault_devices=$default_devices -Dvirglrenderer=$virglrenderer \ -Ddocs=$docs -Dsphinx_build=$sphinx_build -Dinstall_blobs=$blobs \ -Dvhost_user_blk_server=$vhost_user_blk_server -Dmultiprocess=$multiprocess \ - -Dfuse=$fuse -Dfuse_lseek=$fuse_lseek -Dguest_agent_msi=$guest_agent_msi -Dbpf=$bpf\ + -Dselinux=$selinux \ $(if test "$default_feature" = no; then echo "-Dauto_features=disabled"; fi) \ -Dtcg_interpreter=$tcg_interpreter \ $cross_arg \ diff --git a/meson.build b/meson.build index 15ef4d3c4187..0ded2ac5eb9d 100644 --- a/meson.build +++ b/meson.build @@ -1072,6 +1072,11 @@ keyutils = dependency('libkeyutils', required: false, has_gettid = cc.has_function('gettid') +# libselinux +selinux = dependency('libselinux', + required: get_option('selinux'), + method: 'pkg-config', kwargs: static_kwargs) + # Malloc tests malloc = [] @@ -1300,6 +1305,7 @@ config_host_data.set('CONFIG_FUSE', fuse.found()) config_host_data.set('CONFIG_FUSE_LSEEK', fuse_lseek.found()) config_host_data.set('CONFIG_X11', x11.found()) config_host_data.set('CONFIG_CFI', get_option('cfi')) +config_host_data.set('CONFIG_SELINUX', selinux.found()) config_host_data.set('QEMU_VERSION', '"@0@"'.format(meson.project_version())) config_host_data.set('QEMU_VERSION_MAJOR', meson.project_version().split('.')[0]) config_host_data.set('QEMU_VERSION_MINOR', meson.project_version().split('.')[1]) @@ -2759,7 +2765,8 @@ if have_tools qemu_io = executable('qemu-io', files('qemu-io.c'), dependencies: [block, qemuutil], install: true) qemu_nbd = executable('qemu-nbd', files('qemu-nbd.c'), - dependencies: [blockdev, qemuutil, gnutls], install: true) + dependencies: [blockdev, qemuutil, gnutls, selinux], + install: true) subdir('storage-daemon') subdir('contrib/rdmacm-mux') @@ -3124,6 +3131,7 @@ summary_info += {'libpmem support': libpmem.found()} summary_info += {'libdaxctl support': libdaxctl.found()} summary_info += {'libudev': libudev.found()} summary_info += {'FUSE lseek': fuse_lseek.found()} +summary_info += {'selinux': selinux.found()} summary(summary_info, bool_yn: true, section: 'Dependencies') if not supported_cpus.contains(cpu) diff --git a/qemu-nbd.c b/qemu-nbd.c index 9d895ba24b1e..94f8ec07c064 100644 --- a/qemu-nbd.c +++ b/qemu-nbd.c @@ -47,6 +47,10 @@ #include "trace/control.h" #include "qemu-version.h" +#ifdef CONFIG_SELINUX +#include +#endif + #ifdef __linux__ #define HAVE_NBD_DEVICE 1 #else @@ -64,6 +68,7 @@ #define QEMU_NBD_OPT_FORK 263 #define QEMU_NBD_OPT_TLSAUTHZ 264 #define QEMU_NBD_OPT_PID_FILE 265 +#define QEMU_NBD_OPT_SELINUX_LABEL 266 #define MBR_SIZE 512 @@ -116,6 +121,9 @@ static void usage(const char *name) " --fork fork off the server process and exit the parent\n" " once the server is running\n" " --pid-file=PATH store the server's process ID in the given file\n" +#ifdef CONFIG_SELINUX +" --selinux-label=LABEL set SELinux process label on listening socket\n" +#endif #if HAVE_NBD_DEVICE "\n" "Kernel NBD client support:\n" @@ -534,6 +542,8 @@ int main(int argc, char **argv) { "trace", required_argument, NULL, 'T' }, { "fork", no_argument, NULL, QEMU_NBD_OPT_FORK }, { "pid-file", required_argument, NULL, QEMU_NBD_OPT_PID_FILE }, + { "selinux-label", required_argument, NULL, + QEMU_NBD_OPT_SELINUX_LABEL }, { NULL, 0, NULL, 0 } }; int ch; @@ -560,6 +570,7 @@ int main(int argc, char **argv) int old_stderr = -1; unsigned socket_activation; const char *pid_file_name = NULL; + const char *selinux_label = NULL; BlockExportOptions *export_opts; #ifdef CONFIG_POSIX @@ -749,6 +760,9 @@ int main(int argc, char **argv) case QEMU_NBD_OPT_PID_FILE: pid_file_name = optarg; break; + case QEMU_NBD_OPT_SELINUX_LABEL: + selinux_label = optarg; + break; } } @@ -940,6 +954,19 @@ int main(int argc, char **argv) } else { backlog = MIN(shared, SOMAXCONN); } + if (sockpath && selinux_label) { +#ifdef CONFIG_SELINUX + if (setsockcreatecon_raw(selinux_label) == -1) { + error_report("Cannot set SELinux socket create context " + "to %s: %s", + selinux_label, strerror(errno)); + exit(EXIT_FAILURE); + } +#else + error_report("SELinux support not enabled in this binary"); + exit(EXIT_FAILURE); +#endif + } saddr = nbd_build_socket_address(sockpath, bindto, port); if (qio_net_listener_open_sync(server, saddr, backlog, &local_err) < 0) { @@ -947,6 +974,18 @@ int main(int argc, char **argv) error_report_err(local_err); exit(EXIT_FAILURE); } + if (sockpath && selinux_label) { +#ifdef CONFIG_SELINUX + if (setsockcreatecon_raw(NULL) == -1) { + error_report("Cannot clear SELinux socket create context: %s", + strerror(errno)); + exit(EXIT_FAILURE); + } +#else + error_report("SELinux support not enabled in this binary"); + exit(EXIT_FAILURE); +#endif + } } else { size_t i; /* See comment in check_socket_activation above. */ diff --git a/meson_options.txt b/meson_options.txt index a9a9b8f4c6a8..a5938500a3a4 100644 --- a/meson_options.txt +++ b/meson_options.txt @@ -155,3 +155,6 @@ option('slirp', type: 'combo', value: 'auto', option('fdt', type: 'combo', value: 'auto', choices: ['disabled', 'enabled', 'auto', 'system', 'internal'], description: 'Whether and how to find the libfdt library') + +option('selinux', type: 'feature', value: 'auto', + description: 'SELinux support in qemu-nbd') diff --git a/tests/docker/dockerfiles/centos8.docker b/tests/docker/dockerfiles/centos8.docker index 46398c61eea9..7f135f8e8c00 100644 --- a/tests/docker/dockerfiles/centos8.docker +++ b/tests/docker/dockerfiles/centos8.docker @@ -51,6 +51,7 @@ ENV PACKAGES \ libpng-devel \ librbd-devel \ libseccomp-devel \ + libselinux-devel \ libslirp-devel \ libssh-devel \ libtasn1-devel \ diff --git a/tests/docker/dockerfiles/fedora.docker b/tests/docker/dockerfiles/fedora.docker index eec1add7f620..c6fd7e1113d4 100644 --- a/tests/docker/dockerfiles/fedora.docker +++ b/tests/docker/dockerfiles/fedora.docker @@ -53,6 +53,7 @@ ENV PACKAGES \ libpng-devel \ librbd-devel \ libseccomp-devel \ + libselinux-devel \ libslirp-devel \ libssh-devel \ libtasn1-devel \ diff --git a/tests/docker/dockerfiles/opensuse-leap.docker b/tests/docker/dockerfiles/opensuse-leap.docker index 5a8bee028951..3bbdb67f4fad 100644 --- a/tests/docker/dockerfiles/opensuse-leap.docker +++ b/tests/docker/dockerfiles/opensuse-leap.docker @@ -55,6 +55,7 @@ ENV PACKAGES \ libpulse-devel \ librbd-devel \ libseccomp-devel \ + libselinux-devel \ libspice-server-devel \ libssh-devel \ libtasn1-devel \ diff --git a/tests/docker/dockerfiles/ubuntu1804.docker b/tests/docker/dockerfiles/ubuntu1804.docker index 0880bf3e2928..450fd06d0d57 100644 --- a/tests/docker/dockerfiles/ubuntu1804.docker +++ b/tests/docker/dockerfiles/ubuntu1804.docker @@ -60,6 +60,7 @@ ENV PACKAGES \ libsdl2-dev \ libsdl2-image-dev \ libseccomp-dev \ + libselinux-dev \ libsnappy-dev \ libspice-protocol-dev \ libspice-server-dev \ diff --git a/tests/docker/dockerfiles/ubuntu2004.docker b/tests/docker/dockerfiles/ubuntu2004.docker index 39de63d0129f..15a026be0913 100644 --- a/tests/docker/dockerfiles/ubuntu2004.docker +++ b/tests/docker/dockerfiles/ubuntu2004.docker @@ -60,6 +60,7 @@ ENV PACKAGES \ libsdl2-dev \ libsdl2-image-dev \ libseccomp-dev \ + libselinux-dev \ libslirp-dev \ libsnappy-dev \ libspice-protocol-dev \