From patchwork Wed Jun 1 21:10:02 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 9148135 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 8E89560751 for ; Wed, 1 Jun 2016 21:11:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8064D268AE for ; Wed, 1 Jun 2016 21:11:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 74BB226C2F; Wed, 1 Jun 2016 21:11:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D435E26CB8 for ; Wed, 1 Jun 2016 21:11:14 +0000 (UTC) Received: from localhost ([::1]:44118 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b8DQ2-0002JM-0k for patchwork-qemu-devel@patchwork.kernel.org; Wed, 01 Jun 2016 17:11:14 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58925) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b8DPJ-00027P-7F for qemu-devel@nongnu.org; Wed, 01 Jun 2016 17:10:30 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b8DPB-0008BB-Te for qemu-devel@nongnu.org; Wed, 01 Jun 2016 17:10:28 -0400 Received: from mx1.redhat.com ([209.132.183.28]:41634) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b8DP8-00089A-6x; Wed, 01 Jun 2016 17:10:18 -0400 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id DA2FBC05B1F2; Wed, 1 Jun 2016 21:10:17 +0000 (UTC) Received: from red.redhat.com (ovpn-116-150.phx2.redhat.com [10.3.116.150]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u51LAF6P027968; Wed, 1 Jun 2016 17:10:17 -0400 From: Eric Blake To: qemu-devel@nongnu.org Date: Wed, 1 Jun 2016 15:10:02 -0600 Message-Id: <1464815413-613-3-git-send-email-eblake@redhat.com> In-Reply-To: <1464815413-613-1-git-send-email-eblake@redhat.com> References: <1464815413-613-1-git-send-email-eblake@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Wed, 01 Jun 2016 21:10:17 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v2 02/13] block: Track write zero limits in bytes X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, Fam Zheng , Stefan Hajnoczi , qemu-block@nongnu.org, Peter Lieven , Max Reitz , Ronnie Sahlberg , Paolo Bonzini Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP Another step towards removing sector-based interfaces: convert the maximum write and minimum alignment values from sectors to bytes. Rename the variables to let the compiler check that all users are converted to the new semantics. The maximum remains an int as long as BDRV_REQUEST_MAX_SECTORS is constrained by INT_MAX (this means that we can't even support a 2G write_zeroes, but just under it) - changing operation lengths to unsigned or to 64-bits is a much bigger audit, and debatable if we even want to do it (since at the core, a 32-bit platform will still have ssize_t as its underlying limit on write()). Meanwhile, alignment is changed to 'uint32_t', since it makes no sense to have an alignment larger than the maximum write, and less painful to use an unsigned type with well-defined behavior in bit operations than to have to worry about what happens if a driver mistakenly supplies a negative alignment. Add an assert that no one was trying to use sectors to get a write zeroes larger than 2G, and therefore that a later conversion to bytes won't be impacted by keeping the limit at 32 bits. Signed-off-by: Eric Blake --- include/block/block_int.h | 10 ++++++---- block/io.c | 22 +++++++++++++--------- block/iscsi.c | 13 ++++++------- block/qcow2.c | 2 +- block/qed.c | 2 +- block/vmdk.c | 6 +++--- 6 files changed, 30 insertions(+), 25 deletions(-) diff --git a/include/block/block_int.h b/include/block/block_int.h index 30a9717..2e9c81f 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -328,11 +328,13 @@ typedef struct BlockLimits { /* optimal alignment for discard requests in sectors */ int64_t discard_alignment; - /* maximum number of sectors that can zeroized at once */ - int max_write_zeroes; + /* maximum number of bytes that can zeroized at once (since it is + * signed, it must be < 2G, if set) */ + int32_t max_pwrite_zeroes; - /* optimal alignment for write zeroes requests in sectors */ - int64_t write_zeroes_alignment; + /* optimal alignment for write zeroes requests in bytes, must be + * power of 2, and less than max_pwrite_zeroes if that is set */ + uint32_t pwrite_zeroes_alignment; /* optimal transfer length in sectors */ int opt_transfer_length; diff --git a/block/io.c b/block/io.c index 26b5845..108cd35 100644 --- a/block/io.c +++ b/block/io.c @@ -1121,15 +1121,19 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs, int head = 0; int tail = 0; - int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_write_zeroes, - BDRV_REQUEST_MAX_SECTORS); - if (bs->bl.write_zeroes_alignment) { - assert(is_power_of_2(bs->bl.write_zeroes_alignment)); - head = sector_num & (bs->bl.write_zeroes_alignment - 1); - tail = (sector_num + nb_sectors) & (bs->bl.write_zeroes_alignment - 1); - max_write_zeroes &= ~(bs->bl.write_zeroes_alignment - 1); + int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX); + int write_zeroes_sector_align = + bs->bl.pwrite_zeroes_alignment >> BDRV_SECTOR_BITS; + + max_write_zeroes >>= BDRV_SECTOR_BITS; + if (write_zeroes_sector_align) { + assert(is_power_of_2(bs->bl.pwrite_zeroes_alignment)); + head = sector_num & (write_zeroes_sector_align - 1); + tail = (sector_num + nb_sectors) & (write_zeroes_sector_align - 1); + max_write_zeroes &= ~(write_zeroes_sector_align - 1); } + assert(nb_sectors <= BDRV_REQUEST_MAX_SECTORS); while (nb_sectors > 0 && !ret) { int num = nb_sectors; @@ -1139,9 +1143,9 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs, */ if (head) { /* Make a small request up to the first aligned sector. */ - num = MIN(nb_sectors, bs->bl.write_zeroes_alignment - head); + num = MIN(nb_sectors, write_zeroes_sector_align - head); head = 0; - } else if (tail && num > bs->bl.write_zeroes_alignment) { + } else if (tail && num > write_zeroes_sector_align) { /* Shorten the request to the last aligned sector. */ num -= tail; } diff --git a/block/iscsi.c b/block/iscsi.c index 94f9974..52ea9d7 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -1715,16 +1715,15 @@ static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp) bs->bl.discard_alignment = iscsilun->block_size >> BDRV_SECTOR_BITS; } - if (iscsilun->bl.max_ws_len < 0xffffffff) { - bs->bl.max_write_zeroes = - sector_limits_lun2qemu(iscsilun->bl.max_ws_len, iscsilun); + if (iscsilun->bl.max_ws_len < 0xffffffff / iscsilun->block_size) { + bs->bl.max_pwrite_zeroes = + iscsilun->bl.max_ws_len * iscsilun->block_size; } if (iscsilun->lbp.lbpws) { - bs->bl.write_zeroes_alignment = - sector_limits_lun2qemu(iscsilun->bl.opt_unmap_gran, iscsilun); + bs->bl.pwrite_zeroes_alignment = + iscsilun->bl.opt_unmap_gran * iscsilun->block_size; } else { - bs->bl.write_zeroes_alignment = - iscsilun->block_size >> BDRV_SECTOR_BITS; + bs->bl.pwrite_zeroes_alignment = iscsilun->block_size; } bs->bl.opt_transfer_length = sector_limits_lun2qemu(iscsilun->bl.opt_xfer_len, iscsilun); diff --git a/block/qcow2.c b/block/qcow2.c index ecac399..a6ea6cb 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -1193,7 +1193,7 @@ static void qcow2_refresh_limits(BlockDriverState *bs, Error **errp) { BDRVQcow2State *s = bs->opaque; - bs->bl.write_zeroes_alignment = s->cluster_sectors; + bs->bl.pwrite_zeroes_alignment = s->cluster_size; } static int qcow2_set_key(BlockDriverState *bs, const char *key) diff --git a/block/qed.c b/block/qed.c index b591d4a..0ab5b40 100644 --- a/block/qed.c +++ b/block/qed.c @@ -518,7 +518,7 @@ static void bdrv_qed_refresh_limits(BlockDriverState *bs, Error **errp) { BDRVQEDState *s = bs->opaque; - bs->bl.write_zeroes_alignment = s->header.cluster_size >> BDRV_SECTOR_BITS; + bs->bl.pwrite_zeroes_alignment = s->header.cluster_size; } /* We have nothing to do for QED reopen, stubs just return diff --git a/block/vmdk.c b/block/vmdk.c index 372e5ed..8494d63 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -998,9 +998,9 @@ static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp) for (i = 0; i < s->num_extents; i++) { if (!s->extents[i].flat) { - bs->bl.write_zeroes_alignment = - MAX(bs->bl.write_zeroes_alignment, - s->extents[i].cluster_sectors); + bs->bl.pwrite_zeroes_alignment = + MAX(bs->bl.pwrite_zeroes_alignment, + s->extents[i].cluster_sectors << BDRV_SECTOR_BITS); } } }