From patchwork Tue Apr 11 01:17:12 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 9674459 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A1E2060382 for ; Tue, 11 Apr 2017 01:27:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8C55128487 for ; Tue, 11 Apr 2017 01:27:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7DAD7284E7; Tue, 11 Apr 2017 01:27:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 7CED428487 for ; Tue, 11 Apr 2017 01:27:48 +0000 (UTC) Received: from localhost ([::1]:36733 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cxkax-0004Qr-KI for patchwork-qemu-devel@patchwork.kernel.org; Mon, 10 Apr 2017 21:27:47 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46535) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cxkRB-0005MA-Ab for qemu-devel@nongnu.org; Mon, 10 Apr 2017 21:17:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cxkR9-0005sr-It for qemu-devel@nongnu.org; Mon, 10 Apr 2017 21:17:41 -0400 Received: from mx1.redhat.com ([209.132.183.28]:59528) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cxkR3-0005ou-IC; Mon, 10 Apr 2017 21:17:33 -0400 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9B67D8124F; Tue, 11 Apr 2017 01:17:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 9B67D8124F Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=eblake@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 9B67D8124F Received: from red.redhat.com (ovpn-122-58.rdu2.redhat.com [10.10.122.58]) by smtp.corp.redhat.com (Postfix) with ESMTP id E281678DF2; Tue, 11 Apr 2017 01:17:31 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Date: Mon, 10 Apr 2017 20:17:12 -0500 Message-Id: <20170411011718.9152-8-eblake@redhat.com> In-Reply-To: <20170411011718.9152-1-eblake@redhat.com> References: <20170411011718.9152-1-eblake@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 11 Apr 2017 01:17:32 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v9 07/13] qcow2: Discard/zero clusters by byte count X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, qemu-block@nongnu.org, mreitz@redhat.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP Passing a byte offset, but sector count, when we ultimately want to operate on cluster granularity, is madness. Clean up the external interfaces to take both offset and count as bytes, while still keeping the assertion added previously that the caller must align the values to a cluster. Then rename things to make sure backports don't get confused by changed units: instead of qcow2_discard_clusters() and qcow2_zero_clusters(), we now have qcow2_cluster_discard() and qcow2_cluster_zeroize(). The internal functions still operate on clusters at a time, and return an int for number of cleared clusters; but on an image with 2M clusters, a single L2 table holds 256k entries that each represent a 2M cluster, totalling well over INT_MAX bytes if we ever had a request for that many bytes at once. All our callers currently limit themselves to 32-bit bytes (and therefore fewer clusters), but by making this function 64-bit clean, we have one less place to clean up if we later improve the block layer to support 64-bit bytes through all operations (with the block layer auto-fragmenting on behalf of more-limited drivers), rather than the current state where some interfaces are artificially limited to INT_MAX at a time. Signed-off-by: Eric Blake --- v9: rebase to earlier changes, drop R-b v7, v8: only earlier half of series submitted for 2.9 v6: rebase due to context v5: s/count/byte/ to make the units obvious, and rework the math to ensure no 32-bit integer overflow on large clusters v4: improve function names, split assertion additions into earlier patch [no v3 or v2] v1: https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg00339.html --- block/qcow2.h | 9 +++++---- block/qcow2-cluster.c | 39 ++++++++++++++++++++++----------------- block/qcow2-snapshot.c | 7 +++---- block/qcow2.c | 21 +++++++++------------ 4 files changed, 39 insertions(+), 37 deletions(-) diff --git a/block/qcow2.h b/block/qcow2.h index f8aeb08..808104c 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -544,10 +544,11 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs, int compressed_size); int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m); -int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset, - int nb_sectors, enum qcow2_discard_type type, bool full_discard); -int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors, - int flags); +int qcow2_cluster_discard(BlockDriverState *bs, uint64_t offset, + uint64_t bytes, enum qcow2_discard_type type, + bool full_discard); +int qcow2_cluster_zeroize(BlockDriverState *bs, uint64_t offset, + uint64_t bytes, int flags); int qcow2_expand_zero_clusters(BlockDriverState *bs, BlockDriverAmendStatusCB *status_cb, diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 362a855..595e6e3 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -1509,16 +1509,16 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset, return nb_clusters; } -int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset, - int nb_sectors, enum qcow2_discard_type type, bool full_discard) +int qcow2_cluster_discard(BlockDriverState *bs, uint64_t offset, + uint64_t bytes, enum qcow2_discard_type type, + bool full_discard) { BDRVQcow2State *s = bs->opaque; - uint64_t end_offset; + uint64_t end_offset = offset + bytes; uint64_t nb_clusters; + int64_t cleared; int ret; - end_offset = offset + (nb_sectors << BDRV_SECTOR_BITS); - /* Caller must pass aligned values, except at image end */ assert(QEMU_IS_ALIGNED(offset, s->cluster_size)); assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) || @@ -1530,13 +1530,15 @@ int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset, /* Each L2 table is handled by its own loop iteration */ while (nb_clusters > 0) { - ret = discard_single_l2(bs, offset, nb_clusters, type, full_discard); - if (ret < 0) { + cleared = discard_single_l2(bs, offset, nb_clusters, type, + full_discard); + if (cleared < 0) { + ret = cleared; goto fail; } - nb_clusters -= ret; - offset += (ret * s->cluster_size); + nb_clusters -= cleared; + offset += (cleared * s->cluster_size); } ret = 0; @@ -1591,20 +1593,22 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset, return nb_clusters; } -int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors, - int flags) +int qcow2_cluster_zeroize(BlockDriverState *bs, uint64_t offset, + uint64_t bytes, int flags) { BDRVQcow2State *s = bs->opaque; uint64_t end_offset; uint64_t nb_clusters; + int64_t cleared; int ret; - end_offset = offset + (nb_sectors << BDRV_SECTOR_BITS); + end_offset = offset + bytes; /* Caller must pass aligned values, except at image end */ assert(QEMU_IS_ALIGNED(offset, s->cluster_size)); assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) || end_offset == bs->total_sectors << BDRV_SECTOR_BITS); + assert(QEMU_IS_ALIGNED(bytes, s->cluster_size)); /* The zero flag is only supported by version 3 and newer; we * require the use of that flag if there is a backing file or if @@ -1615,18 +1619,19 @@ int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors, } /* Each L2 table is handled by its own loop iteration */ - nb_clusters = size_to_clusters(s, nb_sectors << BDRV_SECTOR_BITS); + nb_clusters = size_to_clusters(s, bytes); s->cache_discards = true; while (nb_clusters > 0) { - ret = zero_single_l2(bs, offset, nb_clusters, flags); - if (ret < 0) { + cleared = zero_single_l2(bs, offset, nb_clusters, flags); + if (cleared < 0) { + ret = cleared; goto fail; } - nb_clusters -= ret; - offset += (ret * s->cluster_size); + nb_clusters -= cleared; + offset += (cleared * s->cluster_size); } ret = 0; diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c index 0324243..44243e0 100644 --- a/block/qcow2-snapshot.c +++ b/block/qcow2-snapshot.c @@ -440,10 +440,9 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info) /* The VM state isn't needed any more in the active L1 table; in fact, it * hurts by causing expensive COW for the next snapshot. */ - qcow2_discard_clusters(bs, qcow2_vm_state_offset(s), - align_offset(sn->vm_state_size, s->cluster_size) - >> BDRV_SECTOR_BITS, - QCOW2_DISCARD_NEVER, false); + qcow2_cluster_discard(bs, qcow2_vm_state_offset(s), + align_offset(sn->vm_state_size, s->cluster_size), + QCOW2_DISCARD_NEVER, false); #ifdef DEBUG_ALLOC { diff --git a/block/qcow2.c b/block/qcow2.c index 8038793..4d34610 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -2508,7 +2508,7 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs, trace_qcow2_pwrite_zeroes(qemu_coroutine_self(), offset, count); /* Whatever is left can use real zero clusters */ - ret = qcow2_zero_clusters(bs, offset, count >> BDRV_SECTOR_BITS, flags); + ret = qcow2_cluster_zeroize(bs, offset, count, flags); qemu_co_mutex_unlock(&s->lock); return ret; @@ -2531,8 +2531,8 @@ static coroutine_fn int qcow2_co_pdiscard(BlockDriverState *bs, } qemu_co_mutex_lock(&s->lock); - ret = qcow2_discard_clusters(bs, offset, count >> BDRV_SECTOR_BITS, - QCOW2_DISCARD_REQUEST, false); + ret = qcow2_cluster_discard(bs, offset, count, QCOW2_DISCARD_REQUEST, + false); qemu_co_mutex_unlock(&s->lock); return ret; } @@ -2839,9 +2839,8 @@ fail: static int qcow2_make_empty(BlockDriverState *bs) { BDRVQcow2State *s = bs->opaque; - uint64_t start_sector; - int sector_step = (QEMU_ALIGN_DOWN(INT_MAX, s->cluster_size) / - BDRV_SECTOR_SIZE); + uint64_t offset, end_offset; + int step = QEMU_ALIGN_DOWN(INT_MAX, s->cluster_size); int l1_clusters, ret = 0; l1_clusters = DIV_ROUND_UP(s->l1_size, s->cluster_size / sizeof(uint64_t)); @@ -2858,18 +2857,16 @@ static int qcow2_make_empty(BlockDriverState *bs) /* This fallback code simply discards every active cluster; this is slow, * but works in all cases */ - for (start_sector = 0; start_sector < bs->total_sectors; - start_sector += sector_step) + end_offset = bs->total_sectors * BDRV_SECTOR_SIZE; + for (offset = 0; offset < end_offset; offset += step) { /* As this function is generally used after committing an external * snapshot, QCOW2_DISCARD_SNAPSHOT seems appropriate. Also, the * default action for this kind of discard is to pass the discard, * which will ideally result in an actually smaller image file, as * is probably desired. */ - ret = qcow2_discard_clusters(bs, start_sector * BDRV_SECTOR_SIZE, - MIN(sector_step, - bs->total_sectors - start_sector), - QCOW2_DISCARD_SNAPSHOT, true); + ret = qcow2_cluster_discard(bs, offset, MIN(step, end_offset - offset), + QCOW2_DISCARD_SNAPSHOT, true); if (ret < 0) { break; }