From patchwork Fri May 5 02:15:00 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 9713005 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 7FDE360362 for ; Fri, 5 May 2017 02:16:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6CCCD286B2 for ; Fri, 5 May 2017 02:16:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 617D8286B4; Fri, 5 May 2017 02:16:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id BCB54286B2 for ; Fri, 5 May 2017 02:16:11 +0000 (UTC) Received: from localhost ([::1]:44659 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d6Smx-0002qv-24 for patchwork-qemu-devel@patchwork.kernel.org; Thu, 04 May 2017 22:16:11 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48892) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d6Sm7-0002ov-6y for qemu-devel@nongnu.org; Thu, 04 May 2017 22:15:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1d6Sm5-00016y-M8 for qemu-devel@nongnu.org; Thu, 04 May 2017 22:15:19 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43224) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1d6Sm0-00013w-Qz; Thu, 04 May 2017 22:15:13 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CCE4564F0; Fri, 5 May 2017 02:15:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com CCE4564F0 Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=eblake@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com CCE4564F0 Received: from red.redhat.com (ovpn-122-206.rdu2.redhat.com [10.10.122.206]) by smtp.corp.redhat.com (Postfix) with ESMTP id A22207D94A; Fri, 5 May 2017 02:15:10 +0000 (UTC) From: Eric Blake To: qemu-devel@nongnu.org Date: Thu, 4 May 2017 21:15:00 -0500 Message-Id: <20170505021500.19315-3-eblake@redhat.com> In-Reply-To: <20170505021500.19315-1-eblake@redhat.com> References: <20170505021500.19315-1-eblake@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 05 May 2017 02:15:12 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [RFC PATCH 2/2] block: Exploit BDRV_BLOCK_EOF for larger zero blocks X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, Fam Zheng , Stefan Hajnoczi , qemu-block@nongnu.org, mreitz@redhat.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP When we have a BSD with unallocated clusters, but asking the status of its underlying bs->file or backing layer encounters an end-of-file condition, we know that the rest of the unallocated area will read as zeroes. However, pre-patch, this required two separate calls to bdrv_get_block_status(), as the first call stops at the point where the underlying file ends. Thanks to BDRV_BLOCK_EOF, we can now widen the results of the primary status if the secondary status already includes BDRV_BLOCK_ZERO. In turn, this fixes a TODO mentioned in iotest 154, where we can now see that all sectors in a partial cluster at the end of a file read as zero when coupling the shorter backing file's status along with our knowledge that the remaining sectors came from an unallocated cluster. Also, note that the loop in bdrv_co_get_block_status_above() had an inefficent exit: in cases where the active layer sets BDRV_BLOCK_ZERO but does NOT set BDRV_BLOCK_ALLOCATED (namely, where we know we read zeroes merely because our unallocated clusters lie beyond the backing file's shorter length), we still ended up probing the backing layer even though we already had a good answer. Signed-off-by: Eric Blake --- block/io.c | 27 ++++++++++++++++++++++----- tests/qemu-iotests/154 | 4 ---- tests/qemu-iotests/154.out | 12 ++++++------ 3 files changed, 28 insertions(+), 15 deletions(-) diff --git a/block/io.c b/block/io.c index a7bc8bd..0ec9c75 100644 --- a/block/io.c +++ b/block/io.c @@ -1819,10 +1819,13 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs, /* Ignore errors. This is just providing extra information, it * is useful but not necessary. */ - if (!file_pnum) { - /* !file_pnum indicates an offset at or beyond the EOF; it is - * perfectly valid for the format block driver to point to such - * offsets, so catch it and mark everything as zero */ + if (ret2 & BDRV_BLOCK_EOF && + (!file_pnum || ret2 & BDRV_BLOCK_ZERO)) { + /* + * It is valid for the format block driver to read + * beyond the end of the underlying file's current + * size; such areas read as zero. + */ ret |= BDRV_BLOCK_ZERO; } else { /* Limit request to the range reported by the protocol driver */ @@ -1849,16 +1852,30 @@ static int64_t coroutine_fn bdrv_co_get_block_status_above(BlockDriverState *bs, { BlockDriverState *p; int64_t ret = 0; + bool first = true; assert(bs != base); for (p = bs; p != base; p = backing_bs(p)) { ret = bdrv_co_get_block_status(p, sector_num, nb_sectors, pnum, file); - if (ret < 0 || ret & BDRV_BLOCK_ALLOCATED) { + if (ret < 0) { + break; + } + if (ret & BDRV_BLOCK_ZERO && ret & BDRV_BLOCK_EOF && !first) { + /* + * Reading beyond the end of the file continues to read + * zeroes, but we can only widen the result to the + * unallocated length we learned from an earlier + * iteration. + */ + *pnum = nb_sectors; + } + if (ret & (BDRV_BLOCK_ZERO | BDRV_BLOCK_DATA)) { break; } /* [sector_num, pnum] unallocated on this layer, which could be only * the first part of [sector_num, nb_sectors]. */ nb_sectors = MIN(nb_sectors, *pnum); + first = false; } return ret; } diff --git a/tests/qemu-iotests/154 b/tests/qemu-iotests/154 index 687b8f3..d4dae83 100755 --- a/tests/qemu-iotests/154 +++ b/tests/qemu-iotests/154 @@ -334,8 +334,6 @@ $QEMU_IO -c "alloc $size 2048" "$TEST_IMG" | _filter_qemu_io $QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map # Repeat with backing file holding unallocated cluster. -# TODO: Note that this forces an allocation, because we aren't yet able to -# quickly detect that reads beyond EOF of the backing file are always zero CLUSTER_SIZE=2048 TEST_IMG="$TEST_IMG.base" _make_test_img $((size + 1024)) # Write at the front: sector-wise, the request is: @@ -371,8 +369,6 @@ $QEMU_IO -c "alloc $size 2048" "$TEST_IMG" | _filter_qemu_io $QEMU_IMG map --output=json "$TEST_IMG" | _filter_qemu_img_map # Repeat with backing file holding zero'd cluster -# TODO: Note that this forces an allocation, because we aren't yet able to -# quickly detect that reads beyond EOF of the backing file are always zero $QEMU_IO -c "write -z $size 512" "$TEST_IMG.base" | _filter_qemu_io # Write at the front: sector-wise, the request is: diff --git a/tests/qemu-iotests/154.out b/tests/qemu-iotests/154.out index b86f074..9e7abf2 100644 --- a/tests/qemu-iotests/154.out +++ b/tests/qemu-iotests/154.out @@ -310,19 +310,19 @@ wrote 512/512 bytes at offset 134217728 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 2048/2048 bytes allocated at offset 128 MiB [{ "start": 0, "length": 134217728, "depth": 1, "zero": true, "data": false}, -{ "start": 134217728, "length": 2048, "depth": 0, "zero": false, "data": true, "offset": OFFSET}] +{ "start": 134217728, "length": 2048, "depth": 0, "zero": true, "data": false}] Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134219776 backing_file=TEST_DIR/t.IMGFMT.base wrote 512/512 bytes at offset 134219264 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 2048/2048 bytes allocated at offset 128 MiB [{ "start": 0, "length": 134217728, "depth": 1, "zero": true, "data": false}, -{ "start": 134217728, "length": 2048, "depth": 0, "zero": false, "data": true, "offset": OFFSET}] +{ "start": 134217728, "length": 2048, "depth": 0, "zero": true, "data": false}] Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134219776 backing_file=TEST_DIR/t.IMGFMT.base wrote 1024/1024 bytes at offset 134218240 1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 2048/2048 bytes allocated at offset 128 MiB [{ "start": 0, "length": 134217728, "depth": 1, "zero": true, "data": false}, -{ "start": 134217728, "length": 2048, "depth": 0, "zero": false, "data": true, "offset": OFFSET}] +{ "start": 134217728, "length": 2048, "depth": 0, "zero": true, "data": false}] Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134219776 backing_file=TEST_DIR/t.IMGFMT.base wrote 2048/2048 bytes at offset 134217728 2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) @@ -336,19 +336,19 @@ wrote 512/512 bytes at offset 134217728 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 2048/2048 bytes allocated at offset 128 MiB [{ "start": 0, "length": 134217728, "depth": 1, "zero": true, "data": false}, -{ "start": 134217728, "length": 2048, "depth": 0, "zero": false, "data": true, "offset": OFFSET}] +{ "start": 134217728, "length": 2048, "depth": 0, "zero": true, "data": false}] Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134219776 backing_file=TEST_DIR/t.IMGFMT.base wrote 512/512 bytes at offset 134219264 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 2048/2048 bytes allocated at offset 128 MiB [{ "start": 0, "length": 134217728, "depth": 1, "zero": true, "data": false}, -{ "start": 134217728, "length": 2048, "depth": 0, "zero": false, "data": true, "offset": OFFSET}] +{ "start": 134217728, "length": 2048, "depth": 0, "zero": true, "data": false}] Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134219776 backing_file=TEST_DIR/t.IMGFMT.base wrote 1024/1024 bytes at offset 134218240 1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) 2048/2048 bytes allocated at offset 128 MiB [{ "start": 0, "length": 134217728, "depth": 1, "zero": true, "data": false}, -{ "start": 134217728, "length": 2048, "depth": 0, "zero": false, "data": true, "offset": OFFSET}] +{ "start": 134217728, "length": 2048, "depth": 0, "zero": true, "data": false}] Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134219776 backing_file=TEST_DIR/t.IMGFMT.base wrote 2048/2048 bytes at offset 134217728 2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)