From patchwork Wed Jun 1 21:10:03 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 9148137 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5243F60751 for ; Wed, 1 Jun 2016 21:11:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 44500268AE for ; Wed, 1 Jun 2016 21:11:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 359DC26C2F; Wed, 1 Jun 2016 21:11:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 8D2EF26FA0 for ; Wed, 1 Jun 2016 21:11:15 +0000 (UTC) Received: from localhost ([::1]:44119 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b8DQ2-0002LY-JU for patchwork-qemu-devel@patchwork.kernel.org; Wed, 01 Jun 2016 17:11:14 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58941) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b8DPJ-00027v-Nf for qemu-devel@nongnu.org; Wed, 01 Jun 2016 17:10:31 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b8DPC-0008BY-DT for qemu-devel@nongnu.org; Wed, 01 Jun 2016 17:10:28 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52529) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b8DP8-00089J-Kt; Wed, 01 Jun 2016 17:10:18 -0400 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5548D7F093; Wed, 1 Jun 2016 21:10:18 +0000 (UTC) Received: from red.redhat.com (ovpn-116-150.phx2.redhat.com [10.3.116.150]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u51LAF6Q027968; Wed, 1 Jun 2016 17:10:17 -0400 From: Eric Blake To: qemu-devel@nongnu.org Date: Wed, 1 Jun 2016 15:10:03 -0600 Message-Id: <1464815413-613-4-git-send-email-eblake@redhat.com> In-Reply-To: <1464815413-613-1-git-send-email-eblake@redhat.com> References: <1464815413-613-1-git-send-email-eblake@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Wed, 01 Jun 2016 21:10:18 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v2 03/13] block: Add .bdrv_co_pwrite_zeroes() X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, Fam Zheng , Stefan Hajnoczi , qemu-block@nongnu.org, Max Reitz Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP Update bdrv_co_do_write_zeroes() to be byte-based, and select between the new byte-based bdrv_co_pwrite_zeroes() or the old bdrv_co_write_zeroes(). The next patches will convert drivers, then remove the old interface. Signed-off-by: Eric Blake --- include/block/block_int.h | 4 ++- block/io.c | 78 ++++++++++++++++++++++++++--------------------- 2 files changed, 46 insertions(+), 36 deletions(-) diff --git a/include/block/block_int.h b/include/block/block_int.h index 2e9c81f..1dfdf92 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -165,6 +165,8 @@ struct BlockDriver { */ int coroutine_fn (*bdrv_co_write_zeroes)(BlockDriverState *bs, int64_t sector_num, int nb_sectors, BdrvRequestFlags flags); + int coroutine_fn (*bdrv_co_pwrite_zeroes)(BlockDriverState *bs, + int64_t offset, int count, BdrvRequestFlags flags); int coroutine_fn (*bdrv_co_discard)(BlockDriverState *bs, int64_t sector_num, int nb_sectors); int64_t coroutine_fn (*bdrv_co_get_block_status)(BlockDriverState *bs, @@ -456,7 +458,7 @@ struct BlockDriverState { unsigned int request_alignment; /* Flags honored during pwrite (so far: BDRV_REQ_FUA) */ unsigned int supported_write_flags; - /* Flags honored during write_zeroes (so far: BDRV_REQ_FUA, + /* Flags honored during pwrite_zeroes (so far: BDRV_REQ_FUA, * BDRV_REQ_MAY_UNMAP) */ unsigned int supported_zero_flags; diff --git a/block/io.c b/block/io.c index 108cd35..3fe7576 100644 --- a/block/io.c +++ b/block/io.c @@ -42,8 +42,8 @@ static BlockAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs, void *opaque, bool is_write); static void coroutine_fn bdrv_co_do_rw(void *opaque); -static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs, - int64_t sector_num, int nb_sectors, BdrvRequestFlags flags); +static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, + int64_t offset, int count, BdrvRequestFlags flags); static void bdrv_parent_drained_begin(BlockDriverState *bs) { @@ -893,10 +893,12 @@ static int coroutine_fn bdrv_co_do_copy_on_readv(BlockDriverState *bs, goto err; } - if (drv->bdrv_co_write_zeroes && + if ((drv->bdrv_co_write_zeroes || drv->bdrv_co_pwrite_zeroes) && buffer_is_zero(bounce_buffer, iov.iov_len)) { - ret = bdrv_co_do_write_zeroes(bs, cluster_sector_num, - cluster_nb_sectors, 0); + ret = bdrv_co_do_pwrite_zeroes(bs, + cluster_sector_num * BDRV_SECTOR_SIZE, + cluster_nb_sectors * BDRV_SECTOR_SIZE, + 0); } else { /* This does not change the data on the disk, it is not necessary * to flush even in cache=writethrough mode. @@ -1110,8 +1112,8 @@ int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num, #define MAX_WRITE_ZEROES_BOUNCE_BUFFER 32768 -static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs, - int64_t sector_num, int nb_sectors, BdrvRequestFlags flags) +static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, + int64_t offset, int count, BdrvRequestFlags flags) { BlockDriver *drv = bs->drv; QEMUIOVector qiov; @@ -1122,20 +1124,16 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs, int tail = 0; int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX); - int write_zeroes_sector_align = - bs->bl.pwrite_zeroes_alignment >> BDRV_SECTOR_BITS; + int alignment = MAX(bs->bl.pwrite_zeroes_alignment ?: 1, + bs->request_alignment); - max_write_zeroes >>= BDRV_SECTOR_BITS; - if (write_zeroes_sector_align) { - assert(is_power_of_2(bs->bl.pwrite_zeroes_alignment)); - head = sector_num & (write_zeroes_sector_align - 1); - tail = (sector_num + nb_sectors) & (write_zeroes_sector_align - 1); - max_write_zeroes &= ~(write_zeroes_sector_align - 1); - } + assert(is_power_of_2(alignment)); + head = offset & (alignment - 1); + tail = (offset + count) & (alignment - 1); + max_write_zeroes &= ~(alignment - 1); - assert(nb_sectors <= BDRV_REQUEST_MAX_SECTORS); - while (nb_sectors > 0 && !ret) { - int num = nb_sectors; + while (count > 0 && !ret) { + int num = count; /* Align request. Block drivers can expect the "bulk" of the request * to be aligned, and that unaligned requests do not cross cluster @@ -1143,9 +1141,9 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs, */ if (head) { /* Make a small request up to the first aligned sector. */ - num = MIN(nb_sectors, write_zeroes_sector_align - head); + num = MIN(count, alignment - head); head = 0; - } else if (tail && num > write_zeroes_sector_align) { + } else if (tail && num > alignment) { /* Shorten the request to the last aligned sector. */ num -= tail; } @@ -1157,8 +1155,18 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs, ret = -ENOTSUP; /* First try the efficient write zeroes operation */ - if (drv->bdrv_co_write_zeroes) { - ret = drv->bdrv_co_write_zeroes(bs, sector_num, num, + if (drv->bdrv_co_pwrite_zeroes) { + ret = drv->bdrv_co_pwrite_zeroes(bs, offset, num, + flags & bs->supported_zero_flags); + if (ret != -ENOTSUP && (flags & BDRV_REQ_FUA) && + !(bs->supported_zero_flags & BDRV_REQ_FUA)) { + need_flush = true; + } + } else if (drv->bdrv_co_write_zeroes) { + assert(offset % BDRV_SECTOR_SIZE == 0); + assert(count % BDRV_SECTOR_SIZE == 0); + ret = drv->bdrv_co_write_zeroes(bs, offset >> BDRV_SECTOR_BITS, + num >> BDRV_SECTOR_BITS, flags & bs->supported_zero_flags); if (ret != -ENOTSUP && (flags & BDRV_REQ_FUA) && !(bs->supported_zero_flags & BDRV_REQ_FUA)) { @@ -1181,33 +1189,31 @@ static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs, write_flags &= ~BDRV_REQ_FUA; need_flush = true; } - num = MIN(num, max_xfer_len); - iov.iov_len = num * BDRV_SECTOR_SIZE; + num = MIN(num, max_xfer_len << BDRV_SECTOR_BITS); + iov.iov_len = num; if (iov.iov_base == NULL) { - iov.iov_base = qemu_try_blockalign(bs, num * BDRV_SECTOR_SIZE); + iov.iov_base = qemu_try_blockalign(bs, num); if (iov.iov_base == NULL) { ret = -ENOMEM; goto fail; } - memset(iov.iov_base, 0, num * BDRV_SECTOR_SIZE); + memset(iov.iov_base, 0, num); } qemu_iovec_init_external(&qiov, &iov, 1); - ret = bdrv_driver_pwritev(bs, sector_num * BDRV_SECTOR_SIZE, - num * BDRV_SECTOR_SIZE, &qiov, - write_flags); + ret = bdrv_driver_pwritev(bs, offset, num, &qiov, write_flags); /* Keep bounce buffer around if it is big enough for all * all future requests. */ - if (num < max_xfer_len) { + if (num < max_xfer_len << BDRV_SECTOR_BITS) { qemu_vfree(iov.iov_base); iov.iov_base = NULL; } } - sector_num += num; - nb_sectors -= num; + offset += num; + count -= num; } fail: @@ -1245,7 +1251,8 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs, ret = notifier_with_return_list_notify(&bs->before_write_notifiers, req); if (!ret && bs->detect_zeroes != BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF && - !(flags & BDRV_REQ_ZERO_WRITE) && drv->bdrv_co_write_zeroes && + !(flags & BDRV_REQ_ZERO_WRITE) && + (drv->bdrv_co_pwrite_zeroes || drv->bdrv_co_write_zeroes) && qemu_iovec_is_zero(qiov)) { flags |= BDRV_REQ_ZERO_WRITE; if (bs->detect_zeroes == BLOCKDEV_DETECT_ZEROES_OPTIONS_UNMAP) { @@ -1257,7 +1264,8 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs, /* Do nothing, write notifier decided to fail this request */ } else if (flags & BDRV_REQ_ZERO_WRITE) { bdrv_debug_event(bs, BLKDBG_PWRITEV_ZERO); - ret = bdrv_co_do_write_zeroes(bs, sector_num, nb_sectors, flags); + ret = bdrv_co_do_pwrite_zeroes(bs, sector_num << BDRV_SECTOR_BITS, + nb_sectors << BDRV_SECTOR_BITS, flags); } else { bdrv_debug_event(bs, BLKDBG_PWRITEV); ret = bdrv_driver_pwritev(bs, offset, bytes, qiov, flags);