From patchwork Thu Nov 17 20:13:56 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 9435285 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 0F90060469 for ; Thu, 17 Nov 2016 20:14:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0086B29648 for ; Thu, 17 Nov 2016 20:14:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E9604296CC; Thu, 17 Nov 2016 20:14:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 33B032968F for ; Thu, 17 Nov 2016 20:14:49 +0000 (UTC) Received: from localhost ([::1]:33033 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c7T56-0006xm-LO for patchwork-qemu-devel@patchwork.kernel.org; Thu, 17 Nov 2016 15:14:48 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57635) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c7T4W-0006uq-NR for qemu-devel@nongnu.org; Thu, 17 Nov 2016 15:14:13 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c7T4V-0003hf-Ny for qemu-devel@nongnu.org; Thu, 17 Nov 2016 15:14:12 -0500 Received: from mx1.redhat.com ([209.132.183.28]:46786) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1c7T4T-0003fD-Cl; Thu, 17 Nov 2016 15:14:09 -0500 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id ACB4285376; Thu, 17 Nov 2016 20:14:08 +0000 (UTC) Received: from red.redhat.com (ovpn-116-206.phx2.redhat.com [10.3.116.206]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id uAHKE2rT005933; Thu, 17 Nov 2016 15:14:07 -0500 From: Eric Blake To: qemu-devel@nongnu.org Date: Thu, 17 Nov 2016 14:13:56 -0600 Message-Id: <1479413642-22463-4-git-send-email-eblake@redhat.com> In-Reply-To: <1479413642-22463-1-git-send-email-eblake@redhat.com> References: <1479413642-22463-1-git-send-email-eblake@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Thu, 17 Nov 2016 20:14:08 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v3 3/9] block: Let write zeroes fallback work even with small max_transfer X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, Fam Zheng , qemu-block@nongnu.org, qemu-stable@nongnu.org, Max Reitz , Stefan Hajnoczi , "Denis V . Lunev" , pbonzini@redhat.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP Commit 443668ca rewrote the write_zeroes logic to guarantee that an unaligned request never crosses a cluster boundary. But in the rewrite, the new code assumed that at most one iteration would be needed to get to an alignment boundary. However, it is easy to trigger an assertion failure: the Linux kernel limits loopback devices to advertise a max_transfer of only 64k. Any operation that requires falling back to writes rather than more efficient zeroing must obey max_transfer during that fallback, which means an unaligned head may require multiple iterations of the write fallbacks before reaching the aligned boundaries, when layering a format with clusters larger than 64k atop the protocol of file access to a loopback device. Test case: $ qemu-img create -f qcow2 -o cluster_size=1M file 10M $ losetup /dev/loop2 /path/to/file $ qemu-io -f qcow2 /dev/loop2 qemu-io> w 7m 1k qemu-io> w -z 8003584 2093056 In fairness to Denis (as the original listed author of the culprit commit), the faulty logic for at most one iteration is probably all my fault in reworking his idea. But the solution is to restore what was in place prior to that commit: when dealing with an unaligned head or tail, iterate as many times as necessary while fragmenting the operation at max_transfer boundaries. Reported-by: Ed Swierk CC: qemu-stable@nongnu.org CC: Denis V. Lunev Signed-off-by: Eric Blake Reviewed-by: Max Reitz --- block/io.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/block/io.c b/block/io.c index aa532a5..085ac34 100644 --- a/block/io.c +++ b/block/io.c @@ -1214,6 +1214,8 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX); int alignment = MAX(bs->bl.pwrite_zeroes_alignment, bs->bl.request_alignment); + int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer, + MAX_WRITE_ZEROES_BOUNCE_BUFFER); assert(alignment % bs->bl.request_alignment == 0); head = offset % alignment; @@ -1229,9 +1231,12 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, * boundaries. */ if (head) { - /* Make a small request up to the first aligned sector. */ - num = MIN(count, alignment - head); - head = 0; + /* Make a small request up to the first aligned sector. For + * convenience, limit this request to max_transfer even if + * we don't need to fall back to writes. */ + num = MIN(MIN(count, max_transfer), alignment - head); + head = (head + num) % alignment; + assert(num < max_write_zeroes); } else if (tail && num > alignment) { /* Shorten the request to the last aligned sector. */ num -= tail; @@ -1257,8 +1262,6 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, if (ret == -ENOTSUP) { /* Fall back to bounce buffer if write zeroes is unsupported */ - int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer, - MAX_WRITE_ZEROES_BOUNCE_BUFFER); BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE; if ((flags & BDRV_REQ_FUA) &&