From patchwork Sun May 21 14:21:46 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 9739087 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 26FCC60328 for ; Sun, 21 May 2017 14:22:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0F44D28402 for ; Sun, 21 May 2017 14:22:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E4F1028423; Sun, 21 May 2017 14:22:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id BBBE028402 for ; Sun, 21 May 2017 14:22:35 +0000 (UTC) Received: from localhost ([::1]:38184 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dCRkg-0007EP-CJ for patchwork-qemu-devel@patchwork.kernel.org; Sun, 21 May 2017 10:22:34 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45815) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dCRk3-0007EJ-VA for qemu-devel@nongnu.org; Sun, 21 May 2017 10:21:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dCRk0-0005pE-PS for qemu-devel@nongnu.org; Sun, 21 May 2017 10:21:55 -0400 Received: from mailhub.sw.ru ([195.214.232.25]:30518 helo=relay.sw.ru) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dCRk0-0005oc-Dr; Sun, 21 May 2017 10:21:52 -0400 Received: from iris.sw.ru (msk-vpn.virtuozzo.com [195.214.232.6]) by relay.sw.ru (8.13.4/8.13.4) with ESMTP id v4LELke3015130; Sun, 21 May 2017 17:21:46 +0300 (MSK) From: "Denis V. Lunev" To: qemu-block@nongnu.org, qemu-devel@nongnu.org Date: Sun, 21 May 2017 17:21:46 +0300 Message-Id: <1495376506-13227-1-git-send-email-den@openvz.org> X-Mailer: git-send-email 2.7.4 X-detected-operating-system: by eggs.gnu.org: OpenBSD 3.x [fuzzy] X-Received-From: 195.214.232.25 Subject: [Qemu-devel] [PATCH 1/1] qcow2: handle cluster leak happening with a guest TRIM command X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , "Denis V. Lunev" , Max Reitz Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP qemu-img create -f qcow2 1.img 64G qemu-io -c "write -P 0x32 0 64k" 1.img results in 324 -rw-r--r-- 1 den den 393216 May 21 16:48 1.img Subsequent qemu-io -c "write -z 0 64k" 1.img qemu-io -c "write -P 0x32 0 64k" 1.img results in 388 -rw-r--r-- 1 den den 458752 May 21 16:50 1.img which looks like we have 1 cluster leaked. Indeed, qcow2_co_pwrite_zeroes calls qcow2_zero_clusters/zero_single_l2, which does not update refcount for the host cluster and keep the offset as used. Later on handle_copied() does not take into account QCOW2_CLUSTER_ZERO type of the cluster. For now we comes into a very bad situation after qcow2_zero_cluster. We have a hole in the host file and we have the offset allocated for that guest cluster. Now there are 2 options: 1) allocate new offset once the write will come into this guest cluster (actually happens, but original cluster offset is leaked) 2) re-use host offset, i.e. fix handle_copied() to allow to reuse offset not only for QCOW2_CLUSTER_NORMAL but for QCOW2_CLUSTER_ZERO too Option 2) seems worse to me as in this case we can easily have host fragmentation in that cluster if writes will come in small pieces. This is not a very big deal if we have filesystem with PUCH_HOLE support, but without this feature the cluster is actually leaked forever. The patch replaces zero_single_l2 with discard_single_l2 and removes now unused zero_single_l2 to fix the situation. Signed-off-by: Denis V. Lunev CC: Kevin Wolf CC: Max Reitz --- block/qcow2-cluster.c | 46 ++-------------------------------------------- 1 file changed, 2 insertions(+), 44 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 100398c..1e53a7c 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -1548,49 +1548,6 @@ fail: return ret; } -/* - * This zeroes as many clusters of nb_clusters as possible at once (i.e. - * all clusters in the same L2 table) and returns the number of zeroed - * clusters. - */ -static int zero_single_l2(BlockDriverState *bs, uint64_t offset, - uint64_t nb_clusters, int flags) -{ - BDRVQcow2State *s = bs->opaque; - uint64_t *l2_table; - int l2_index; - int ret; - int i; - - ret = get_cluster_table(bs, offset, &l2_table, &l2_index); - if (ret < 0) { - return ret; - } - - /* Limit nb_clusters to one L2 table */ - nb_clusters = MIN(nb_clusters, s->l2_size - l2_index); - assert(nb_clusters <= INT_MAX); - - for (i = 0; i < nb_clusters; i++) { - uint64_t old_offset; - - old_offset = be64_to_cpu(l2_table[l2_index + i]); - - /* Update L2 entries */ - qcow2_cache_entry_mark_dirty(bs, s->l2_table_cache, l2_table); - if (old_offset & QCOW_OFLAG_COMPRESSED || flags & BDRV_REQ_MAY_UNMAP) { - l2_table[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO); - qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST); - } else { - l2_table[l2_index + i] |= cpu_to_be64(QCOW_OFLAG_ZERO); - } - } - - qcow2_cache_put(bs, s->l2_table_cache, (void **) &l2_table); - - return nb_clusters; -} - int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors, int flags) { @@ -1609,7 +1566,8 @@ int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors, s->cache_discards = true; while (nb_clusters > 0) { - ret = zero_single_l2(bs, offset, nb_clusters, flags); + ret = discard_single_l2(bs, offset, nb_clusters, + QCOW2_DISCARD_REQUEST, false); if (ret < 0) { goto fail; }