From patchwork Mon Oct  9 12:59:38 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Ashijeet Acharya <ashijeetacharya@gmail.com>
X-Patchwork-Id: 9992987
Return-Path: 
 <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	D230E60230 for <patchwork-qemu-devel@patchwork.kernel.org>;
	Mon,  9 Oct 2017 13:03:43 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C327F28543
	for <patchwork-qemu-devel@patchwork.kernel.org>;
	Mon,  9 Oct 2017 13:03:43 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id B7FEE287CA; Mon,  9 Oct 2017 13:03:43 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,
	DKIM_ADSP_CUSTOM_MED,
	DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI,
	T_DKIM_INVALID autolearn=ham version=3.3.1
Received: from lists.gnu.org (lists.gnu.org [208.118.235.17])
	(using TLSv1 with cipher AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 16E6028543
	for <patchwork-qemu-devel@patchwork.kernel.org>;
	Mon,  9 Oct 2017 13:03:43 +0000 (UTC)
Received: from localhost ([::1]:57783 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from
	<qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>)
	id 1e1Xig-0004he-8S for patchwork-qemu-devel@patchwork.kernel.org;
	Mon, 09 Oct 2017 09:03:42 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:50667)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <ashijeetacharya@gmail.com>) id 1e1XdJ-0001DJ-Np
	for qemu-devel@nongnu.org; Mon, 09 Oct 2017 08:58:14 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <ashijeetacharya@gmail.com>) id 1e1XdI-00026A-AF
	for qemu-devel@nongnu.org; Mon, 09 Oct 2017 08:58:09 -0400
Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]:33680)
	by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
	(Exim 4.71) (envelope-from <ashijeetacharya@gmail.com>)
	id 1e1XdI-00025j-2L; Mon, 09 Oct 2017 08:58:08 -0400
Received: by mail-pf0-x242.google.com with SMTP id m28so27449786pfi.0;
	Mon, 09 Oct 2017 05:58:07 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
	h=from:to:cc:subject:date:message-id:in-reply-to:references;
	bh=wMmHzSBYCqTOfYe3z26qaCx1tO8lVkM/lLXaudnhrtE=;
	b=ITPeeXpcPXkq8/nOsHBtL0VV1MbdZqcdJOGO0hv5TjFyAHCbT3dfjQ7UOjV67COJg1
	LeR2i/S9W8AvKYujYLrvCvIWy9KThe1Ji1hVWjil694IU5N21QdI6wGuAysp+nfBIVbD
	4C+RkoGbBNeQi/x2IhAauH/v2R9PNyAirDxPljGZgds0H0I3Oc8IuL4UEvG/8LBByuE8
	ompCTxaWzOWG8lUkvqKRIPlbLi1pSDn8YrFMjADJB4ufA87vAL3g3/6ik8mZ4p7EZ2NX
	xgCJGipPt30YN7FgVppb8JV283KgnAxp68tqgR68RssZCNN/zRUtOrhKdXkmBB8nHDor
	C4yA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
	:references;
	bh=wMmHzSBYCqTOfYe3z26qaCx1tO8lVkM/lLXaudnhrtE=;
	b=MsND3tjEH/kCeKW60cyaDSpESHVPsINEMxELdq28PK8mwbFWvYnDSNeY3r78sPzSbc
	d0LX9LRnS5zIXmeyiUPEOPPcVyx8c/njrvC5Yxgks/hJCq7a3CuYggQTFJ1LQjG2t2D9
	OzxK2gw0ZV/R4rnMIcgYFwKsQHeaFSsbMe3ngeQg2CEzqFy/vDVsDyJhEN0mNZCDsiQ2
	TuJLGDI/ATQopPEsm7ZfA+KbFtpAUfEWINJyRnrW1kLd1yAszwMoTbyITzqdGJ/Bc5hg
	+8C3ntVqK1eIkiOdRvyLRL0XViQIxzyXGRyr9Qaufb6bnriqYX6jdm6YxApXtm47ej/H
	jO3w==
X-Gm-Message-State: AMCzsaVMQT8ZSRGkXzN9vvxFaMGTM8mrHjhWCdiQydI1BKViDh7FulD8
	MJTXyrICtoBwNvaBcj1ZP8g=
X-Google-Smtp-Source: 
 AOwi7QDVbCxRs/YdL6a+UWjgpbJZfnxqZwgLI5mdIHZwm7fe7IcqLWhIRQyRX2jIOH66a2FjMIUZtA==
X-Received: by 10.99.174.78 with SMTP id e14mr8987189pgp.155.1507553887014;
	Mon, 09 Oct 2017 05:58:07 -0700 (PDT)
Received: from localhost.localdomain ([27.251.197.195])
	by smtp.gmail.com with ESMTPSA id
	d190sm14781283pgc.11.2017.10.09.05.58.04
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
	Mon, 09 Oct 2017 05:58:06 -0700 (PDT)
From: Ashijeet Acharya <ashijeetacharya@gmail.com>
To: famz@redhat.com
Date: Mon,  9 Oct 2017 18:29:38 +0530
Message-Id: <20171009125940.24220-7-ashijeetacharya@gmail.com>
X-Mailer: git-send-email 2.13.5
In-Reply-To: <20171009125940.24220-1-ashijeetacharya@gmail.com>
References: <20171009125940.24220-1-ashijeetacharya@gmail.com>
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
	recognized.
X-Received-From: 2607:f8b0:400e:c00::242
Subject: [Qemu-devel] [PATCH v9 6/8] vmdk: New functions to assist
	allocating multiple clusters
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: kwolf@redhat.com, qemu-block@nongnu.org, stefanha@gmail.com,
	qemu-devel@nongnu.org, mreitz@redhat.com,
	Ashijeet Acharya <ashijeetacharya@gmail.com>, jsnow@redhat.com
Errors-To: 
 qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org
Sender: "Qemu-devel"
	<qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>
X-Virus-Scanned: ClamAV using ClamSMTP

Introduce two new helper functions handle_alloc() and
vmdk_alloc_cluster_offset(). handle_alloc() helps to allocate multiple
clusters at once starting from a given offset on disk and performs COW
if necessary for first and last allocated clusters.
vmdk_alloc_cluster_offset() helps to return the offset of the first of
the many newly allocated clusters. Also, provide proper documentation
for both.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
---
 block/vmdk.c | 201 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 191 insertions(+), 10 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 11bc0f09c7..d5dfd21abe 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -136,6 +136,7 @@ typedef struct VmdkMetaData {
     unsigned int l2_offset;
     int valid;
     uint32_t *l2_cache_entry;
+    uint32_t nb_clusters;
 } VmdkMetaData;
 
 typedef struct VmdkGrainMarker {
@@ -1259,6 +1260,183 @@ static int get_cluster_table(VmdkExtent *extent, uint64_t offset,
     return VMDK_OK;
 }
 
+/*
+ * vmdk_handle_alloc
+ *
+ * Allocate new clusters for an area that either is yet unallocated or needs a
+ * copy on write.
+ *
+ * Returns:
+ *   VMDK_OK:       if new clusters were allocated, *bytes may be decreased if
+ *                  the new allocation doesn't cover all of the requested area.
+ *                  *cluster_offset is updated to contain the offset of the
+ *                  first newly allocated cluster.
+ *
+ *   VMDK_UNALLOC:  if no clusters could be allocated. *cluster_offset is left
+ *                  unchanged.
+ *
+ *   VMDK_ERROR:    in error cases
+ */
+static int vmdk_handle_alloc(BlockDriverState *bs, VmdkExtent *extent,
+                             uint64_t offset, uint64_t *cluster_offset,
+                             int64_t *bytes, VmdkMetaData *m_data,
+                             bool allocate, uint32_t *alloc_clusters_counter)
+{
+    int l1_index, l2_offset, l2_index;
+    uint32_t *l2_table;
+    uint32_t cluster_sector;
+    uint32_t nb_clusters;
+    bool zeroed = false;
+    uint64_t skip_start_bytes, skip_end_bytes;
+    int ret;
+
+    ret = get_cluster_table(extent, offset, &l1_index, &l2_offset,
+                            &l2_index, &l2_table);
+    if (ret < 0) {
+        return ret;
+    }
+
+    cluster_sector = le32_to_cpu(l2_table[l2_index]);
+
+    skip_start_bytes = vmdk_find_offset_in_cluster(extent, offset);
+    /* Calculate the number of clusters to look for. Here we truncate the last
+     * cluster, i.e. 1 less than the actual value calculated as we may need to
+     * perform COW for the last one. */
+    nb_clusters = DIV_ROUND_UP(skip_start_bytes + *bytes,
+                               extent->cluster_sectors << BDRV_SECTOR_BITS) - 1;
+
+    nb_clusters = MIN(nb_clusters, extent->l2_size - l2_index);
+    assert(nb_clusters <= INT_MAX);
+
+    /* update bytes according to final nb_clusters value */
+    if (nb_clusters != 0) {
+        *bytes = ((nb_clusters * extent->cluster_sectors) << BDRV_SECTOR_BITS)
+                 - skip_start_bytes;
+    } else {
+        nb_clusters = 1;
+    }
+    *alloc_clusters_counter += nb_clusters;
+
+    /* we need to use MIN() for basically 3 cases that arise :
+     * 1. alloc very first cluster : here skip_start_bytes >= 0 and
+     *    *bytes <= cluster_size.
+     * 2. alloc middle clusters : here *bytes is a perfect multiple of
+     *    cluster_size and skip_start_bytes is 0.
+     * 3. alloc very last cluster : here *bytes <= cluster_size and
+     *    skip_start_bytes is 0
+     */
+    skip_end_bytes = skip_start_bytes + MIN(*bytes,
+                     extent->cluster_sectors * BDRV_SECTOR_SIZE
+                                    - skip_start_bytes);
+
+    if (extent->has_zero_grain && cluster_sector == VMDK_GTE_ZEROED) {
+        zeroed = true;
+    }
+
+    if (!cluster_sector || zeroed) {
+        if (!allocate) {
+            return zeroed ? VMDK_ZEROED : VMDK_UNALLOC;
+        }
+
+        cluster_sector = extent->next_cluster_sector;
+        extent->next_cluster_sector += extent->cluster_sectors
+                                                * nb_clusters;
+
+        ret = vmdk_perform_cow(bs, extent, cluster_sector * BDRV_SECTOR_SIZE,
+                               offset, skip_start_bytes,
+                               skip_end_bytes);
+        if (ret < 0) {
+            return ret;
+        }
+        if (m_data) {
+            m_data->valid = 1;
+            m_data->l1_index = l1_index;
+            m_data->l2_index = l2_index;
+            m_data->l2_offset = l2_offset;
+            m_data->l2_cache_entry = &l2_table[l2_index];
+            m_data->nb_clusters = nb_clusters;
+        }
+    }
+    *cluster_offset = cluster_sector << BDRV_SECTOR_BITS;
+    return VMDK_OK;
+}
+
+/*
+ * vmdk_alloc_clusters
+ *
+ * For a given offset on the virtual disk, find the cluster offset in vmdk
+ * file. If the offset is not found, allocate a new cluster.
+ *
+ * If the cluster is newly allocated, m_data->nb_clusters is set to the number
+ * of contiguous clusters that have been allocated. In this case, the other
+ * fields of m_data are valid and contain information about the first allocated
+ * cluster.
+ *
+ * Returns:
+ *
+ *   VMDK_OK:           on success and @cluster_offset was set
+ *
+ *   VMDK_UNALLOC:      if no clusters were allocated and @cluster_offset is
+ *                      set to zero
+ *
+ *   VMDK_ERROR:        in error cases
+ */
+static int vmdk_alloc_clusters(BlockDriverState *bs,
+                               VmdkExtent *extent,
+                               VmdkMetaData *m_data, uint64_t offset,
+                               bool allocate, uint64_t *cluster_offset,
+                               int64_t bytes,
+                               uint32_t *alloc_clusters_counter)
+{
+    uint64_t start, remaining;
+    uint64_t new_cluster_offset;
+    int64_t n_bytes;
+    int ret;
+
+    if (extent->flat) {
+        *cluster_offset = extent->flat_start_offset;
+        return VMDK_OK;
+    }
+
+    start = offset;
+    remaining = bytes;
+    new_cluster_offset = 0;
+    *cluster_offset = 0;
+    n_bytes = 0;
+    if (m_data) {
+        m_data->valid = 0;
+    }
+
+    /* due to L2 table margins all bytes may not get allocated at once */
+    while (true) {
+
+        if (!*cluster_offset) {
+            *cluster_offset = new_cluster_offset;
+        }
+
+        start              += n_bytes;
+        remaining          -= n_bytes;
+        new_cluster_offset += n_bytes;
+
+        if (remaining == 0) {
+            break;
+        }
+
+        n_bytes = remaining;
+
+        ret = vmdk_handle_alloc(bs, extent, start, &new_cluster_offset,
+                                &n_bytes, m_data, allocate,
+                                alloc_clusters_counter);
+
+        if (ret < 0) {
+            return ret;
+
+        }
+    }
+
+    return VMDK_OK;
+}
+
 /**
  * vmdk_get_cluster_offset
  *
@@ -1642,6 +1820,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
     uint64_t bytes_done = 0;
     VmdkMetaData m_data;
     uint64_t extent_end;
+    uint32_t alloc_clusters_counter = 0;
 
     if (DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE) > bs->total_sectors) {
         error_report("Wrong offset: offset=0x%" PRIx64
@@ -1667,10 +1846,10 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
             n_bytes = MIN(bytes, extent_end - offset);
         }
 
-        ret = vmdk_get_cluster_offset(bs, extent, &m_data, offset,
-                                      !(extent->compressed || zeroed),
-                                      &cluster_offset, offset_in_cluster,
-                                      offset_in_cluster + n_bytes);
+        ret = vmdk_alloc_clusters(bs, extent, &m_data, offset,
+                                  !(extent->compressed || zeroed),
+                                  &cluster_offset, n_bytes,
+                                  &alloc_clusters_counter);
         if (extent->compressed) {
             if (ret == VMDK_OK) {
                 /* Refuse write to allocated cluster for streamOptimized */
@@ -1679,8 +1858,9 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
                 return -EIO;
             } else {
                 /* allocate */
-                ret = vmdk_get_cluster_offset(bs, extent, &m_data, offset,
-                                              true, &cluster_offset, 0, 0);
+                ret = vmdk_alloc_clusters(bs, extent, &m_data, offset,
+                                          true, &cluster_offset, n_bytes,
+                                          &alloc_clusters_counter);
             }
         }
         if (ret == VMDK_ERROR) {
@@ -1688,10 +1868,11 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
         }
         if (zeroed) {
             /* Do zeroed write, buf is ignored */
-            if (extent->has_zero_grain &&
-                    offset_in_cluster == 0 &&
-                    n_bytes >= extent->cluster_sectors * BDRV_SECTOR_SIZE) {
-                n_bytes = extent->cluster_sectors * BDRV_SECTOR_SIZE;
+            if (extent->has_zero_grain && offset_in_cluster == 0 &&
+                    n_bytes >= extent->cluster_sectors * BDRV_SECTOR_SIZE *
+                        alloc_clusters_counter) {
+                n_bytes = extent->cluster_sectors * BDRV_SECTOR_SIZE *
+                                        alloc_clusters_counter;
                 if (!zero_dry_run) {
                     /* update L2 tables */
                     if (vmdk_L2update(extent, &m_data, VMDK_GTE_ZEROED)