diff mbox series

[v2,03/12] block/vmdk: Change extent info type

Message ID 20220620162704.80987-4-hreitz@redhat.com (mailing list archive)
State New, archived
Headers show
Series qemu-img info: Show protocol-level information | expand

Commit Message

Hanna Czenczek June 20, 2022, 4:26 p.m. UTC
VMDK's implementation of .bdrv_get_specific_info() returns information
about its extent files, ostensibly in the form of ImageInfo objects.
However, it does not get this information through
bdrv_query_image_info(), but fills only a select few fields with custom
information that does not always match the fields' purposes.

For example, @format, which is supposed to be a block driver name, is
filled with the extent type, e.g. SPARSE or FLAT.

In ImageInfo, @compressed shows whether the data that can be seen in the
image is stored in compressed form or not.  For example, a compressed
qcow2 image will store compressed data in its data file, but when
accessing the qcow2 node, you will see normal data.  This is not how
VMDK uses the @compressed field for its extent files: Instead, it
signifies whether accessing the extent file will yield compressed data
(which the VMDK driver then (de-)compresses).

Create a new structure to represent the extent information.  This allows
us to clarify the fields' meanings, and it clearly shows that these are
not complete ImageInfo objects.  (That is, if a user wants an extent
file's ImageInfo object, they will need to query it separately, and will
not get it from ImageInfoSpecificVmdk.extents.)

Note that this removes the last use of ['ImageInfo'] (i.e. an array of
ImageInfo objects), so the QAPI generator will no longer generate
ImageInfoList by default.  However, we use it in qemu-img.c, so we need
to create a dummy object to force the generate to create that type,
similarly to DummyForceArrays in machine.json (introduced in commit
9f08c8ec73878122ad4b061ed334f0437afaaa32 ("qapi: Lazy creation of array
types")).

Signed-off-by: Hanna Reitz <hreitz@redhat.com>
---
I'm not sure whether this is an incompatible change.  I'm also not sure
if it even matters whether it's an incompatible change (i.e. whether
anyone cares).

I believe we can get away without this change.  I find it useful to make
it clear that (A) this extent information is not what you would find in
other ImageInfo objects (i.e., I consider this a fix for
ImageInfoSpecificVmdk's @extents field), and (B) that the upcoming
BlockGraphInfo type will not duplicate the extent nodes' ImageInfo
information, because those are actual ImageInfo objects.

We can probably replace this patch by clarifying all of this in
documentation, but if possible I would prefer a syntactic clarification
(as done here).
---
 qapi/block-core.json | 38 +++++++++++++++++++++++++++++++++++++-
 block/vmdk.c         |  8 ++++----
 2 files changed, 41 insertions(+), 5 deletions(-)

Comments

Kevin Wolf Jan. 19, 2023, 3:20 p.m. UTC | #1
Am 20.06.2022 um 18:26 hat Hanna Reitz geschrieben:
> VMDK's implementation of .bdrv_get_specific_info() returns information
> about its extent files, ostensibly in the form of ImageInfo objects.
> However, it does not get this information through
> bdrv_query_image_info(), but fills only a select few fields with custom
> information that does not always match the fields' purposes.
> 
> For example, @format, which is supposed to be a block driver name, is
> filled with the extent type, e.g. SPARSE or FLAT.
> 
> In ImageInfo, @compressed shows whether the data that can be seen in the
> image is stored in compressed form or not.  For example, a compressed
> qcow2 image will store compressed data in its data file, but when
> accessing the qcow2 node, you will see normal data.  This is not how
> VMDK uses the @compressed field for its extent files: Instead, it
> signifies whether accessing the extent file will yield compressed data
> (which the VMDK driver then (de-)compresses).

Actually, VMDK was the only user of the field in ImageInfo. qcow2
doesn't set the field at all because it would have to parse all L2
tables in order to set it.

So I suppose @compressed can now be removed from ImageInfo?

Kevin
Hanna Czenczek Jan. 20, 2023, 1:43 p.m. UTC | #2
On 19.01.23 16:20, Kevin Wolf wrote:
> Am 20.06.2022 um 18:26 hat Hanna Reitz geschrieben:
>> VMDK's implementation of .bdrv_get_specific_info() returns information
>> about its extent files, ostensibly in the form of ImageInfo objects.
>> However, it does not get this information through
>> bdrv_query_image_info(), but fills only a select few fields with custom
>> information that does not always match the fields' purposes.
>>
>> For example, @format, which is supposed to be a block driver name, is
>> filled with the extent type, e.g. SPARSE or FLAT.
>>
>> In ImageInfo, @compressed shows whether the data that can be seen in the
>> image is stored in compressed form or not.  For example, a compressed
>> qcow2 image will store compressed data in its data file, but when
>> accessing the qcow2 node, you will see normal data.  This is not how
>> VMDK uses the @compressed field for its extent files: Instead, it
>> signifies whether accessing the extent file will yield compressed data
>> (which the VMDK driver then (de-)compresses).
> Actually, VMDK was the only user of the field in ImageInfo. qcow2
> doesn't set the field at all because it would have to parse all L2
> tables in order to set it.

Right, makes sense.

> So I suppose @compressed can now be removed from ImageInfo?

I think so.  Looks like it was indeed introduced specifically for VMDK’s extent information (cbe82d7fb32, f4c129a38a5) so that ImageInfo could be used there, which this patch here changes.  So we should probably indeed remove the field – it did lead me to naïvely believe that it would have a meaning on images that actually support compression (like qcow2) after all.

Hanna
Eric Blake Jan. 27, 2023, 11:06 p.m. UTC | #3
On Thu, Jan 19, 2023 at 04:20:16PM +0100, Kevin Wolf wrote:
> Am 20.06.2022 um 18:26 hat Hanna Reitz geschrieben:
> > VMDK's implementation of .bdrv_get_specific_info() returns information
> > about its extent files, ostensibly in the form of ImageInfo objects.
> > However, it does not get this information through
> > bdrv_query_image_info(), but fills only a select few fields with custom
> > information that does not always match the fields' purposes.
> > 
> > For example, @format, which is supposed to be a block driver name, is
> > filled with the extent type, e.g. SPARSE or FLAT.
> > 
> > In ImageInfo, @compressed shows whether the data that can be seen in the
> > image is stored in compressed form or not.  For example, a compressed
> > qcow2 image will store compressed data in its data file, but when
> > accessing the qcow2 node, you will see normal data.  This is not how
> > VMDK uses the @compressed field for its extent files: Instead, it
> > signifies whether accessing the extent file will yield compressed data
> > (which the VMDK driver then (de-)compresses).
> 
> Actually, VMDK was the only user of the field in ImageInfo. qcow2
> doesn't set the field at all because it would have to parse all L2
> tables in order to set it.
> 
> So I suppose @compressed can now be removed from ImageInfo?

I think you are okay for VMDK (the new struct has the same field
names, but better meanings, and the on-the-wire representation for
someone querying a known-VMDK image doesn't change).  For non-VMDK,
any code that was querying @compressed will break, but arguably no one
was doing that since it would have always been false.  If we want to
be super-conservative, we deprecate the field now and only remove it
from ImageInfo in a later release, but I'd rather trust Markus on this
decision.

On a side note - would it be worth adding a bit to the qcow2 header
(one of the compatible_feature bits seems best) which we set when
writing a compressed cluster, so that on newer images, it is an O(1)
probe of whether the image contains (or at least has contained in the
past) a compressed cluster?  Or is that going to add needless overhead
for something we really don't need to know all that often?
Kevin Wolf Jan. 30, 2023, 10:56 a.m. UTC | #4
Am 28.01.2023 um 00:06 hat Eric Blake geschrieben:
> On Thu, Jan 19, 2023 at 04:20:16PM +0100, Kevin Wolf wrote:
> > Am 20.06.2022 um 18:26 hat Hanna Reitz geschrieben:
> > > VMDK's implementation of .bdrv_get_specific_info() returns information
> > > about its extent files, ostensibly in the form of ImageInfo objects.
> > > However, it does not get this information through
> > > bdrv_query_image_info(), but fills only a select few fields with custom
> > > information that does not always match the fields' purposes.
> > > 
> > > For example, @format, which is supposed to be a block driver name, is
> > > filled with the extent type, e.g. SPARSE or FLAT.
> > > 
> > > In ImageInfo, @compressed shows whether the data that can be seen in the
> > > image is stored in compressed form or not.  For example, a compressed
> > > qcow2 image will store compressed data in its data file, but when
> > > accessing the qcow2 node, you will see normal data.  This is not how
> > > VMDK uses the @compressed field for its extent files: Instead, it
> > > signifies whether accessing the extent file will yield compressed data
> > > (which the VMDK driver then (de-)compresses).
> > 
> > Actually, VMDK was the only user of the field in ImageInfo. qcow2
> > doesn't set the field at all because it would have to parse all L2
> > tables in order to set it.
> > 
> > So I suppose @compressed can now be removed from ImageInfo?
> 
> I think you are okay for VMDK (the new struct has the same field
> names, but better meanings, and the on-the-wire representation for
> someone querying a known-VMDK image doesn't change).  For non-VMDK,
> any code that was querying @compressed will break, but arguably no one
> was doing that since it would have always been false.  If we want to
> be super-conservative, we deprecate the field now and only remove it
> from ImageInfo in a later release, but I'd rather trust Markus on this
> decision.

It is an optional field and VMDK is the only driver that ever provided
it, and only in the context of extent information. In this case, the
information on the wire stays the same after this patch.

So I don't think there is any visible difference for a client, apart
from schema introspection?

> On a side note - would it be worth adding a bit to the qcow2 header
> (one of the compatible_feature bits seems best) which we set when
> writing a compressed cluster, so that on newer images, it is an O(1)
> probe of whether the image contains (or at least has contained in the
> past) a compressed cluster?  Or is that going to add needless overhead
> for something we really don't need to know all that often?

I think the only use for it would be displaying it in 'qemu-img info'.

Would you combine two bits then? One for "compressed bit is valid" and
one for "contains compressed clusters"? Because with one bit you can
only distinguish "compressed" from "don't know", but there wouldn't be a
way to say for certain that an image is uncompressed.

Kevin
diff mbox series

Patch

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 40fb307e2d..e0c8f07932 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -124,7 +124,33 @@ 
       'create-type': 'str',
       'cid': 'int',
       'parent-cid': 'int',
-      'extents': ['ImageInfo']
+      'extents': ['VmdkExtentInfo']
+  } }
+
+##
+# @VmdkExtentInfo:
+#
+# Information about a VMDK extent file
+#
+# @filename: Name of the extent file
+#
+# @format: Extent type (e.g. FLAT or SPARSE)
+#
+# @virtual-size: Number of bytes covered by this extent
+#
+# @cluster-size: Cluster size in bytes (for non-flat extents)
+#
+# @compressed: Whether this extent contains compressed data
+#
+# Since: 7.1
+##
+{ 'struct': 'VmdkExtentInfo',
+  'data': {
+      'filename': 'str',
+      'format': 'str',
+      'virtual-size': 'int',
+      '*cluster-size': 'int',
+      '*compressed': 'bool'
   } }
 
 ##
@@ -5638,3 +5664,13 @@ 
   'data': { 'device': 'str', '*id': 'str', '*name': 'str'},
   'returns': 'SnapshotInfo',
   'allow-preconfig': true }
+
+##
+# @DummyBlockCoreForceArrays:
+#
+# Not used by QMP; hack to let us use ImageInfoList internally
+#
+# Since: 7.1
+##
+{ 'struct': 'DummyBlockCoreForceArrays',
+  'data': { 'unused-image-info': ['ImageInfo'] } }
diff --git a/block/vmdk.c b/block/vmdk.c
index 38e5ab3806..35131a916e 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -2908,12 +2908,12 @@  static int vmdk_has_zero_init(BlockDriverState *bs)
     return 1;
 }
 
-static ImageInfo *vmdk_get_extent_info(VmdkExtent *extent)
+static VmdkExtentInfo *vmdk_get_extent_info(VmdkExtent *extent)
 {
-    ImageInfo *info = g_new0(ImageInfo, 1);
+    VmdkExtentInfo *info = g_new0(VmdkExtentInfo, 1);
 
     bdrv_refresh_filename(extent->file->bs);
-    *info = (ImageInfo){
+    *info = (VmdkExtentInfo){
         .filename         = g_strdup(extent->file->bs->filename),
         .format           = g_strdup(extent->type),
         .virtual_size     = extent->sectors * BDRV_SECTOR_SIZE,
@@ -2992,7 +2992,7 @@  static ImageInfoSpecific *vmdk_get_specific_info(BlockDriverState *bs,
     int i;
     BDRVVmdkState *s = bs->opaque;
     ImageInfoSpecific *spec_info = g_new0(ImageInfoSpecific, 1);
-    ImageInfoList **tail;
+    VmdkExtentInfoList **tail;
 
     *spec_info = (ImageInfoSpecific){
         .type = IMAGE_INFO_SPECIFIC_KIND_VMDK,