diff mbox

[for-2.11,5/5] qcow2: Refuse to get unaligned offsets from cache

Message ID 20171110203111.7666-6-mreitz@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Max Reitz Nov. 10, 2017, 8:31 p.m. UTC
Instead of using an assertion, it is better to emit a corruption event
here.  Checking all offsets for correct alignment can be tedious and it
is easily possible to forget to do so.  qcow2_cache_do_get() is a
function every L2 and refblock access has to go through, so this is a
good central point to add such a check.

And for good measure, let us also add an assertion that the offset is
non-zero.  Making this a corruption event is not feasible, because a
zero offset usually means something special (such as the cluster is
unused), so all callers should be checking this anyway.  If they do not,
it is their fault, hence the assertion here.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2-cache.c        | 21 +++++++++++++++++++++
 tests/qemu-iotests/060     | 21 +++++++++++++++++++++
 tests/qemu-iotests/060.out | 29 +++++++++++++++++++++++++++++
 3 files changed, 71 insertions(+)

Comments

Eric Blake Nov. 10, 2017, 9:54 p.m. UTC | #1
On 11/10/2017 02:31 PM, Max Reitz wrote:
> Instead of using an assertion, it is better to emit a corruption event
> here.  Checking all offsets for correct alignment can be tedious and it
> is easily possible to forget to do so.  qcow2_cache_do_get() is a
> function every L2 and refblock access has to go through, so this is a
> good central point to add such a check.
> 
> And for good measure, let us also add an assertion that the offset is
> non-zero.  Making this a corruption event is not feasible, because a
> zero offset usually means something special (such as the cluster is
> unused), so all callers should be checking this anyway.  If they do not,
> it is their fault, hence the assertion here.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/qcow2-cache.c        | 21 +++++++++++++++++++++
>  tests/qemu-iotests/060     | 21 +++++++++++++++++++++
>  tests/qemu-iotests/060.out | 29 +++++++++++++++++++++++++++++
>  3 files changed, 71 insertions(+)
> 

> +--- Repairing ---
> +Repairing refcount block 1 is outside image
> +ERROR refcount block 2 is not cluster aligned; refcount table entry corrupted
> +qcow2: Marking image as corrupt: Refblock offset 0x200 unaligned (reftable index: 0x2); further corruption events will be suppressed
> +Can't get refcount for cluster 1048576: Input/output error

Trying to understand this: we have a double corruption, because we
encountered a refblock that points outside of the image, but fixing the
refblock in turn encounters a second refblock that points within the
image but to an unaligned area.

Of course, you should never encounter these bad refblocks in normal
usage, but when it comes to dealing with untrusted images, being robust
is always worth it.

Reviewed-by: Eric Blake <eblake@redhat.com>
Max Reitz Nov. 10, 2017, 10 p.m. UTC | #2
On 2017-11-10 22:54, Eric Blake wrote:
> On 11/10/2017 02:31 PM, Max Reitz wrote:
>> Instead of using an assertion, it is better to emit a corruption event
>> here.  Checking all offsets for correct alignment can be tedious and it
>> is easily possible to forget to do so.  qcow2_cache_do_get() is a
>> function every L2 and refblock access has to go through, so this is a
>> good central point to add such a check.
>>
>> And for good measure, let us also add an assertion that the offset is
>> non-zero.  Making this a corruption event is not feasible, because a
>> zero offset usually means something special (such as the cluster is
>> unused), so all callers should be checking this anyway.  If they do not,
>> it is their fault, hence the assertion here.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>  block/qcow2-cache.c        | 21 +++++++++++++++++++++
>>  tests/qemu-iotests/060     | 21 +++++++++++++++++++++
>>  tests/qemu-iotests/060.out | 29 +++++++++++++++++++++++++++++
>>  3 files changed, 71 insertions(+)
>>
> 
>> +--- Repairing ---
>> +Repairing refcount block 1 is outside image
>> +ERROR refcount block 2 is not cluster aligned; refcount table entry corrupted
>> +qcow2: Marking image as corrupt: Refblock offset 0x200 unaligned (reftable index: 0x2); further corruption events will be suppressed
>> +Can't get refcount for cluster 1048576: Input/output error
> 
> Trying to understand this: we have a double corruption, because we
> encountered a refblock that points outside of the image, but fixing the
> refblock in turn encounters a second refblock that points within the
> image but to an unaligned area.

No, it's the very same.  As far as I've seen it, the repair function
tries to fix the "refblock is outside image" error by resizing the image
so the refblock is inside the image.  However, the subsequent
bdrv_truncate() detects the alignment corruption, too, and thus marks
the image corrupt.

The check function itself never marks the image corrupt because it's
doing its best to fix it. :-)

(And the only point in marking an image corrupt is to force the user to
repair it.)

And that's also the reason why we need to invoke the repair twice: On
the first run the check function notices that the image is so broken we
need to create new refcount structures, so it does that.  But it cannot
free the old structures (or something) because bs->drv == NULL by now.
And then it cannot be run a second time because !bs->drv.

So you need to manually invoke it a second time, and then it goes over
the newly created refcount structures which are then fixed up.

> Of course, you should never encounter these bad refblocks in normal
> usage, but when it comes to dealing with untrusted images, being robust
> is always worth it.
> 
> Reviewed-by: Eric Blake <eblake@redhat.com>

Thanks!

Max
Eric Blake Nov. 10, 2017, 10:15 p.m. UTC | #3
On 11/10/2017 04:00 PM, Max Reitz wrote:
>> Trying to understand this: we have a double corruption, because we
>> encountered a refblock that points outside of the image, but fixing the
>> refblock in turn encounters a second refblock that points within the
>> image but to an unaligned area.
> 
> No, it's the very same.  As far as I've seen it, the repair function
> tries to fix the "refblock is outside image" error by resizing the image
> so the refblock is inside the image.  However, the subsequent
> bdrv_truncate() detects the alignment corruption, too, and thus marks
> the image corrupt.

Is resizing the image to be larger always a wise thing compared to just
rebuilding the refcount?  If I stick a large enough out-of-image value
in the table, can I cause a denial-of-service by making qemu try to
allocate petabytes of storage just to bring it into range?
Max Reitz Nov. 10, 2017, 10:16 p.m. UTC | #4
On 2017-11-10 23:15, Eric Blake wrote:
> On 11/10/2017 04:00 PM, Max Reitz wrote:
>>> Trying to understand this: we have a double corruption, because we
>>> encountered a refblock that points outside of the image, but fixing the
>>> refblock in turn encounters a second refblock that points within the
>>> image but to an unaligned area.
>>
>> No, it's the very same.  As far as I've seen it, the repair function
>> tries to fix the "refblock is outside image" error by resizing the image
>> so the refblock is inside the image.  However, the subsequent
>> bdrv_truncate() detects the alignment corruption, too, and thus marks
>> the image corrupt.
> 
> Is resizing the image to be larger always a wise thing compared to just
> rebuilding the refcount?  If I stick a large enough out-of-image value
> in the table, can I cause a denial-of-service by making qemu try to
> allocate petabytes of storage just to bring it into range?

But it's just a qcow2 resize (with no preallocation), so nothing will be
allocated.

Max
Alberto Garcia Nov. 14, 2017, 3:06 p.m. UTC | #5
On Fri 10 Nov 2017 09:31:11 PM CET, Max Reitz wrote:
> +static inline const char *qcow2_cache_get_name(BDRVQcow2State *s, Qcow2Cache *c)
> +{
> +    if (c == s->refcount_block_cache) {
> +        return "refcount block";
> +    } else if (c == s->l2_table_cache) {
> +        return "L2 table";
> +    } else {
> +        /* Do not abort, because this is not critical */
> +        return "unknown";
> +    }
> +}

Why is an unknown cache not critical?

Berto
Max Reitz Nov. 14, 2017, 3:09 p.m. UTC | #6
On 2017-11-14 16:06, Alberto Garcia wrote:
> On Fri 10 Nov 2017 09:31:11 PM CET, Max Reitz wrote:
>> +static inline const char *qcow2_cache_get_name(BDRVQcow2State *s, Qcow2Cache *c)
>> +{
>> +    if (c == s->refcount_block_cache) {
>> +        return "refcount block";
>> +    } else if (c == s->l2_table_cache) {
>> +        return "L2 table";
>> +    } else {
>> +        /* Do not abort, because this is not critical */
>> +        return "unknown";
>> +    }
>> +}
> 
> Why is an unknown cache not critical?

Because this is debugging information.  I know others disagree with my
opinion that I'd rather not abort qemu just because someone forgot to
add a 'return "foo";' here when adding a new cache, but that's my
opinion so I wanted to at least be told by someone that we should abort
here before doing it.

Max
Alberto Garcia Nov. 14, 2017, 3:31 p.m. UTC | #7
On Tue 14 Nov 2017 04:09:16 PM CET, Max Reitz wrote:
>>> +static inline const char *qcow2_cache_get_name(BDRVQcow2State *s, Qcow2Cache *c)
>>> +{
>>> +    if (c == s->refcount_block_cache) {
>>> +        return "refcount block";
>>> +    } else if (c == s->l2_table_cache) {
>>> +        return "L2 table";
>>> +    } else {
>>> +        /* Do not abort, because this is not critical */
>>> +        return "unknown";
>>> +    }
>>> +}
>> 
>> Why is an unknown cache not critical?
>
> Because this is debugging information.

Ah, right, this is only used for qcow2_signal_corruption().

> I know others disagree with my opinion that I'd rather not abort qemu
> just because someone forgot to add a 'return "foo";' here when adding
> a new cache, but that's my opinion so I wanted to at least be told by
> someone that we should abort here before doing it.

I just checked the code and at least qcow2_cache_entry_flush() assumes
that there can be other caches, so this problem is not new.

Perhaps we should rethink this in the future and add a stronger
assertion somewhere else instead of having dead code, but for now this
is OK I guess.

Reviewed-by: Alberto Garcia <berto@igalia.com>

Berto
diff mbox

Patch

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index 75746a7f43..a5baaba0ff 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -62,6 +62,18 @@  static inline int qcow2_cache_get_table_idx(BlockDriverState *bs,
     return idx;
 }
 
+static inline const char *qcow2_cache_get_name(BDRVQcow2State *s, Qcow2Cache *c)
+{
+    if (c == s->refcount_block_cache) {
+        return "refcount block";
+    } else if (c == s->l2_table_cache) {
+        return "L2 table";
+    } else {
+        /* Do not abort, because this is not critical */
+        return "unknown";
+    }
+}
+
 static void qcow2_cache_table_release(BlockDriverState *bs, Qcow2Cache *c,
                                       int i, int num_tables)
 {
@@ -314,9 +326,18 @@  static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c,
     uint64_t min_lru_counter = UINT64_MAX;
     int min_lru_index = -1;
 
+    assert(offset != 0);
+
     trace_qcow2_cache_get(qemu_coroutine_self(), c == s->l2_table_cache,
                           offset, read_from_disk);
 
+    if (offset_into_cluster(s, offset)) {
+        qcow2_signal_corruption(bs, true, -1, -1, "Cannot get entry from %s "
+                                "cache: Offset %#" PRIx64 " is unaligned",
+                                qcow2_cache_get_name(s, c), offset);
+        return -EIO;
+    }
+
     /* Check if the table is already cached */
     i = lookup_index = (offset / s->cluster_size * 4) % c->size;
     do {
diff --git a/tests/qemu-iotests/060 b/tests/qemu-iotests/060
index c230696b3a..1eca09417b 100755
--- a/tests/qemu-iotests/060
+++ b/tests/qemu-iotests/060
@@ -405,6 +405,27 @@  _check_test_img -r all
 $QEMU_IMG resize --shrink "$TEST_IMG" 32M
 _img_info | grep 'virtual size'
 
+echo
+echo "=== Discarding a refblock covered by an unaligned refblock ==="
+echo
+
+IMGOPTS='refcount_bits=1' _make_test_img 64M
+
+# Same as above
+poke_file "$TEST_IMG" "$(($rt_offset+8))" "\x00\x00\x00\x10\x00\x00\x00\x00"
+# But now we actually "create" an unaligned third refblock
+poke_file "$TEST_IMG" "$(($rt_offset+16))" "\x00\x00\x00\x00\x00\x00\x02\x00"
+$QEMU_IMG resize --shrink "$TEST_IMG" 32M
+
+echo '--- Repairing ---'
+# Fails the first repair because the corruption prevents the check
+# function from double-checking
+# (Using -q for the first invocation, because otherwise the
+#  double-check error message appears above the summary for some
+#  reason -- so let's just hide the summary)
+_check_test_img -q -r all
+_check_test_img -r all
+
 # success, all done
 echo "*** done"
 rm -f $seq.full
diff --git a/tests/qemu-iotests/060.out b/tests/qemu-iotests/060.out
index 358e54cdc9..56f5eb15d8 100644
--- a/tests/qemu-iotests/060.out
+++ b/tests/qemu-iotests/060.out
@@ -370,4 +370,33 @@  virtual size: 64M (67108864 bytes)
 No errors were found on the image.
 Image resized.
 virtual size: 32M (33554432 bytes)
+
+=== Discarding a refblock covered by an unaligned refblock ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
+qcow2: Marking image as corrupt: Cannot get entry from refcount block cache: Offset 0x200 is unaligned; further corruption events will be suppressed
+qemu-img: Failed to discard unused refblocks: Input/output error
+--- Repairing ---
+Repairing refcount block 1 is outside image
+ERROR refcount block 2 is not cluster aligned; refcount table entry corrupted
+qcow2: Marking image as corrupt: Refblock offset 0x200 unaligned (reftable index: 0x2); further corruption events will be suppressed
+Can't get refcount for cluster 1048576: Input/output error
+Rebuilding refcount structure
+Repairing cluster 1 refcount=1 reference=0
+Repairing cluster 2 refcount=1 reference=0
+Repairing cluster 1048576 refcount=1 reference=0
+qemu-img: Check failed: No medium found
+Leaked cluster 1 refcount=1 reference=0
+Leaked cluster 2 refcount=1 reference=0
+Leaked cluster 1048576 refcount=1 reference=0
+Repairing cluster 1 refcount=1 reference=0
+Repairing cluster 2 refcount=1 reference=0
+Repairing cluster 1048576 refcount=1 reference=0
+The following inconsistencies were found and repaired:
+
+    3 leaked clusters
+    0 corruptions
+
+Double checking the fixed image now...
+No errors were found on the image.
 *** done