diff mbox series

[3/3] qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()

Message ID 20191023152620.13166-4-kwolf@redhat.com (mailing list archive)
State New, archived
Headers show
Series qcow2: Fix image corruption bug in 4.1 | expand

Commit Message

Kevin Wolf Oct. 23, 2019, 3:26 p.m. UTC
qcow2_detect_metadata_preallocation() calls qcow2_get_refcount() which
requires s->lock to be taken to protect its accesses to the refcount
table and refcount blocks. However, nothing in this code path actually
took the lock. This could cause the same cache entry to be used by two
requests at the same time, for different tables at different offsets,
resulting in image corruption.

As it would be preferable to base the detection on consistent data (even
though it's just heuristics), let's take the lock not only around the
qcow2_get_refcount() calls, but around the whole function.

This patch takes the lock in qcow2_co_block_status() earlier and asserts
in qcow2_detect_metadata_preallocation() that we hold the lock.

Fixes: 69f47505ee66afaa513305de0c1895a224e52c45
Cc: qemu-stable@nongnu.org
Reported-by: Michael Weiser <michael@weiser.dinsnail.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-refcount.c | 2 ++
 block/qcow2.c          | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

Comments

Vladimir Sementsov-Ogievskiy Oct. 24, 2019, 10:46 a.m. UTC | #1
23.10.2019 18:26, Kevin Wolf wrote:
> qcow2_detect_metadata_preallocation() calls qcow2_get_refcount() which
> requires s->lock to be taken to protect its accesses to the refcount
> table and refcount blocks. However, nothing in this code path actually
> took the lock. This could cause the same cache entry to be used by two
> requests at the same time, for different tables at different offsets,
> resulting in image corruption.
> 
> As it would be preferable to base the detection on consistent data (even
> though it's just heuristics), let's take the lock not only around the
> qcow2_get_refcount() calls, but around the whole function.
> 
> This patch takes the lock in qcow2_co_block_status() earlier and asserts
> in qcow2_detect_metadata_preallocation() that we hold the lock.
> 
> Fixes: 69f47505ee66afaa513305de0c1895a224e52c45
> Cc: qemu-stable@nongnu.org
> Reported-by: Michael Weiser <michael@weiser.dinsnail.net>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Oh, I'm very sorry. I was going to backport this patch, and found that it's
fixed in our downstream long ago, even before last upstream version patch sent :(

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> ---
>   block/qcow2-refcount.c | 2 ++
>   block/qcow2.c          | 3 ++-
>   2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index ef965d7895..0d64bf5a5e 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -3455,6 +3455,8 @@ int qcow2_detect_metadata_preallocation(BlockDriverState *bs)
>       int64_t i, end_cluster, cluster_count = 0, threshold;
>       int64_t file_length, real_allocation, real_clusters;
>   
> +    qemu_co_mutex_assert_locked(&s->lock);
> +
>       file_length = bdrv_getlength(bs->file->bs);
>       if (file_length < 0) {
>           return file_length;
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 8b05933565..0bc69e6996 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -1916,6 +1916,8 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
>       unsigned int bytes;
>       int status = 0;
>   
> +    qemu_co_mutex_lock(&s->lock);
> +
>       if (!s->metadata_preallocation_checked) {
>           ret = qcow2_detect_metadata_preallocation(bs);
>           s->metadata_preallocation = (ret == 1);
> @@ -1923,7 +1925,6 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
>       }
>   
>       bytes = MIN(INT_MAX, count);
> -    qemu_co_mutex_lock(&s->lock);
>       ret = qcow2_get_cluster_offset(bs, offset, &bytes, &cluster_offset);
>       qemu_co_mutex_unlock(&s->lock);
>       if (ret < 0) {
>
Kevin Wolf Oct. 24, 2019, 11:17 a.m. UTC | #2
Am 24.10.2019 um 12:46 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 23.10.2019 18:26, Kevin Wolf wrote:
> > qcow2_detect_metadata_preallocation() calls qcow2_get_refcount() which
> > requires s->lock to be taken to protect its accesses to the refcount
> > table and refcount blocks. However, nothing in this code path actually
> > took the lock. This could cause the same cache entry to be used by two
> > requests at the same time, for different tables at different offsets,
> > resulting in image corruption.
> > 
> > As it would be preferable to base the detection on consistent data (even
> > though it's just heuristics), let's take the lock not only around the
> > qcow2_get_refcount() calls, but around the whole function.
> > 
> > This patch takes the lock in qcow2_co_block_status() earlier and asserts
> > in qcow2_detect_metadata_preallocation() that we hold the lock.
> > 
> > Fixes: 69f47505ee66afaa513305de0c1895a224e52c45
> > Cc: qemu-stable@nongnu.org
> > Reported-by: Michael Weiser <michael@weiser.dinsnail.net>
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> 
> Oh, I'm very sorry. I was going to backport this patch, and found that it's
> fixed in our downstream long ago, even before last upstream version patch sent :(

Seriously? Is your downstream QEMU so divergent from upstream that you
wouldn't notice something like this? I think you really have to improve
something there.

I mean, we have a serious data corruptor in the 4.1 release and I wasted
days to debug this, and you've been sitting on a fix for months without
telling anyone? This isn't a memory leak or something, this kills
people's images. It's eating their data.

Modifying an image format driver requires utmost care and I think I'll
try to make sure to allow only few qcow2 changes per release in the
future. Too many changes were made in too short time, and with too
little care, and there are even more qcow2 patches pending. Please check
whether these series actually match your downstream code. Anyway, we'll
tread very carefully now with the pending patches, even if it means
delaying them for another release or two. Stability is way more
important for an image format driver than new features and
optimisations.

Do whatever you need to fix your downstream process, but seriously, this
must not ever happen again.

Kevin
Vladimir Sementsov-Ogievskiy Oct. 24, 2019, 12:41 p.m. UTC | #3
24.10.2019 14:17, Kevin Wolf wrote:
> Am 24.10.2019 um 12:46 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 23.10.2019 18:26, Kevin Wolf wrote:
>>> qcow2_detect_metadata_preallocation() calls qcow2_get_refcount() which
>>> requires s->lock to be taken to protect its accesses to the refcount
>>> table and refcount blocks. However, nothing in this code path actually
>>> took the lock. This could cause the same cache entry to be used by two
>>> requests at the same time, for different tables at different offsets,
>>> resulting in image corruption.
>>>
>>> As it would be preferable to base the detection on consistent data (even
>>> though it's just heuristics), let's take the lock not only around the
>>> qcow2_get_refcount() calls, but around the whole function.
>>>
>>> This patch takes the lock in qcow2_co_block_status() earlier and asserts
>>> in qcow2_detect_metadata_preallocation() that we hold the lock.
>>>
>>> Fixes: 69f47505ee66afaa513305de0c1895a224e52c45
>>> Cc: qemu-stable@nongnu.org
>>> Reported-by: Michael Weiser <michael@weiser.dinsnail.net>
>>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>>
>> Oh, I'm very sorry. I was going to backport this patch, and found that it's
>> fixed in our downstream long ago, even before last upstream version patch sent :(
> 
> Seriously? Is your downstream QEMU so divergent from upstream that you
> wouldn't notice something like this? I think you really have to improve
> something there.

How to improve it? Months and years are passed to bring patches into upstream,
of course we have to live with a bunch (now it's nearer to 300 and this is a lot
better then 500+ in the recent past) of patches above Rhel release.

> 
> I mean, we have a serious data corruptor in the 4.1 release and I wasted
> days to debug this, and you've been sitting on a fix for months without
> telling anyone?

Of course, it was not my goal to hide the fix, I'll do my best to avoid this
in future. Very sorry for your time wasted.

> This isn't a memory leak or something, this kills
> people's images. It's eating their data.

Possibly, I didn't know about data corruption. For some reason I decided that
lock should be taken here, I don't remember. Still I should have
applied it to upstream version too, of course.

> 
> Modifying an image format driver requires utmost care and I think I'll
> try to make sure to allow only few qcow2 changes per release in the
> future. Too many changes were made in too short time, and with too
> little care, and there are even more qcow2 patches pending. Please check
> whether these series actually match your downstream code.

The difference is that I didn't move the lock() but add separate lock()/unlock()
pair around qcow2_detect_metadata_preallocation(). I think there is no serious
difference.

> Anyway, we'll
> tread very carefully now with the pending patches, even if it means
> delaying them for another release or two.

What do you mean? What to do with sent patches during several months?

> Stability is way more
> important for an image format driver than new features and
> optimisations.

Agree. But I think, the main source of stability is testing.

> 
> Do whatever you need to fix your downstream process, but seriously, this
> must not ever happen again.
> 

I don't see how to improve the process to exclude such problems. But I'll
of course will do my best to avoid them.
diff mbox series

Patch

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index ef965d7895..0d64bf5a5e 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -3455,6 +3455,8 @@  int qcow2_detect_metadata_preallocation(BlockDriverState *bs)
     int64_t i, end_cluster, cluster_count = 0, threshold;
     int64_t file_length, real_allocation, real_clusters;
 
+    qemu_co_mutex_assert_locked(&s->lock);
+
     file_length = bdrv_getlength(bs->file->bs);
     if (file_length < 0) {
         return file_length;
diff --git a/block/qcow2.c b/block/qcow2.c
index 8b05933565..0bc69e6996 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1916,6 +1916,8 @@  static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
     unsigned int bytes;
     int status = 0;
 
+    qemu_co_mutex_lock(&s->lock);
+
     if (!s->metadata_preallocation_checked) {
         ret = qcow2_detect_metadata_preallocation(bs);
         s->metadata_preallocation = (ret == 1);
@@ -1923,7 +1925,6 @@  static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
     }
 
     bytes = MIN(INT_MAX, count);
-    qemu_co_mutex_lock(&s->lock);
     ret = qcow2_get_cluster_offset(bs, offset, &bytes, &cluster_offset);
     qemu_co_mutex_unlock(&s->lock);
     if (ret < 0) {