diff mbox series

btrfs: ensure that a DUP block group has exactly two stripes

Message ID 20190213142602.20436-1-jthumshirn@suse.de (mailing list archive)
State New, archived
Headers show
Series btrfs: ensure that a DUP block group has exactly two stripes | expand

Commit Message

Johannes Thumshirn Feb. 13, 2019, 2:26 p.m. UTC
We recently had a customer issue with a corrupted filesystem. When trying
to mount this image btrfs panicked with a division by zero in
calc_stripe_length().

The corrupt chunk had a 'num_stripes' value of 1. calc_stripe_length()
takes this value and divides it by the number of copies the RAID profile is
expected to have to calculate the amount of data stripes. As a DUP profile
is expected to have 2 copies this division resulted in 1/2 = 0. Later then
the 'data_stripes' variable is used as a divisor in the stripe length
calculation which results in a division by 0 and thus a kernel panic.

When encountering a filesystem with a DUP block group and a 'num_stripes'
value unequal to 2, refuse mounting as the image is corrupted and will lead
to unexpected behaviour.

Fixes: e06cd3dd7cea ("Btrfs: add validadtion checks for chunk loading")
Cc: Liu Bo <obuil.liubo@gmail.com>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 fs/btrfs/volumes.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Nikolay Borisov Feb. 13, 2019, 2:27 p.m. UTC | #1
On 13.02.19 г. 16:26 ч., Johannes Thumshirn wrote:
> We recently had a customer issue with a corrupted filesystem. When trying
> to mount this image btrfs panicked with a division by zero in
> calc_stripe_length().
> 
> The corrupt chunk had a 'num_stripes' value of 1. calc_stripe_length()
> takes this value and divides it by the number of copies the RAID profile is
> expected to have to calculate the amount of data stripes. As a DUP profile
> is expected to have 2 copies this division resulted in 1/2 = 0. Later then
> the 'data_stripes' variable is used as a divisor in the stripe length
> calculation which results in a division by 0 and thus a kernel panic.
> 
> When encountering a filesystem with a DUP block group and a 'num_stripes'
> value unequal to 2, refuse mounting as the image is corrupted and will lead
> to unexpected behaviour.
> 
> Fixes: e06cd3dd7cea ("Btrfs: add validadtion checks for chunk loading")
> Cc: Liu Bo <obuil.liubo@gmail.com>
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

> ---
>  fs/btrfs/volumes.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 03f223aa7194..b40cc7c830f4 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -6794,7 +6794,7 @@ static int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
>  	    (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes < 1) ||
>  	    (type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) ||
>  	    (type & BTRFS_BLOCK_GROUP_RAID6 && num_stripes < 3) ||
> -	    (type & BTRFS_BLOCK_GROUP_DUP && num_stripes > 2) ||
> +	    (type & BTRFS_BLOCK_GROUP_DUP && num_stripes != 2) ||
>  	    ((type & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0 &&
>  	     num_stripes != 1)) {
>  		btrfs_err(fs_info,
>
Qu Wenruo Feb. 13, 2019, 2:32 p.m. UTC | #2
On 2019/2/13 下午10:26, Johannes Thumshirn wrote:
> We recently had a customer issue with a corrupted filesystem. When trying
> to mount this image btrfs panicked with a division by zero in
> calc_stripe_length().
> 
> The corrupt chunk had a 'num_stripes' value of 1. calc_stripe_length()
> takes this value and divides it by the number of copies the RAID profile is
> expected to have to calculate the amount of data stripes. As a DUP profile
> is expected to have 2 copies this division resulted in 1/2 = 0. Later then
> the 'data_stripes' variable is used as a divisor in the stripe length
> calculation which results in a division by 0 and thus a kernel panic.
> 
> When encountering a filesystem with a DUP block group and a 'num_stripes'
> value unequal to 2, refuse mounting as the image is corrupted and will lead
> to unexpected behaviour.
> 
> Fixes: e06cd3dd7cea ("Btrfs: add validadtion checks for chunk loading")
> Cc: Liu Bo <obuil.liubo@gmail.com>
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu

> ---
>  fs/btrfs/volumes.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 03f223aa7194..b40cc7c830f4 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -6794,7 +6794,7 @@ static int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
>  	    (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes < 1) ||
>  	    (type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) ||
>  	    (type & BTRFS_BLOCK_GROUP_RAID6 && num_stripes < 3) ||
> -	    (type & BTRFS_BLOCK_GROUP_DUP && num_stripes > 2) ||
> +	    (type & BTRFS_BLOCK_GROUP_DUP && num_stripes != 2) ||
>  	    ((type & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0 &&
>  	     num_stripes != 1)) {
>  		btrfs_err(fs_info,
>
Hans van Kranenburg Feb. 13, 2019, 2:32 p.m. UTC | #3
On 2/13/19 3:26 PM, Johannes Thumshirn wrote:
> We recently had a customer issue with a corrupted filesystem. When trying
> to mount this image btrfs panicked with a division by zero in
> calc_stripe_length().
> 
> The corrupt chunk had a 'num_stripes' value of 1. calc_stripe_length()
> takes this value and divides it by the number of copies the RAID profile is
> expected to have to calculate the amount of data stripes. As a DUP profile
> is expected to have 2 copies this division resulted in 1/2 = 0. Later then
> the 'data_stripes' variable is used as a divisor in the stripe length
> calculation which results in a division by 0 and thus a kernel panic.
> 
> When encountering a filesystem with a DUP block group and a 'num_stripes'
> value unequal to 2, refuse mounting as the image is corrupted and will lead
> to unexpected behaviour.
> 
> Fixes: e06cd3dd7cea ("Btrfs: add validadtion checks for chunk loading")
> Cc: Liu Bo <obuil.liubo@gmail.com>
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  fs/btrfs/volumes.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 03f223aa7194..b40cc7c830f4 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -6794,7 +6794,7 @@ static int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
>  	    (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes < 1) ||
>  	    (type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) ||
>  	    (type & BTRFS_BLOCK_GROUP_RAID6 && num_stripes < 3) ||
> -	    (type & BTRFS_BLOCK_GROUP_DUP && num_stripes > 2) ||
> +	    (type & BTRFS_BLOCK_GROUP_DUP && num_stripes != 2) ||
>  	    ((type & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0 &&
>  	     num_stripes != 1)) {
>  		btrfs_err(fs_info,
> 

It looks like the RAID1 check has a similar problem. Shouldn't that
check also be != 2 ?

Hans
Johannes Thumshirn Feb. 13, 2019, 2:37 p.m. UTC | #4
On 13/02/2019 15:32, Hans van Kranenburg wrote:
[...]

>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 03f223aa7194..b40cc7c830f4 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -6794,7 +6794,7 @@ static int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
>>  	    (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes < 1) ||
>>  	    (type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) ||
>>  	    (type & BTRFS_BLOCK_GROUP_RAID6 && num_stripes < 3) ||
>> -	    (type & BTRFS_BLOCK_GROUP_DUP && num_stripes > 2) ||
>> +	    (type & BTRFS_BLOCK_GROUP_DUP && num_stripes != 2) ||
>>  	    ((type & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0 &&
>>  	     num_stripes != 1)) {
>>  		btrfs_err(fs_info,
>>
> 
> It looks like the RAID1 check has a similar problem. Shouldn't that
> check also be != 2 ?

Hmm I guess a degraded RAID1 can have only 1 stripe, doesn't it?
Nikolay Borisov Feb. 13, 2019, 3:02 p.m. UTC | #5
On 13.02.19 г. 16:37 ч., Johannes Thumshirn wrote:
> On 13/02/2019 15:32, Hans van Kranenburg wrote:
> [...]
> 
>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>> index 03f223aa7194..b40cc7c830f4 100644
>>> --- a/fs/btrfs/volumes.c
>>> +++ b/fs/btrfs/volumes.c
>>> @@ -6794,7 +6794,7 @@ static int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
>>>  	    (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes < 1) ||
>>>  	    (type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) ||
>>>  	    (type & BTRFS_BLOCK_GROUP_RAID6 && num_stripes < 3) ||
>>> -	    (type & BTRFS_BLOCK_GROUP_DUP && num_stripes > 2) ||
>>> +	    (type & BTRFS_BLOCK_GROUP_DUP && num_stripes != 2) ||
>>>  	    ((type & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0 &&
>>>  	     num_stripes != 1)) {
>>>  		btrfs_err(fs_info,
>>>
>>
>> It looks like the RAID1 check has a similar problem. Shouldn't that
>> check also be != 2 ?
> 
> Hmm I guess a degraded RAID1 can have only 1 stripe, doesn't it?

It depends on the POV. 

root@ubuntu-virtual:~# fallocate -l 2g /media/scratch/disk1.img
root@ubuntu-virtual:~# fallocate -l 2g /media/scratch/disk2.img
root@ubuntu-virtual:~# losetup -f /media/scratch/disk1.img 
root@ubuntu-virtual:~# losetup -f /media/scratch/disk2.img 
root@ubuntu-virtual:~# mkfs.btrfs -draid1 /dev/loop0 /dev/loop1
root@ubuntu-virtual:~# losetup -d /dev/loop1
root@ubuntu-virtual:~# mount -odegraded /dev/loop0 /media/test/

 btrfs inspect-internal dump-tree -t chunk /dev/loop0  | grep -B1 -A5 DATA\|RAID1
	item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704) itemoff 15863 itemsize 112
		length 214695936 owner 2 stripe_len 65536 type METADATA|RAID1
		io_align 65536 io_width 65536 sector_size 4096
		num_stripes 2 sub_stripes 0
			stripe 0 devid 2 offset 9437184
			dev_uuid adb924e5-830b-4eb0-b28a-f5bee6b709ca
			stripe 1 devid 1 offset 30408704
--
	item 4 key (FIRST_CHUNK_TREE CHUNK_ITEM 245104640) itemoff 15751 itemsize 112
		length 214695936 owner 2 stripe_len 65536 type DATA|RAID1
		io_align 65536 io_width 65536 sector_size 4096
		num_stripes 2 sub_stripes 0
			stripe 0 devid 2 offset 224133120
			dev_uuid adb924e5-830b-4eb0-b28a-f5bee6b709ca
			stripe 1 devid 1 offset 245104640



DMESG: 
[78623.920424] BTRFS warning (device loop0): devid 2 uuid adb924e5-830b-4eb0-b28a-f5bee6b709ca is missing

> 
>
Johannes Thumshirn Feb. 14, 2019, 4:21 p.m. UTC | #6
On 13/02/2019 15:32, Hans van Kranenburg wrote:
[...]

>> +++ b/fs/btrfs/volumes.c
>> @@ -6794,7 +6794,7 @@ static int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
>>  	    (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes < 1) ||
>>  	    (type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) ||
>>  	    (type & BTRFS_BLOCK_GROUP_RAID6 && num_stripes < 3) ||
>> -	    (type & BTRFS_BLOCK_GROUP_DUP && num_stripes > 2) ||
>> +	    (type & BTRFS_BLOCK_GROUP_DUP && num_stripes != 2) ||
>>  	    ((type & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0 &&
>>  	     num_stripes != 1)) {
>>  		btrfs_err(fs_info,
>>
> 
> It looks like the RAID1 check has a similar problem. Shouldn't that
> check also be != 2 ?

So looking at the code again I think num_stripes == 1 for RAID1 will
result in the same division by 0 in calc_stripe_length().

I'll send a patch for RAID1 as well unless David speaks up and says he
wants it amended in this one.

Thanks,
	Johannes
Hans van Kranenburg Feb. 14, 2019, 5:27 p.m. UTC | #7
On 2/13/19 3:37 PM, Johannes Thumshirn wrote:
> On 13/02/2019 15:32, Hans van Kranenburg wrote:
> [...]
> 
>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>> index 03f223aa7194..b40cc7c830f4 100644
>>> --- a/fs/btrfs/volumes.c
>>> +++ b/fs/btrfs/volumes.c
>>> @@ -6794,7 +6794,7 @@ static int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
>>>  	    (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes < 1) ||
>>>  	    (type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) ||
>>>  	    (type & BTRFS_BLOCK_GROUP_RAID6 && num_stripes < 3) ||
>>> -	    (type & BTRFS_BLOCK_GROUP_DUP && num_stripes > 2) ||
>>> +	    (type & BTRFS_BLOCK_GROUP_DUP && num_stripes != 2) ||
>>>  	    ((type & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0 &&
>>>  	     num_stripes != 1)) {
>>>  		btrfs_err(fs_info,
>>>
>>
>> It looks like the RAID1 check has a similar problem. Shouldn't that
>> check also be != 2 ?
> 
> Hmm I guess a degraded RAID1 can have only 1 stripe, doesn't it?

It's reading the metadata from disk here. So, there are always still
exactly two 'stripe' structs inside the chunk item data.

Hans
David Sterba Feb. 15, 2019, 3:20 p.m. UTC | #8
On Thu, Feb 14, 2019 at 05:21:54PM +0100, Johannes Thumshirn wrote:
> On 13/02/2019 15:32, Hans van Kranenburg wrote:
> [...]
> 
> >> +++ b/fs/btrfs/volumes.c
> >> @@ -6794,7 +6794,7 @@ static int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
> >>  	    (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes < 1) ||
> >>  	    (type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) ||
> >>  	    (type & BTRFS_BLOCK_GROUP_RAID6 && num_stripes < 3) ||
> >> -	    (type & BTRFS_BLOCK_GROUP_DUP && num_stripes > 2) ||
> >> +	    (type & BTRFS_BLOCK_GROUP_DUP && num_stripes != 2) ||
> >>  	    ((type & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0 &&
> >>  	     num_stripes != 1)) {
> >>  		btrfs_err(fs_info,
> >>
> > 
> > It looks like the RAID1 check has a similar problem. Shouldn't that
> > check also be != 2 ?
> 
> So looking at the code again I think num_stripes == 1 for RAID1 will
> result in the same division by 0 in calc_stripe_length().
> 
> I'll send a patch for RAID1 as well unless David speaks up and says he
> wants it amended in this one.

If the explanation and cause is the same, it's ok to put it into one
patch I think, but no strong preference here.
diff mbox series

Patch

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 03f223aa7194..b40cc7c830f4 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6794,7 +6794,7 @@  static int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info,
 	    (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes < 1) ||
 	    (type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) ||
 	    (type & BTRFS_BLOCK_GROUP_RAID6 && num_stripes < 3) ||
-	    (type & BTRFS_BLOCK_GROUP_DUP && num_stripes > 2) ||
+	    (type & BTRFS_BLOCK_GROUP_DUP && num_stripes != 2) ||
 	    ((type & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0 &&
 	     num_stripes != 1)) {
 		btrfs_err(fs_info,