btrfs_progs: mkfs: match devid order to the stripe index
diff mbox series

Message ID 20190628022611.2844-1-anand.jain@oracle.com
State New
Headers show
Series
  • btrfs_progs: mkfs: match devid order to the stripe index
Related show

Commit Message

Anand Jain June 28, 2019, 2:26 a.m. UTC
At the time mkfs.btrfs the device id and stripe index gets reversed as
shown in [1]. This patch helps to keep them in order at the time of
mkfs.btrfs. And makes it easier to debug.

Before:
Stripe 0 is on devid 2; Stipe 1 is on devid 1;

./mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc && btrfs in dump-tree -d /dev/sdb | grep -A 10000 "chunk tree" | grep -B 10000 "device tree" | grep -A 13  "FIRST_CHUNK_TREE CHUNK_ITEM"
	item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096) itemoff 15975 itemsize 112
		length 8388608 owner 2 stripe_len 65536 type SYSTEM|RAID1
		io_align 65536 io_width 65536 sector_size 4096
		num_stripes 2 sub_stripes 0
			stripe 0 devid 2 offset 1048576
			dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd
			stripe 1 devid 1 offset 22020096
			dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab
	item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704) itemoff 15863 itemsize 112
		length 268435456 owner 2 stripe_len 65536 type METADATA|RAID1
		io_align 65536 io_width 65536 sector_size 4096
		num_stripes 2 sub_stripes 0
			stripe 0 devid 2 offset 9437184
			dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd
			stripe 1 devid 1 offset 30408704
			dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab
	item 4 key (FIRST_CHUNK_TREE CHUNK_ITEM 298844160) itemoff 15751 itemsize 112
		length 314572800 owner 2 stripe_len 65536 type DATA|RAID1
		io_align 65536 io_width 65536 sector_size 4096
		num_stripes 2 sub_stripes 0
			stripe 0 devid 2 offset 277872640
			dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd
			stripe 1 devid 1 offset 298844160
			dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab

After:
Stripe 0 is on devid 1; Stripe 1 is on devid 2

./mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc && btrfs in dump-tree -d /dev/sdb | grep -A 10000 "chunk tree" | grep -B 10000 "device tree" | grep -A 13  "FIRST_CHUNK_TREE CHUNK_ITEM"
/dev/sdb: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 5f 4d
/dev/sdc: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 5f 4d
	item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096) itemoff 15975 itemsize 112
		length 8388608 owner 2 stripe_len 65536 type SYSTEM|RAID1
		io_align 65536 io_width 65536 sector_size 4096
		num_stripes 2 sub_stripes 0
			stripe 0 devid 1 offset 22020096
			dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1
			stripe 1 devid 2 offset 1048576
			dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d
	item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704) itemoff 15863 itemsize 112
		length 268435456 owner 2 stripe_len 65536 type METADATA|RAID1
		io_align 65536 io_width 65536 sector_size 4096
		num_stripes 2 sub_stripes 0
			stripe 0 devid 1 offset 30408704
			dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1
			stripe 1 devid 2 offset 9437184
			dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d
	item 4 key (FIRST_CHUNK_TREE CHUNK_ITEM 298844160) itemoff 15751 itemsize 112
		length 314572800 owner 2 stripe_len 65536 type DATA|RAID1
		io_align 65536 io_width 65536 sector_size 4096
		num_stripes 2 sub_stripes 0
			stripe 0 devid 1 offset 298844160
			dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1
			stripe 1 devid 2 offset 277872640
			dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 volumes.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Qu Wenruo June 28, 2019, 2:44 a.m. UTC | #1
On 2019/6/28 上午10:26, Anand Jain wrote:
> At the time mkfs.btrfs the device id and stripe index gets reversed as
> shown in [1]. This patch helps to keep them in order at the time of
> mkfs.btrfs. And makes it easier to debug.
> 
> Before:
> Stripe 0 is on devid 2; Stipe 1 is on devid 1;
> 
> ./mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc && btrfs in dump-tree -d /dev/sdb | grep -A 10000 "chunk tree" | grep -B 10000 "device tree" | grep -A 13  "FIRST_CHUNK_TREE CHUNK_ITEM"
> 	item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096) itemoff 15975 itemsize 112
> 		length 8388608 owner 2 stripe_len 65536 type SYSTEM|RAID1
> 		io_align 65536 io_width 65536 sector_size 4096
> 		num_stripes 2 sub_stripes 0
> 			stripe 0 devid 2 offset 1048576
> 			dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd
> 			stripe 1 devid 1 offset 22020096
> 			dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab
> 	item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704) itemoff 15863 itemsize 112
> 		length 268435456 owner 2 stripe_len 65536 type METADATA|RAID1
> 		io_align 65536 io_width 65536 sector_size 4096
> 		num_stripes 2 sub_stripes 0
> 			stripe 0 devid 2 offset 9437184
> 			dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd
> 			stripe 1 devid 1 offset 30408704
> 			dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab
> 	item 4 key (FIRST_CHUNK_TREE CHUNK_ITEM 298844160) itemoff 15751 itemsize 112
> 		length 314572800 owner 2 stripe_len 65536 type DATA|RAID1
> 		io_align 65536 io_width 65536 sector_size 4096
> 		num_stripes 2 sub_stripes 0
> 			stripe 0 devid 2 offset 277872640
> 			dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd
> 			stripe 1 devid 1 offset 298844160
> 			dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab
> 
> After:
> Stripe 0 is on devid 1; Stripe 1 is on devid 2
> 
> ./mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc && btrfs in dump-tree -d /dev/sdb | grep -A 10000 "chunk tree" | grep -B 10000 "device tree" | grep -A 13  "FIRST_CHUNK_TREE CHUNK_ITEM"
> /dev/sdb: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 5f 4d
> /dev/sdc: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 5f 4d
> 	item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096) itemoff 15975 itemsize 112
> 		length 8388608 owner 2 stripe_len 65536 type SYSTEM|RAID1
> 		io_align 65536 io_width 65536 sector_size 4096
> 		num_stripes 2 sub_stripes 0
> 			stripe 0 devid 1 offset 22020096
> 			dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1
> 			stripe 1 devid 2 offset 1048576
> 			dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d
> 	item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704) itemoff 15863 itemsize 112
> 		length 268435456 owner 2 stripe_len 65536 type METADATA|RAID1
> 		io_align 65536 io_width 65536 sector_size 4096
> 		num_stripes 2 sub_stripes 0
> 			stripe 0 devid 1 offset 30408704
> 			dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1
> 			stripe 1 devid 2 offset 9437184
> 			dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d
> 	item 4 key (FIRST_CHUNK_TREE CHUNK_ITEM 298844160) itemoff 15751 itemsize 112
> 		length 314572800 owner 2 stripe_len 65536 type DATA|RAID1
> 		io_align 65536 io_width 65536 sector_size 4096
> 		num_stripes 2 sub_stripes 0
> 			stripe 0 devid 1 offset 298844160
> 			dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1
> 			stripe 1 devid 2 offset 277872640
> 			dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

But please also check the comment inlined below.
> ---
>  volumes.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/volumes.c b/volumes.c
> index 79d1d6a07fb7..8c8b17e814b8 100644
> --- a/volumes.c
> +++ b/volumes.c
> @@ -1109,7 +1109,7 @@ again:
>  			return ret;
>  		cur = cur->next;
>  		if (avail >= min_free) {
> -			list_move_tail(&device->dev_list, &private_devs);
> +			list_move(&device->dev_list, &private_devs);

This is OK since current btrfs-progs chunk allocator doesn't follow the
kernel behavior by sorting devices with its unallocated space.
So it's completely devid based.

But please keep in mind that, if we're going to unify the chunk
allocator behavior of kernel and btrfs-progs, the behavior will change.

As the initial temporary chunk is always allocated on devid 1, reducing
its unallocated space thus reducing its priority in chunk allocator, and
making the devid sequence more unreliable.

Thanks,
Qu

>  			index++;
>  			if (type & BTRFS_BLOCK_GROUP_DUP)
>  				index++;
> @@ -1166,7 +1166,7 @@ again:
>  		/* loop over this device again if we're doing a dup group */
>  		if (!(type & BTRFS_BLOCK_GROUP_DUP) ||
>  		    (index == num_stripes - 1))
> -			list_move_tail(&device->dev_list, dev_list);
> +			list_move(&device->dev_list, dev_list);
>  
>  		ret = btrfs_alloc_dev_extent(trans, device, key.offset,
>  			     calc_size, &dev_offset);
>
Anand Jain June 28, 2019, 3:28 a.m. UTC | #2
On 28/6/19 10:44 AM, Qu Wenruo wrote:
> 
> 
> On 2019/6/28 上午10:26, Anand Jain wrote:
>> At the time mkfs.btrfs the device id and stripe index gets reversed as
>> shown in [1]. This patch helps to keep them in order at the time of
>> mkfs.btrfs. And makes it easier to debug.
>>
>> Before:
>> Stripe 0 is on devid 2; Stipe 1 is on devid 1;
>>
>> ./mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc && btrfs in dump-tree -d /dev/sdb | grep -A 10000 "chunk tree" | grep -B 10000 "device tree" | grep -A 13  "FIRST_CHUNK_TREE CHUNK_ITEM"
>> 	item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096) itemoff 15975 itemsize 112
>> 		length 8388608 owner 2 stripe_len 65536 type SYSTEM|RAID1
>> 		io_align 65536 io_width 65536 sector_size 4096
>> 		num_stripes 2 sub_stripes 0
>> 			stripe 0 devid 2 offset 1048576
>> 			dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd
>> 			stripe 1 devid 1 offset 22020096
>> 			dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab
>> 	item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704) itemoff 15863 itemsize 112
>> 		length 268435456 owner 2 stripe_len 65536 type METADATA|RAID1
>> 		io_align 65536 io_width 65536 sector_size 4096
>> 		num_stripes 2 sub_stripes 0
>> 			stripe 0 devid 2 offset 9437184
>> 			dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd
>> 			stripe 1 devid 1 offset 30408704
>> 			dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab
>> 	item 4 key (FIRST_CHUNK_TREE CHUNK_ITEM 298844160) itemoff 15751 itemsize 112
>> 		length 314572800 owner 2 stripe_len 65536 type DATA|RAID1
>> 		io_align 65536 io_width 65536 sector_size 4096
>> 		num_stripes 2 sub_stripes 0
>> 			stripe 0 devid 2 offset 277872640
>> 			dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd
>> 			stripe 1 devid 1 offset 298844160
>> 			dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab
>>
>> After:
>> Stripe 0 is on devid 1; Stripe 1 is on devid 2
>>
>> ./mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc && btrfs in dump-tree -d /dev/sdb | grep -A 10000 "chunk tree" | grep -B 10000 "device tree" | grep -A 13  "FIRST_CHUNK_TREE CHUNK_ITEM"
>> /dev/sdb: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 5f 4d
>> /dev/sdc: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 5f 4d
>> 	item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096) itemoff 15975 itemsize 112
>> 		length 8388608 owner 2 stripe_len 65536 type SYSTEM|RAID1
>> 		io_align 65536 io_width 65536 sector_size 4096
>> 		num_stripes 2 sub_stripes 0
>> 			stripe 0 devid 1 offset 22020096
>> 			dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1
>> 			stripe 1 devid 2 offset 1048576
>> 			dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d
>> 	item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704) itemoff 15863 itemsize 112
>> 		length 268435456 owner 2 stripe_len 65536 type METADATA|RAID1
>> 		io_align 65536 io_width 65536 sector_size 4096
>> 		num_stripes 2 sub_stripes 0
>> 			stripe 0 devid 1 offset 30408704
>> 			dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1
>> 			stripe 1 devid 2 offset 9437184
>> 			dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d
>> 	item 4 key (FIRST_CHUNK_TREE CHUNK_ITEM 298844160) itemoff 15751 itemsize 112
>> 		length 314572800 owner 2 stripe_len 65536 type DATA|RAID1
>> 		io_align 65536 io_width 65536 sector_size 4096
>> 		num_stripes 2 sub_stripes 0
>> 			stripe 0 devid 1 offset 298844160
>> 			dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1
>> 			stripe 1 devid 2 offset 277872640
>> 			dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> 
> Reviewed-by: Qu Wenruo <wqu@suse.com>
> 
> But please also check the comment inlined below.
>> ---
>>   volumes.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/volumes.c b/volumes.c
>> index 79d1d6a07fb7..8c8b17e814b8 100644
>> --- a/volumes.c
>> +++ b/volumes.c
>> @@ -1109,7 +1109,7 @@ again:
>>   			return ret;
>>   		cur = cur->next;
>>   		if (avail >= min_free) {
>> -			list_move_tail(&device->dev_list, &private_devs);
>> +			list_move(&device->dev_list, &private_devs);
> 
> This is OK since current btrfs-progs chunk allocator doesn't follow the
> kernel behavior by sorting devices with its unallocated space.
> So it's completely devid based.
> 
> But please keep in mind that, if we're going to unify the chunk
> allocator behavior of kernel and btrfs-progs, the behavior will change.
> 
> As the initial temporary chunk is always allocated on devid 1, reducing
> its unallocated space thus reducing its priority in chunk allocator, and
> making the devid sequence more unreliable.

  Right. For the debug here, I have an experimental code which disables
  the unallocated space sort in the kernel. I don't have a strong reason
  to disable the sort in the kernel so didn't send the patch.

Thanks, Anand

> Thanks,
> Qu
> 
>>   			index++;
>>   			if (type & BTRFS_BLOCK_GROUP_DUP)
>>   				index++;
>> @@ -1166,7 +1166,7 @@ again:
>>   		/* loop over this device again if we're doing a dup group */
>>   		if (!(type & BTRFS_BLOCK_GROUP_DUP) ||
>>   		    (index == num_stripes - 1))
>> -			list_move_tail(&device->dev_list, dev_list);
>> +			list_move(&device->dev_list, dev_list);
>>   
>>   		ret = btrfs_alloc_dev_extent(trans, device, key.offset,
>>   			     calc_size, &dev_offset);
>>
>
Qu Wenruo June 28, 2019, 6:01 a.m. UTC | #3
On 2019/6/28 上午11:28, Anand Jain wrote:
> On 28/6/19 10:44 AM, Qu Wenruo wrote:
>>
>>
>> On 2019/6/28 上午10:26, Anand Jain wrote:
>>> At the time mkfs.btrfs the device id and stripe index gets reversed as
>>> shown in [1]. This patch helps to keep them in order at the time of
>>> mkfs.btrfs. And makes it easier to debug.
>>>
>>> Before:
>>> Stripe 0 is on devid 2; Stipe 1 is on devid 1;
>>>
>>> ./mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc && btrfs in
>>> dump-tree -d /dev/sdb | grep -A 10000 "chunk tree" | grep -B 10000
>>> "device tree" | grep -A 13  "FIRST_CHUNK_TREE CHUNK_ITEM"
>>>     item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096) itemoff 15975
>>> itemsize 112
>>>         length 8388608 owner 2 stripe_len 65536 type SYSTEM|RAID1
>>>         io_align 65536 io_width 65536 sector_size 4096
>>>         num_stripes 2 sub_stripes 0
>>>             stripe 0 devid 2 offset 1048576
>>>             dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd
>>>             stripe 1 devid 1 offset 22020096
>>>             dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab
>>>     item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704) itemoff 15863
>>> itemsize 112
>>>         length 268435456 owner 2 stripe_len 65536 type METADATA|RAID1
>>>         io_align 65536 io_width 65536 sector_size 4096
>>>         num_stripes 2 sub_stripes 0
>>>             stripe 0 devid 2 offset 9437184
>>>             dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd
>>>             stripe 1 devid 1 offset 30408704
>>>             dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab
>>>     item 4 key (FIRST_CHUNK_TREE CHUNK_ITEM 298844160) itemoff 15751
>>> itemsize 112
>>>         length 314572800 owner 2 stripe_len 65536 type DATA|RAID1
>>>         io_align 65536 io_width 65536 sector_size 4096
>>>         num_stripes 2 sub_stripes 0
>>>             stripe 0 devid 2 offset 277872640
>>>             dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd
>>>             stripe 1 devid 1 offset 298844160
>>>             dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab
>>>
>>> After:
>>> Stripe 0 is on devid 1; Stripe 1 is on devid 2
>>>
>>> ./mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc && btrfs in
>>> dump-tree -d /dev/sdb | grep -A 10000 "chunk tree" | grep -B 10000
>>> "device tree" | grep -A 13  "FIRST_CHUNK_TREE CHUNK_ITEM"
>>> /dev/sdb: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48
>>> 52 66 53 5f 4d
>>> /dev/sdc: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48
>>> 52 66 53 5f 4d
>>>     item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096) itemoff 15975
>>> itemsize 112
>>>         length 8388608 owner 2 stripe_len 65536 type SYSTEM|RAID1
>>>         io_align 65536 io_width 65536 sector_size 4096
>>>         num_stripes 2 sub_stripes 0
>>>             stripe 0 devid 1 offset 22020096
>>>             dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1
>>>             stripe 1 devid 2 offset 1048576
>>>             dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d
>>>     item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704) itemoff 15863
>>> itemsize 112
>>>         length 268435456 owner 2 stripe_len 65536 type METADATA|RAID1
>>>         io_align 65536 io_width 65536 sector_size 4096
>>>         num_stripes 2 sub_stripes 0
>>>             stripe 0 devid 1 offset 30408704
>>>             dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1
>>>             stripe 1 devid 2 offset 9437184
>>>             dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d
>>>     item 4 key (FIRST_CHUNK_TREE CHUNK_ITEM 298844160) itemoff 15751
>>> itemsize 112
>>>         length 314572800 owner 2 stripe_len 65536 type DATA|RAID1
>>>         io_align 65536 io_width 65536 sector_size 4096
>>>         num_stripes 2 sub_stripes 0
>>>             stripe 0 devid 1 offset 298844160
>>>             dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1
>>>             stripe 1 devid 2 offset 277872640
>>>             dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d
>>>
>>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>>
>> Reviewed-by: Qu Wenruo <wqu@suse.com>
>>
>> But please also check the comment inlined below.
>>> ---
>>>   volumes.c | 4 ++--
>>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/volumes.c b/volumes.c
>>> index 79d1d6a07fb7..8c8b17e814b8 100644
>>> --- a/volumes.c
>>> +++ b/volumes.c
>>> @@ -1109,7 +1109,7 @@ again:
>>>               return ret;
>>>           cur = cur->next;
>>>           if (avail >= min_free) {
>>> -            list_move_tail(&device->dev_list, &private_devs);
>>> +            list_move(&device->dev_list, &private_devs);
>>
>> This is OK since current btrfs-progs chunk allocator doesn't follow the
>> kernel behavior by sorting devices with its unallocated space.
>> So it's completely devid based.
>>
>> But please keep in mind that, if we're going to unify the chunk
>> allocator behavior of kernel and btrfs-progs, the behavior will change.
>>
>> As the initial temporary chunk is always allocated on devid 1, reducing
>> its unallocated space thus reducing its priority in chunk allocator, and
>> making the devid sequence more unreliable.
> 
>  Right. For the debug here, I have an experimental code which disables
>  the unallocated space sort in the kernel. I don't have a strong reason
>  to disable the sort in the kernel so didn't send the patch.

I'd say that unallocated sort is a hidden way to prevent starvation.

The mostly common case is 3 disk RAID1. (1024M X 3)
With the unallocated space sort, we can take full use of 1.5T.

While without that, we can only use 1T, as all allocation will happen on
the first (or last) 2 devices, not utilize the remaining disk at all.

So that kernel part is very helpful to prevent starvation.

Thanks,
Qu

> 
> Thanks, Anand
> 
>> Thanks,
>> Qu
>>
>>>               index++;
>>>               if (type & BTRFS_BLOCK_GROUP_DUP)
>>>                   index++;
>>> @@ -1166,7 +1166,7 @@ again:
>>>           /* loop over this device again if we're doing a dup group */
>>>           if (!(type & BTRFS_BLOCK_GROUP_DUP) ||
>>>               (index == num_stripes - 1))
>>> -            list_move_tail(&device->dev_list, dev_list);
>>> +            list_move(&device->dev_list, dev_list);
>>>             ret = btrfs_alloc_dev_extent(trans, device, key.offset,
>>>                    calc_size, &dev_offset);
>>>
>>
>
David Sterba July 3, 2019, 1:21 p.m. UTC | #4
On Fri, Jun 28, 2019 at 10:26:11AM +0800, Anand Jain wrote:
> At the time mkfs.btrfs the device id and stripe index gets reversed as
> shown in [1]. This patch helps to keep them in order at the time of
> mkfs.btrfs. And makes it easier to debug.
> 
> Before:
> Stripe 0 is on devid 2; Stipe 1 is on devid 1;
> 
> ./mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc && btrfs in dump-tree -d /dev/sdb | grep -A 10000 "chunk tree" | grep -B 10000 "device tree" | grep -A 13  "FIRST_CHUNK_TREE CHUNK_ITEM"

I've reformatted that so it's not overly long line. For dumps it's ok
but a command can be split by && or | .

> 	item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096) itemoff 15975 itemsize 112
> 		length 8388608 owner 2 stripe_len 65536 type SYSTEM|RAID1
> 		io_align 65536 io_width 65536 sector_size 4096
> 		num_stripes 2 sub_stripes 0
> 			stripe 0 devid 2 offset 1048576
> 			dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd
> 			stripe 1 devid 1 offset 22020096
> 			dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab
> 	item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704) itemoff 15863 itemsize 112
> 		length 268435456 owner 2 stripe_len 65536 type METADATA|RAID1
> 		io_align 65536 io_width 65536 sector_size 4096
> 		num_stripes 2 sub_stripes 0
> 			stripe 0 devid 2 offset 9437184
> 			dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd
> 			stripe 1 devid 1 offset 30408704
> 			dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab
> 	item 4 key (FIRST_CHUNK_TREE CHUNK_ITEM 298844160) itemoff 15751 itemsize 112
> 		length 314572800 owner 2 stripe_len 65536 type DATA|RAID1
> 		io_align 65536 io_width 65536 sector_size 4096
> 		num_stripes 2 sub_stripes 0
> 			stripe 0 devid 2 offset 277872640
> 			dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd
> 			stripe 1 devid 1 offset 298844160
> 			dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab
> 
> After:
> Stripe 0 is on devid 1; Stripe 1 is on devid 2
> 
> ./mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc && btrfs in dump-tree -d /dev/sdb | grep -A 10000 "chunk tree" | grep -B 10000 "device tree" | grep -A 13  "FIRST_CHUNK_TREE CHUNK_ITEM"
> /dev/sdb: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 5f 4d
> /dev/sdc: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 5f 4d
> 	item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096) itemoff 15975 itemsize 112
> 		length 8388608 owner 2 stripe_len 65536 type SYSTEM|RAID1
> 		io_align 65536 io_width 65536 sector_size 4096
> 		num_stripes 2 sub_stripes 0
> 			stripe 0 devid 1 offset 22020096
> 			dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1
> 			stripe 1 devid 2 offset 1048576
> 			dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d
> 	item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704) itemoff 15863 itemsize 112
> 		length 268435456 owner 2 stripe_len 65536 type METADATA|RAID1
> 		io_align 65536 io_width 65536 sector_size 4096
> 		num_stripes 2 sub_stripes 0
> 			stripe 0 devid 1 offset 30408704
> 			dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1
> 			stripe 1 devid 2 offset 9437184
> 			dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d
> 	item 4 key (FIRST_CHUNK_TREE CHUNK_ITEM 298844160) itemoff 15751 itemsize 112
> 		length 314572800 owner 2 stripe_len 65536 type DATA|RAID1
> 		io_align 65536 io_width 65536 sector_size 4096
> 		num_stripes 2 sub_stripes 0
> 			stripe 0 devid 1 offset 298844160
> 			dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1
> 			stripe 1 devid 2 offset 277872640
> 			dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>

Added to devel, thanks.
Anand Jain Aug. 27, 2019, 2:02 a.m. UTC | #5
On 3/7/19 9:21 PM, David Sterba wrote:
> On Fri, Jun 28, 2019 at 10:26:11AM +0800, Anand Jain wrote:
>> At the time mkfs.btrfs the device id and stripe index gets reversed as
>> shown in [1]. This patch helps to keep them in order at the time of
>> mkfs.btrfs. And makes it easier to debug.
>>
>> Before:
>> Stripe 0 is on devid 2; Stipe 1 is on devid 1;
>>
>> ./mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc && btrfs in dump-tree -d /dev/sdb | grep -A 10000 "chunk tree" | grep -B 10000 "device tree" | grep -A 13  "FIRST_CHUNK_TREE CHUNK_ITEM"
> 
> I've reformatted that so it's not overly long line. For dumps it's ok
> but a command can be split by && or | .
> 
>> 	item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096) itemoff 15975 itemsize 112
>> 		length 8388608 owner 2 stripe_len 65536 type SYSTEM|RAID1
>> 		io_align 65536 io_width 65536 sector_size 4096
>> 		num_stripes 2 sub_stripes 0
>> 			stripe 0 devid 2 offset 1048576
>> 			dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd
>> 			stripe 1 devid 1 offset 22020096
>> 			dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab
>> 	item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704) itemoff 15863 itemsize 112
>> 		length 268435456 owner 2 stripe_len 65536 type METADATA|RAID1
>> 		io_align 65536 io_width 65536 sector_size 4096
>> 		num_stripes 2 sub_stripes 0
>> 			stripe 0 devid 2 offset 9437184
>> 			dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd
>> 			stripe 1 devid 1 offset 30408704
>> 			dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab
>> 	item 4 key (FIRST_CHUNK_TREE CHUNK_ITEM 298844160) itemoff 15751 itemsize 112
>> 		length 314572800 owner 2 stripe_len 65536 type DATA|RAID1
>> 		io_align 65536 io_width 65536 sector_size 4096
>> 		num_stripes 2 sub_stripes 0
>> 			stripe 0 devid 2 offset 277872640
>> 			dev_uuid d9fe51c4-6e79-446d-87ee-5be3184798cd
>> 			stripe 1 devid 1 offset 298844160
>> 			dev_uuid 16f626ca-1a54-469b-ac7e-25623af884ab
>>
>> After:
>> Stripe 0 is on devid 1; Stripe 1 is on devid 2
>>
>> ./mkfs.btrfs -fq -draid1 -mraid1 /dev/sdb /dev/sdc && btrfs in dump-tree -d /dev/sdb | grep -A 10000 "chunk tree" | grep -B 10000 "device tree" | grep -A 13  "FIRST_CHUNK_TREE CHUNK_ITEM"
>> /dev/sdb: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 5f 4d
>> /dev/sdc: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 5f 4d
>> 	item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096) itemoff 15975 itemsize 112
>> 		length 8388608 owner 2 stripe_len 65536 type SYSTEM|RAID1
>> 		io_align 65536 io_width 65536 sector_size 4096
>> 		num_stripes 2 sub_stripes 0
>> 			stripe 0 devid 1 offset 22020096
>> 			dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1
>> 			stripe 1 devid 2 offset 1048576
>> 			dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d
>> 	item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704) itemoff 15863 itemsize 112
>> 		length 268435456 owner 2 stripe_len 65536 type METADATA|RAID1
>> 		io_align 65536 io_width 65536 sector_size 4096
>> 		num_stripes 2 sub_stripes 0
>> 			stripe 0 devid 1 offset 30408704
>> 			dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1
>> 			stripe 1 devid 2 offset 9437184
>> 			dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d
>> 	item 4 key (FIRST_CHUNK_TREE CHUNK_ITEM 298844160) itemoff 15751 itemsize 112
>> 		length 314572800 owner 2 stripe_len 65536 type DATA|RAID1
>> 		io_align 65536 io_width 65536 sector_size 4096
>> 		num_stripes 2 sub_stripes 0
>> 			stripe 0 devid 1 offset 298844160
>> 			dev_uuid 6abc88fa-f42e-4f0c-9bc3-2225735e51d1
>> 			stripe 1 devid 2 offset 277872640
>> 			dev_uuid 73746d27-13a6-4d58-ac6b-48c90c31d94d
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> 
> Added to devel, thanks.
> 

  I don't see this patch is integrated. Any idea?

Thanks, Anand
Anand Jain Sept. 2, 2019, 8:01 a.m. UTC | #6
David,

  I don't see this patch is integrated. Can you please integrated this 
patch thanks.

Thanks, Anand


>>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>>
>> Added to devel, thanks.
>>
> 
>   I don't see this patch is integrated. Any idea?
> 
> Thanks, Anand
David Sterba Sept. 2, 2019, 4:22 p.m. UTC | #7
On Mon, Sep 02, 2019 at 04:01:56PM +0800, Anand Jain wrote:
> 
> David,
> 
>   I don't see this patch is integrated. Can you please integrated this 
> patch thanks.

I don't know why but the patch got lost somewhere, adding to devel
again.
Johannes Thumshirn Sept. 3, 2019, 10:46 a.m. UTC | #8
Hi Anand,

This patch breaks misc-tests/021-image-multi-devices for me:

jthumshirn@adalid:btrfs-progs ((b41eba548d89...)|BISECTING)$ git bisect log
git bisect start
# good: [9a85732d8beaae4b80cab98bb3355660389c1d36] Btrfs progs v5.2.1
git bisect good 9a85732d8beaae4b80cab98bb3355660389c1d36
# bad: [7757aa659556e3654bdd2066ac977cf3c1cf9a4e] btrfs-progs:
fsck-tests: Add test image for valid half-dropped orphan inode
git bisect bad 7757aa659556e3654bdd2066ac977cf3c1cf9a4e
# good: [70351ae92834ff6405b464053871770245165c9b] btrfs-progs: copy
btrfsck.h to check/common.h
git bisect good 70351ae92834ff6405b464053871770245165c9b
# good: [d0848c91d916a203ca45409ed7538120e3b27f69] btrfs-progs: tests:
mkfs and extra large devices
git bisect good d0848c91d916a203ca45409ed7538120e3b27f69
# good: [aa0d7b57658d0aebb8be84456782a55212b49292] btrfs-progs: tests:
cli/003: add resize checks with 'max'
git bisect good aa0d7b57658d0aebb8be84456782a55212b49292
# good: [de8112e7dc53a753a9fbd8c0fb9c7d46359efc56] btrfs-progs:
print-tree add missing DEV_STATS
git bisect good de8112e7dc53a753a9fbd8c0fb9c7d46359efc56
# bad: [1613a542dfaadc9e1d105f6fd077ca7c4fed5900] btrfs-progs:
check/lowmem: Skip nbytes check for orphan inodes
git bisect bad 1613a542dfaadc9e1d105f6fd077ca7c4fed5900
# bad: [17bb2860247876c34bf67f5526cb142f45f2e630] btrfs_progs: mkfs:
match devid order to the stripe index
git bisect bad 17bb2860247876c34bf67f5526cb142f45f2e630
# first bad commit: [17bb2860247876c34bf67f5526cb142f45f2e630]
btrfs_progs: mkfs: match devid order to the stripe index

Can you please have a look?

Thanks,
	Johannes
David Sterba Sept. 3, 2019, 12:06 p.m. UTC | #9
On Mon, Sep 02, 2019 at 06:22:30PM +0200, David Sterba wrote:
> On Mon, Sep 02, 2019 at 04:01:56PM +0800, Anand Jain wrote:
> > 
> > David,
> > 
> >   I don't see this patch is integrated. Can you please integrated this 
> > patch thanks.
> 
> I don't know why but the patch got lost somewhere, adding to devel
> again.

Not lost, but dropped, misc-tests/021 fails. So dropped again, please
fix it and test before posting again. Thanks.
Anand Jain Sept. 4, 2019, 11:10 a.m. UTC | #10
On 9/3/19 8:06 PM, David Sterba wrote:
> On Mon, Sep 02, 2019 at 06:22:30PM +0200, David Sterba wrote:
>> On Mon, Sep 02, 2019 at 04:01:56PM +0800, Anand Jain wrote:
>>>
>>> David,
>>>
>>>    I don't see this patch is integrated. Can you please integrated this
>>> patch thanks.
>>
>> I don't know why but the patch got lost somewhere, adding to devel
>> again.
> 
> Not lost, but dropped, misc-tests/021 fails. So dropped again, please
> fix it and test before posting again. Thanks.
> 

I have unit tested this patch successfully, misc-tests/021 is a false 
positive failure. Because it assumes btrfs-image collects the data and 
attempts to read the file on the restored image, and rightly ends up 
with IO error.

-------
md5sum: /Volumes/ws/btrfs-progs/tests//mnt/foobar: Input/output error
failed: md5sum /Volumes/ws/btrfs-progs/tests//mnt/foobar
------

The test case is successful so far because the images is restored on the 
same loop device, and unfortunately stale data on the loop device 
matched and helped read from the restored image to succeed.

Tweak the test case a little as below and it fails even without this patch.

----------------------
diff --git a/tests/misc-tests/021-image-multi-devices/test.sh 
b/tests/misc-tests/021-image-multi-devices/test.sh
index b1013b5d2596..05f4146b6007 100755
--- a/tests/misc-tests/021-image-multi-devices/test.sh
+++ b/tests/misc-tests/021-image-multi-devices/test.sh
@@ -17,7 +17,7 @@ loop2=${loopdevs[2]}

  # Create the test file system.

-run_check $SUDO_HELPER "$TOP/mkfs.btrfs" -f "$loop1" "$loop2"
+run_check $SUDO_HELPER "$TOP/mkfs.btrfs" -f "$loop2" "$loop1"
  run_check $SUDO_HELPER mount "$loop1" "$TEST_MNT"
  run_check $SUDO_HELPER dd bs=1M count=1 if=/dev/zero of="$TEST_MNT/foobar"
  orig_md5=$(run_check_stdout md5sum "$TEST_MNT/foobar" | cut -d ' ' -f 1)
----------------------


This patch is good to integrate.

Thanks, Anand
Anand Jain Sept. 10, 2019, 7:37 a.m. UTC | #11
Ping.

Thanks, Anand
David Sterba Sept. 12, 2019, 5:54 p.m. UTC | #12
On Tue, Sep 03, 2019 at 02:06:03PM +0200, David Sterba wrote:
> On Mon, Sep 02, 2019 at 06:22:30PM +0200, David Sterba wrote:
> > On Mon, Sep 02, 2019 at 04:01:56PM +0800, Anand Jain wrote:
> > > 
> > > David,
> > > 
> > >   I don't see this patch is integrated. Can you please integrated this 
> > > patch thanks.
> > 
> > I don't know why but the patch got lost somewhere, adding to devel
> > again.
> 
> Not lost, but dropped, misc-tests/021 fails. So dropped again, please
> fix it and test before posting again. Thanks.

With the test misc/021 updated, this patch has been added to devel.
Thanks.
Filipe Manana Dec. 10, 2019, 3:42 p.m. UTC | #13
On Thu, Sep 12, 2019 at 10:39 PM David Sterba <dsterba@suse.cz> wrote:
>
> On Tue, Sep 03, 2019 at 02:06:03PM +0200, David Sterba wrote:
> > On Mon, Sep 02, 2019 at 06:22:30PM +0200, David Sterba wrote:
> > > On Mon, Sep 02, 2019 at 04:01:56PM +0800, Anand Jain wrote:
> > > >
> > > > David,
> > > >
> > > >   I don't see this patch is integrated. Can you please integrated this
> > > > patch thanks.
> > >
> > > I don't know why but the patch got lost somewhere, adding to devel
> > > again.
> >
> > Not lost, but dropped, misc-tests/021 fails. So dropped again, please
> > fix it and test before posting again. Thanks.
>
> With the test misc/021 updated, this patch has been added to devel.
> Thanks.

So having updated my local btrfs-progs from v5.2.2 to 5.4, I started
getting 4 test cases from fstests failing:

root 15:35:09 /home/fdmanana/git/hub/xfstests (master)> ./check
btrfs/142 btrfs/143 btrfs/157 btrfs/158
FSTYP         -- btrfs
PLATFORM      -- Linux/x86_64 debian6 5.4.0-rc8-btrfs-next-51 #1 SMP
PREEMPT Wed Nov 27 10:19:33 WET 2019
MKFS_OPTIONS  -- /dev/sdb
MOUNT_OPTIONS -- /dev/sdb /home/fdmanana/btrfs-tests/scratch_1

btrfs/142 1s ... - output mismatch (see
/home/fdmanana/git/hub/xfstests/results//btrfs/142.out.bad)
    --- tests/btrfs/142.out 2018-09-16 21:30:48.505104287 +0100
    +++ /home/fdmanana/git/hub/xfstests/results//btrfs/142.out.bad
2019-12-10 15:35:40.280392626 +0000
    @@ -3,37 +3,37 @@
     XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
     wrote 65536/65536 bytes
     XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
    -XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
    -XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
    -XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
    -XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
    ...
    (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/btrfs/142.out
/home/fdmanana/git/hub/xfstests/results//btrfs/142.out.bad'  to see
the entire diff)
btrfs/143 2s ... - output mismatch (see
/home/fdmanana/git/hub/xfstests/results//btrfs/143.out.bad)
    --- tests/btrfs/143.out 2018-09-16 21:30:48.505104287 +0100
    +++ /home/fdmanana/git/hub/xfstests/results//btrfs/143.out.bad
2019-12-10 15:35:41.740391311 +0000
    @@ -3,37 +3,37 @@
     XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
     wrote 65536/65536 bytes
     XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
    -XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
    -XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
    -XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
    -XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
    ...
    (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/btrfs/143.out
/home/fdmanana/git/hub/xfstests/results//btrfs/143.out.bad'  to see
the entire diff)
btrfs/157 1s ... - output mismatch (see
/home/fdmanana/git/hub/xfstests/results//btrfs/157.out.bad)
    --- tests/btrfs/157.out 2018-09-16 21:30:48.505104287 +0100
    +++ /home/fdmanana/git/hub/xfstests/results//btrfs/157.out.bad
2019-12-10 15:35:43.112390076 +0000
    @@ -1,9 +1,9 @@
     QA output created by 157
     wrote 131072/131072 bytes at offset 0
     XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
    -wrote 65536/65536 bytes at offset 9437184
    +wrote 65536/65536 bytes at offset 22020096
     XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
    -wrote 65536/65536 bytes at offset 9437184
    ...
    (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/btrfs/157.out
/home/fdmanana/git/hub/xfstests/results//btrfs/157.out.bad'  to see
the entire diff)
btrfs/158 2s ... - output mismatch (see
/home/fdmanana/git/hub/xfstests/results//btrfs/158.out.bad)
    --- tests/btrfs/158.out 2018-09-16 21:30:48.505104287 +0100
    +++ /home/fdmanana/git/hub/xfstests/results//btrfs/158.out.bad
2019-12-10 15:35:44.844388521 +0000
    @@ -1,9 +1,9 @@
     QA output created by 158
     wrote 131072/131072 bytes at offset 0
     XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
    -wrote 65536/65536 bytes at offset 9437184
    +wrote 65536/65536 bytes at offset 22020096
     XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
    -wrote 65536/65536 bytes at offset 9437184
    ...
    (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/btrfs/158.out
/home/fdmanana/git/hub/xfstests/results//btrfs/158.out.bad'  to see
the entire diff)
Ran: btrfs/142 btrfs/143 btrfs/157 btrfs/158
Failures: btrfs/142 btrfs/143 btrfs/157 btrfs/158
Failed 4 of 4 tests

A git bisect points to this patch:

fdmanana 15:38:37 ~/git/hub/btrfs-progs ((c501c9e3...)|BISECTING)> git
bisect log
git bisect start
# bad: [f82e569b33c3c1cfd4f8f405085ff8d439a0a915] Btrfs progs v5.3.1
git bisect bad f82e569b33c3c1cfd4f8f405085ff8d439a0a915
# good: [55a8c9626fb906c20c3206f8fd39b9a8fb259b79] Btrfs progs v5.2.2
git bisect good 55a8c9626fb906c20c3206f8fd39b9a8fb259b79
# bad: [a38eb3d4265873aa0dc55d1f9f2f1d7667a06b3c] btrfs-progs: add
checksum type to checksumming functions
git bisect bad a38eb3d4265873aa0dc55d1f9f2f1d7667a06b3c
# good: [b74d0dffb11c929c88e8ce7805f7aa6b370522b5] btrfs-progs:
qgroups: use parse_size instead of open coding it
git bisect good b74d0dffb11c929c88e8ce7805f7aa6b370522b5
# good: [3de68a0e8704869e72910b1a06e2e71692015e7b] btrfs-progs: tests:
fix misc/021 when restoring overlapping stale data
git bisect good 3de68a0e8704869e72910b1a06e2e71692015e7b
# bad: [779ada6edd81c520fd0acf61be14fc8669b00f79] btrfs-progs: make
checksum type explicit in mkfs context structure
git bisect bad 779ada6edd81c520fd0acf61be14fc8669b00f79
# bad: [669f56177550375713398aa22dfa817547df5566] btrfs-progs: docs:
use docbook5 backend for asciidoctor
git bisect bad 669f56177550375713398aa22dfa817547df5566
# bad: [c501c9e3b8164657e9210a701e0e0b75061e9d3b] btrfs-progs: mkfs:
match devid order to the stripe index
git bisect bad c501c9e3b8164657e9210a701e0e0b75061e9d3b
# first bad commit: [c501c9e3b8164657e9210a701e0e0b75061e9d3b]
btrfs-progs: mkfs: match devid order to the stripe index
fdmanana 15:38:40 ~/git/hub/btrfs-progs ((c501c9e3...)|BISECTING)>

Am I the only one getting this? It's been a while and I can't have
been the only one running fstests with progs 5.3+.
Is there any fix around for btrfs-progs I missed in mailing list (with
devel branch the tests fail as well), or a plan to update the tests?

thanks
Qu Wenruo Dec. 11, 2019, 1:45 a.m. UTC | #14
On 2019/12/10 下午11:42, Filipe Manana wrote:
> On Thu, Sep 12, 2019 at 10:39 PM David Sterba <dsterba@suse.cz> wrote:
>>
>> On Tue, Sep 03, 2019 at 02:06:03PM +0200, David Sterba wrote:
>>> On Mon, Sep 02, 2019 at 06:22:30PM +0200, David Sterba wrote:
>>>> On Mon, Sep 02, 2019 at 04:01:56PM +0800, Anand Jain wrote:
>>>>>
>>>>> David,
>>>>>
>>>>>   I don't see this patch is integrated. Can you please integrated this
>>>>> patch thanks.
>>>>
>>>> I don't know why but the patch got lost somewhere, adding to devel
>>>> again.
>>>
>>> Not lost, but dropped, misc-tests/021 fails. So dropped again, please
>>> fix it and test before posting again. Thanks.
>>
>> With the test misc/021 updated, this patch has been added to devel.
>> Thanks.
> 
> So having updated my local btrfs-progs from v5.2.2 to 5.4, I started
> getting 4 test cases from fstests failing:

Was running with btrfs-progs v5.3.1 before, so not hit the bug...
(And some notrun due to missing make_fail_request config)

> 
> Am I the only one getting this? It's been a while and I can't have
> been the only one running fstests with progs 5.3+.
> Is there any fix around for btrfs-progs I missed in mailing list (with
> devel branch the tests fail as well), or a plan to update the tests?

I'll look into the test case to fix them, since most of them are just
bad golden output, not a big deal to handle.

Thanks for the report,
Qu

> 
> thanks
> 
>
Filipe Manana Dec. 11, 2019, 8:58 a.m. UTC | #15
On Wed, Dec 11, 2019 at 1:45 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
>
> On 2019/12/10 下午11:42, Filipe Manana wrote:
> > On Thu, Sep 12, 2019 at 10:39 PM David Sterba <dsterba@suse.cz> wrote:
> >>
> >> On Tue, Sep 03, 2019 at 02:06:03PM +0200, David Sterba wrote:
> >>> On Mon, Sep 02, 2019 at 06:22:30PM +0200, David Sterba wrote:
> >>>> On Mon, Sep 02, 2019 at 04:01:56PM +0800, Anand Jain wrote:
> >>>>>
> >>>>> David,
> >>>>>
> >>>>>   I don't see this patch is integrated. Can you please integrated this
> >>>>> patch thanks.
> >>>>
> >>>> I don't know why but the patch got lost somewhere, adding to devel
> >>>> again.
> >>>
> >>> Not lost, but dropped, misc-tests/021 fails. So dropped again, please
> >>> fix it and test before posting again. Thanks.
> >>
> >> With the test misc/021 updated, this patch has been added to devel.
> >> Thanks.
> >
> > So having updated my local btrfs-progs from v5.2.2 to 5.4, I started
> > getting 4 test cases from fstests failing:
>
> Was running with btrfs-progs v5.3.1 before, so not hit the bug...
> (And some notrun due to missing make_fail_request config)
>
> >
> > Am I the only one getting this? It's been a while and I can't have
> > been the only one running fstests with progs 5.3+.
> > Is there any fix around for btrfs-progs I missed in mailing list (with
> > devel branch the tests fail as well), or a plan to update the tests?
>
> I'll look into the test case to fix them, since most of them are just
> bad golden output, not a big deal to handle.
>
> Thanks for the report,

Thanks for looking into it.

> Qu
>
> >
> > thanks
> >
> >
>

Patch
diff mbox series

diff --git a/volumes.c b/volumes.c
index 79d1d6a07fb7..8c8b17e814b8 100644
--- a/volumes.c
+++ b/volumes.c
@@ -1109,7 +1109,7 @@  again:
 			return ret;
 		cur = cur->next;
 		if (avail >= min_free) {
-			list_move_tail(&device->dev_list, &private_devs);
+			list_move(&device->dev_list, &private_devs);
 			index++;
 			if (type & BTRFS_BLOCK_GROUP_DUP)
 				index++;
@@ -1166,7 +1166,7 @@  again:
 		/* loop over this device again if we're doing a dup group */
 		if (!(type & BTRFS_BLOCK_GROUP_DUP) ||
 		    (index == num_stripes - 1))
-			list_move_tail(&device->dev_list, dev_list);
+			list_move(&device->dev_list, dev_list);
 
 		ret = btrfs_alloc_dev_extent(trans, device, key.offset,
 			     calc_size, &dev_offset);