diff mbox series

btrfs: zoned: use alloc_list instead of rcu locked device_list

Message ID f7108349f3a7d690166c88691c5dc1932cab3610.1647267391.git.johannes.thumshirn@wdc.com (mailing list archive)
State New, archived
Headers show
Series btrfs: zoned: use alloc_list instead of rcu locked device_list | expand

Commit Message

Johannes Thumshirn March 14, 2022, 2:16 p.m. UTC
Anand pointed out, that instead of using the rcu locked version of
fs_devices->device_list we cab use fs_devices->alloc_list, prrotected by
the chunk_mutex to traverse the list of active deviices in
btrfs_can_activate_zone().

Suggested-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 fs/btrfs/zoned.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

Comments

David Sterba March 14, 2022, 6:55 p.m. UTC | #1
On Mon, Mar 14, 2022 at 07:16:47AM -0700, Johannes Thumshirn wrote:
> Anand pointed out, that instead of using the rcu locked version of
> fs_devices->device_list we cab use fs_devices->alloc_list, prrotected by
> the chunk_mutex to traverse the list of active deviices in
> btrfs_can_activate_zone().

Why?
Anand Jain March 15, 2022, 12:03 p.m. UTC | #2
On 15/03/2022 02:55, David Sterba wrote:
> On Mon, Mar 14, 2022 at 07:16:47AM -0700, Johannes Thumshirn wrote:
>> Anand pointed out, that instead of using the rcu locked version of
>> fs_devices->device_list we cab use fs_devices->alloc_list, prrotected by
>> the chunk_mutex to traverse the list of active deviices in
>> btrfs_can_activate_zone().
> 
> Why?

We are in the chunk allocation thread. The newer chunk allocation
happens from the devices in the fs_device->alloc_list protected by
the chunk_mutex.

  btrfs_create_chunk()
    lockdep_assert_held(&info->chunk_mutex);
    gather_device_info
     list_for_each_entry(device, &fs_devices->alloc_list, dev_alloc_list) {

Also, a device reappeared after the mount won't join the alloc_list yet
and, it will be in the dev_list, which we don't want to consider in the
context of the chunk alloc.

Thanks, Anand
David Sterba March 15, 2022, 8:03 p.m. UTC | #3
On Mon, Mar 14, 2022 at 07:16:47AM -0700, Johannes Thumshirn wrote:
> Anand pointed out, that instead of using the rcu locked version of
> fs_devices->device_list we cab use fs_devices->alloc_list, prrotected by
> the chunk_mutex to traverse the list of active deviices in
> btrfs_can_activate_zone().
> 
> Suggested-by: Anand Jain <anand.jain@oracle.com>
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

With updated changelog added to misc-next, thanks.
Anand Jain March 16, 2022, 8:36 a.m. UTC | #4
On 16/03/2022 04:03, David Sterba wrote:
> On Mon, Mar 14, 2022 at 07:16:47AM -0700, Johannes Thumshirn wrote:
>> Anand pointed out, that instead of using the rcu locked version of
>> fs_devices->device_list we cab use fs_devices->alloc_list, prrotected by
>> the chunk_mutex to traverse the list of active deviices in
>> btrfs_can_activate_zone().
>>
>> Suggested-by: Anand Jain <anand.jain@oracle.com>
>> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> 
> With updated changelog added to misc-next, thanks.


Misc-next hit an NPD for new chunk alloc.

---------------------
[  552.785854] BUG: kernel NULL pointer dereference, address: 
0000000000000013
::
[  552.785872] RIP: 0010:btrfs_can_activate_zone.cold+0x5b/0x71 [btrfs]
::
[  552.785932] Call Trace:
[  552.785933]  <TASK>
[  552.785934]  find_free_extent+0x1233/0x1430 [btrfs]
[  552.785960]  btrfs_reserve_extent+0x147/0x290 [btrfs]
[  552.785986]  cow_file_range+0x173/0x4b0 [btrfs]
[  552.786014]  run_delalloc_zoned+0x25/0x90 [btrfs]
[  552.786041]  btrfs_run_delalloc_range+0x174/0x5c0 [btrfs]
[  552.786066]  ? find_lock_delalloc_range+0x25c/0x280 [btrfs]
[  552.786096]  writepage_delalloc+0xc1/0x180 [btrfs]
[  552.786126]  __extent_writepage+0x156/0x3b0 [btrfs]
[  552.786155]  extent_write_cache_pages+0x260/0x450 [btrfs]
[  552.786185]  extent_writepages+0x76/0x130 [btrfs]
[  552.786214]  do_writepages+0xbe/0x1a0
[  552.786218]  ? lock_is_held_type+0xd7/0x130
[  552.786221]  __writeback_single_inode+0x58/0x5e0
[  552.786223]  writeback_sb_inodes+0x218/0x5b0
[  552.786226]  __writeback_inodes_wb+0x4c/0xe0
[  552.786228]  wb_writeback+0x298/0x450
[  552.786231]  wb_workfn+0x38d/0x660
[  552.786232]  ? lock_acquire+0xca/0x2f0
[  552.786235]  ? lock_acquire+0xda/0x2f0
[  552.786238]  process_one_work+0x271/0x5a0
[  552.786241]  worker_thread+0x4f/0x3d0
[  552.786243]  ? process_one_work+0x5a0/0x5a0
[  552.786244]  kthread+0xf0/0x120
[  552.786246]  ? kthread_complete_and_exit+0x20/0x20
[  552.786248]  ret_from_fork+0x1f/0x30
[  552.786251]  </TASK>
-----------------------------


We should use fs_devices->alloc_list with dev_alloc_list as below.
Also, missing devices aren't part of dev_alloc_list, so we don't have
to check for if (!device->bdev).

--------------
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 7da630ff5708..c29e67c621be 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -1984,12 +1984,9 @@ bool btrfs_can_activate_zone(struct 
btrfs_fs_devices *fs_devices, u64 flags)

         /* Check if there is a device with active zones left */
         mutex_lock(&fs_info->chunk_mutex);
-       list_for_each_entry(device, &fs_devices->alloc_list, dev_list) {
+       list_for_each_entry(device, &fs_devices->alloc_list, 
dev_alloc_list) {
                 struct btrfs_zoned_device_info *zinfo = device->zone_info;

-               if (!device->bdev)
-                       continue;
-
                 if (!zinfo->max_active_zones ||
                     atomic_read(&zinfo->active_zones_left)) {
                         ret = true;
-------------

With this fixed you may add

Reviewed-by: Anand Jain <anand.jain@oracle.com>


Thanks, Anand
Johannes Thumshirn March 16, 2022, 8:39 a.m. UTC | #5
On 16/03/2022 09:37, Anand Jain wrote:
> On 16/03/2022 04:03, David Sterba wrote:
>> On Mon, Mar 14, 2022 at 07:16:47AM -0700, Johannes Thumshirn wrote:
>>> Anand pointed out, that instead of using the rcu locked version of
>>> fs_devices->device_list we cab use fs_devices->alloc_list, prrotected by
>>> the chunk_mutex to traverse the list of active deviices in
>>> btrfs_can_activate_zone().
>>>
>>> Suggested-by: Anand Jain <anand.jain@oracle.com>
>>> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
>>
>> With updated changelog added to misc-next, thanks.
> 
> 
> Misc-next hit an NPD for new chunk alloc.


Thanks for the change, but how did you trigger it? A.k.a why didn't it
trigger for me?

[...]

> We should use fs_devices->alloc_list with dev_alloc_list as below.
> Also, missing devices aren't part of dev_alloc_list, so we don't have
> to check for if (!device->bdev).
> 
> --------------
> diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
> index 7da630ff5708..c29e67c621be 100644
> --- a/fs/btrfs/zoned.c
> +++ b/fs/btrfs/zoned.c
> @@ -1984,12 +1984,9 @@ bool btrfs_can_activate_zone(struct 
> btrfs_fs_devices *fs_devices, u64 flags)
> 
>          /* Check if there is a device with active zones left */
>          mutex_lock(&fs_info->chunk_mutex);
> -       list_for_each_entry(device, &fs_devices->alloc_list, dev_list) {
> +       list_for_each_entry(device, &fs_devices->alloc_list, 
> dev_alloc_list) {
>                  struct btrfs_zoned_device_info *zinfo = device->zone_info;
> 
> -               if (!device->bdev)
> -                       continue;
> -
>                  if (!zinfo->max_active_zones ||
>                      atomic_read(&zinfo->active_zones_left)) {
>                          ret = true;
> -------------
> 
> With this fixed you may add
> 
> Reviewed-by: Anand Jain <anand.jain@oracle.com>

Thanks I'll update the patch
Anand Jain March 16, 2022, 8:44 a.m. UTC | #6
On 16/03/2022 16:39, Johannes Thumshirn wrote:
> On 16/03/2022 09:37, Anand Jain wrote:
>> On 16/03/2022 04:03, David Sterba wrote:
>>> On Mon, Mar 14, 2022 at 07:16:47AM -0700, Johannes Thumshirn wrote:
>>>> Anand pointed out, that instead of using the rcu locked version of
>>>> fs_devices->device_list we cab use fs_devices->alloc_list, prrotected by
>>>> the chunk_mutex to traverse the list of active deviices in
>>>> btrfs_can_activate_zone().
>>>>
>>>> Suggested-by: Anand Jain <anand.jain@oracle.com>
>>>> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
>>>
>>> With updated changelog added to misc-next, thanks.
>>
>>
>> Misc-next hit an NPD for new chunk alloc.
> 
> 
> Thanks for the change, but how did you trigger it? A.k.a why didn't it
> trigger for me?
> 

Well, not much. Write enough so to new chunks.

-------
nullb 4096 64 4 8
Created /dev/nullb0
mkfs.btrfs -f -q -dsingle -msingle /dev/nullb0
mount /dev/nullb0 /btrfs
dd if=/dev/zero of=/btrfs/tf1
dd: writing to '/btrfs/tf1': No space left on device
-------

Thanks, Anand

> [...]
> 
>> We should use fs_devices->alloc_list with dev_alloc_list as below.
>> Also, missing devices aren't part of dev_alloc_list, so we don't have
>> to check for if (!device->bdev).
>>
>> --------------
>> diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
>> index 7da630ff5708..c29e67c621be 100644
>> --- a/fs/btrfs/zoned.c
>> +++ b/fs/btrfs/zoned.c
>> @@ -1984,12 +1984,9 @@ bool btrfs_can_activate_zone(struct
>> btrfs_fs_devices *fs_devices, u64 flags)
>>
>>           /* Check if there is a device with active zones left */
>>           mutex_lock(&fs_info->chunk_mutex);
>> -       list_for_each_entry(device, &fs_devices->alloc_list, dev_list) {
>> +       list_for_each_entry(device, &fs_devices->alloc_list,
>> dev_alloc_list) {
>>                   struct btrfs_zoned_device_info *zinfo = device->zone_info;
>>
>> -               if (!device->bdev)
>> -                       continue;
>> -
>>                   if (!zinfo->max_active_zones ||
>>                       atomic_read(&zinfo->active_zones_left)) {
>>                           ret = true;
>> -------------
>>
>> With this fixed you may add
>>
>> Reviewed-by: Anand Jain <anand.jain@oracle.com>
> 
> Thanks I'll update the patch
diff mbox series

Patch

diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 49446bb5a5d1..7da630ff5708 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -1975,15 +1975,16 @@  int btrfs_zone_finish(struct btrfs_block_group *block_group)
 
 bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, u64 flags)
 {
+	struct btrfs_fs_info *fs_info = fs_devices->fs_info;
 	struct btrfs_device *device;
 	bool ret = false;
 
-	if (!btrfs_is_zoned(fs_devices->fs_info))
+	if (!btrfs_is_zoned(fs_info))
 		return true;
 
 	/* Check if there is a device with active zones left */
-	rcu_read_lock();
-	list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) {
+	mutex_lock(&fs_info->chunk_mutex);
+	list_for_each_entry(device, &fs_devices->alloc_list, dev_list) {
 		struct btrfs_zoned_device_info *zinfo = device->zone_info;
 
 		if (!device->bdev)
@@ -1995,7 +1996,7 @@  bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, u64 flags)
 			break;
 		}
 	}
-	rcu_read_unlock();
+	mutex_unlock(&fs_info->chunk_mutex);
 
 	return ret;
 }