Message ID | 20240122-reclaim-fix-v1-1-761234a6d005@wdc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: zoned: kick reclaim earlier on fast zoned devices | expand |
On Mon, Jan 22, 2024 at 02:51:03AM -0800, Johannes Thumshirn wrote: > As btrfs_zoned_should_reclaim only has to iterate the device list in order > to collect stats on the device's total and used bytes, we don't need to > take the full blown mutex, but can iterate the device list in a rcu_read > context. > > Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Looks good. Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com>
On Mon, Jan 22, 2024 at 02:51:03AM -0800, Johannes Thumshirn wrote: > As btrfs_zoned_should_reclaim only has to iterate the device list in order > to collect stats on the device's total and used bytes, we don't need to > take the full blown mutex, but can iterate the device list in a rcu_read > context. > > Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > --- > fs/btrfs/zoned.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c > index 168af9d000d1..b7e7b5a5a6fa 100644 > --- a/fs/btrfs/zoned.c > +++ b/fs/btrfs/zoned.c > @@ -2423,15 +2423,15 @@ bool btrfs_zoned_should_reclaim(struct btrfs_fs_info *fs_info) > if (fs_info->bg_reclaim_threshold == 0) > return false; > > - mutex_lock(&fs_devices->device_list_mutex); > - list_for_each_entry(device, &fs_devices->devices, dev_list) { > + rcu_read_lock(); > + list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) { > if (!device->bdev) > continue; > > total += device->disk_total_bytes; > used += device->bytes_used; > } > - mutex_unlock(&fs_devices->device_list_mutex); > + rcu_read_unlock(); This is basically only a hint and inaccuracies in the total or used values would be transient, right? The sum is calculated each time the funciton is called, not stored anywhere so in the unlikely case of device removal it may skip reclaim once, but then pick it up later. Any actual removal of the block groups in verified again and properly locked in btrfs_reclaim_bgs_work().
On 22.01.24 22:35, David Sterba wrote: > On Mon, Jan 22, 2024 at 02:51:03AM -0800, Johannes Thumshirn wrote: >> As btrfs_zoned_should_reclaim only has to iterate the device list in order >> to collect stats on the device's total and used bytes, we don't need to >> take the full blown mutex, but can iterate the device list in a rcu_read >> context. >> >> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> >> --- >> fs/btrfs/zoned.c | 6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c >> index 168af9d000d1..b7e7b5a5a6fa 100644 >> --- a/fs/btrfs/zoned.c >> +++ b/fs/btrfs/zoned.c >> @@ -2423,15 +2423,15 @@ bool btrfs_zoned_should_reclaim(struct btrfs_fs_info *fs_info) >> if (fs_info->bg_reclaim_threshold == 0) >> return false; >> >> - mutex_lock(&fs_devices->device_list_mutex); >> - list_for_each_entry(device, &fs_devices->devices, dev_list) { >> + rcu_read_lock(); >> + list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) { >> if (!device->bdev) >> continue; >> >> total += device->disk_total_bytes; >> used += device->bytes_used; >> } >> - mutex_unlock(&fs_devices->device_list_mutex); >> + rcu_read_unlock(); > > This is basically only a hint and inaccuracies in the total or used > values would be transient, right? The sum is calculated each time the > funciton is called, not stored anywhere so in the unlikely case of > device removal it may skip reclaim once, but then pick it up later. > Any actual removal of the block groups in verified again and properly > locked in btrfs_reclaim_bgs_work(). > Yes.
On Tue, Jan 23, 2024 at 07:49:22AM +0000, Johannes Thumshirn wrote: > On 22.01.24 22:35, David Sterba wrote: > > On Mon, Jan 22, 2024 at 02:51:03AM -0800, Johannes Thumshirn wrote: > >> As btrfs_zoned_should_reclaim only has to iterate the device list in order > >> to collect stats on the device's total and used bytes, we don't need to > >> take the full blown mutex, but can iterate the device list in a rcu_read > >> context. > >> > >> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > >> --- > >> fs/btrfs/zoned.c | 6 +++--- > >> 1 file changed, 3 insertions(+), 3 deletions(-) > >> > >> diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c > >> index 168af9d000d1..b7e7b5a5a6fa 100644 > >> --- a/fs/btrfs/zoned.c > >> +++ b/fs/btrfs/zoned.c > >> @@ -2423,15 +2423,15 @@ bool btrfs_zoned_should_reclaim(struct btrfs_fs_info *fs_info) > >> if (fs_info->bg_reclaim_threshold == 0) > >> return false; > >> > >> - mutex_lock(&fs_devices->device_list_mutex); > >> - list_for_each_entry(device, &fs_devices->devices, dev_list) { > >> + rcu_read_lock(); > >> + list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) { > >> if (!device->bdev) > >> continue; > >> > >> total += device->disk_total_bytes; > >> used += device->bytes_used; > >> } > >> - mutex_unlock(&fs_devices->device_list_mutex); > >> + rcu_read_unlock(); > > > > This is basically only a hint and inaccuracies in the total or used > > values would be transient, right? The sum is calculated each time the > > funciton is called, not stored anywhere so in the unlikely case of > > device removal it may skip reclaim once, but then pick it up later. > > Any actual removal of the block groups in verified again and properly > > locked in btrfs_reclaim_bgs_work(). > > > > Yes. So please add it to the changelog as an explanation why the mutex -> rcu switch is safe, thanks.
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 168af9d000d1..b7e7b5a5a6fa 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -2423,15 +2423,15 @@ bool btrfs_zoned_should_reclaim(struct btrfs_fs_info *fs_info) if (fs_info->bg_reclaim_threshold == 0) return false; - mutex_lock(&fs_devices->device_list_mutex); - list_for_each_entry(device, &fs_devices->devices, dev_list) { + rcu_read_lock(); + list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) { if (!device->bdev) continue; total += device->disk_total_bytes; used += device->bytes_used; } - mutex_unlock(&fs_devices->device_list_mutex); + rcu_read_unlock(); factor = div64_u64(used * 100, total); return factor >= fs_info->bg_reclaim_threshold;
As btrfs_zoned_should_reclaim only has to iterate the device list in order to collect stats on the device's total and used bytes, we don't need to take the full blown mutex, but can iterate the device list in a rcu_read context. Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> --- fs/btrfs/zoned.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)