Message ID | ec6b55668686f77593f12c579832886294fc7310.1741596325.git.naohiro.aota@wdc.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | btrfs: zoned: skip reporting zone for new block group | expand |
On 3/12/25 10:31, Naohiro Aota wrote: > There is a potential deadlock if we do report zones in an IO context. When one > process do a report zones and another process freezes the block device, the > report zones side cannot allocate a tag because the freeze is already started. > This can thus result in new block group creation to hang forever, blocking the > write path. +Shin'ichiro blktest has a failing test case due to a lockdep splat triggered by this. Would be good to add that information (with the splat) here. > > Thankfully, a new block group should be created on empty zones. So, reporting > the zones is not necessary and we can set the write pointer = 0 and load the > zone capacity from the block layer using bdev_zone_capacity() helper. > > CC: stable@vger.kernel.org # 6.13+ > Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> With that fixed, looks good to me. Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
On Mar 12, 2025 / 10:42, Damien Le Moal wrote: > On 3/12/25 10:31, Naohiro Aota wrote: > > There is a potential deadlock if we do report zones in an IO context. When one > > process do a report zones and another process freezes the block device, the > > report zones side cannot allocate a tag because the freeze is already started. > > This can thus result in new block group creation to hang forever, blocking the > > write path. > > +Shin'ichiro > > blktest has a failing test case due to a lockdep splat triggered by this. Would > be good to add that information (with the splat) here. I confirmed that this fix avoids the blktests zbd/009 failure I reported [1]. Thanks for the fix! Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> [1] https://lore.kernel.org/linux-block/uyijd3ufbrfbiyyaajvhyhdyytssubekvymzgyiqjqmkh33cmi@ksqjpewsqlvw/
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 4956baf8220a..6c730f6bce10 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1277,7 +1277,7 @@ struct zone_info { static int btrfs_load_zone_info(struct btrfs_fs_info *fs_info, int zone_idx, struct zone_info *info, unsigned long *active, - struct btrfs_chunk_map *map) + struct btrfs_chunk_map *map, bool new) { struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace; struct btrfs_device *device; @@ -1307,6 +1307,8 @@ static int btrfs_load_zone_info(struct btrfs_fs_info *fs_info, int zone_idx, return 0; } + ASSERT(!new || btrfs_dev_is_empty_zone(device, info->physical)); + /* This zone will be used for allocation, so mark this zone non-empty. */ btrfs_dev_clear_zone_empty(device, info->physical); @@ -1319,6 +1321,18 @@ static int btrfs_load_zone_info(struct btrfs_fs_info *fs_info, int zone_idx, * to determine the allocation offset within the zone. */ WARN_ON(!IS_ALIGNED(info->physical, fs_info->zone_size)); + + if (new) { + sector_t capacity; + + capacity = bdev_zone_capacity(device->bdev, info->physical >> SECTOR_SHIFT); + up_read(&dev_replace->rwsem); + info->alloc_offset = 0; + info->capacity = capacity << SECTOR_SHIFT; + + return 0; + } + nofs_flag = memalloc_nofs_save(); ret = btrfs_get_dev_zone(device, info->physical, &zone); memalloc_nofs_restore(nofs_flag); @@ -1588,7 +1602,7 @@ int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new) } for (i = 0; i < map->num_stripes; i++) { - ret = btrfs_load_zone_info(fs_info, i, &zone_info[i], active, map); + ret = btrfs_load_zone_info(fs_info, i, &zone_info[i], active, map, new); if (ret) goto out;
There is a potential deadlock if we do report zones in an IO context. When one process do a report zones and another process freezes the block device, the report zones side cannot allocate a tag because the freeze is already started. This can thus result in new block group creation to hang forever, blocking the write path. Thankfully, a new block group should be created on empty zones. So, reporting the zones is not necessary and we can set the write pointer = 0 and load the zone capacity from the block layer using bdev_zone_capacity() helper. CC: stable@vger.kernel.org # 6.13+ Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> --- fs/btrfs/zoned.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-)