mbox series

[v3,0/2] btrfs: zoned: mark relocation as writing

Message ID cover.1645157220.git.naohiro.aota@wdc.com (mailing list archive)
Headers show
Series btrfs: zoned: mark relocation as writing | expand

Message

Naohiro Aota Feb. 18, 2022, 4:14 a.m. UTC
There is a hung_task issue with running generic/068 on an SMR
device. The hang occurs while a process is trying to thaw the
filesystem. The process is trying to take sb->s_umount to thaw the
FS. The lock is held by fsstress, which calls btrfs_sync_fs() and is
waiting for an ordered extent to finish. However, as the FS is frozen,
the ordered extent never finish.

Having an ordered extent while the FS is frozen is the root cause of
the hang. The ordered extent is initiated from btrfs_relocate_chunk()
which is called from btrfs_reclaim_bgs_work().

The first patch is a preparation patch to add asserting functions to
check if sb_start_{write,pagefault,intwrite} is called.

The second patch adds sb_{start,end}_write and the assert function at
proper places.

Changelog:
v3:
  - Return bool instead of asserting and let caller decide what to do
    (suggested by Dave Chinner)
v2:
  - Implement asserting functions not to directly touch the internal
    implementation

Naohiro Aota (2):
  fs: add asserting functions for sb_start_{write,pagefault,intwrite}
  btrfs: zoned: mark relocation as writing

 fs/btrfs/block-group.c |  8 +++++++-
 fs/btrfs/volumes.c     |  6 ++++++
 include/linux/fs.h     | 20 ++++++++++++++++++++
 3 files changed, 33 insertions(+), 1 deletion(-)

Comments

David Sterba Feb. 18, 2022, 4:54 p.m. UTC | #1
On Fri, Feb 18, 2022 at 01:14:17PM +0900, Naohiro Aota wrote:
> There is a hung_task issue with running generic/068 on an SMR
> device. The hang occurs while a process is trying to thaw the
> filesystem. The process is trying to take sb->s_umount to thaw the
> FS. The lock is held by fsstress, which calls btrfs_sync_fs() and is
> waiting for an ordered extent to finish. However, as the FS is frozen,
> the ordered extent never finish.
> 
> Having an ordered extent while the FS is frozen is the root cause of
> the hang. The ordered extent is initiated from btrfs_relocate_chunk()
> which is called from btrfs_reclaim_bgs_work().
> 
> The first patch is a preparation patch to add asserting functions to
> check if sb_start_{write,pagefault,intwrite} is called.
> 
> The second patch adds sb_{start,end}_write and the assert function at
> proper places.
> 
> Changelog:
> v3:
>   - Return bool instead of asserting and let caller decide what to do
>     (suggested by Dave Chinner)
> v2:
>   - Implement asserting functions not to directly touch the internal
>     implementation
> 
> Naohiro Aota (2):
>   fs: add asserting functions for sb_start_{write,pagefault,intwrite}
>   btrfs: zoned: mark relocation as writing

Topic branch updated, thanks.