diff mbox series

[1/5] Btrfs: fix hang when loading existing inode cache off disk

Message ID 20190704152409.20665-1-fdmanana@kernel.org (mailing list archive)
State New, archived
Headers show
Series Btrfs: fixes for inode cache (-o inode_cache) | expand

Commit Message

Filipe Manana July 4, 2019, 3:24 p.m. UTC
From: Filipe Manana <fdmanana@suse.com>

If we are able to load an existing inode cache off disk, we set the state
of the cache to BTRFS_CACHE_FINISHED, but we don't wake up any one waiting
for the cache to be available. This means that anyone waiting for the
cache to be available, waiting on the condition that either its state is
BTRFS_CACHE_FINISHED or its available free space is greather than zero,
can hang forever.

This could be observed running fstests with MOUNT_OPTIONS="-o inode_cache",
in particular test case generic/161 triggered it very frequently for me,
producing a trace like the following:

  [63795.739712] BTRFS info (device sdc): enabling inode map caching
  [63795.739714] BTRFS info (device sdc): disk space caching is enabled
  [63795.739716] BTRFS info (device sdc): has skinny extents
  [64036.653886] INFO: task btrfs-transacti:3917 blocked for more than 120 seconds.
  [64036.654079]       Not tainted 5.2.0-rc4-btrfs-next-50 #1
  [64036.654143] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  [64036.654232] btrfs-transacti D    0  3917      2 0x80004000
  [64036.654239] Call Trace:
  [64036.654258]  ? __schedule+0x3ae/0x7b0
  [64036.654271]  schedule+0x3a/0xb0
  [64036.654325]  btrfs_commit_transaction+0x978/0xae0 [btrfs]
  [64036.654339]  ? remove_wait_queue+0x60/0x60
  [64036.654395]  transaction_kthread+0x146/0x180 [btrfs]
  [64036.654450]  ? btrfs_cleanup_transaction+0x620/0x620 [btrfs]
  [64036.654456]  kthread+0x103/0x140
  [64036.654464]  ? kthread_create_worker_on_cpu+0x70/0x70
  [64036.654476]  ret_from_fork+0x3a/0x50
  [64036.654504] INFO: task xfs_io:3919 blocked for more than 120 seconds.
  [64036.654568]       Not tainted 5.2.0-rc4-btrfs-next-50 #1
  [64036.654617] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  [64036.654685] xfs_io          D    0  3919   3633 0x00000000
  [64036.654691] Call Trace:
  [64036.654703]  ? __schedule+0x3ae/0x7b0
  [64036.654716]  schedule+0x3a/0xb0
  [64036.654756]  btrfs_find_free_ino+0xa9/0x120 [btrfs]
  [64036.654764]  ? remove_wait_queue+0x60/0x60
  [64036.654809]  btrfs_create+0x72/0x1f0 [btrfs]
  [64036.654822]  lookup_open+0x6bc/0x790
  [64036.654849]  path_openat+0x3bc/0xc00
  [64036.654854]  ? __lock_acquire+0x331/0x1cb0
  [64036.654869]  do_filp_open+0x99/0x110
  [64036.654884]  ? __alloc_fd+0xee/0x200
  [64036.654895]  ? do_raw_spin_unlock+0x49/0xc0
  [64036.654909]  ? do_sys_open+0x132/0x220
  [64036.654913]  do_sys_open+0x132/0x220
  [64036.654926]  do_syscall_64+0x60/0x1d0
  [64036.654933]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fix this by adding a wake_up() call right after setting the cache state to
BTRFS_CACHE_FINISHED, at start_caching(), when we are able to load the
cache from disk.

Fixes: 82d5902d9c681b ("Btrfs: Support reading/writing on disk free ino cache")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/inode-map.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Nikolay Borisov July 5, 2019, 9:09 a.m. UTC | #1
On 4.07.19 г. 18:24 ч., fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> If we are able to load an existing inode cache off disk, we set the state
> of the cache to BTRFS_CACHE_FINISHED, but we don't wake up any one waiting
> for the cache to be available. This means that anyone waiting for the
> cache to be available, waiting on the condition that either its state is
> BTRFS_CACHE_FINISHED or its available free space is greather than zero,
> can hang forever.
> 
> This could be observed running fstests with MOUNT_OPTIONS="-o inode_cache",
> in particular test case generic/161 triggered it very frequently for me,
> producing a trace like the following:
> 
>   [63795.739712] BTRFS info (device sdc): enabling inode map caching
>   [63795.739714] BTRFS info (device sdc): disk space caching is enabled
>   [63795.739716] BTRFS info (device sdc): has skinny extents
>   [64036.653886] INFO: task btrfs-transacti:3917 blocked for more than 120 seconds.
>   [64036.654079]       Not tainted 5.2.0-rc4-btrfs-next-50 #1
>   [64036.654143] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>   [64036.654232] btrfs-transacti D    0  3917      2 0x80004000
>   [64036.654239] Call Trace:
>   [64036.654258]  ? __schedule+0x3ae/0x7b0
>   [64036.654271]  schedule+0x3a/0xb0
>   [64036.654325]  btrfs_commit_transaction+0x978/0xae0 [btrfs]
>   [64036.654339]  ? remove_wait_queue+0x60/0x60
>   [64036.654395]  transaction_kthread+0x146/0x180 [btrfs]
>   [64036.654450]  ? btrfs_cleanup_transaction+0x620/0x620 [btrfs]
>   [64036.654456]  kthread+0x103/0x140
>   [64036.654464]  ? kthread_create_worker_on_cpu+0x70/0x70
>   [64036.654476]  ret_from_fork+0x3a/0x50
>   [64036.654504] INFO: task xfs_io:3919 blocked for more than 120 seconds.
>   [64036.654568]       Not tainted 5.2.0-rc4-btrfs-next-50 #1
>   [64036.654617] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>   [64036.654685] xfs_io          D    0  3919   3633 0x00000000
>   [64036.654691] Call Trace:
>   [64036.654703]  ? __schedule+0x3ae/0x7b0
>   [64036.654716]  schedule+0x3a/0xb0
>   [64036.654756]  btrfs_find_free_ino+0xa9/0x120 [btrfs]
>   [64036.654764]  ? remove_wait_queue+0x60/0x60
>   [64036.654809]  btrfs_create+0x72/0x1f0 [btrfs]
>   [64036.654822]  lookup_open+0x6bc/0x790
>   [64036.654849]  path_openat+0x3bc/0xc00
>   [64036.654854]  ? __lock_acquire+0x331/0x1cb0
>   [64036.654869]  do_filp_open+0x99/0x110
>   [64036.654884]  ? __alloc_fd+0xee/0x200
>   [64036.654895]  ? do_raw_spin_unlock+0x49/0xc0
>   [64036.654909]  ? do_sys_open+0x132/0x220
>   [64036.654913]  do_sys_open+0x132/0x220
>   [64036.654926]  do_syscall_64+0x60/0x1d0
>   [64036.654933]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> Fix this by adding a wake_up() call right after setting the cache state to
> BTRFS_CACHE_FINISHED, at start_caching(), when we are able to load the
> cache from disk.
> 
> Fixes: 82d5902d9c681b ("Btrfs: Support reading/writing on disk free ino cache")
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> ---
>  fs/btrfs/inode-map.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c
> index ffca2abf13d0..4a5882665f8a 100644
> --- a/fs/btrfs/inode-map.c
> +++ b/fs/btrfs/inode-map.c
> @@ -145,6 +145,7 @@ static void start_caching(struct btrfs_root *root)
>  		spin_lock(&root->ino_cache_lock);
>  		root->ino_cache_state = BTRFS_CACHE_FINISHED;
>  		spin_unlock(&root->ino_cache_lock);
> +		wake_up(&root->ino_cache_wait);

One of the two callers of start_caching doesn't actually wait for the
cache to load - btrfs_return_ino. Is this expected or is it also a bug?

The presence of such a glaring omission of the wake up means this code
hasn't been tested much.

>  		return;
>  	}
>  
>
Filipe Manana July 5, 2019, 10:44 a.m. UTC | #2
On Fri, Jul 5, 2019 at 10:09 AM Nikolay Borisov <nborisov@suse.com> wrote:
>
>
>
> On 4.07.19 г. 18:24 ч., fdmanana@kernel.org wrote:
> > From: Filipe Manana <fdmanana@suse.com>
> >
> > If we are able to load an existing inode cache off disk, we set the state
> > of the cache to BTRFS_CACHE_FINISHED, but we don't wake up any one waiting
> > for the cache to be available. This means that anyone waiting for the
> > cache to be available, waiting on the condition that either its state is
> > BTRFS_CACHE_FINISHED or its available free space is greather than zero,
> > can hang forever.
> >
> > This could be observed running fstests with MOUNT_OPTIONS="-o inode_cache",
> > in particular test case generic/161 triggered it very frequently for me,
> > producing a trace like the following:
> >
> >   [63795.739712] BTRFS info (device sdc): enabling inode map caching
> >   [63795.739714] BTRFS info (device sdc): disk space caching is enabled
> >   [63795.739716] BTRFS info (device sdc): has skinny extents
> >   [64036.653886] INFO: task btrfs-transacti:3917 blocked for more than 120 seconds.
> >   [64036.654079]       Not tainted 5.2.0-rc4-btrfs-next-50 #1
> >   [64036.654143] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >   [64036.654232] btrfs-transacti D    0  3917      2 0x80004000
> >   [64036.654239] Call Trace:
> >   [64036.654258]  ? __schedule+0x3ae/0x7b0
> >   [64036.654271]  schedule+0x3a/0xb0
> >   [64036.654325]  btrfs_commit_transaction+0x978/0xae0 [btrfs]
> >   [64036.654339]  ? remove_wait_queue+0x60/0x60
> >   [64036.654395]  transaction_kthread+0x146/0x180 [btrfs]
> >   [64036.654450]  ? btrfs_cleanup_transaction+0x620/0x620 [btrfs]
> >   [64036.654456]  kthread+0x103/0x140
> >   [64036.654464]  ? kthread_create_worker_on_cpu+0x70/0x70
> >   [64036.654476]  ret_from_fork+0x3a/0x50
> >   [64036.654504] INFO: task xfs_io:3919 blocked for more than 120 seconds.
> >   [64036.654568]       Not tainted 5.2.0-rc4-btrfs-next-50 #1
> >   [64036.654617] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >   [64036.654685] xfs_io          D    0  3919   3633 0x00000000
> >   [64036.654691] Call Trace:
> >   [64036.654703]  ? __schedule+0x3ae/0x7b0
> >   [64036.654716]  schedule+0x3a/0xb0
> >   [64036.654756]  btrfs_find_free_ino+0xa9/0x120 [btrfs]
> >   [64036.654764]  ? remove_wait_queue+0x60/0x60
> >   [64036.654809]  btrfs_create+0x72/0x1f0 [btrfs]
> >   [64036.654822]  lookup_open+0x6bc/0x790
> >   [64036.654849]  path_openat+0x3bc/0xc00
> >   [64036.654854]  ? __lock_acquire+0x331/0x1cb0
> >   [64036.654869]  do_filp_open+0x99/0x110
> >   [64036.654884]  ? __alloc_fd+0xee/0x200
> >   [64036.654895]  ? do_raw_spin_unlock+0x49/0xc0
> >   [64036.654909]  ? do_sys_open+0x132/0x220
> >   [64036.654913]  do_sys_open+0x132/0x220
> >   [64036.654926]  do_syscall_64+0x60/0x1d0
> >   [64036.654933]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >
> > Fix this by adding a wake_up() call right after setting the cache state to
> > BTRFS_CACHE_FINISHED, at start_caching(), when we are able to load the
> > cache from disk.
> >
> > Fixes: 82d5902d9c681b ("Btrfs: Support reading/writing on disk free ino cache")
> > Signed-off-by: Filipe Manana <fdmanana@suse.com>
>
> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> > ---
> >  fs/btrfs/inode-map.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c
> > index ffca2abf13d0..4a5882665f8a 100644
> > --- a/fs/btrfs/inode-map.c
> > +++ b/fs/btrfs/inode-map.c
> > @@ -145,6 +145,7 @@ static void start_caching(struct btrfs_root *root)
> >               spin_lock(&root->ino_cache_lock);
> >               root->ino_cache_state = BTRFS_CACHE_FINISHED;
> >               spin_unlock(&root->ino_cache_lock);
> > +             wake_up(&root->ino_cache_wait);
>
> One of the two callers of start_caching doesn't actually wait for the
> cache to load - btrfs_return_ino. Is this expected or is it also a bug?

It's expected. It doesn't need to allocate an inode, so it doesn't
need to wait for the caching to complete - it just wants to mark an
inode as free in the cache.

>
> The presence of such a glaring omission of the wake up means this code
> hasn't been tested much.

It hasn't. Last time I tried it was more than one year ago, maybe two.

Thanks.

>
> >               return;
> >       }
> >
> >
diff mbox series

Patch

diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c
index ffca2abf13d0..4a5882665f8a 100644
--- a/fs/btrfs/inode-map.c
+++ b/fs/btrfs/inode-map.c
@@ -145,6 +145,7 @@  static void start_caching(struct btrfs_root *root)
 		spin_lock(&root->ino_cache_lock);
 		root->ino_cache_state = BTRFS_CACHE_FINISHED;
 		spin_unlock(&root->ino_cache_lock);
+		wake_up(&root->ino_cache_wait);
 		return;
 	}