diff mbox

Btrfs: stop all workers before cleaning up roots

Message ID 1369947496-27707-1-git-send-email-jbacik@fusionio.com (mailing list archive)
State New, archived
Headers show

Commit Message

Josef Bacik May 30, 2013, 8:58 p.m. UTC
Dave reported a panic because the extent_root->commit_root was NULL in the
caching kthread.  That is because we just unset it in free_root_pointers, which
is not the correct thing to do, we have to either wait for the caching kthread
to complete or hold the extent_commit_sem lock so we know the thread has exited.
This patch makes the kthreads all stop first and then we do our cleanup.  This
should fix the race.  Thanks,

Reported-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
---
 fs/btrfs/disk-io.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

Comments

Alex Lyakas Aug. 1, 2013, 2:05 p.m. UTC | #1
Hi Josef,

On Thu, May 30, 2013 at 11:58 PM, Josef Bacik <jbacik@fusionio.com> wrote:
> Dave reported a panic because the extent_root->commit_root was NULL in the
> caching kthread.  That is because we just unset it in free_root_pointers, which
> is not the correct thing to do, we have to either wait for the caching kthread
> to complete or hold the extent_commit_sem lock so we know the thread has exited.
> This patch makes the kthreads all stop first and then we do our cleanup.  This
> should fix the race.  Thanks,
>
> Reported-by: David Sterba <dsterba@suse.cz>
> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
> ---
>  fs/btrfs/disk-io.c |    6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 2b53afd..77cb566 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -3547,13 +3547,13 @@ int close_ctree(struct btrfs_root *root)
>
>         btrfs_free_block_groups(fs_info);

do you think it would be safer to stop all workers first and make sure
they are stopped, then do btrfs_free_block_groups()? I see, for
example, that btrfs_free_block_groups() checks:
if (block_group->cached == BTRFS_CACHE_STARTED)
which could be perhaps racy with other people spawning caching_threads.

So maybe better to stop all threads (including cleaner and committer)
and then free everything?

>
> -       free_root_pointers(fs_info, 1);
> +       btrfs_stop_all_workers(fs_info);
>
>         del_fs_roots(fs_info);
>
> -       iput(fs_info->btree_inode);
> +       free_root_pointers(fs_info, 1);
>
> -       btrfs_stop_all_workers(fs_info);
> +       iput(fs_info->btree_inode);
>
>  #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY
>         if (btrfs_test_opt(root, CHECK_INTEGRITY))
> --
> 1.7.7.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Alex.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Josef Bacik Aug. 5, 2013, 3:09 p.m. UTC | #2
On Thu, Aug 01, 2013 at 05:05:35PM +0300, Alex Lyakas wrote:
> Hi Josef,
> 
> On Thu, May 30, 2013 at 11:58 PM, Josef Bacik <jbacik@fusionio.com> wrote:
> > Dave reported a panic because the extent_root->commit_root was NULL in the
> > caching kthread.  That is because we just unset it in free_root_pointers, which
> > is not the correct thing to do, we have to either wait for the caching kthread
> > to complete or hold the extent_commit_sem lock so we know the thread has exited.
> > This patch makes the kthreads all stop first and then we do our cleanup.  This
> > should fix the race.  Thanks,
> >
> > Reported-by: David Sterba <dsterba@suse.cz>
> > Signed-off-by: Josef Bacik <jbacik@fusionio.com>
> > ---
> >  fs/btrfs/disk-io.c |    6 +++---
> >  1 files changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> > index 2b53afd..77cb566 100644
> > --- a/fs/btrfs/disk-io.c
> > +++ b/fs/btrfs/disk-io.c
> > @@ -3547,13 +3547,13 @@ int close_ctree(struct btrfs_root *root)
> >
> >         btrfs_free_block_groups(fs_info);
> 
> do you think it would be safer to stop all workers first and make sure
> they are stopped, then do btrfs_free_block_groups()? I see, for
> example, that btrfs_free_block_groups() checks:
> if (block_group->cached == BTRFS_CACHE_STARTED)
> which could be perhaps racy with other people spawning caching_threads.
> 
> So maybe better to stop all threads (including cleaner and committer)
> and then free everything?
> 

Well nobody should be writing anymore, so we shouldn't be starting any new
caching_kthreads, we should just be cleaning up threads that are already
running.  Btrfs_free_block_groups() will wait on any kthreads it spawned, so we
are good there.  Hth,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 2b53afd..77cb566 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3547,13 +3547,13 @@  int close_ctree(struct btrfs_root *root)
 
 	btrfs_free_block_groups(fs_info);
 
-	free_root_pointers(fs_info, 1);
+	btrfs_stop_all_workers(fs_info);
 
 	del_fs_roots(fs_info);
 
-	iput(fs_info->btree_inode);
+	free_root_pointers(fs_info, 1);
 
-	btrfs_stop_all_workers(fs_info);
+	iput(fs_info->btree_inode);
 
 #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY
 	if (btrfs_test_opt(root, CHECK_INTEGRITY))