diff mbox series

[1/3] btrfs: call btrfs_close_devices from ->kill_sb

Message ID 20231213040018.73803-2-ebiggers@kernel.org (mailing list archive)
State New
Headers show
Series Move fscrypt keyring destruction to after ->put_super | expand

Commit Message

Eric Biggers Dec. 13, 2023, 4 a.m. UTC
From: Christoph Hellwig <hch@lst.de>

blkdev_put must not be called under sb->s_umount to avoid a lock order
reversal with disk->open_mutex once call backs from block devices to
the file system using the holder ops are supported.  Move the call
to btrfs_close_devices into btrfs_free_fs_info so that it is closed
from ->kill_sb (which is also called from the mount failure handling
path unlike ->put_super) as well as when an fs_info is freed because
an existing superblock already exists.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/btrfs/disk-io.c | 4 ++--
 fs/btrfs/super.c   | 7 ++-----
 2 files changed, 4 insertions(+), 7 deletions(-)

Comments

Christoph Hellwig Dec. 13, 2023, 8:41 a.m. UTC | #1
On Tue, Dec 12, 2023 at 08:00:16PM -0800, Eric Biggers wrote:
> From: Christoph Hellwig <hch@lst.de>
> 
> blkdev_put must not be called under sb->s_umount to avoid a lock order
> reversal with disk->open_mutex once call backs from block devices to
> the file system using the holder ops are supported.  Move the call
> to btrfs_close_devices into btrfs_free_fs_info so that it is closed
> from ->kill_sb (which is also called from the mount failure handling
> path unlike ->put_super) as well as when an fs_info is freed because
> an existing superblock already exists.

Thanks, this looks roughly the same to what I have locally.

I did in fact forward port everything missing from the get_super
series yesterday, but on my test setup btrfs/142 hangs even in the
baseline setup.  I went back to Linux before giving up for now.

Josef, any chane you could throw this branch:

    git://git.infradead.org/users/hch/misc.git btrfs-holder

into your CI setup and see if it sticks?  Except for the trivial last
three patches this is basically what you reviewed already, although
there was some heavy rebasing due to the mount API converison.
Josef Bacik Dec. 15, 2023, 2:26 p.m. UTC | #2
On Wed, Dec 13, 2023 at 09:41:23AM +0100, Christoph Hellwig wrote:
> On Tue, Dec 12, 2023 at 08:00:16PM -0800, Eric Biggers wrote:
> > From: Christoph Hellwig <hch@lst.de>
> > 
> > blkdev_put must not be called under sb->s_umount to avoid a lock order
> > reversal with disk->open_mutex once call backs from block devices to
> > the file system using the holder ops are supported.  Move the call
> > to btrfs_close_devices into btrfs_free_fs_info so that it is closed
> > from ->kill_sb (which is also called from the mount failure handling
> > path unlike ->put_super) as well as when an fs_info is freed because
> > an existing superblock already exists.
> 
> Thanks, this looks roughly the same to what I have locally.
> 
> I did in fact forward port everything missing from the get_super
> series yesterday, but on my test setup btrfs/142 hangs even in the
> baseline setup.  I went back to Linux before giving up for now.
> 
> Josef, any chane you could throw this branch:
> 
>     git://git.infradead.org/users/hch/misc.git btrfs-holder
> 
> into your CI setup and see if it sticks?  Except for the trivial last
> three patches this is basically what you reviewed already, although
> there was some heavy rebasing due to the mount API converison.
> 

Yup, sorry Christoph I missed this email when you sent it, I'll throw it in
there now.  Thanks,

Josef
Josef Bacik Dec. 15, 2023, 9:45 p.m. UTC | #3
On Wed, Dec 13, 2023 at 09:41:23AM +0100, Christoph Hellwig wrote:
> On Tue, Dec 12, 2023 at 08:00:16PM -0800, Eric Biggers wrote:
> > From: Christoph Hellwig <hch@lst.de>
> > 
> > blkdev_put must not be called under sb->s_umount to avoid a lock order
> > reversal with disk->open_mutex once call backs from block devices to
> > the file system using the holder ops are supported.  Move the call
> > to btrfs_close_devices into btrfs_free_fs_info so that it is closed
> > from ->kill_sb (which is also called from the mount failure handling
> > path unlike ->put_super) as well as when an fs_info is freed because
> > an existing superblock already exists.
> 
> Thanks, this looks roughly the same to what I have locally.
> 
> I did in fact forward port everything missing from the get_super
> series yesterday, but on my test setup btrfs/142 hangs even in the
> baseline setup.  I went back to Linux before giving up for now.
> 
> Josef, any chane you could throw this branch:
> 
>     git://git.infradead.org/users/hch/misc.git btrfs-holder
> 
> into your CI setup and see if it sticks?  Except for the trivial last
> three patches this is basically what you reviewed already, although
> there was some heavy rebasing due to the mount API converison.

I ran it through, you broke a test that isn't upstream yet to test the old mount
api double mount thing that I have a test for

https://github.com/btrfs/fstests/commit/2796723e77adb0f9da1059acf13fc402467f7ac4

In this case we end up leaking a reference on the fs_devices.  If you add this
fixup to "btrfs: call btrfs_close_devices from ->kill_sb" it fixes that failure.
I'm re-running with that fixup applied, but I assume the rest is fine.  Thanks,

Josef
Christoph Hellwig Dec. 16, 2023, 4:12 a.m. UTC | #4
On Fri, Dec 15, 2023 at 04:45:50PM -0500, Josef Bacik wrote:
> I ran it through, you broke a test that isn't upstream yet to test the old mount
> api double mount thing that I have a test for
> 
> https://github.com/btrfs/fstests/commit/2796723e77adb0f9da1059acf13fc402467f7ac4
> 
> In this case we end up leaking a reference on the fs_devices.  If you add this
> fixup to "btrfs: call btrfs_close_devices from ->kill_sb" it fixes that failure.
> I'm re-running with that fixup applied, but I assume the rest is fine.  Thanks,

Is "this fixup" referring to a patch that was supposed to be attached
but is't? :)
Josef Bacik Dec. 16, 2023, 2:52 p.m. UTC | #5
On Sat, Dec 16, 2023 at 05:12:21AM +0100, Christoph Hellwig wrote:
> On Fri, Dec 15, 2023 at 04:45:50PM -0500, Josef Bacik wrote:
> > I ran it through, you broke a test that isn't upstream yet to test the old mount
> > api double mount thing that I have a test for
> > 
> > https://github.com/btrfs/fstests/commit/2796723e77adb0f9da1059acf13fc402467f7ac4
> > 
> > In this case we end up leaking a reference on the fs_devices.  If you add this
> > fixup to "btrfs: call btrfs_close_devices from ->kill_sb" it fixes that failure.
> > I'm re-running with that fixup applied, but I assume the rest is fine.  Thanks,
> 
> Is "this fixup" referring to a patch that was supposed to be attached
> but is't? :)

Sorry, vacation brain, here you go.

Josef

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index f93fe2e5e378..2dfa2274b193 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1950,10 +1950,20 @@ static int btrfs_get_tree_super(struct fs_context *fc)
  */
 static struct vfsmount *btrfs_reconfigure_for_mount(struct fs_context *fc)
 {
+	struct btrfs_fs_info *fs_info = fc->s_fs_info;
 	struct vfsmount *mnt;
 	int ret;
 	const bool ro2rw = !(fc->sb_flags & SB_RDONLY);
 
+	/*
+	 * We got a reference to our fs_devices, so we need to close it here to
+	 * make sure we don't leak our reference on the fs_devices.
+	 */
+	if (fs_info->fs_devices) {
+		btrfs_close_devices(fs_info->fs_devices);
+		fs_info->fs_devices = NULL;
+	}
+
 	/*
 	 * We got an EBUSY because our SB_RDONLY flag didn't match the existing
 	 * super block, so invert our setting here and retry the mount so we
diff mbox series

Patch

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index bbcc3df776461..fe98e6b1d9c61 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1237,20 +1237,22 @@  static void free_global_roots(struct btrfs_fs_info *fs_info)
 
 	while ((node = rb_first_postorder(&fs_info->global_root_tree)) != NULL) {
 		root = rb_entry(node, struct btrfs_root, rb_node);
 		rb_erase(&root->rb_node, &fs_info->global_root_tree);
 		btrfs_put_root(root);
 	}
 }
 
 void btrfs_free_fs_info(struct btrfs_fs_info *fs_info)
 {
+	if (fs_info->fs_devices)
+		btrfs_close_devices(fs_info->fs_devices);
 	percpu_counter_destroy(&fs_info->dirty_metadata_bytes);
 	percpu_counter_destroy(&fs_info->delalloc_bytes);
 	percpu_counter_destroy(&fs_info->ordered_bytes);
 	percpu_counter_destroy(&fs_info->dev_replace.bio_counter);
 	btrfs_free_csum_hash(fs_info);
 	btrfs_free_stripe_hash_table(fs_info);
 	btrfs_free_ref_cache(fs_info);
 	kfree(fs_info->balance_ctl);
 	kfree(fs_info->delayed_root);
 	free_global_roots(fs_info);
@@ -3605,21 +3607,20 @@  int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 	invalidate_inode_pages2(fs_info->btree_inode->i_mapping);
 
 fail_sb_buffer:
 	btrfs_stop_all_workers(fs_info);
 	btrfs_free_block_groups(fs_info);
 fail_alloc:
 	btrfs_mapping_tree_free(&fs_info->mapping_tree);
 
 	iput(fs_info->btree_inode);
 fail:
-	btrfs_close_devices(fs_info->fs_devices);
 	ASSERT(ret < 0);
 	return ret;
 }
 ALLOW_ERROR_INJECTION(open_ctree, ERRNO);
 
 static void btrfs_end_super_write(struct bio *bio)
 {
 	struct btrfs_device *device = bio->bi_private;
 	struct bio_vec *bvec;
 	struct bvec_iter_all iter_all;
@@ -4385,21 +4386,20 @@  void __cold close_ctree(struct btrfs_fs_info *fs_info)
 	 * have had an IO error and have left over tree log blocks that aren't
 	 * cleaned up until the fs roots are freed.  This makes the block group
 	 * accounting appear to be wrong because there's pending reserved bytes,
 	 * so make sure we do the block group cleanup afterwards.
 	 */
 	btrfs_free_block_groups(fs_info);
 
 	iput(fs_info->btree_inode);
 
 	btrfs_mapping_tree_free(&fs_info->mapping_tree);
-	btrfs_close_devices(fs_info->fs_devices);
 }
 
 void btrfs_mark_buffer_dirty(struct btrfs_trans_handle *trans,
 			     struct extent_buffer *buf)
 {
 	struct btrfs_fs_info *fs_info = buf->fs_info;
 	u64 transid = btrfs_header_generation(buf);
 
 #ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS
 	/*
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index ef256b944c72a..9616ce63c5630 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1450,55 +1450,52 @@  static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
 	fs_devices = device->fs_devices;
 	fs_info->fs_devices = fs_devices;
 
 	error = btrfs_open_devices(fs_devices, mode, fs_type);
 	mutex_unlock(&uuid_mutex);
 	if (error)
 		goto error_fs_info;
 
 	if (!(flags & SB_RDONLY) && fs_devices->rw_devices == 0) {
 		error = -EACCES;
-		goto error_close_devices;
+		goto error_fs_info;
 	}
 
 	bdev = fs_devices->latest_dev->bdev;
 	s = sget(fs_type, btrfs_test_super, btrfs_set_super, flags | SB_NOSEC,
 		 fs_info);
 	if (IS_ERR(s)) {
 		error = PTR_ERR(s);
-		goto error_close_devices;
+		goto error_fs_info;
 	}
 
 	if (s->s_root) {
-		btrfs_close_devices(fs_devices);
 		btrfs_free_fs_info(fs_info);
 		if ((flags ^ s->s_flags) & SB_RDONLY)
 			error = -EBUSY;
 	} else {
 		snprintf(s->s_id, sizeof(s->s_id), "%pg", bdev);
 		shrinker_debugfs_rename(s->s_shrink, "sb-%s:%s", fs_type->name,
 					s->s_id);
 		btrfs_sb(s)->bdev_holder = fs_type;
 		error = btrfs_fill_super(s, fs_devices, data);
 	}
 	if (!error)
 		error = security_sb_set_mnt_opts(s, new_sec_opts, 0, NULL);
 	security_free_mnt_opts(&new_sec_opts);
 	if (error) {
 		deactivate_locked_super(s);
 		return ERR_PTR(error);
 	}
 
 	return dget(s->s_root);
 
-error_close_devices:
-	btrfs_close_devices(fs_devices);
 error_fs_info:
 	btrfs_free_fs_info(fs_info);
 error_sec_opts:
 	security_free_mnt_opts(&new_sec_opts);
 	return ERR_PTR(error);
 }
 
 /*
  * Mount function which is called by VFS layer.
  *