Message ID | 20230802154131.2221419-12-hch@lst.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [01/12] fs: export setup_bdev_super | expand |
On Wed, Aug 02, 2023 at 05:41:30PM +0200, Christoph Hellwig wrote: > Just like get_tree_bdev needs to drop s_umount when opening the main > device, we need to do the same for the xfs log and RT devices to avoid a > potential lock order reversal with s_unmount for the mark_dead path. > > It might be preferable to just drop s_umount over ->fill_super entirely, > but that will require a fairly massive audit first, so we'll do the easy > version here first. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > fs/xfs/xfs_super.c | 15 +++++++++++---- > 1 file changed, 11 insertions(+), 4 deletions(-) > > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c > index 8185102431301d..d5042419ed9997 100644 > --- a/fs/xfs/xfs_super.c > +++ b/fs/xfs/xfs_super.c > @@ -448,17 +448,21 @@ STATIC int > xfs_open_devices( > struct xfs_mount *mp) > { > - struct block_device *ddev = mp->m_super->s_bdev; > + struct super_block *sb = mp->m_super; > + struct block_device *ddev = sb->s_bdev; > struct block_device *logdev = NULL, *rtdev = NULL; > int error; > > + /* see get_tree_bdev why this is needed and safe */ Which part of get_tree_bdev? Is it this? /* * s_umount nests inside open_mutex during * __invalidate_device(). blkdev_put() acquires * open_mutex and can't be called under s_umount. Drop * s_umount temporarily. This is safe as we're * holding an active reference. */ up_write(&s->s_umount); blkdev_put(bdev, fc->fs_type); down_write(&s->s_umount); <confused> > + up_write(&sb->s_umount); > + > /* > * Open real time and log devices - order is important. > */ > if (mp->m_logname) { > error = xfs_blkdev_get(mp, mp->m_logname, &logdev); > if (error) > - return error; > + goto out_unlock; > } > > if (mp->m_rtname) { > @@ -496,7 +500,10 @@ xfs_open_devices( > mp->m_logdev_targp = mp->m_ddev_targp; > } > > - return 0; > + error = 0; > +out_unlock: > + down_write(&sb->s_umount); Isn't down_write taking s_umount? I think the label should be out_relock or something less misleading. --D > + return error; > > out_free_rtdev_targ: > if (mp->m_rtdev_targp) > @@ -508,7 +515,7 @@ xfs_open_devices( > out_close_logdev: > if (logdev && logdev != ddev) > xfs_blkdev_put(mp, logdev); > - return error; > + goto out_unlock; > } > > /* > -- > 2.39.2 >
On Wed, Aug 02, 2023 at 09:32:19AM -0700, Darrick J. Wong wrote: > > + /* see get_tree_bdev why this is needed and safe */ > > Which part of get_tree_bdev? Is it this? > > /* > * s_umount nests inside open_mutex during > * __invalidate_device(). blkdev_put() acquires > * open_mutex and can't be called under s_umount. Drop > * s_umount temporarily. This is safe as we're > * holding an active reference. > */ > up_write(&s->s_umount); > blkdev_put(bdev, fc->fs_type); > down_write(&s->s_umount); Yes. With the refactoring earlier in the series get_tree_bdev should be trivial enough to not need a more specific reference. If you think there's a better way to refer to it I can update the comment, though. > > mp->m_logdev_targp = mp->m_ddev_targp; > > } > > > > - return 0; > > + error = 0; > > +out_unlock: > > + down_write(&sb->s_umount); > > Isn't down_write taking s_umount? I think the label should be > out_relock or something less misleading. Agreed. Christian, can you just change this in your branch, or should I send an incremental patch?
On Sat, Aug 05, 2023 at 10:32:39AM +0200, Christoph Hellwig wrote: > On Wed, Aug 02, 2023 at 09:32:19AM -0700, Darrick J. Wong wrote: > > > + /* see get_tree_bdev why this is needed and safe */ > > > > Which part of get_tree_bdev? Is it this? > > > > /* > > * s_umount nests inside open_mutex during > > * __invalidate_device(). blkdev_put() acquires > > * open_mutex and can't be called under s_umount. Drop > > * s_umount temporarily. This is safe as we're > > * holding an active reference. > > */ > > up_write(&s->s_umount); > > blkdev_put(bdev, fc->fs_type); > > down_write(&s->s_umount); > > Yes. With the refactoring earlier in the series get_tree_bdev should > be trivial enough to not need a more specific reference. If you > think there's a better way to refer to it I can update the comment, > though. > > > > mp->m_logdev_targp = mp->m_ddev_targp; > > > } > > > > > > - return 0; > > > + error = 0; > > > +out_unlock: > > > + down_write(&sb->s_umount); > > > > Isn't down_write taking s_umount? I think the label should be > > out_relock or something less misleading. > > Agreed. Christian, can you just change this in your branch, or should > I send an incremental patch? No need to send an incremental patch. I just s/out_unlock/out_relock/g in-tree. Thanks!
On Sat, Aug 05, 2023 at 10:32:39AM +0200, Christoph Hellwig wrote: > On Wed, Aug 02, 2023 at 09:32:19AM -0700, Darrick J. Wong wrote: > > > + /* see get_tree_bdev why this is needed and safe */ > > > > Which part of get_tree_bdev? Is it this? > > > > /* > > * s_umount nests inside open_mutex during > > * __invalidate_device(). blkdev_put() acquires > > * open_mutex and can't be called under s_umount. Drop > > * s_umount temporarily. This is safe as we're > > * holding an active reference. > > */ > > up_write(&s->s_umount); > > blkdev_put(bdev, fc->fs_type); > > down_write(&s->s_umount); > > Yes. With the refactoring earlier in the series get_tree_bdev should > be trivial enough to not need a more specific reference. If you > think there's a better way to refer to it I can update the comment, > though. How about: /* * blkdev_put can't be called under s_umount, see the comment in * get_tree_bdev for more details */ with that and the label name change, Reviewed-by: Darrick J. Wong <djwong@kernel.org> --D > > > mp->m_logdev_targp = mp->m_ddev_targp; > > > } > > > > > > - return 0; > > > + error = 0; > > > +out_unlock: > > > + down_write(&sb->s_umount); > > > > Isn't down_write taking s_umount? I think the label should be > > out_relock or something less misleading. > > Agreed. Christian, can you just change this in your branch, or should > I send an incremental patch? >
On Sat, Aug 05, 2023 at 09:19:04AM -0700, Darrick J. Wong wrote: > On Sat, Aug 05, 2023 at 10:32:39AM +0200, Christoph Hellwig wrote: > > On Wed, Aug 02, 2023 at 09:32:19AM -0700, Darrick J. Wong wrote: > > > > + /* see get_tree_bdev why this is needed and safe */ > > > > > > Which part of get_tree_bdev? Is it this? > > > > > > /* > > > * s_umount nests inside open_mutex during > > > * __invalidate_device(). blkdev_put() acquires > > > * open_mutex and can't be called under s_umount. Drop > > > * s_umount temporarily. This is safe as we're > > > * holding an active reference. > > > */ > > > up_write(&s->s_umount); > > > blkdev_put(bdev, fc->fs_type); > > > down_write(&s->s_umount); > > > > Yes. With the refactoring earlier in the series get_tree_bdev should > > be trivial enough to not need a more specific reference. If you > > think there's a better way to refer to it I can update the comment, > > though. > > How about: > > /* > * blkdev_put can't be called under s_umount, see the comment in > * get_tree_bdev for more details > */ > > with that and the label name change, > Reviewed-by: Darrick J. Wong <djwong@kernel.org> Added that comment and you rvb in-tree.
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 8185102431301d..d5042419ed9997 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -448,17 +448,21 @@ STATIC int xfs_open_devices( struct xfs_mount *mp) { - struct block_device *ddev = mp->m_super->s_bdev; + struct super_block *sb = mp->m_super; + struct block_device *ddev = sb->s_bdev; struct block_device *logdev = NULL, *rtdev = NULL; int error; + /* see get_tree_bdev why this is needed and safe */ + up_write(&sb->s_umount); + /* * Open real time and log devices - order is important. */ if (mp->m_logname) { error = xfs_blkdev_get(mp, mp->m_logname, &logdev); if (error) - return error; + goto out_unlock; } if (mp->m_rtname) { @@ -496,7 +500,10 @@ xfs_open_devices( mp->m_logdev_targp = mp->m_ddev_targp; } - return 0; + error = 0; +out_unlock: + down_write(&sb->s_umount); + return error; out_free_rtdev_targ: if (mp->m_rtdev_targp) @@ -508,7 +515,7 @@ xfs_open_devices( out_close_logdev: if (logdev && logdev != ddev) xfs_blkdev_put(mp, logdev); - return error; + goto out_unlock; } /*
Just like get_tree_bdev needs to drop s_umount when opening the main device, we need to do the same for the xfs log and RT devices to avoid a potential lock order reversal with s_unmount for the mark_dead path. It might be preferable to just drop s_umount over ->fill_super entirely, but that will require a fairly massive audit first, so we'll do the easy version here first. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_super.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)