Message ID | 20230913111013.77623-12-hch@lst.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [01/19] fs: reflow deactivate_locked_super | expand |
On Wed, Sep 13, 2023 at 08:10:05AM -0300, Christoph Hellwig wrote: > Currently super_blocks are shut down using the ->kill_sb method, which > must call generic_shutdown_super, but allows the file system to > add extra work before or after the call to generic_shutdown_super. > > File systems tend to get rather confused by this, so add an alternative > shutdown sequence where generic_shutdown_super is called by the core > code, and there are extra ->shutdown_sb and ->free_sb hooks before and > after it. To remove the amount of boilerplate code ->shutdown_sb is only > called if the super has finished initialization and ->d_root is set. The last sentence doesn't match the patchset. That aside, there is an issue with method names. ->shutdown_sb() is... odd. ->begin_shutdown_sb(), perhaps? For the majority of filesystems it's NULL, after all... Worse, ->free_sb() is seriously misguiding - the name implies that we are, well, freeing a superblock passed to it. Which is not what is happening here - superblock itself is freed only when all passive references go away. It's asking for trouble down the road. We already have more than enough confusion in the area. Note, BTW, that there's a delicate issue around RCU accesses and freeing stuff - ->d_compare() can bloody well be called when superblock is getting shut down. For anything that might be needed by it (or by other RCU'd methods) we must arrange for RCU-delayed destruction. E.g. in case of fatfs we have sbi freeing done via call_rcu() (from fat_put_super(), called by generic_shutdown_super()). <checks the current tree> Oh, bugger... AFAICS, exfat has a problem - exfat_free_sbi() is called directly from exfat_kill_sb(), without any concern for this: static int exfat_utf8_d_cmp(const struct dentry *dentry, unsigned int len, const char *str, const struct qstr *name) { struct super_block *sb = dentry->d_sb; unsigned int alen = exfat_striptail_len(name->len, name->name, EXFAT_SB(sb)->options.keep_last_dots); That kfree() needs to be RCU-delayed... While we are at it, there's this: static int exfat_d_hash(const struct dentry *dentry, struct qstr *qstr) { struct super_block *sb = dentry->d_sb; struct nls_table *t = EXFAT_SB(sb)->nls_io; and we need this unload_nls(sbi->nls_io); in exfat_put_super() RCU-delayed as well. And I suspect that exfat_free_upcase_table(sbi); right after it needs the same treatment. AFFS: similar problem, wants ->s_fs_info freeing RCU-delayed. hfsplus: similar, including non-delayed unlock_nls() calls. ntfs3: /* * Try slow way with current upcase table */ sbi = dentry->d_sb->s_fs_info; uni1 = __getname(); if (!uni1) return -ENOMEM; __getname(). "Give me a page and you might block, while you are at it". Done from ->d_compare(). Called under dentry->d_lock and rcu_read_lock(). OK, any further investigation of that one is... probably not worth bothering with at that point. Other in-tree instances appear to be correct. I'll push fixes for those (well, ntfs3 aside) out tomorrow.
diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst index 7be2900806c853..c33e2f03ed1f69 100644 --- a/Documentation/filesystems/locking.rst +++ b/Documentation/filesystems/locking.rst @@ -220,7 +220,9 @@ prototypes:: struct dentry *(*mount) (struct file_system_type *, int, const char *, void *); + void (*shutdown_sb) (struct super_block *); void (*kill_sb) (struct super_block *); + void (*free_sb) (struct super_block *); locking rules: @@ -228,7 +230,9 @@ locking rules: ops may block ======= ========= mount yes +shutdown_sb yes kill_sb yes +free_sb yes ======= ========= ->mount() returns ERR_PTR or the root dentry; its superblock should be locked diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst index 99acc2e9867391..1a7c6926c31f34 100644 --- a/Documentation/filesystems/vfs.rst +++ b/Documentation/filesystems/vfs.rst @@ -119,7 +119,9 @@ members are defined: const struct fs_parameter_spec *parameters; struct dentry *(*mount) (struct file_system_type *, int, const char *, void *); + void (*shutdown_sb) (struct super_block *); void (*kill_sb) (struct super_block *); + void (*free_sb) (struct super_block *); struct module *owner; struct file_system_type * next; struct hlist_head fs_supers; @@ -155,10 +157,20 @@ members are defined: the method to call when a new instance of this filesystem should be mounted +``shutdown_sb`` + Cleanup after a super_block has reached a zero active count, and before + the VFS level cleanup happens. Typical picks all fs-specific objects + (if any) that need destruction out of superblock and releases them. + Note: dentries and inodes are normally taken care of and do not need + specific handling unless they are pinned by kernel users. + ``kill_sb`` the method to call when an instance of this filesystem should be shut down +``free_sb`` + Free file system specific resources like sb->s_fs_info that are + still needed while inodes are freed during umount. ``owner`` for internal VFS use: you should initialize this to THIS_MODULE diff --git a/fs/super.c b/fs/super.c index 5c685b4944c2d6..8e173eccc8c113 100644 --- a/fs/super.c +++ b/fs/super.c @@ -480,10 +480,15 @@ void deactivate_locked_super(struct super_block *s) unregister_shrinker(&s->s_shrink); - if (fs->kill_sb) + if (fs->kill_sb) { fs->kill_sb(s); - else + } else { + if (fs->shutdown_sb) + fs->shutdown_sb(s); generic_shutdown_super(s); + if (fs->free_sb) + fs->free_sb(s); + } kill_super_notify(s); diff --git a/include/linux/fs.h b/include/linux/fs.h index 31b6b235b36efa..12fff7df3cc46b 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2341,6 +2341,8 @@ struct file_system_type { struct dentry *(*mount) (struct file_system_type *, int, const char *, void *); void (*kill_sb) (struct super_block *); + void (*shutdown_sb)(struct super_block *sb); + void (*free_sb)(struct super_block *sb); struct module *owner; struct file_system_type * next; struct hlist_head fs_supers;
Currently super_blocks are shut down using the ->kill_sb method, which must call generic_shutdown_super, but allows the file system to add extra work before or after the call to generic_shutdown_super. File systems tend to get rather confused by this, so add an alternative shutdown sequence where generic_shutdown_super is called by the core code, and there are extra ->shutdown_sb and ->free_sb hooks before and after it. To remove the amount of boilerplate code ->shutdown_sb is only called if the super has finished initialization and ->d_root is set. Signed-off-by: Christoph Hellwig <hch@lst.de> --- Documentation/filesystems/locking.rst | 4 ++++ Documentation/filesystems/vfs.rst | 12 ++++++++++++ fs/super.c | 9 +++++++-- include/linux/fs.h | 2 ++ 4 files changed, 25 insertions(+), 2 deletions(-)