Message ID | 20230611194706.1583818-2-amir73il@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Handle notifications on overlayfs fake path files | expand |
On Sun, Jun 11, 2023 at 10:47:05PM +0300, Amir Goldstein wrote: > Overlayfs and cachefiles use open_with_fake_path() to allocate internal > files, where overlayfs also puts a "fake" path in f_path - a path which > is not on the same fs as f_inode. But cachefs doesn't, so this needs a better explanation / documentation. > Allocate a container struct file_fake for those internal files, that > is used to hold the fake path along with an optional real path. The idea looks sensible, but fake a is a really weird term here. I know open_with_fake_path also uses it, but we really need to come up with a better name, and also good documentation of the concept here. > +/* Returns the real_path field that could be empty */ > +struct path *__f_real_path(struct file *f) > +{ > + struct file_fake *ff = file_fake(f); > + > + if (f->f_mode & FMODE_FAKE_PATH) > + return &ff->real_path; > + else > + return &f->f_path; > +} two of the three callers always have FMODE_FAKE_PATH set, so please just drop this helper and open code it in the three callers. > + > +/* Returns the real_path if not empty or f_path */ > +const struct path *f_real_path(struct file *f) > +{ > + const struct path *path = __f_real_path(f); > + > + return path->dentry ? path : &f->f_path; > +} > +EXPORT_SYMBOL(f_real_path); This is only needed by the few places like nfsd or btrfs send that directlycall fsnotify and should at very least be EXPORT_SYMBOL_GPL. But I suspect with all the exta code, fsnotify_file really should move out of line and have an EXORT_SYMBOL_GPL instead. > + > +const struct path *f_fake_path(struct file *f) > +{ > + return &f->f_path; > +} > +EXPORT_SYMBOL(f_fake_path); .. and this helper is completely pointless. > +extern struct file *alloc_empty_file(int flags, const struct cred *cred); > +extern struct file *alloc_empty_file_fake(int flags, const struct cred *cred); > +extern struct path *__f_real_path(struct file *f); Please drop all the pointless externs while you're at it.
On Mon, Jun 12, 2023 at 7:45 AM Christoph Hellwig <hch@infradead.org> wrote: > > On Sun, Jun 11, 2023 at 10:47:05PM +0300, Amir Goldstein wrote: > > Overlayfs and cachefiles use open_with_fake_path() to allocate internal > > files, where overlayfs also puts a "fake" path in f_path - a path which > > is not on the same fs as f_inode. > > But cachefs doesn't, so this needs a better explanation / documentation. > > > Allocate a container struct file_fake for those internal files, that > > is used to hold the fake path along with an optional real path. > > The idea looks sensible, but fake a is a really weird term here. > I know open_with_fake_path also uses it, but we really need to > come up with a better name, and also good documentation of the > concept here. > > > +/* Returns the real_path field that could be empty */ > > +struct path *__f_real_path(struct file *f) > > +{ > > + struct file_fake *ff = file_fake(f); > > + > > + if (f->f_mode & FMODE_FAKE_PATH) > > + return &ff->real_path; > > + else > > + return &f->f_path; > > +} > > two of the three callers always have FMODE_FAKE_PATH set, so please > just drop this helper and open code it in the three callers. > I wanted to keep the container opaque I can make __f_real_path not check the flag at all and only f_real_path will check the flag > > + > > +/* Returns the real_path if not empty or f_path */ > > +const struct path *f_real_path(struct file *f) > > +{ > > + const struct path *path = __f_real_path(f); > > + > > + return path->dentry ? path : &f->f_path; > > +} > > +EXPORT_SYMBOL(f_real_path); > > This is only needed by the few places like nfsd or btrfs send > that directlycall fsnotify and should at very least be > EXPORT_SYMBOL_GPL. But I suspect with all the exta code, fsnotify_file > really should move out of line and have an EXORT_SYMBOL_GPL instead. > fsnotify_file() inline is expected to avoid to call to fsnotify() due to the optimization of zero s_fsnotify_connector in fsnotify_parents() - this was observed as needed by some benchmarks. > > + > > +const struct path *f_fake_path(struct file *f) > > +{ > > + return &f->f_path; > > +} > > +EXPORT_SYMBOL(f_fake_path); > > .. and this helper is completely pointless. > > > +extern struct file *alloc_empty_file(int flags, const struct cred *cred); > > +extern struct file *alloc_empty_file_fake(int flags, const struct cred *cred); > > +extern struct path *__f_real_path(struct file *f); > > Please drop all the pointless externs while you're at it. sure, Thanks, Amir.
On Mon, Jun 12, 2023 at 09:37:51AM +0300, Amir Goldstein wrote: > On Mon, Jun 12, 2023 at 7:45 AM Christoph Hellwig <hch@infradead.org> wrote: > > > > On Sun, Jun 11, 2023 at 10:47:05PM +0300, Amir Goldstein wrote: > > > Overlayfs and cachefiles use open_with_fake_path() to allocate internal > > > files, where overlayfs also puts a "fake" path in f_path - a path which > > > is not on the same fs as f_inode. > > > > But cachefs doesn't, so this needs a better explanation / documentation. > > > > > Allocate a container struct file_fake for those internal files, that > > > is used to hold the fake path along with an optional real path. > > > > The idea looks sensible, but fake a is a really weird term here. > > I know open_with_fake_path also uses it, but we really need to > > come up with a better name, and also good documentation of the > > concept here. > > > > > +/* Returns the real_path field that could be empty */ > > > +struct path *__f_real_path(struct file *f) > > > +{ > > > + struct file_fake *ff = file_fake(f); > > > + > > > + if (f->f_mode & FMODE_FAKE_PATH) > > > + return &ff->real_path; > > > + else > > > + return &f->f_path; > > > +} > > > > two of the three callers always have FMODE_FAKE_PATH set, so please > > just drop this helper and open code it in the three callers. > > > > I wanted to keep the container opaque This is preferable to me tbh to not leak this detail into callsites.
On Sun, Jun 11, 2023 at 09:45:45PM -0700, Christoph Hellwig wrote: > On Sun, Jun 11, 2023 at 10:47:05PM +0300, Amir Goldstein wrote: > > Overlayfs and cachefiles use open_with_fake_path() to allocate internal > > files, where overlayfs also puts a "fake" path in f_path - a path which > > is not on the same fs as f_inode. > > But cachefs doesn't, so this needs a better explanation / documentation. > > > Allocate a container struct file_fake for those internal files, that > > is used to hold the fake path along with an optional real path. > > The idea looks sensible, but fake a is a really weird term here. > I know open_with_fake_path also uses it, but we really need to > come up with a better name, and also good documentation of the > concept here. It's basically a stack so I'd either use struct file_stack or struct file_proxy; with a preference for the latter.
On Mon, Jun 12, 2023 at 11:07 AM Christian Brauner <brauner@kernel.org> wrote: > > On Sun, Jun 11, 2023 at 09:45:45PM -0700, Christoph Hellwig wrote: > > On Sun, Jun 11, 2023 at 10:47:05PM +0300, Amir Goldstein wrote: > > > Overlayfs and cachefiles use open_with_fake_path() to allocate internal > > > files, where overlayfs also puts a "fake" path in f_path - a path which > > > is not on the same fs as f_inode. > > > > But cachefs doesn't, so this needs a better explanation / documentation. > > > > > Allocate a container struct file_fake for those internal files, that > > > is used to hold the fake path along with an optional real path. > > > > The idea looks sensible, but fake a is a really weird term here. > > I know open_with_fake_path also uses it, but we really need to > > come up with a better name, and also good documentation of the > > concept here. > > It's basically a stack so I'd either use struct file_stack or > struct file_proxy; with a preference for the latter. Let the bikeshedding begin :) file_proxy too generic to my taste How about: /* File is embedded in backing_file object */ #define FMODE_BACKING ((__force fmode_t)0x2000000) This backing_file container may be usable for cachefiles or FUSE passthrough in the future to store the path of the (netfs/fuse) file that is backed by the (local) backing file. We had a short attempt to do the same for ovl - store the ovl path in the container and not the real_path, to get rid of file_dentry() calls, but it got complicated so deferreing that for later. Thanks, Amir.
On Mon, Jun 12, 2023 at 11:28:23AM +0300, Amir Goldstein wrote: > On Mon, Jun 12, 2023 at 11:07 AM Christian Brauner <brauner@kernel.org> wrote: > > > > On Sun, Jun 11, 2023 at 09:45:45PM -0700, Christoph Hellwig wrote: > > > On Sun, Jun 11, 2023 at 10:47:05PM +0300, Amir Goldstein wrote: > > > > Overlayfs and cachefiles use open_with_fake_path() to allocate internal > > > > files, where overlayfs also puts a "fake" path in f_path - a path which > > > > is not on the same fs as f_inode. > > > > > > But cachefs doesn't, so this needs a better explanation / documentation. > > > > > > > Allocate a container struct file_fake for those internal files, that > > > > is used to hold the fake path along with an optional real path. > > > > > > The idea looks sensible, but fake a is a really weird term here. > > > I know open_with_fake_path also uses it, but we really need to > > > come up with a better name, and also good documentation of the > > > concept here. > > > > It's basically a stack so I'd either use struct file_stack or > > struct file_proxy; with a preference for the latter. > > Let the bikeshedding begin :) > > file_proxy too generic to my taste > > How about: > > /* File is embedded in backing_file object */ > #define FMODE_BACKING ((__force fmode_t)0x2000000) Yeah, that'd be ok with me.
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index 82219a8f6084..a71bdf2d03ba 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -561,7 +561,7 @@ static bool cachefiles_open_file(struct cachefiles_object *object, path.mnt = cache->mnt; path.dentry = dentry; file = open_with_fake_path(&path, O_RDWR | O_LARGEFILE | O_DIRECT, - d_backing_inode(dentry), cache->cache_cred); + &path, cache->cache_cred); if (IS_ERR(file)) { trace_cachefiles_vfs_error(object, d_backing_inode(dentry), PTR_ERR(file), diff --git a/fs/file_table.c b/fs/file_table.c index 372653b92617..7c392f639fa1 100644 --- a/fs/file_table.c +++ b/fs/file_table.c @@ -44,18 +44,60 @@ static struct kmem_cache *filp_cachep __read_mostly; static struct percpu_counter nr_files __cacheline_aligned_in_smp; +/* Container for file with optional fake path */ +struct file_fake { + struct file file; + struct path real_path; +}; + +static inline struct file_fake *file_fake(struct file *f) +{ + return container_of(f, struct file_fake, file); +} + +/* Returns the real_path field that could be empty */ +struct path *__f_real_path(struct file *f) +{ + struct file_fake *ff = file_fake(f); + + if (f->f_mode & FMODE_FAKE_PATH) + return &ff->real_path; + else + return &f->f_path; +} + +/* Returns the real_path if not empty or f_path */ +const struct path *f_real_path(struct file *f) +{ + const struct path *path = __f_real_path(f); + + return path->dentry ? path : &f->f_path; +} +EXPORT_SYMBOL(f_real_path); + +const struct path *f_fake_path(struct file *f) +{ + return &f->f_path; +} +EXPORT_SYMBOL(f_fake_path); + static void file_free_rcu(struct rcu_head *head) { struct file *f = container_of(head, struct file, f_rcuhead); put_cred(f->f_cred); - kmem_cache_free(filp_cachep, f); + if (f->f_mode & FMODE_FAKE_PATH) + kfree(file_fake(f)); + else + kmem_cache_free(filp_cachep, f); } static inline void file_free(struct file *f) { security_file_free(f); - if (!(f->f_mode & FMODE_NOACCOUNT)) + if (f->f_mode & FMODE_FAKE_PATH) + path_put(__f_real_path(f)); + else percpu_counter_dec(&nr_files); call_rcu(&f->f_rcuhead, file_free_rcu); } @@ -131,20 +173,15 @@ static int __init init_fs_stat_sysctls(void) fs_initcall(init_fs_stat_sysctls); #endif -static struct file *__alloc_file(int flags, const struct cred *cred) +static int init_file(struct file *f, int flags, const struct cred *cred) { - struct file *f; int error; - f = kmem_cache_zalloc(filp_cachep, GFP_KERNEL); - if (unlikely(!f)) - return ERR_PTR(-ENOMEM); - f->f_cred = get_cred(cred); error = security_file_alloc(f); if (unlikely(error)) { file_free_rcu(&f->f_rcuhead); - return ERR_PTR(error); + return error; } atomic_long_set(&f->f_count, 1); @@ -155,6 +192,22 @@ static struct file *__alloc_file(int flags, const struct cred *cred) f->f_mode = OPEN_FMODE(flags); /* f->f_version: 0 */ + return 0; +} + +static struct file *__alloc_file(int flags, const struct cred *cred) +{ + struct file *f; + int error; + + f = kmem_cache_zalloc(filp_cachep, GFP_KERNEL); + if (unlikely(!f)) + return ERR_PTR(-ENOMEM); + + error = init_file(f, flags, cred); + if (unlikely(error)) + return ERR_PTR(error); + return f; } @@ -201,18 +254,26 @@ struct file *alloc_empty_file(int flags, const struct cred *cred) } /* - * Variant of alloc_empty_file() that doesn't check and modify nr_files. + * Variant of alloc_empty_file() that allocates a file_fake container + * and doesn't check and modify nr_files. * * Should not be used unless there's a very good reason to do so. */ -struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred) +struct file *alloc_empty_file_fake(int flags, const struct cred *cred) { - struct file *f = __alloc_file(flags, cred); + struct file_fake *ff; + int error; - if (!IS_ERR(f)) - f->f_mode |= FMODE_NOACCOUNT; + ff = kzalloc(sizeof(struct file_fake), GFP_KERNEL); + if (unlikely(!ff)) + return ERR_PTR(-ENOMEM); - return f; + error = init_file(&ff->file, flags, cred); + if (unlikely(error)) + return ERR_PTR(error); + + ff->file.f_mode |= FMODE_FAKE_PATH; + return &ff->file; } /** diff --git a/fs/internal.h b/fs/internal.h index bd3b2810a36b..df7956b74b08 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -97,8 +97,9 @@ extern void chroot_fs_refs(const struct path *, const struct path *); /* * file_table.c */ -extern struct file *alloc_empty_file(int, const struct cred *); -extern struct file *alloc_empty_file_noaccount(int, const struct cred *); +extern struct file *alloc_empty_file(int flags, const struct cred *cred); +extern struct file *alloc_empty_file_fake(int flags, const struct cred *cred); +extern struct path *__f_real_path(struct file *f); static inline void put_file_access(struct file *file) { diff --git a/fs/namei.c b/fs/namei.c index e4fe0879ae55..329e89043277 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3721,14 +3721,16 @@ struct file *vfs_tmpfile_open(struct mnt_idmap *idmap, struct file *file; int error; - file = alloc_empty_file_noaccount(open_flag, cred); - if (!IS_ERR(file)) { - error = vfs_tmpfile(idmap, parentpath, file, mode); - if (error) { - fput(file); - file = ERR_PTR(error); - } + file = alloc_empty_file_fake(open_flag, cred); + if (IS_ERR(file)) + return file; + + error = vfs_tmpfile(idmap, parentpath, file, mode); + if (error) { + fput(file); + file = ERR_PTR(error); } + return file; } EXPORT_SYMBOL(vfs_tmpfile_open); diff --git a/fs/open.c b/fs/open.c index 4478adcc4f3a..0398ab38ce0c 100644 --- a/fs/open.c +++ b/fs/open.c @@ -1116,20 +1116,26 @@ struct file *dentry_create(const struct path *path, int flags, umode_t mode, } EXPORT_SYMBOL(dentry_create); -struct file *open_with_fake_path(const struct path *path, int flags, - struct inode *inode, const struct cred *cred) +struct file *open_with_fake_path(const struct path *fake_path, int flags, + const struct path *real_path, + const struct cred *cred) { - struct file *f = alloc_empty_file_noaccount(flags, cred); - if (!IS_ERR(f)) { - int error; + struct file *f; + int error; - f->f_path = *path; - error = do_dentry_open(f, inode, NULL); - if (error) { - fput(f); - f = ERR_PTR(error); - } + f = alloc_empty_file_fake(flags, cred); + if (IS_ERR(f)) + return f; + + f->f_path = *fake_path; + path_get(real_path); + *__f_real_path(f) = *real_path; + error = do_dentry_open(f, d_inode(real_path->dentry), NULL); + if (error) { + fput(f); + f = ERR_PTR(error); } + return f; } EXPORT_SYMBOL(open_with_fake_path); diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c index 39737c2aaa84..6309dab46985 100644 --- a/fs/overlayfs/file.c +++ b/fs/overlayfs/file.c @@ -61,7 +61,7 @@ static struct file *ovl_open_realfile(const struct file *file, if (!inode_owner_or_capable(real_idmap, realinode)) flags &= ~O_NOATIME; - realfile = open_with_fake_path(&file->f_path, flags, realinode, + realfile = open_with_fake_path(&file->f_path, flags, realpath, current_cred()); } revert_creds(old_cred); diff --git a/include/linux/fs.h b/include/linux/fs.h index 21a981680856..2454025eee1c 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -180,8 +180,8 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset, /* File represents mount that needs unmounting */ #define FMODE_NEED_UNMOUNT ((__force fmode_t)0x10000000) -/* File does not contribute to nr_files count */ -#define FMODE_NOACCOUNT ((__force fmode_t)0x20000000) +/* File is embedded in file_fake and doesn't contribute to nr_files */ +#define FMODE_FAKE_PATH ((__force fmode_t)0x20000000) /* File supports async buffered reads */ #define FMODE_BUF_RASYNC ((__force fmode_t)0x40000000) @@ -2349,11 +2349,16 @@ static inline struct file *file_open_root_mnt(struct vfsmount *mnt, return file_open_root(&(struct path){.mnt = mnt, .dentry = mnt->mnt_root}, name, flags, mode); } -extern struct file * dentry_open(const struct path *, int, const struct cred *); +extern struct file *dentry_open(const struct path *path, int flags, + const struct cred *creds); extern struct file *dentry_create(const struct path *path, int flags, umode_t mode, const struct cred *cred); -extern struct file * open_with_fake_path(const struct path *, int, - struct inode*, const struct cred *); +extern struct file *open_with_fake_path(const struct path *fake_path, int flags, + const struct path *real_path, + const struct cred *cred); +extern const struct path *f_real_path(struct file *file); +extern const struct path *f_fake_path(struct file *file); + static inline struct file *file_clone_open(struct file *file) { return dentry_open(&file->f_path, file->f_flags, file->f_cred);
Overlayfs and cachefiles use open_with_fake_path() to allocate internal files, where overlayfs also puts a "fake" path in f_path - a path which is not on the same fs as f_inode. Allocate a container struct file_fake for those internal files, that is used to hold the fake path along with an optional real path. Signed-off-by: Amir Goldstein <amir73il@gmail.com> --- fs/cachefiles/namei.c | 2 +- fs/file_table.c | 91 ++++++++++++++++++++++++++++++++++++------- fs/internal.h | 5 ++- fs/namei.c | 16 ++++---- fs/open.c | 28 +++++++------ fs/overlayfs/file.c | 2 +- include/linux/fs.h | 15 ++++--- 7 files changed, 117 insertions(+), 42 deletions(-)