diff mbox series

[v3,1/2] fs: use fake_file container for internal files with fake f_path

Message ID 20230611194706.1583818-2-amir73il@gmail.com (mailing list archive)
State New, archived
Headers show
Series Handle notifications on overlayfs fake path files | expand

Commit Message

Amir Goldstein June 11, 2023, 7:47 p.m. UTC
Overlayfs and cachefiles use open_with_fake_path() to allocate internal
files, where overlayfs also puts a "fake" path in f_path - a path which
is not on the same fs as f_inode.

Allocate a container struct file_fake for those internal files, that
is used to hold the fake path along with an optional real path.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/cachefiles/namei.c |  2 +-
 fs/file_table.c       | 91 ++++++++++++++++++++++++++++++++++++-------
 fs/internal.h         |  5 ++-
 fs/namei.c            | 16 ++++----
 fs/open.c             | 28 +++++++------
 fs/overlayfs/file.c   |  2 +-
 include/linux/fs.h    | 15 ++++---
 7 files changed, 117 insertions(+), 42 deletions(-)

Comments

Christoph Hellwig June 12, 2023, 4:45 a.m. UTC | #1
On Sun, Jun 11, 2023 at 10:47:05PM +0300, Amir Goldstein wrote:
> Overlayfs and cachefiles use open_with_fake_path() to allocate internal
> files, where overlayfs also puts a "fake" path in f_path - a path which
> is not on the same fs as f_inode.

But cachefs doesn't, so this needs a better explanation / documentation.

> Allocate a container struct file_fake for those internal files, that
> is used to hold the fake path along with an optional real path.

The idea looks sensible, but fake a is a really weird term here.
I know open_with_fake_path also uses it, but we really need to
come up with a better name, and also good documentation of the
concept here.

> +/* Returns the real_path field that could be empty */
> +struct path *__f_real_path(struct file *f)
> +{
> +	struct file_fake *ff = file_fake(f);
> +
> +	if (f->f_mode & FMODE_FAKE_PATH)
> +		return &ff->real_path;
> +	else
> +		return &f->f_path;
> +}

two of the three callers always have FMODE_FAKE_PATH set, so please
just drop this helper and open code it in the three callers.

> +
> +/* Returns the real_path if not empty or f_path */
> +const struct path *f_real_path(struct file *f)
> +{
> +	const struct path *path = __f_real_path(f);
> +
> +	return path->dentry ? path : &f->f_path;
> +}
> +EXPORT_SYMBOL(f_real_path);

This is only needed by the few places like nfsd or btrfs send
that directlycall fsnotify and should at very least be
EXPORT_SYMBOL_GPL.  But I suspect with all the exta code, fsnotify_file
really should move out of line and have an EXORT_SYMBOL_GPL instead.

> +
> +const struct path *f_fake_path(struct file *f)
> +{
> +	return &f->f_path;
> +}
> +EXPORT_SYMBOL(f_fake_path);

.. and this helper is completely pointless.

> +extern struct file *alloc_empty_file(int flags, const struct cred *cred);
> +extern struct file *alloc_empty_file_fake(int flags, const struct cred *cred);
> +extern struct path *__f_real_path(struct file *f);

Please drop all the pointless externs while you're at it.
Amir Goldstein June 12, 2023, 6:37 a.m. UTC | #2
On Mon, Jun 12, 2023 at 7:45 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Sun, Jun 11, 2023 at 10:47:05PM +0300, Amir Goldstein wrote:
> > Overlayfs and cachefiles use open_with_fake_path() to allocate internal
> > files, where overlayfs also puts a "fake" path in f_path - a path which
> > is not on the same fs as f_inode.
>
> But cachefs doesn't, so this needs a better explanation / documentation.
>
> > Allocate a container struct file_fake for those internal files, that
> > is used to hold the fake path along with an optional real path.
>
> The idea looks sensible, but fake a is a really weird term here.
> I know open_with_fake_path also uses it, but we really need to
> come up with a better name, and also good documentation of the
> concept here.
>
> > +/* Returns the real_path field that could be empty */
> > +struct path *__f_real_path(struct file *f)
> > +{
> > +     struct file_fake *ff = file_fake(f);
> > +
> > +     if (f->f_mode & FMODE_FAKE_PATH)
> > +             return &ff->real_path;
> > +     else
> > +             return &f->f_path;
> > +}
>
> two of the three callers always have FMODE_FAKE_PATH set, so please
> just drop this helper and open code it in the three callers.
>

I wanted to keep the container opaque
I can make __f_real_path not check the flag at all
and only f_real_path will check the flag

> > +
> > +/* Returns the real_path if not empty or f_path */
> > +const struct path *f_real_path(struct file *f)
> > +{
> > +     const struct path *path = __f_real_path(f);
> > +
> > +     return path->dentry ? path : &f->f_path;
> > +}
> > +EXPORT_SYMBOL(f_real_path);
>
> This is only needed by the few places like nfsd or btrfs send
> that directlycall fsnotify and should at very least be
> EXPORT_SYMBOL_GPL.  But I suspect with all the exta code, fsnotify_file
> really should move out of line and have an EXORT_SYMBOL_GPL instead.
>

fsnotify_file() inline is expected to avoid to call to fsnotify()
due to the optimization of zero s_fsnotify_connector
in fsnotify_parents() - this was observed as needed by some benchmarks.

> > +
> > +const struct path *f_fake_path(struct file *f)
> > +{
> > +     return &f->f_path;
> > +}
> > +EXPORT_SYMBOL(f_fake_path);
>
> .. and this helper is completely pointless.
>
> > +extern struct file *alloc_empty_file(int flags, const struct cred *cred);
> > +extern struct file *alloc_empty_file_fake(int flags, const struct cred *cred);
> > +extern struct path *__f_real_path(struct file *f);
>
> Please drop all the pointless externs while you're at it.

sure,

Thanks,
Amir.
Christian Brauner June 12, 2023, 8:01 a.m. UTC | #3
On Mon, Jun 12, 2023 at 09:37:51AM +0300, Amir Goldstein wrote:
> On Mon, Jun 12, 2023 at 7:45 AM Christoph Hellwig <hch@infradead.org> wrote:
> >
> > On Sun, Jun 11, 2023 at 10:47:05PM +0300, Amir Goldstein wrote:
> > > Overlayfs and cachefiles use open_with_fake_path() to allocate internal
> > > files, where overlayfs also puts a "fake" path in f_path - a path which
> > > is not on the same fs as f_inode.
> >
> > But cachefs doesn't, so this needs a better explanation / documentation.
> >
> > > Allocate a container struct file_fake for those internal files, that
> > > is used to hold the fake path along with an optional real path.
> >
> > The idea looks sensible, but fake a is a really weird term here.
> > I know open_with_fake_path also uses it, but we really need to
> > come up with a better name, and also good documentation of the
> > concept here.
> >
> > > +/* Returns the real_path field that could be empty */
> > > +struct path *__f_real_path(struct file *f)
> > > +{
> > > +     struct file_fake *ff = file_fake(f);
> > > +
> > > +     if (f->f_mode & FMODE_FAKE_PATH)
> > > +             return &ff->real_path;
> > > +     else
> > > +             return &f->f_path;
> > > +}
> >
> > two of the three callers always have FMODE_FAKE_PATH set, so please
> > just drop this helper and open code it in the three callers.
> >
> 
> I wanted to keep the container opaque

This is preferable to me tbh to not leak this detail into callsites.
Christian Brauner June 12, 2023, 8:07 a.m. UTC | #4
On Sun, Jun 11, 2023 at 09:45:45PM -0700, Christoph Hellwig wrote:
> On Sun, Jun 11, 2023 at 10:47:05PM +0300, Amir Goldstein wrote:
> > Overlayfs and cachefiles use open_with_fake_path() to allocate internal
> > files, where overlayfs also puts a "fake" path in f_path - a path which
> > is not on the same fs as f_inode.
> 
> But cachefs doesn't, so this needs a better explanation / documentation.
> 
> > Allocate a container struct file_fake for those internal files, that
> > is used to hold the fake path along with an optional real path.
> 
> The idea looks sensible, but fake a is a really weird term here.
> I know open_with_fake_path also uses it, but we really need to
> come up with a better name, and also good documentation of the
> concept here.

It's basically a stack so I'd either use struct file_stack or
struct file_proxy; with a preference for the latter.
Amir Goldstein June 12, 2023, 8:28 a.m. UTC | #5
On Mon, Jun 12, 2023 at 11:07 AM Christian Brauner <brauner@kernel.org> wrote:
>
> On Sun, Jun 11, 2023 at 09:45:45PM -0700, Christoph Hellwig wrote:
> > On Sun, Jun 11, 2023 at 10:47:05PM +0300, Amir Goldstein wrote:
> > > Overlayfs and cachefiles use open_with_fake_path() to allocate internal
> > > files, where overlayfs also puts a "fake" path in f_path - a path which
> > > is not on the same fs as f_inode.
> >
> > But cachefs doesn't, so this needs a better explanation / documentation.
> >
> > > Allocate a container struct file_fake for those internal files, that
> > > is used to hold the fake path along with an optional real path.
> >
> > The idea looks sensible, but fake a is a really weird term here.
> > I know open_with_fake_path also uses it, but we really need to
> > come up with a better name, and also good documentation of the
> > concept here.
>
> It's basically a stack so I'd either use struct file_stack or
> struct file_proxy; with a preference for the latter.

Let the bikeshedding begin :)

file_proxy too generic to my taste

How about:

/* File is embedded in backing_file object */
#define FMODE_BACKING           ((__force fmode_t)0x2000000)

This backing_file container may be usable for cachefiles or FUSE
passthrough in the future to store the path of the (netfs/fuse) file that
is backed by the (local) backing file.

We had a short attempt to do the same for ovl - store the ovl path
in the container and not the real_path, to get rid of file_dentry() calls,
but it got complicated so deferreing that for later.

Thanks,
Amir.
Christian Brauner June 12, 2023, 8:31 a.m. UTC | #6
On Mon, Jun 12, 2023 at 11:28:23AM +0300, Amir Goldstein wrote:
> On Mon, Jun 12, 2023 at 11:07 AM Christian Brauner <brauner@kernel.org> wrote:
> >
> > On Sun, Jun 11, 2023 at 09:45:45PM -0700, Christoph Hellwig wrote:
> > > On Sun, Jun 11, 2023 at 10:47:05PM +0300, Amir Goldstein wrote:
> > > > Overlayfs and cachefiles use open_with_fake_path() to allocate internal
> > > > files, where overlayfs also puts a "fake" path in f_path - a path which
> > > > is not on the same fs as f_inode.
> > >
> > > But cachefs doesn't, so this needs a better explanation / documentation.
> > >
> > > > Allocate a container struct file_fake for those internal files, that
> > > > is used to hold the fake path along with an optional real path.
> > >
> > > The idea looks sensible, but fake a is a really weird term here.
> > > I know open_with_fake_path also uses it, but we really need to
> > > come up with a better name, and also good documentation of the
> > > concept here.
> >
> > It's basically a stack so I'd either use struct file_stack or
> > struct file_proxy; with a preference for the latter.
> 
> Let the bikeshedding begin :)
> 
> file_proxy too generic to my taste
> 
> How about:
> 
> /* File is embedded in backing_file object */
> #define FMODE_BACKING           ((__force fmode_t)0x2000000)

Yeah, that'd be ok with me.
diff mbox series

Patch

diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 82219a8f6084..a71bdf2d03ba 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -561,7 +561,7 @@  static bool cachefiles_open_file(struct cachefiles_object *object,
 	path.mnt = cache->mnt;
 	path.dentry = dentry;
 	file = open_with_fake_path(&path, O_RDWR | O_LARGEFILE | O_DIRECT,
-				   d_backing_inode(dentry), cache->cache_cred);
+				   &path, cache->cache_cred);
 	if (IS_ERR(file)) {
 		trace_cachefiles_vfs_error(object, d_backing_inode(dentry),
 					   PTR_ERR(file),
diff --git a/fs/file_table.c b/fs/file_table.c
index 372653b92617..7c392f639fa1 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -44,18 +44,60 @@  static struct kmem_cache *filp_cachep __read_mostly;
 
 static struct percpu_counter nr_files __cacheline_aligned_in_smp;
 
+/* Container for file with optional fake path */
+struct file_fake {
+	struct file file;
+	struct path real_path;
+};
+
+static inline struct file_fake *file_fake(struct file *f)
+{
+	return container_of(f, struct file_fake, file);
+}
+
+/* Returns the real_path field that could be empty */
+struct path *__f_real_path(struct file *f)
+{
+	struct file_fake *ff = file_fake(f);
+
+	if (f->f_mode & FMODE_FAKE_PATH)
+		return &ff->real_path;
+	else
+		return &f->f_path;
+}
+
+/* Returns the real_path if not empty or f_path */
+const struct path *f_real_path(struct file *f)
+{
+	const struct path *path = __f_real_path(f);
+
+	return path->dentry ? path : &f->f_path;
+}
+EXPORT_SYMBOL(f_real_path);
+
+const struct path *f_fake_path(struct file *f)
+{
+	return &f->f_path;
+}
+EXPORT_SYMBOL(f_fake_path);
+
 static void file_free_rcu(struct rcu_head *head)
 {
 	struct file *f = container_of(head, struct file, f_rcuhead);
 
 	put_cred(f->f_cred);
-	kmem_cache_free(filp_cachep, f);
+	if (f->f_mode & FMODE_FAKE_PATH)
+		kfree(file_fake(f));
+	else
+		kmem_cache_free(filp_cachep, f);
 }
 
 static inline void file_free(struct file *f)
 {
 	security_file_free(f);
-	if (!(f->f_mode & FMODE_NOACCOUNT))
+	if (f->f_mode & FMODE_FAKE_PATH)
+		path_put(__f_real_path(f));
+	else
 		percpu_counter_dec(&nr_files);
 	call_rcu(&f->f_rcuhead, file_free_rcu);
 }
@@ -131,20 +173,15 @@  static int __init init_fs_stat_sysctls(void)
 fs_initcall(init_fs_stat_sysctls);
 #endif
 
-static struct file *__alloc_file(int flags, const struct cred *cred)
+static int init_file(struct file *f, int flags, const struct cred *cred)
 {
-	struct file *f;
 	int error;
 
-	f = kmem_cache_zalloc(filp_cachep, GFP_KERNEL);
-	if (unlikely(!f))
-		return ERR_PTR(-ENOMEM);
-
 	f->f_cred = get_cred(cred);
 	error = security_file_alloc(f);
 	if (unlikely(error)) {
 		file_free_rcu(&f->f_rcuhead);
-		return ERR_PTR(error);
+		return error;
 	}
 
 	atomic_long_set(&f->f_count, 1);
@@ -155,6 +192,22 @@  static struct file *__alloc_file(int flags, const struct cred *cred)
 	f->f_mode = OPEN_FMODE(flags);
 	/* f->f_version: 0 */
 
+	return 0;
+}
+
+static struct file *__alloc_file(int flags, const struct cred *cred)
+{
+	struct file *f;
+	int error;
+
+	f = kmem_cache_zalloc(filp_cachep, GFP_KERNEL);
+	if (unlikely(!f))
+		return ERR_PTR(-ENOMEM);
+
+	error = init_file(f, flags, cred);
+	if (unlikely(error))
+		return ERR_PTR(error);
+
 	return f;
 }
 
@@ -201,18 +254,26 @@  struct file *alloc_empty_file(int flags, const struct cred *cred)
 }
 
 /*
- * Variant of alloc_empty_file() that doesn't check and modify nr_files.
+ * Variant of alloc_empty_file() that allocates a file_fake container
+ * and doesn't check and modify nr_files.
  *
  * Should not be used unless there's a very good reason to do so.
  */
-struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred)
+struct file *alloc_empty_file_fake(int flags, const struct cred *cred)
 {
-	struct file *f = __alloc_file(flags, cred);
+	struct file_fake *ff;
+	int error;
 
-	if (!IS_ERR(f))
-		f->f_mode |= FMODE_NOACCOUNT;
+	ff = kzalloc(sizeof(struct file_fake), GFP_KERNEL);
+	if (unlikely(!ff))
+		return ERR_PTR(-ENOMEM);
 
-	return f;
+	error = init_file(&ff->file, flags, cred);
+	if (unlikely(error))
+		return ERR_PTR(error);
+
+	ff->file.f_mode |= FMODE_FAKE_PATH;
+	return &ff->file;
 }
 
 /**
diff --git a/fs/internal.h b/fs/internal.h
index bd3b2810a36b..df7956b74b08 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -97,8 +97,9 @@  extern void chroot_fs_refs(const struct path *, const struct path *);
 /*
  * file_table.c
  */
-extern struct file *alloc_empty_file(int, const struct cred *);
-extern struct file *alloc_empty_file_noaccount(int, const struct cred *);
+extern struct file *alloc_empty_file(int flags, const struct cred *cred);
+extern struct file *alloc_empty_file_fake(int flags, const struct cred *cred);
+extern struct path *__f_real_path(struct file *f);
 
 static inline void put_file_access(struct file *file)
 {
diff --git a/fs/namei.c b/fs/namei.c
index e4fe0879ae55..329e89043277 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3721,14 +3721,16 @@  struct file *vfs_tmpfile_open(struct mnt_idmap *idmap,
 	struct file *file;
 	int error;
 
-	file = alloc_empty_file_noaccount(open_flag, cred);
-	if (!IS_ERR(file)) {
-		error = vfs_tmpfile(idmap, parentpath, file, mode);
-		if (error) {
-			fput(file);
-			file = ERR_PTR(error);
-		}
+	file = alloc_empty_file_fake(open_flag, cred);
+	if (IS_ERR(file))
+		return file;
+
+	error = vfs_tmpfile(idmap, parentpath, file, mode);
+	if (error) {
+		fput(file);
+		file = ERR_PTR(error);
 	}
+
 	return file;
 }
 EXPORT_SYMBOL(vfs_tmpfile_open);
diff --git a/fs/open.c b/fs/open.c
index 4478adcc4f3a..0398ab38ce0c 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1116,20 +1116,26 @@  struct file *dentry_create(const struct path *path, int flags, umode_t mode,
 }
 EXPORT_SYMBOL(dentry_create);
 
-struct file *open_with_fake_path(const struct path *path, int flags,
-				struct inode *inode, const struct cred *cred)
+struct file *open_with_fake_path(const struct path *fake_path, int flags,
+				 const struct path *real_path,
+				 const struct cred *cred)
 {
-	struct file *f = alloc_empty_file_noaccount(flags, cred);
-	if (!IS_ERR(f)) {
-		int error;
+	struct file *f;
+	int error;
 
-		f->f_path = *path;
-		error = do_dentry_open(f, inode, NULL);
-		if (error) {
-			fput(f);
-			f = ERR_PTR(error);
-		}
+	f = alloc_empty_file_fake(flags, cred);
+	if (IS_ERR(f))
+		return f;
+
+	f->f_path = *fake_path;
+	path_get(real_path);
+	*__f_real_path(f) = *real_path;
+	error = do_dentry_open(f, d_inode(real_path->dentry), NULL);
+	if (error) {
+		fput(f);
+		f = ERR_PTR(error);
 	}
+
 	return f;
 }
 EXPORT_SYMBOL(open_with_fake_path);
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 39737c2aaa84..6309dab46985 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -61,7 +61,7 @@  static struct file *ovl_open_realfile(const struct file *file,
 		if (!inode_owner_or_capable(real_idmap, realinode))
 			flags &= ~O_NOATIME;
 
-		realfile = open_with_fake_path(&file->f_path, flags, realinode,
+		realfile = open_with_fake_path(&file->f_path, flags, realpath,
 					       current_cred());
 	}
 	revert_creds(old_cred);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 21a981680856..2454025eee1c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -180,8 +180,8 @@  typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
 /* File represents mount that needs unmounting */
 #define FMODE_NEED_UNMOUNT	((__force fmode_t)0x10000000)
 
-/* File does not contribute to nr_files count */
-#define FMODE_NOACCOUNT		((__force fmode_t)0x20000000)
+/* File is embedded in file_fake and doesn't contribute to nr_files */
+#define FMODE_FAKE_PATH		((__force fmode_t)0x20000000)
 
 /* File supports async buffered reads */
 #define FMODE_BUF_RASYNC	((__force fmode_t)0x40000000)
@@ -2349,11 +2349,16 @@  static inline struct file *file_open_root_mnt(struct vfsmount *mnt,
 	return file_open_root(&(struct path){.mnt = mnt, .dentry = mnt->mnt_root},
 			      name, flags, mode);
 }
-extern struct file * dentry_open(const struct path *, int, const struct cred *);
+extern struct file *dentry_open(const struct path *path, int flags,
+				const struct cred *creds);
 extern struct file *dentry_create(const struct path *path, int flags,
 				  umode_t mode, const struct cred *cred);
-extern struct file * open_with_fake_path(const struct path *, int,
-					 struct inode*, const struct cred *);
+extern struct file *open_with_fake_path(const struct path *fake_path, int flags,
+					const struct path *real_path,
+					const struct cred *cred);
+extern const struct path *f_real_path(struct file *file);
+extern const struct path *f_fake_path(struct file *file);
+
 static inline struct file *file_clone_open(struct file *file)
 {
 	return dentry_open(&file->f_path, file->f_flags, file->f_cred);