diff mbox series

[v3,7/9] vfs: move open right after ->tmpfile()

Message ID 20220920193632.2215598-8-mszeredi@redhat.com (mailing list archive)
State New, archived
Headers show
Series fuse tmpfile | expand

Commit Message

Miklos Szeredi Sept. 20, 2022, 7:36 p.m. UTC
Create a helper finish_open_simple() that opens the file with the original
dentry.  Handle the error case here as well to simplify callers.

Call this helper right after ->tmpfile() is called.

Next patch will change the tmpfile API and move this call into tmpfile
instances.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/namei.c         | 79 ++++++++++++++++++----------------------------
 include/linux/fs.h |  9 ++++++
 2 files changed, 40 insertions(+), 48 deletions(-)

Comments

Al Viro Sept. 20, 2022, 8:57 p.m. UTC | #1
On Tue, Sep 20, 2022 at 09:36:30PM +0200, Miklos Szeredi wrote:

>  	inode = child->d_inode;

Better
	inode = file_inode(file);

so that child would be completely ignored after dput().

> +	error = vfs_tmpfile(mnt_userns, &path, file, op->mode);
> +	if (error)
>  		goto out2;
> -	dput(path.dentry);
> -	path.dentry = child;
> -	audit_inode(nd->name, child, 0);
> +	audit_inode(nd->name, file->f_path.dentry, 0);
>  	/* Don't check for other permissions, the inode was just created */
> -	error = may_open(mnt_userns, &path, 0, op->open_flag);

Umm...  I'm not sure that losing it is the right thing - it might
be argued that ->permission(..., MAY_OPEN) is to be ignored for
tmpfile (and the only thing checking for MAY_OPEN is nfs, which is
*not* going to grow tmpfile any time soon - certainly not with these
calling conventions), but you are also dropping the call of
security_inode_permission(inode, MAY_OPEN) and that's a change
compared to what LSM crowd used to get...
Miklos Szeredi Sept. 21, 2022, 3:06 a.m. UTC | #2
On Tue, 20 Sept 2022 at 22:57, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Tue, Sep 20, 2022 at 09:36:30PM +0200, Miklos Szeredi wrote:
>
> >       inode = child->d_inode;
>
> Better
>         inode = file_inode(file);
>
> so that child would be completely ignored after dput().
>
> > +     error = vfs_tmpfile(mnt_userns, &path, file, op->mode);
> > +     if (error)
> >               goto out2;
> > -     dput(path.dentry);
> > -     path.dentry = child;
> > -     audit_inode(nd->name, child, 0);
> > +     audit_inode(nd->name, file->f_path.dentry, 0);
> >       /* Don't check for other permissions, the inode was just created */
> > -     error = may_open(mnt_userns, &path, 0, op->open_flag);
>
> Umm...  I'm not sure that losing it is the right thing - it might
> be argued that ->permission(..., MAY_OPEN) is to be ignored for
> tmpfile (and the only thing checking for MAY_OPEN is nfs, which is
> *not* going to grow tmpfile any time soon - certainly not with these
> calling conventions), but you are also dropping the call of
> security_inode_permission(inode, MAY_OPEN) and that's a change
> compared to what LSM crowd used to get...

Not losing it, just moving it into vfs_tmpfile().

Thanks,
Miklos
Christian Brauner Sept. 21, 2022, 8:54 a.m. UTC | #3
On Wed, Sep 21, 2022 at 05:06:57AM +0200, Miklos Szeredi wrote:
> On Tue, 20 Sept 2022 at 22:57, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > On Tue, Sep 20, 2022 at 09:36:30PM +0200, Miklos Szeredi wrote:
> >
> > >       inode = child->d_inode;
> >
> > Better
> >         inode = file_inode(file);
> >
> > so that child would be completely ignored after dput().
> >
> > > +     error = vfs_tmpfile(mnt_userns, &path, file, op->mode);
> > > +     if (error)
> > >               goto out2;
> > > -     dput(path.dentry);
> > > -     path.dentry = child;
> > > -     audit_inode(nd->name, child, 0);
> > > +     audit_inode(nd->name, file->f_path.dentry, 0);
> > >       /* Don't check for other permissions, the inode was just created */
> > > -     error = may_open(mnt_userns, &path, 0, op->open_flag);
> >
> > Umm...  I'm not sure that losing it is the right thing - it might
> > be argued that ->permission(..., MAY_OPEN) is to be ignored for
> > tmpfile (and the only thing checking for MAY_OPEN is nfs, which is
> > *not* going to grow tmpfile any time soon - certainly not with these
> > calling conventions), but you are also dropping the call of
> > security_inode_permission(inode, MAY_OPEN) and that's a change
> > compared to what LSM crowd used to get...
> 
> Not losing it, just moving it into vfs_tmpfile().

Afaict, we haven't called may_open() for tmpfile creation in either
cachefiles or overlayfs before. So from that perspective I wonder if
there's a good reason for us to do it now.

The fact that we don't account these kernel internal tmpfiles feels like
another exemption that points to them being internal files that don't
need to be subject to common restrictions.
Christian Brauner Sept. 21, 2022, 9:03 a.m. UTC | #4
On Tue, Sep 20, 2022 at 09:36:30PM +0200, Miklos Szeredi wrote:
> Create a helper finish_open_simple() that opens the file with the original
> dentry.  Handle the error case here as well to simplify callers.
> 
> Call this helper right after ->tmpfile() is called.
> 
> Next patch will change the tmpfile API and move this call into tmpfile
> instances.
> 
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> ---
>  fs/namei.c         | 79 ++++++++++++++++++----------------------------
>  include/linux/fs.h |  9 ++++++
>  2 files changed, 40 insertions(+), 48 deletions(-)
> 
> diff --git a/fs/namei.c b/fs/namei.c
> index 652d09ae66fb..4faf7e743664 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -3583,44 +3583,43 @@ static int do_open(struct nameidata *nd,
>   * On non-idmapped mounts or if permission checking is to be performed on the
>   * raw inode simply passs init_user_ns.
>   */
> -static struct dentry *vfs_tmpfile(struct user_namespace *mnt_userns,
> -			   struct dentry *dentry, umode_t mode, int open_flag)
> +static int vfs_tmpfile(struct user_namespace *mnt_userns,
> +		       const struct path *parentpath,
> +		       struct file *file, umode_t mode)
>  {
> -	struct dentry *child = NULL;
> -	struct inode *dir = dentry->d_inode;
> +	struct dentry *child;
> +	struct inode *dir = d_inode(parentpath->dentry);
>  	struct inode *inode;
>  	int error;
>  
>  	/* we want directory to be writable */
>  	error = inode_permission(mnt_userns, dir, MAY_WRITE | MAY_EXEC);
>  	if (error)
> -		goto out_err;
> -	error = -EOPNOTSUPP;
> +		return error;
>  	if (!dir->i_op->tmpfile)
> -		goto out_err;
> -	error = -ENOMEM;
> -	child = d_alloc(dentry, &slash_name);
> +		return -EOPNOTSUPP;
> +	child = d_alloc(parentpath->dentry, &slash_name);
>  	if (unlikely(!child))
> -		goto out_err;
> +		return -ENOMEM;
> +	file->f_path.mnt = parentpath->mnt;
> +	file->f_path.dentry = child;
>  	mode = vfs_prepare_mode(mnt_userns, dir, mode, mode, mode);
>  	error = dir->i_op->tmpfile(mnt_userns, dir, child, mode);
> +	error = finish_open_simple(file, error);
> +	dput(child);
>  	if (error)
> -		goto out_err;
> -	error = -ENOENT;
> +		return error;
> +	error = may_open(mnt_userns, &file->f_path, 0, file->f_flags);
> +	if (error)
> +		return error;
>  	inode = child->d_inode;
> -	if (unlikely(!inode))
> -		goto out_err;
> -	if (!(open_flag & O_EXCL)) {
> +	if (!(file->f_flags & O_EXCL)) {
>  		spin_lock(&inode->i_lock);
>  		inode->i_state |= I_LINKABLE;
>  		spin_unlock(&inode->i_lock);
>  	}
>  	ima_post_create_tmpfile(mnt_userns, inode);
> -	return child;
> -
> -out_err:
> -	dput(child);
> -	return ERR_PTR(error);
> +	return 0;
>  }
>  
>  /**
> @@ -3641,25 +3640,15 @@ struct file *tmpfile_open(struct user_namespace *mnt_userns,
>  {
>  	struct file *file;
>  	int error;
> -	struct path path = { .mnt = parentpath->mnt };
> -
> -	path.dentry = vfs_tmpfile(mnt_userns, parentpath->dentry, mode, open_flag);
> -	if (IS_ERR(path.dentry))
> -		return ERR_CAST(path.dentry);
> -
> -	error = may_open(mnt_userns, &path, 0, open_flag);
> -	file = ERR_PTR(error);
> -	if (error)
> -		goto out_dput;
> -
> -	/*
> -	 * This relies on the "noaccount" property of fake open, otherwise
> -	 * equivalent to dentry_open().
> -	 */
> -	file = open_with_fake_path(&path, open_flag, d_inode(path.dentry), cred);
> -out_dput:
> -	dput(path.dentry);
>  
> +	file = alloc_empty_file_noaccount(open_flag, cred);
> +	if (!IS_ERR(file)) {
> +		error = vfs_tmpfile(mnt_userns, parentpath, file, mode);
> +		if (error) {
> +			fput(file);
> +			file = ERR_PTR(error);
> +		}
> +	}
>  	return file;
>  }
>  EXPORT_SYMBOL(tmpfile_open);
> @@ -3669,26 +3658,20 @@ static int do_tmpfile(struct nameidata *nd, unsigned flags,
>  		struct file *file)
>  {
>  	struct user_namespace *mnt_userns;
> -	struct dentry *child;
>  	struct path path;
>  	int error = path_lookupat(nd, flags | LOOKUP_DIRECTORY, &path);
> +
>  	if (unlikely(error))
>  		return error;
>  	error = mnt_want_write(path.mnt);
>  	if (unlikely(error))
>  		goto out;
>  	mnt_userns = mnt_user_ns(path.mnt);
> -	child = vfs_tmpfile(mnt_userns, path.dentry, op->mode, op->open_flag);
> -	error = PTR_ERR(child);
> -	if (IS_ERR(child))
> +	error = vfs_tmpfile(mnt_userns, &path, file, op->mode);
> +	if (error)
>  		goto out2;
> -	dput(path.dentry);
> -	path.dentry = child;
> -	audit_inode(nd->name, child, 0);
> +	audit_inode(nd->name, file->f_path.dentry, 0);
>  	/* Don't check for other permissions, the inode was just created */
> -	error = may_open(mnt_userns, &path, 0, op->open_flag);
> -	if (!error)
> -		error = vfs_open(&path, file);
>  out2:
>  	mnt_drop_write(path.mnt);
>  out:
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index a445da4842e0..f0d17eefb966 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2780,6 +2780,15 @@ extern int finish_open(struct file *file, struct dentry *dentry,
>  			int (*open)(struct inode *, struct file *));
>  extern int finish_no_open(struct file *file, struct dentry *dentry);
>  
> +/* Helper for the simple case when original dentry is used */
> +static inline int finish_open_simple(struct file *file, int error)

It would be nice if the new helpers would would be called
vfs_finish_open()/vfs_finish_open_simple() and vfs_tmpfile_open() to
stick with our vfs_* prefix convention.

It is extremely helpful when looking/grepping for helpers and the
consistency we have gained there in recent years is pretty good.
Miklos Szeredi Sept. 21, 2022, 2:53 p.m. UTC | #5
On Wed, 21 Sept 2022 at 10:54, Christian Brauner <brauner@kernel.org> wrote:
>
> On Wed, Sep 21, 2022 at 05:06:57AM +0200, Miklos Szeredi wrote:
> > On Tue, 20 Sept 2022 at 22:57, Al Viro <viro@zeniv.linux.org.uk> wrote:
> > >
> > > On Tue, Sep 20, 2022 at 09:36:30PM +0200, Miklos Szeredi wrote:
> > >
> > > >       inode = child->d_inode;
> > >
> > > Better
> > >         inode = file_inode(file);
> > >
> > > so that child would be completely ignored after dput().
> > >
> > > > +     error = vfs_tmpfile(mnt_userns, &path, file, op->mode);
> > > > +     if (error)
> > > >               goto out2;
> > > > -     dput(path.dentry);
> > > > -     path.dentry = child;
> > > > -     audit_inode(nd->name, child, 0);
> > > > +     audit_inode(nd->name, file->f_path.dentry, 0);
> > > >       /* Don't check for other permissions, the inode was just created */
> > > > -     error = may_open(mnt_userns, &path, 0, op->open_flag);
> > >
> > > Umm...  I'm not sure that losing it is the right thing - it might
> > > be argued that ->permission(..., MAY_OPEN) is to be ignored for
> > > tmpfile (and the only thing checking for MAY_OPEN is nfs, which is
> > > *not* going to grow tmpfile any time soon - certainly not with these
> > > calling conventions), but you are also dropping the call of
> > > security_inode_permission(inode, MAY_OPEN) and that's a change
> > > compared to what LSM crowd used to get...
> >
> > Not losing it, just moving it into vfs_tmpfile().
>
> Afaict, we haven't called may_open() for tmpfile creation in either
> cachefiles or overlayfs before. So from that perspective I wonder if
> there's a good reason for us to do it now.

For overlayfs we did check MAY_WRITE | MAY_OPEN through
ovl_path_open().  Just checking MAY_OPEN relaxes this, but it's in
line with the overlay model of checking the same permissions as if the
operation was invoked directly.

For cachefiles no permission was checked before this patch, so in
theory it could change behavior.  Moving the permission check back out
to callers would fix this, but I'm not entirely sure that that is the
best way forward.

David, what is the model for cachefiles?  Is this okay to check for
permissions on underlying ops, or that must be avoided?

Thanks,
Miklos
Miklos Szeredi Sept. 21, 2022, 2:56 p.m. UTC | #6
On Wed, 21 Sept 2022 at 11:03, Christian Brauner <brauner@kernel.org> wrote:
>
> On Tue, Sep 20, 2022 at 09:36:30PM +0200, Miklos Szeredi wrote:


> > +/* Helper for the simple case when original dentry is used */
> > +static inline int finish_open_simple(struct file *file, int error)
>
> It would be nice if the new helpers would would be called
> vfs_finish_open()/vfs_finish_open_simple() and vfs_tmpfile_open() to
> stick with our vfs_* prefix convention.
>
> It is extremely helpful when looking/grepping for helpers and the
> consistency we have gained there in recent years is pretty good.

Agreed.  However only finish_open_simple() is the new one, and naming
it vfs_finish_open_simple() makes it inconsistent with
finish_open_simple().    I'd just leave this renaming to a separate
patchset and discussion, as it's hard enough to make progress with the
current one without expanding its scope.

Thanks,
Miklos
Christian Brauner Sept. 21, 2022, 3:09 p.m. UTC | #7
On Wed, Sep 21, 2022 at 04:56:01PM +0200, Miklos Szeredi wrote:
> On Wed, 21 Sept 2022 at 11:03, Christian Brauner <brauner@kernel.org> wrote:
> >
> > On Tue, Sep 20, 2022 at 09:36:30PM +0200, Miklos Szeredi wrote:
> 
> 
> > > +/* Helper for the simple case when original dentry is used */
> > > +static inline int finish_open_simple(struct file *file, int error)
> >
> > It would be nice if the new helpers would would be called
> > vfs_finish_open()/vfs_finish_open_simple() and vfs_tmpfile_open() to
> > stick with our vfs_* prefix convention.
> >
> > It is extremely helpful when looking/grepping for helpers and the
> > consistency we have gained there in recent years is pretty good.
> 
> Agreed.  However only finish_open_simple() is the new one, and naming
> it vfs_finish_open_simple() makes it inconsistent with
> finish_open_simple().    I'd just leave this renaming to a separate
> patchset and discussion, as it's hard enough to make progress with the
> current one without expanding its scope.

Ok, the finish_open* thing can be left for later. But vfs_tmpfile_open()
should be doable for this patchset already.
Miklos Szeredi Sept. 21, 2022, 3:24 p.m. UTC | #8
On Wed, 21 Sept 2022 at 17:09, Christian Brauner <brauner@kernel.org> wrote:

> Ok, the finish_open* thing can be left for later. But vfs_tmpfile_open()
> should be doable for this patchset already.

Fixed.

Thanks,
Miklos
Al Viro Sept. 21, 2022, 7:55 p.m. UTC | #9
On Wed, Sep 21, 2022 at 05:06:57AM +0200, Miklos Szeredi wrote:
> On Tue, 20 Sept 2022 at 22:57, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > On Tue, Sep 20, 2022 at 09:36:30PM +0200, Miklos Szeredi wrote:
> >
> > >       inode = child->d_inode;
> >
> > Better
> >         inode = file_inode(file);
> >
> > so that child would be completely ignored after dput().
> >
> > > +     error = vfs_tmpfile(mnt_userns, &path, file, op->mode);
> > > +     if (error)
> > >               goto out2;
> > > -     dput(path.dentry);
> > > -     path.dentry = child;
> > > -     audit_inode(nd->name, child, 0);
> > > +     audit_inode(nd->name, file->f_path.dentry, 0);
> > >       /* Don't check for other permissions, the inode was just created */
> > > -     error = may_open(mnt_userns, &path, 0, op->open_flag);
> >
> > Umm...  I'm not sure that losing it is the right thing - it might
> > be argued that ->permission(..., MAY_OPEN) is to be ignored for
> > tmpfile (and the only thing checking for MAY_OPEN is nfs, which is
> > *not* going to grow tmpfile any time soon - certainly not with these
> > calling conventions), but you are also dropping the call of
> > security_inode_permission(inode, MAY_OPEN) and that's a change
> > compared to what LSM crowd used to get...
> 
> Not losing it, just moving it into vfs_tmpfile().

Sorry, missed that bit...
diff mbox series

Patch

diff --git a/fs/namei.c b/fs/namei.c
index 652d09ae66fb..4faf7e743664 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3583,44 +3583,43 @@  static int do_open(struct nameidata *nd,
  * On non-idmapped mounts or if permission checking is to be performed on the
  * raw inode simply passs init_user_ns.
  */
-static struct dentry *vfs_tmpfile(struct user_namespace *mnt_userns,
-			   struct dentry *dentry, umode_t mode, int open_flag)
+static int vfs_tmpfile(struct user_namespace *mnt_userns,
+		       const struct path *parentpath,
+		       struct file *file, umode_t mode)
 {
-	struct dentry *child = NULL;
-	struct inode *dir = dentry->d_inode;
+	struct dentry *child;
+	struct inode *dir = d_inode(parentpath->dentry);
 	struct inode *inode;
 	int error;
 
 	/* we want directory to be writable */
 	error = inode_permission(mnt_userns, dir, MAY_WRITE | MAY_EXEC);
 	if (error)
-		goto out_err;
-	error = -EOPNOTSUPP;
+		return error;
 	if (!dir->i_op->tmpfile)
-		goto out_err;
-	error = -ENOMEM;
-	child = d_alloc(dentry, &slash_name);
+		return -EOPNOTSUPP;
+	child = d_alloc(parentpath->dentry, &slash_name);
 	if (unlikely(!child))
-		goto out_err;
+		return -ENOMEM;
+	file->f_path.mnt = parentpath->mnt;
+	file->f_path.dentry = child;
 	mode = vfs_prepare_mode(mnt_userns, dir, mode, mode, mode);
 	error = dir->i_op->tmpfile(mnt_userns, dir, child, mode);
+	error = finish_open_simple(file, error);
+	dput(child);
 	if (error)
-		goto out_err;
-	error = -ENOENT;
+		return error;
+	error = may_open(mnt_userns, &file->f_path, 0, file->f_flags);
+	if (error)
+		return error;
 	inode = child->d_inode;
-	if (unlikely(!inode))
-		goto out_err;
-	if (!(open_flag & O_EXCL)) {
+	if (!(file->f_flags & O_EXCL)) {
 		spin_lock(&inode->i_lock);
 		inode->i_state |= I_LINKABLE;
 		spin_unlock(&inode->i_lock);
 	}
 	ima_post_create_tmpfile(mnt_userns, inode);
-	return child;
-
-out_err:
-	dput(child);
-	return ERR_PTR(error);
+	return 0;
 }
 
 /**
@@ -3641,25 +3640,15 @@  struct file *tmpfile_open(struct user_namespace *mnt_userns,
 {
 	struct file *file;
 	int error;
-	struct path path = { .mnt = parentpath->mnt };
-
-	path.dentry = vfs_tmpfile(mnt_userns, parentpath->dentry, mode, open_flag);
-	if (IS_ERR(path.dentry))
-		return ERR_CAST(path.dentry);
-
-	error = may_open(mnt_userns, &path, 0, open_flag);
-	file = ERR_PTR(error);
-	if (error)
-		goto out_dput;
-
-	/*
-	 * This relies on the "noaccount" property of fake open, otherwise
-	 * equivalent to dentry_open().
-	 */
-	file = open_with_fake_path(&path, open_flag, d_inode(path.dentry), cred);
-out_dput:
-	dput(path.dentry);
 
+	file = alloc_empty_file_noaccount(open_flag, cred);
+	if (!IS_ERR(file)) {
+		error = vfs_tmpfile(mnt_userns, parentpath, file, mode);
+		if (error) {
+			fput(file);
+			file = ERR_PTR(error);
+		}
+	}
 	return file;
 }
 EXPORT_SYMBOL(tmpfile_open);
@@ -3669,26 +3658,20 @@  static int do_tmpfile(struct nameidata *nd, unsigned flags,
 		struct file *file)
 {
 	struct user_namespace *mnt_userns;
-	struct dentry *child;
 	struct path path;
 	int error = path_lookupat(nd, flags | LOOKUP_DIRECTORY, &path);
+
 	if (unlikely(error))
 		return error;
 	error = mnt_want_write(path.mnt);
 	if (unlikely(error))
 		goto out;
 	mnt_userns = mnt_user_ns(path.mnt);
-	child = vfs_tmpfile(mnt_userns, path.dentry, op->mode, op->open_flag);
-	error = PTR_ERR(child);
-	if (IS_ERR(child))
+	error = vfs_tmpfile(mnt_userns, &path, file, op->mode);
+	if (error)
 		goto out2;
-	dput(path.dentry);
-	path.dentry = child;
-	audit_inode(nd->name, child, 0);
+	audit_inode(nd->name, file->f_path.dentry, 0);
 	/* Don't check for other permissions, the inode was just created */
-	error = may_open(mnt_userns, &path, 0, op->open_flag);
-	if (!error)
-		error = vfs_open(&path, file);
 out2:
 	mnt_drop_write(path.mnt);
 out:
diff --git a/include/linux/fs.h b/include/linux/fs.h
index a445da4842e0..f0d17eefb966 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2780,6 +2780,15 @@  extern int finish_open(struct file *file, struct dentry *dentry,
 			int (*open)(struct inode *, struct file *));
 extern int finish_no_open(struct file *file, struct dentry *dentry);
 
+/* Helper for the simple case when original dentry is used */
+static inline int finish_open_simple(struct file *file, int error)
+{
+	if (error)
+		return error;
+
+	return finish_open(file, file->f_path.dentry, NULL);
+}
+
 /* fs/dcache.c */
 extern void __init vfs_caches_init_early(void);
 extern void __init vfs_caches_init(void);