diff mbox series

[07/15] remap_range: move file_start_write() to after permission hook

Message ID 20231114153321.1716028-8-amir73il@gmail.com (mailing list archive)
State New
Headers show
Series Tidy up file permission hooks | expand

Commit Message

Amir Goldstein Nov. 14, 2023, 3:33 p.m. UTC
In vfs code, file_start_write() is usually called after the permission
hook in rw_verify_area().  vfs_dedupe_file_range_one() is an exception
to this rule.

In vfs_dedupe_file_range_one(), move file_start_write() to after the
the rw_verify_area() checks to make them "start-write-safe".

This is needed for fanotify "pre content" events.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/remap_range.c | 32 +++++++++++++-------------------
 1 file changed, 13 insertions(+), 19 deletions(-)

Comments

Christian Brauner Nov. 21, 2023, 3:10 p.m. UTC | #1
On Tue, Nov 14, 2023 at 05:33:13PM +0200, Amir Goldstein wrote:
> In vfs code, file_start_write() is usually called after the permission
> hook in rw_verify_area().  vfs_dedupe_file_range_one() is an exception
> to this rule.
> 
> In vfs_dedupe_file_range_one(), move file_start_write() to after the
> the rw_verify_area() checks to make them "start-write-safe".
> 
> This is needed for fanotify "pre content" events.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/remap_range.c | 32 +++++++++++++-------------------
>  1 file changed, 13 insertions(+), 19 deletions(-)
> 
> diff --git a/fs/remap_range.c b/fs/remap_range.c
> index 42f79cb2b1b1..de4b09d0ba1d 100644
> --- a/fs/remap_range.c
> +++ b/fs/remap_range.c
> @@ -445,46 +445,40 @@ loff_t vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos,
>  	WARN_ON_ONCE(remap_flags & ~(REMAP_FILE_DEDUP |
>  				     REMAP_FILE_CAN_SHORTEN));
>  
> -	ret = mnt_want_write_file(dst_file);
> -	if (ret)
> -		return ret;
> -
>  	/*
>  	 * This is redundant if called from vfs_dedupe_file_range(), but other
>  	 * callers need it and it's not performance sesitive...
>  	 */
>  	ret = remap_verify_area(src_file, src_pos, len, false);
>  	if (ret)
> -		goto out_drop_write;
> +		return ret;
>  
>  	ret = remap_verify_area(dst_file, dst_pos, len, true);
>  	if (ret)
> -		goto out_drop_write;
> +		return ret;
>  
> -	ret = -EPERM;
>  	if (!allow_file_dedupe(dst_file))
> -		goto out_drop_write;
> +		return -EPERM;

So that check specifically should come after mnt_want_write_file()
because it calls inode_permission() which takes the mount's idmapping
into account. And before you hold mnt_want_write_file() the idmapping of
the mount can still change. Once you've gotten write access though we
tell the anyone trying to change the mount's write-relevant properties
to go away.

With your changes that check might succeed now but fail later. So please
move that check below mnt_want_write_file(). That shouldn't be a
problem.

Fwiw, for security_file_permission() it doesn't matter because the LSMs
don't care about DAC permission - at least not the ones that currently
implement the hook. I verified that years ago and just rechecked. If
they start caring - which I sincerely hope they don't - then we have to
do a bunch of rework anyway to make that work reliably. But I doubt
that'll happen or we'll let that happen.

While at it, please rename allow_file_dedupe() to may_dedupe_file() so
it mirrors our helpers in fs/namei.c.
Christian Brauner Nov. 21, 2023, 3:47 p.m. UTC | #2
> the mount can still change. Once you've gotten write access though we
> tell the anyone trying to change the mount's write-relevant properties
> to go away.

I should also clarify that this is unlikely to matter in practice. It's
more about correctness. You have to be in a very specific scenario for
that to even be a relevant concern.
Amir Goldstein Nov. 21, 2023, 6:39 p.m. UTC | #3
On Tue, Nov 21, 2023 at 5:10 PM Christian Brauner <brauner@kernel.org> wrote:
>
> On Tue, Nov 14, 2023 at 05:33:13PM +0200, Amir Goldstein wrote:
> > In vfs code, file_start_write() is usually called after the permission
> > hook in rw_verify_area().  vfs_dedupe_file_range_one() is an exception
> > to this rule.
> >
> > In vfs_dedupe_file_range_one(), move file_start_write() to after the
> > the rw_verify_area() checks to make them "start-write-safe".
> >
> > This is needed for fanotify "pre content" events.
> >
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > ---
> >  fs/remap_range.c | 32 +++++++++++++-------------------
> >  1 file changed, 13 insertions(+), 19 deletions(-)
> >
> > diff --git a/fs/remap_range.c b/fs/remap_range.c
> > index 42f79cb2b1b1..de4b09d0ba1d 100644
> > --- a/fs/remap_range.c
> > +++ b/fs/remap_range.c
> > @@ -445,46 +445,40 @@ loff_t vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos,
> >       WARN_ON_ONCE(remap_flags & ~(REMAP_FILE_DEDUP |
> >                                    REMAP_FILE_CAN_SHORTEN));
> >
> > -     ret = mnt_want_write_file(dst_file);
> > -     if (ret)
> > -             return ret;
> > -
> >       /*
> >        * This is redundant if called from vfs_dedupe_file_range(), but other
> >        * callers need it and it's not performance sesitive...
> >        */
> >       ret = remap_verify_area(src_file, src_pos, len, false);
> >       if (ret)
> > -             goto out_drop_write;
> > +             return ret;
> >
> >       ret = remap_verify_area(dst_file, dst_pos, len, true);
> >       if (ret)
> > -             goto out_drop_write;
> > +             return ret;
> >
> > -     ret = -EPERM;
> >       if (!allow_file_dedupe(dst_file))
> > -             goto out_drop_write;
> > +             return -EPERM;
>
> So that check specifically should come after mnt_want_write_file()
> because it calls inode_permission() which takes the mount's idmapping
> into account. And before you hold mnt_want_write_file() the idmapping of
> the mount can still change. Once you've gotten write access though we
> tell the anyone trying to change the mount's write-relevant properties
> to go away.
>
> With your changes that check might succeed now but fail later. So please
> move that check below mnt_want_write_file(). That shouldn't be a
> problem.
>

Right. Good catch!

Thanks,
Amir.
diff mbox series

Patch

diff --git a/fs/remap_range.c b/fs/remap_range.c
index 42f79cb2b1b1..de4b09d0ba1d 100644
--- a/fs/remap_range.c
+++ b/fs/remap_range.c
@@ -445,46 +445,40 @@  loff_t vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos,
 	WARN_ON_ONCE(remap_flags & ~(REMAP_FILE_DEDUP |
 				     REMAP_FILE_CAN_SHORTEN));
 
-	ret = mnt_want_write_file(dst_file);
-	if (ret)
-		return ret;
-
 	/*
 	 * This is redundant if called from vfs_dedupe_file_range(), but other
 	 * callers need it and it's not performance sesitive...
 	 */
 	ret = remap_verify_area(src_file, src_pos, len, false);
 	if (ret)
-		goto out_drop_write;
+		return ret;
 
 	ret = remap_verify_area(dst_file, dst_pos, len, true);
 	if (ret)
-		goto out_drop_write;
+		return ret;
 
-	ret = -EPERM;
 	if (!allow_file_dedupe(dst_file))
-		goto out_drop_write;
+		return -EPERM;
 
-	ret = -EXDEV;
 	if (file_inode(src_file)->i_sb != file_inode(dst_file)->i_sb)
-		goto out_drop_write;
+		return -EXDEV;
 
-	ret = -EISDIR;
 	if (S_ISDIR(file_inode(dst_file)->i_mode))
-		goto out_drop_write;
+		return -EISDIR;
 
-	ret = -EINVAL;
 	if (!dst_file->f_op->remap_file_range)
-		goto out_drop_write;
+		return -EINVAL;
 
-	if (len == 0) {
-		ret = 0;
-		goto out_drop_write;
-	}
+	if (len == 0)
+		return 0;
+
+	ret = mnt_want_write_file(dst_file);
+	if (ret)
+		return ret;
 
 	ret = dst_file->f_op->remap_file_range(src_file, src_pos, dst_file,
 			dst_pos, len, remap_flags | REMAP_FILE_DEDUP);
-out_drop_write:
+
 	mnt_drop_write_file(dst_file);
 
 	return ret;