Message ID | 20231114153321.1716028-8-amir73il@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Tidy up file permission hooks | expand |
On Tue, Nov 14, 2023 at 05:33:13PM +0200, Amir Goldstein wrote: > In vfs code, file_start_write() is usually called after the permission > hook in rw_verify_area(). vfs_dedupe_file_range_one() is an exception > to this rule. > > In vfs_dedupe_file_range_one(), move file_start_write() to after the > the rw_verify_area() checks to make them "start-write-safe". > > This is needed for fanotify "pre content" events. > > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > --- > fs/remap_range.c | 32 +++++++++++++------------------- > 1 file changed, 13 insertions(+), 19 deletions(-) > > diff --git a/fs/remap_range.c b/fs/remap_range.c > index 42f79cb2b1b1..de4b09d0ba1d 100644 > --- a/fs/remap_range.c > +++ b/fs/remap_range.c > @@ -445,46 +445,40 @@ loff_t vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos, > WARN_ON_ONCE(remap_flags & ~(REMAP_FILE_DEDUP | > REMAP_FILE_CAN_SHORTEN)); > > - ret = mnt_want_write_file(dst_file); > - if (ret) > - return ret; > - > /* > * This is redundant if called from vfs_dedupe_file_range(), but other > * callers need it and it's not performance sesitive... > */ > ret = remap_verify_area(src_file, src_pos, len, false); > if (ret) > - goto out_drop_write; > + return ret; > > ret = remap_verify_area(dst_file, dst_pos, len, true); > if (ret) > - goto out_drop_write; > + return ret; > > - ret = -EPERM; > if (!allow_file_dedupe(dst_file)) > - goto out_drop_write; > + return -EPERM; So that check specifically should come after mnt_want_write_file() because it calls inode_permission() which takes the mount's idmapping into account. And before you hold mnt_want_write_file() the idmapping of the mount can still change. Once you've gotten write access though we tell the anyone trying to change the mount's write-relevant properties to go away. With your changes that check might succeed now but fail later. So please move that check below mnt_want_write_file(). That shouldn't be a problem. Fwiw, for security_file_permission() it doesn't matter because the LSMs don't care about DAC permission - at least not the ones that currently implement the hook. I verified that years ago and just rechecked. If they start caring - which I sincerely hope they don't - then we have to do a bunch of rework anyway to make that work reliably. But I doubt that'll happen or we'll let that happen. While at it, please rename allow_file_dedupe() to may_dedupe_file() so it mirrors our helpers in fs/namei.c.
> the mount can still change. Once you've gotten write access though we > tell the anyone trying to change the mount's write-relevant properties > to go away. I should also clarify that this is unlikely to matter in practice. It's more about correctness. You have to be in a very specific scenario for that to even be a relevant concern.
On Tue, Nov 21, 2023 at 5:10 PM Christian Brauner <brauner@kernel.org> wrote: > > On Tue, Nov 14, 2023 at 05:33:13PM +0200, Amir Goldstein wrote: > > In vfs code, file_start_write() is usually called after the permission > > hook in rw_verify_area(). vfs_dedupe_file_range_one() is an exception > > to this rule. > > > > In vfs_dedupe_file_range_one(), move file_start_write() to after the > > the rw_verify_area() checks to make them "start-write-safe". > > > > This is needed for fanotify "pre content" events. > > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > > --- > > fs/remap_range.c | 32 +++++++++++++------------------- > > 1 file changed, 13 insertions(+), 19 deletions(-) > > > > diff --git a/fs/remap_range.c b/fs/remap_range.c > > index 42f79cb2b1b1..de4b09d0ba1d 100644 > > --- a/fs/remap_range.c > > +++ b/fs/remap_range.c > > @@ -445,46 +445,40 @@ loff_t vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos, > > WARN_ON_ONCE(remap_flags & ~(REMAP_FILE_DEDUP | > > REMAP_FILE_CAN_SHORTEN)); > > > > - ret = mnt_want_write_file(dst_file); > > - if (ret) > > - return ret; > > - > > /* > > * This is redundant if called from vfs_dedupe_file_range(), but other > > * callers need it and it's not performance sesitive... > > */ > > ret = remap_verify_area(src_file, src_pos, len, false); > > if (ret) > > - goto out_drop_write; > > + return ret; > > > > ret = remap_verify_area(dst_file, dst_pos, len, true); > > if (ret) > > - goto out_drop_write; > > + return ret; > > > > - ret = -EPERM; > > if (!allow_file_dedupe(dst_file)) > > - goto out_drop_write; > > + return -EPERM; > > So that check specifically should come after mnt_want_write_file() > because it calls inode_permission() which takes the mount's idmapping > into account. And before you hold mnt_want_write_file() the idmapping of > the mount can still change. Once you've gotten write access though we > tell the anyone trying to change the mount's write-relevant properties > to go away. > > With your changes that check might succeed now but fail later. So please > move that check below mnt_want_write_file(). That shouldn't be a > problem. > Right. Good catch! Thanks, Amir.
diff --git a/fs/remap_range.c b/fs/remap_range.c index 42f79cb2b1b1..de4b09d0ba1d 100644 --- a/fs/remap_range.c +++ b/fs/remap_range.c @@ -445,46 +445,40 @@ loff_t vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos, WARN_ON_ONCE(remap_flags & ~(REMAP_FILE_DEDUP | REMAP_FILE_CAN_SHORTEN)); - ret = mnt_want_write_file(dst_file); - if (ret) - return ret; - /* * This is redundant if called from vfs_dedupe_file_range(), but other * callers need it and it's not performance sesitive... */ ret = remap_verify_area(src_file, src_pos, len, false); if (ret) - goto out_drop_write; + return ret; ret = remap_verify_area(dst_file, dst_pos, len, true); if (ret) - goto out_drop_write; + return ret; - ret = -EPERM; if (!allow_file_dedupe(dst_file)) - goto out_drop_write; + return -EPERM; - ret = -EXDEV; if (file_inode(src_file)->i_sb != file_inode(dst_file)->i_sb) - goto out_drop_write; + return -EXDEV; - ret = -EISDIR; if (S_ISDIR(file_inode(dst_file)->i_mode)) - goto out_drop_write; + return -EISDIR; - ret = -EINVAL; if (!dst_file->f_op->remap_file_range) - goto out_drop_write; + return -EINVAL; - if (len == 0) { - ret = 0; - goto out_drop_write; - } + if (len == 0) + return 0; + + ret = mnt_want_write_file(dst_file); + if (ret) + return ret; ret = dst_file->f_op->remap_file_range(src_file, src_pos, dst_file, dst_pos, len, remap_flags | REMAP_FILE_DEDUP); -out_drop_write: + mnt_drop_write_file(dst_file); return ret;
In vfs code, file_start_write() is usually called after the permission hook in rw_verify_area(). vfs_dedupe_file_range_one() is an exception to this rule. In vfs_dedupe_file_range_one(), move file_start_write() to after the the rw_verify_area() checks to make them "start-write-safe". This is needed for fanotify "pre content" events. Signed-off-by: Amir Goldstein <amir73il@gmail.com> --- fs/remap_range.c | 32 +++++++++++++------------------- 1 file changed, 13 insertions(+), 19 deletions(-)