diff mbox series

[06/15] vfs: strengthen checking of file range inputs to clone/dedupe range

Message ID 153870031519.29072.18289185889660082318.stgit@magnolia (mailing list archive)
State New, archived
Headers show
Series fs: fixes for serious clone/dedupe problems | expand

Commit Message

Darrick J. Wong Oct. 5, 2018, 12:45 a.m. UTC
From: Darrick J. Wong <darrick.wong@oracle.com>

Clone range is an optimization on a regular file write.  File writes
that extend the file length are subject to various constraints which are
not checked by clonerange.  This is a correctness problem, because we're
never allowed to touch ranges that the page cache can't support
(s_maxbytes); we're not supposed to deal with large offsets
(MAX_NON_LFS) if O_LARGEFILE isn't set; and we must obey resource limits
(RLIMIT_FSIZE).

Therefore, add these checks to the new generic_clone_checks function so
that we curtail unexpected behavior.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 mm/filemap.c |   31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

Comments

Amir Goldstein Oct. 5, 2018, 6:10 a.m. UTC | #1
On Fri, Oct 5, 2018 at 3:46 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
>
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> Clone range is an optimization on a regular file write.  File writes
> that extend the file length are subject to various constraints which are
> not checked by clonerange.  This is a correctness problem, because we're
> never allowed to touch ranges that the page cache can't support
> (s_maxbytes); we're not supposed to deal with large offsets
> (MAX_NON_LFS) if O_LARGEFILE isn't set; and we must obey resource limits
> (RLIMIT_FSIZE).
>
> Therefore, add these checks to the new generic_clone_checks function so
> that we curtail unexpected behavior.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  mm/filemap.c |   31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
>
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 68ec91d05c7b..f74391721234 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -3015,6 +3015,37 @@ int generic_clone_checks(struct file *file_in, loff_t pos_in,
>                 return -EINVAL;
>         count = min(count, size_in - (uint64_t)pos_in);
>
> +       /* Don't exceed RLMIT_FSIZE in the file we're writing into. */
> +       if (limit != RLIM_INFINITY) {
> +               if (pos_out >= limit) {
> +                       send_sig(SIGXFSZ, current, 0);
> +                       return -EFBIG;
> +               }
> +               count = min(count, limit - (uint64_t)pos_out);
> +       }
> +
> +       /* Don't exceed the LFS limits. */
> +       if (unlikely(pos_out + count > MAX_NON_LFS &&
> +                               !(file_out->f_flags & O_LARGEFILE))) {
> +               if (pos_out >= MAX_NON_LFS)
> +                       return -EFBIG;
> +               count = min(count, MAX_NON_LFS - (uint64_t)pos_out);
> +       }
> +       if (unlikely(pos_in + count > MAX_NON_LFS &&
> +                               !(file_in->f_flags & O_LARGEFILE))) {
> +               if (pos_in >= MAX_NON_LFS)
> +                       return -EFBIG;
> +               count = min(count, MAX_NON_LFS - (uint64_t)pos_in);
> +       }
> +
> +       /* Don't operate on ranges the page cache doesn't support. */
> +       if (unlikely(pos_out >= inode_out->i_sb->s_maxbytes ||
> +                    pos_in >= inode_in->i_sb->s_maxbytes))
> +               return -EFBIG;
> +

Forget my standards, this doesn't abide by your own standards ;-)
Please factor out generic_write_checks() and use it instead of
duplicating the code. The in/out variant doesn't justify not calling
the helper twice IMO.

Thanks,
Amir.
Darrick J. Wong Oct. 5, 2018, 5:36 p.m. UTC | #2
On Fri, Oct 05, 2018 at 09:10:12AM +0300, Amir Goldstein wrote:
> On Fri, Oct 5, 2018 at 3:46 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> >
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> >
> > Clone range is an optimization on a regular file write.  File writes
> > that extend the file length are subject to various constraints which are
> > not checked by clonerange.  This is a correctness problem, because we're
> > never allowed to touch ranges that the page cache can't support
> > (s_maxbytes); we're not supposed to deal with large offsets
> > (MAX_NON_LFS) if O_LARGEFILE isn't set; and we must obey resource limits
> > (RLIMIT_FSIZE).
> >
> > Therefore, add these checks to the new generic_clone_checks function so
> > that we curtail unexpected behavior.
> >
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  mm/filemap.c |   31 +++++++++++++++++++++++++++++++
> >  1 file changed, 31 insertions(+)
> >
> >
> > diff --git a/mm/filemap.c b/mm/filemap.c
> > index 68ec91d05c7b..f74391721234 100644
> > --- a/mm/filemap.c
> > +++ b/mm/filemap.c
> > @@ -3015,6 +3015,37 @@ int generic_clone_checks(struct file *file_in, loff_t pos_in,
> >                 return -EINVAL;
> >         count = min(count, size_in - (uint64_t)pos_in);
> >
> > +       /* Don't exceed RLMIT_FSIZE in the file we're writing into. */
> > +       if (limit != RLIM_INFINITY) {
> > +               if (pos_out >= limit) {
> > +                       send_sig(SIGXFSZ, current, 0);
> > +                       return -EFBIG;
> > +               }
> > +               count = min(count, limit - (uint64_t)pos_out);
> > +       }
> > +
> > +       /* Don't exceed the LFS limits. */
> > +       if (unlikely(pos_out + count > MAX_NON_LFS &&
> > +                               !(file_out->f_flags & O_LARGEFILE))) {
> > +               if (pos_out >= MAX_NON_LFS)
> > +                       return -EFBIG;
> > +               count = min(count, MAX_NON_LFS - (uint64_t)pos_out);
> > +       }
> > +       if (unlikely(pos_in + count > MAX_NON_LFS &&
> > +                               !(file_in->f_flags & O_LARGEFILE))) {
> > +               if (pos_in >= MAX_NON_LFS)
> > +                       return -EFBIG;
> > +               count = min(count, MAX_NON_LFS - (uint64_t)pos_in);
> > +       }
> > +
> > +       /* Don't operate on ranges the page cache doesn't support. */
> > +       if (unlikely(pos_out >= inode_out->i_sb->s_maxbytes ||
> > +                    pos_in >= inode_in->i_sb->s_maxbytes))
> > +               return -EFBIG;
> > +
> 
> Forget my standards, this doesn't abide by your own standards ;-)
> Please factor out generic_write_checks() and use it instead of
> duplicating the code. The in/out variant doesn't justify not calling
> the helper twice IMO.

Factor generic_write_checks and generic_clone_checks how?  They operate
on very different parameter types.

Or were you suggeseting refactoring just the "Dont' exceed LFS limits"
and "Don't operate on ranges the page cache..." sections of
generic_clone_checks to reduce copy paste?  That I'll do.

--D

> 
> Thanks,
> Amir.
diff mbox series

Patch

diff --git a/mm/filemap.c b/mm/filemap.c
index 68ec91d05c7b..f74391721234 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3015,6 +3015,37 @@  int generic_clone_checks(struct file *file_in, loff_t pos_in,
 		return -EINVAL;
 	count = min(count, size_in - (uint64_t)pos_in);
 
+	/* Don't exceed RLMIT_FSIZE in the file we're writing into. */
+	if (limit != RLIM_INFINITY) {
+		if (pos_out >= limit) {
+			send_sig(SIGXFSZ, current, 0);
+			return -EFBIG;
+		}
+		count = min(count, limit - (uint64_t)pos_out);
+	}
+
+	/* Don't exceed the LFS limits. */
+	if (unlikely(pos_out + count > MAX_NON_LFS &&
+				!(file_out->f_flags & O_LARGEFILE))) {
+		if (pos_out >= MAX_NON_LFS)
+			return -EFBIG;
+		count = min(count, MAX_NON_LFS - (uint64_t)pos_out);
+	}
+	if (unlikely(pos_in + count > MAX_NON_LFS &&
+				!(file_in->f_flags & O_LARGEFILE))) {
+		if (pos_in >= MAX_NON_LFS)
+			return -EFBIG;
+		count = min(count, MAX_NON_LFS - (uint64_t)pos_in);
+	}
+
+	/* Don't operate on ranges the page cache doesn't support. */
+	if (unlikely(pos_out >= inode_out->i_sb->s_maxbytes ||
+		     pos_in >= inode_in->i_sb->s_maxbytes))
+		return -EFBIG;
+
+	count = min(count, inode_out->i_sb->s_maxbytes - (uint64_t)pos_out);
+	count = min(count, inode_in->i_sb->s_maxbytes - (uint64_t)pos_in);
+
 	/*
 	 * If the user wanted us to link to the infile's EOF, round up to the
 	 * next block boundary for this check.