diff mbox series

[1/5] string.h: add array-wrappers for (v)memdup_user()

Message ID 46f667e154393a930a97d2218d8e90286d93a062.1693386602.git.pstanner@redhat.com (mailing list archive)
State New, archived
Headers show
Series Introduce new wrappers to copy user-arrays | expand

Commit Message

Philipp Stanner Aug. 30, 2023, 1:45 p.m. UTC
Currently, user array duplications are sometimes done without an
overflow check. Sometimes the checks are done manually; sometimes the
array size is calculated with array_size() and sometimes by calculating
n * size directly in code.

Introduce wrappers for arrays for memdup_user() and vmemdup_user() to
provide a standardized and safe way for duplicating user arrays.

This is both for new code as well as replacing usage of (v)memdup_user()
in existing code that uses, e.g., n * size to calculate array sizes.

Suggested-by: David Airlie <airlied@redhat.com>
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
---
 include/linux/string.h | 42 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

Comments

Andy Shevchenko Aug. 30, 2023, 2:11 p.m. UTC | #1
On Wed, Aug 30, 2023 at 4:46 PM Philipp Stanner <pstanner@redhat.com> wrote:
>
> Currently, user array duplications are sometimes done without an
> overflow check. Sometimes the checks are done manually; sometimes the
> array size is calculated with array_size() and sometimes by calculating
> n * size directly in code.
>
> Introduce wrappers for arrays for memdup_user() and vmemdup_user() to
> provide a standardized and safe way for duplicating user arrays.
>
> This is both for new code as well as replacing usage of (v)memdup_user()
> in existing code that uses, e.g., n * size to calculate array sizes.

...

> --- a/include/linux/string.h
> +++ b/include/linux/string.h

I'm wondering if this has no side-effects as string.h/string.c IIRC is
used also for early stages where some of the APIs are not available.

> @@ -6,6 +6,8 @@
>  #include <linux/types.h>       /* for size_t */
>  #include <linux/stddef.h>      /* for NULL */
>  #include <linux/errno.h>       /* for E2BIG */
> +#include <linux/overflow.h>    /* for check_mul_overflow() */
> +#include <linux/err.h>         /* for ERR_PTR() */

Can we preserve order (to some extent)?

>  #include <linux/stdarg.h>
>  #include <uapi/linux/string.h>

...

> +/**
> + * memdup_array_user - duplicate array from user space

> + *

Do we need this blank line?

> + * @src: source address in user space
> + * @n: number of array members to copy
> + * @size: size of one array member
> + *
> + * Return: an ERR_PTR() on failure.  Result is physically
> + * contiguous, to be freed by kfree().
> + */

...

> +/**
> + * vmemdup_array_user - duplicate array from user space

> + *

Redundant?

> + * @src: source address in user space
> + * @n: number of array members to copy
> + * @size: size of one array member
> + *
> + * Return: an ERR_PTR() on failure.  Result may be not
> + * physically contiguous.  Use kvfree() to free.
> + */
Andy Shevchenko Aug. 30, 2023, 2:15 p.m. UTC | #2
On Wed, Aug 30, 2023 at 4:46 PM Philipp Stanner <pstanner@redhat.com> wrote:

> +       if (unlikely(check_mul_overflow(n, size, &nbytes)))
> +               return ERR_PTR(-EINVAL);

> +       if (unlikely(check_mul_overflow(n, size, &nbytes)))
> +               return ERR_PTR(-EINVAL);

Btw, why not -EOVERFLOW ?
Philipp Stanner Aug. 30, 2023, 2:19 p.m. UTC | #3
On Wed, 2023-08-30 at 17:11 +0300, Andy Shevchenko wrote:
> On Wed, Aug 30, 2023 at 4:46 PM Philipp Stanner <pstanner@redhat.com>
> wrote:
> > 
> > Currently, user array duplications are sometimes done without an
> > overflow check. Sometimes the checks are done manually; sometimes
> > the
> > array size is calculated with array_size() and sometimes by
> > calculating
> > n * size directly in code.
> > 
> > Introduce wrappers for arrays for memdup_user() and vmemdup_user()
> > to
> > provide a standardized and safe way for duplicating user arrays.
> > 
> > This is both for new code as well as replacing usage of
> > (v)memdup_user()
> > in existing code that uses, e.g., n * size to calculate array
> > sizes.
> 
> ...
> 
> > --- a/include/linux/string.h
> > +++ b/include/linux/string.h
> 
> I'm wondering if this has no side-effects as string.h/string.c IIRC
> is
> used also for early stages where some of the APIs are not available.
> 
> > @@ -6,6 +6,8 @@
> >  #include <linux/types.h>       /* for size_t */
> >  #include <linux/stddef.h>      /* for NULL */
> >  #include <linux/errno.h>       /* for E2BIG */
> > +#include <linux/overflow.h>    /* for check_mul_overflow() */
> > +#include <linux/err.h>         /* for ERR_PTR() */
> 
> Can we preserve order (to some extent)?

Sure. I just put it there so the comments build a congruent block.
Which order would you prefer?

> 
> >  #include <linux/stdarg.h>
> >  #include <uapi/linux/string.h>
> 
> ...
> 
> > +/**
> > + * memdup_array_user - duplicate array from user space
> 
> > + *
> 
> Do we need this blank line?

I more or less directly copied the docstring format from the original
functions (v)memdup_user() in mm/util.c
I guess this is common style?

> 
> > + * @src: source address in user space
> > + * @n: number of array members to copy
> > + * @size: size of one array member
> > + *
> > + * Return: an ERR_PTR() on failure.  Result is physically
> > + * contiguous, to be freed by kfree().
> > + */
> 
> ...
> 
> > +/**
> > + * vmemdup_array_user - duplicate array from user space
> 
> > + *
> 
> Redundant?

No, there are two functions:
 * memdup_array_user()
 * vmemdup_array_user()

On the deeper layers they utilize kmalloc() or kvmalloc(),
respectively.


Greetings,
P.

> 
> > + * @src: source address in user space
> > + * @n: number of array members to copy
> > + * @size: size of one array member
> > + *
> > + * Return: an ERR_PTR() on failure.  Result may be not
> > + * physically contiguous.  Use kvfree() to free.
> > + */
>
Philipp Stanner Aug. 30, 2023, 2:23 p.m. UTC | #4
On Wed, 2023-08-30 at 17:15 +0300, Andy Shevchenko wrote:
> On Wed, Aug 30, 2023 at 4:46 PM Philipp Stanner <pstanner@redhat.com>
> wrote:
> 
> > +       if (unlikely(check_mul_overflow(n, size, &nbytes)))
> > +               return ERR_PTR(-EINVAL);
> 
> > +       if (unlikely(check_mul_overflow(n, size, &nbytes)))
> > +               return ERR_PTR(-EINVAL);
> 
> Btw, why not -EOVERFLOW ?
> 

Good question, actually.
To be honest I wasn't quite sure which code to pick (-E2BIG was also
once I candidate).

-EINVAL was picked because the idea was that a request overflowing a
size_t could surely be expected to contain an invalid parameter,
because no one would ever request an array _that_ large

?
Andy Shevchenko Aug. 30, 2023, 2:29 p.m. UTC | #5
On Wed, Aug 30, 2023 at 5:19 PM <pstanner@redhat.com> wrote:
> On Wed, 2023-08-30 at 17:11 +0300, Andy Shevchenko wrote:
> > On Wed, Aug 30, 2023 at 4:46 PM Philipp Stanner <pstanner@redhat.com>
> > wrote:

> > > --- a/include/linux/string.h
> > > +++ b/include/linux/string.h
> >
> > I'm wondering if this has no side-effects as string.h/string.c IIRC
> > is
> > used also for early stages where some of the APIs are not available.
> >
> > > @@ -6,6 +6,8 @@
> > >  #include <linux/types.h>       /* for size_t */
> > >  #include <linux/stddef.h>      /* for NULL */
> > >  #include <linux/errno.h>       /* for E2BIG */
> > > +#include <linux/overflow.h>    /* for check_mul_overflow() */
> > > +#include <linux/err.h>         /* for ERR_PTR() */
> >
> > Can we preserve order (to some extent)?
>
> Sure. I just put it there so the comments build a congruent block.
> Which order would you prefer?

Alphabetical.

compiler.h
err.h
overflow.h
...the rest that is a bit unordered...

> > >  #include <linux/stdarg.h>
> > >  #include <uapi/linux/string.h>

...

> > > +/**
> > > + * memdup_array_user - duplicate array from user space
> >
> > > + *
> >
> > Do we need this blank line?
>
> I more or less directly copied the docstring format from the original
> functions (v)memdup_user() in mm/util.c
> I guess this is common style?

I think it's not. But you may grep kernel source tree and tell which
one occurs more often with or without this (unneeded) blank line.

> > > + * @src: source address in user space
> > > + * @n: number of array members to copy
> > > + * @size: size of one array member
> > > + *
> > > + * Return: an ERR_PTR() on failure.  Result is physically
> > > + * contiguous, to be freed by kfree().
> > > + */

...

> > > +/**
> > > + * vmemdup_array_user - duplicate array from user space
> >
> > > + *
> >
> > Redundant?
>
> No, there are two functions:
>  * memdup_array_user()
>  * vmemdup_array_user()
>
> On the deeper layers they utilize kmalloc() or kvmalloc(),
> respectively.

I guess you misunderstood my comment. I was talking about kernel doc
(as in the previous function).

> > > + * @src: source address in user space
> > > + * @n: number of array members to copy
> > > + * @size: size of one array member
> > > + *
> > > + * Return: an ERR_PTR() on failure.  Result may be not
> > > + * physically contiguous.  Use kvfree() to free.
> > > + */
Philipp Stanner Aug. 30, 2023, 7:15 p.m. UTC | #6
On Wed, 2023-08-30 at 17:29 +0300, Andy Shevchenko wrote:
> On Wed, Aug 30, 2023 at 5:19 PM <pstanner@redhat.com> wrote:
> > On Wed, 2023-08-30 at 17:11 +0300, Andy Shevchenko wrote:
> > > On Wed, Aug 30, 2023 at 4:46 PM Philipp Stanner
> > > <pstanner@redhat.com>
> > > wrote:
> 
> > > > --- a/include/linux/string.h
> > > > +++ b/include/linux/string.h
> > > 
> > > I'm wondering if this has no side-effects as string.h/string.c
> > > IIRC
> > > is
> > > used also for early stages where some of the APIs are not
> > > available.
> > > 
> > > > @@ -6,6 +6,8 @@
> > > >  #include <linux/types.h>       /* for size_t */
> > > >  #include <linux/stddef.h>      /* for NULL */
> > > >  #include <linux/errno.h>       /* for E2BIG */
> > > > +#include <linux/overflow.h>    /* for check_mul_overflow() */
> > > > +#include <linux/err.h>         /* for ERR_PTR() */
> > > 
> > > Can we preserve order (to some extent)?
> > 
> > Sure. I just put it there so the comments build a congruent block.
> > Which order would you prefer?
> 
> Alphabetical.
> 
> compiler.h
> err.h
> overflow.h
> ...the rest that is a bit unordered...
> 
> > > >  #include <linux/stdarg.h>
> > > >  #include <uapi/linux/string.h>
> 
> ...

I mean I could include my own in a sorted manner – but the existing
ones are not sorted:

/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _LINUX_STRING_H_
#define _LINUX_STRING_H_

#include <linux/compiler.h>	/* for inline */
#include <linux/types.h>	/* for size_t */
#include <linux/stddef.h>	/* for NULL */
#include <linux/errno.h>	/* for E2BIG */
#include <linux/stdarg.h>
#include <uapi/linux/string.h>

extern char *strndup_user(const char __user *, long);

We could sort them all, but I'd prefer to do that in a separate patch
so that this commit does not make the impression of doing anything else
than including the two new headers

Such a separate patch could also unify the docstring style, see below

> 
> > > > +/**
> > > > + * memdup_array_user - duplicate array from user space
> > > 
> > > > + *
> > > 
> > > Do we need this blank line?
> > 
> > I more or less directly copied the docstring format from the
> > original
> > functions (v)memdup_user() in mm/util.c
> > I guess this is common style?
> 
> I think it's not. But you may grep kernel source tree and tell which
> one occurs more often with or without this (unneeded) blank line.


It seems to be very much mixed. string.h itself is mixed.
When you look at the bottom of string.h, you'll find functions such as
kbasename() that have the extra line.

That's not really a super decisive point for me, though. We can remove
the line I guess


P.


> 
> > > > + * @src: source address in user space
> > > > + * @n: number of array members to copy
> > > > + * @size: size of one array member
> > > > + *
> > > > + * Return: an ERR_PTR() on failure.  Result is physically
> > > > + * contiguous, to be freed by kfree().
> > > > + */
> 
> ...
> 
> > > > +/**
> > > > + * vmemdup_array_user - duplicate array from user space
> > > 
> > > > + *
> > > 
> > > Redundant?
> > 
> > No, there are two functions:
> >  * memdup_array_user()
> >  * vmemdup_array_user()
> > 
> > On the deeper layers they utilize kmalloc() or kvmalloc(),
> > respectively.
> 
> I guess you misunderstood my comment. I was talking about kernel doc
> (as in the previous function).
> 
> > > > + * @src: source address in user space
> > > > + * @n: number of array members to copy
> > > > + * @size: size of one array member
> > > > + *
> > > > + * Return: an ERR_PTR() on failure.  Result may be not
> > > > + * physically contiguous.  Use kvfree() to free.
> > > > + */
> 
>
Andy Shevchenko Aug. 31, 2023, 8:59 a.m. UTC | #7
On Wed, Aug 30, 2023 at 10:15 PM <pstanner@redhat.com> wrote:
> On Wed, 2023-08-30 at 17:29 +0300, Andy Shevchenko wrote:
> > On Wed, Aug 30, 2023 at 5:19 PM <pstanner@redhat.com> wrote:
> > > On Wed, 2023-08-30 at 17:11 +0300, Andy Shevchenko wrote:
> > > > On Wed, Aug 30, 2023 at 4:46 PM Philipp Stanner
> > > > <pstanner@redhat.com>
> > > > wrote:

...

> > > > >  #include <linux/types.h>       /* for size_t */
> > > > >  #include <linux/stddef.h>      /* for NULL */
> > > > >  #include <linux/errno.h>       /* for E2BIG */
> > > > > +#include <linux/overflow.h>    /* for check_mul_overflow() */
> > > > > +#include <linux/err.h>         /* for ERR_PTR() */
> > > >
> > > > Can we preserve order (to some extent)?
> > >
> > > Sure. I just put it there so the comments build a congruent block.
> > > Which order would you prefer?
> >
> > Alphabetical.
> >
> > compiler.h
> > err.h
> > overflow.h
> > ...the rest that is a bit unordered...
> >
> > > > >  #include <linux/stdarg.h>
> > > > >  #include <uapi/linux/string.h>
>
> I mean I could include my own in a sorted manner – but the existing
> ones are not sorted:

I know, that's why I put "(to some extent)" in my initial comment.

> We could sort them all, but I'd prefer to do that in a separate patch
> so that this commit does not make the impression of doing anything else
> than including the two new headers

But you can do your stuff first with better ordering than you proposed
initially, so there will be less churn for any additional change
(either that simply unifies the thing or something else).

> Such a separate patch could also unify the docstring style, see below

Sure, patches are welcome!

> > > > > +/**
> > > > > + * memdup_array_user - duplicate array from user space
> > > >
> > > > > + *
> > > >
> > > > Do we need this blank line?
> > >
> > > I more or less directly copied the docstring format from the
> > > original
> > > functions (v)memdup_user() in mm/util.c
> > > I guess this is common style?
> >
> > I think it's not. But you may grep kernel source tree and tell which
> > one occurs more often with or without this (unneeded) blank line.
>
>
> It seems to be very much mixed. string.h itself is mixed.
> When you look at the bottom of string.h, you'll find functions such as
> kbasename() that have the extra line.
>
> That's not really a super decisive point for me, though. We can remove
> the line I guess

I think the less LoCs the better generally speaking. So, if we don't
need that line and it doesn't make the readability worse, why to keep
it?

> > > > > + * @src: source address in user space
> > > > > + * @n: number of array members to copy
> > > > > + * @size: size of one array member
> > > > > + *
> > > > > + * Return: an ERR_PTR() on failure.  Result is physically
> > > > > + * contiguous, to be freed by kfree().
> > > > > + */

--
With Best Regards,
Andy Shevchenko
Philipp Stanner Aug. 31, 2023, 12:16 p.m. UTC | #8
On Thu, 2023-08-31 at 11:59 +0300, Andy Shevchenko wrote:
> On Wed, Aug 30, 2023 at 10:15 PM <pstanner@redhat.com> wrote:
> > On Wed, 2023-08-30 at 17:29 +0300, Andy Shevchenko wrote:
> > > On Wed, Aug 30, 2023 at 5:19 PM <pstanner@redhat.com> wrote:
> > > > On Wed, 2023-08-30 at 17:11 +0300, Andy Shevchenko wrote:
> > > > > On Wed, Aug 30, 2023 at 4:46 PM Philipp Stanner
> > > > > <pstanner@redhat.com>
> > > Alphabetical.
> > > 
> > > compiler.h
> > > err.h
> > > overflow.h
> > > ...the rest that is a bit unordered...
> > > 
> > > > > >  #include <linux/stdarg.h>
> > > > > >  #include <uapi/linux/string.h>
> > 
> > I mean I could include my own in a sorted manner – but the existing
> > ones are not sorted:
> 
> I know, that's why I put "(to some extent)" in my initial comment.
> 
> > We could sort them all, but I'd prefer to do that in a separate
> > patch
> > so that this commit does not make the impression of doing anything
> > else
> > than including the two new headers
> 
> But you can do your stuff first with better ordering than you
> proposed
> initially, so there will be less churn for any additional change
> (either that simply unifies the thing or something else).
> 
> 

I'll take care of those points in a v2 once I gathered some more
feedback from the other parties.

P.
Philipp Stanner Aug. 31, 2023, 12:22 p.m. UTC | #9
On Wed, 2023-08-30 at 17:11 +0300, Andy Shevchenko wrote:
> On Wed, Aug 30, 2023 at 4:46 PM Philipp Stanner <pstanner@redhat.com>
> wrote:
> > 
> > Currently, user array duplications are sometimes done without an
> > overflow check. Sometimes the checks are done manually; sometimes
> > the
> > array size is calculated with array_size() and sometimes by
> > calculating
> > n * size directly in code.
> > 
> > Introduce wrappers for arrays for memdup_user() and vmemdup_user()
> > to
> > provide a standardized and safe way for duplicating user arrays.
> > 
> > This is both for new code as well as replacing usage of
> > (v)memdup_user()
> > in existing code that uses, e.g., n * size to calculate array
> > sizes.
> 
> ...
> 
> > --- a/include/linux/string.h
> > +++ b/include/linux/string.h
> 
> I'm wondering if this has no side-effects as string.h/string.c IIRC
> is
> used also for early stages where some of the APIs are not available.
> 

I forgot to address this point in my previous reply.

Who's going to decide whether this is a problem or not?


My personal guess is that this is unlikely to be a problem because

   A. either (v)memdup_user() is available, in which case
      (v)memdup_array_user() will always work – 
   B. or (v)memdup_user() is not available, which would cause the code
      that currently uses (v)memdup_user() for copying arrays to fail
      anyways.


P.
Andy Shevchenko Aug. 31, 2023, 1:16 p.m. UTC | #10
On Thu, Aug 31, 2023 at 3:22 PM Philipp Stanner <pstanner@redhat.com> wrote:
> On Wed, 2023-08-30 at 17:11 +0300, Andy Shevchenko wrote:
> > On Wed, Aug 30, 2023 at 4:46 PM Philipp Stanner <pstanner@redhat.com>
> > wrote:

...

> > I'm wondering if this has no side-effects as string.h/string.c IIRC
> > is used also for early stages where some of the APIs are not available.
>
> I forgot to address this point in my previous reply.
>
> Who's going to decide whether this is a problem or not?
>
> My personal guess is that this is unlikely to be a problem because
>
>    A. either (v)memdup_user() is available, in which case
>       (v)memdup_array_user() will always work –
>    B. or (v)memdup_user() is not available, which would cause the code
>       that currently uses (v)memdup_user() for copying arrays to fail
>       anyways.

It also uses something from overflow.h. I don't remember if that
header was ever been used in early stage modules (like
boot/decompressor).
Andy Shevchenko Aug. 31, 2023, 1:17 p.m. UTC | #11
On Thu, Aug 31, 2023 at 4:16 PM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
> On Thu, Aug 31, 2023 at 3:22 PM Philipp Stanner <pstanner@redhat.com> wrote:
> > On Wed, 2023-08-30 at 17:11 +0300, Andy Shevchenko wrote:
> > > On Wed, Aug 30, 2023 at 4:46 PM Philipp Stanner <pstanner@redhat.com>
> > > wrote:

...

> > > I'm wondering if this has no side-effects as string.h/string.c IIRC
> > > is used also for early stages where some of the APIs are not available.
> >
> > I forgot to address this point in my previous reply.
> >
> > Who's going to decide whether this is a problem or not?
> >
> > My personal guess is that this is unlikely to be a problem because
> >
> >    A. either (v)memdup_user() is available, in which case
> >       (v)memdup_array_user() will always work –
> >    B. or (v)memdup_user() is not available, which would cause the code
> >       that currently uses (v)memdup_user() for copying arrays to fail
> >       anyways.
>
> It also uses something from overflow.h. I don't remember if that
> header was ever been used in early stage modules (like
> boot/decompressor).

Also we need to be sure UML is still buildable. Dunno if they are
using anything from this, but it appeared to me recently that someone
tried to optimize something using (internal) kernel headers and broke
the build in some cases.
diff mbox series

Patch

diff --git a/include/linux/string.h b/include/linux/string.h
index dbfc66400050..0e8e7a40bae7 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -6,6 +6,8 @@ 
 #include <linux/types.h>	/* for size_t */
 #include <linux/stddef.h>	/* for NULL */
 #include <linux/errno.h>	/* for E2BIG */
+#include <linux/overflow.h>	/* for check_mul_overflow() */
+#include <linux/err.h>		/* for ERR_PTR() */
 #include <linux/stdarg.h>
 #include <uapi/linux/string.h>
 
@@ -14,6 +16,46 @@  extern void *memdup_user(const void __user *, size_t);
 extern void *vmemdup_user(const void __user *, size_t);
 extern void *memdup_user_nul(const void __user *, size_t);
 
+/**
+ * memdup_array_user - duplicate array from user space
+ *
+ * @src: source address in user space
+ * @n: number of array members to copy
+ * @size: size of one array member
+ *
+ * Return: an ERR_PTR() on failure.  Result is physically
+ * contiguous, to be freed by kfree().
+ */
+static inline void *memdup_array_user(const void __user *src, size_t n, size_t size)
+{
+	size_t nbytes;
+
+	if (unlikely(check_mul_overflow(n, size, &nbytes)))
+		return ERR_PTR(-EINVAL);
+
+	return memdup_user(src, nbytes);
+}
+
+/**
+ * vmemdup_array_user - duplicate array from user space
+ *
+ * @src: source address in user space
+ * @n: number of array members to copy
+ * @size: size of one array member
+ *
+ * Return: an ERR_PTR() on failure.  Result may be not
+ * physically contiguous.  Use kvfree() to free.
+ */
+static inline void *vmemdup_array_user(const void __user *src, size_t n, size_t size)
+{
+	size_t nbytes;
+
+	if (unlikely(check_mul_overflow(n, size, &nbytes)))
+		return ERR_PTR(-EINVAL);
+
+	return vmemdup_user(src, nbytes);
+}
+
 /*
  * Include machine specific inline routines
  */