mbox series

[0/5] Introduce new wrappers to copy user-arrays

Message ID cover.1693386602.git.pstanner@redhat.com (mailing list archive)
Headers show
Series Introduce new wrappers to copy user-arrays | expand

Message

Philipp Stanner Aug. 30, 2023, 1:45 p.m. UTC
Hi!

David Airlie suggested that we could implement new wrappers around
(v)memdup_user() for duplicating user arrays.

This small patch series first implements the two new wrapper functions
memdup_array_user() and vmemdup_array_user(). They calculate the
array-sizes safely, i.e., they return an error in case of an overflow.

It then implements the new wrappers in two components in kernel/ and two
in the drm-subsystem.

In total, there are 18 files in the kernel that use (v)memdup_user() to
duplicate arrays. My plan is to provide patches for the other 14
successively once this series has been merged.

P.

Philipp Stanner (5):
  string.h: add array-wrappers for (v)memdup_user()
  kernel: kexec: copy user-array safely
  kernel: watch_queue: copy user-array safely
  drm_lease.c: copy user-array safely
  drm: vmgfx_surface.c: copy user-array safely

 drivers/gpu/drm/drm_lease.c             |  4 +--
 drivers/gpu/drm/vmwgfx/vmwgfx_surface.c |  4 +--
 include/linux/string.h                  | 42 +++++++++++++++++++++++++
 kernel/kexec.c                          |  2 +-
 kernel/watch_queue.c                    |  2 +-
 5 files changed, 48 insertions(+), 6 deletions(-)

Comments

Andy Shevchenko Aug. 30, 2023, 2:11 p.m. UTC | #1
On Wed, Aug 30, 2023 at 4:46 PM Philipp Stanner <pstanner@redhat.com> wrote:
>
> Currently, user array duplications are sometimes done without an
> overflow check. Sometimes the checks are done manually; sometimes the
> array size is calculated with array_size() and sometimes by calculating
> n * size directly in code.
>
> Introduce wrappers for arrays for memdup_user() and vmemdup_user() to
> provide a standardized and safe way for duplicating user arrays.
>
> This is both for new code as well as replacing usage of (v)memdup_user()
> in existing code that uses, e.g., n * size to calculate array sizes.

...

> --- a/include/linux/string.h
> +++ b/include/linux/string.h

I'm wondering if this has no side-effects as string.h/string.c IIRC is
used also for early stages where some of the APIs are not available.

> @@ -6,6 +6,8 @@
>  #include <linux/types.h>       /* for size_t */
>  #include <linux/stddef.h>      /* for NULL */
>  #include <linux/errno.h>       /* for E2BIG */
> +#include <linux/overflow.h>    /* for check_mul_overflow() */
> +#include <linux/err.h>         /* for ERR_PTR() */

Can we preserve order (to some extent)?

>  #include <linux/stdarg.h>
>  #include <uapi/linux/string.h>

...

> +/**
> + * memdup_array_user - duplicate array from user space

> + *

Do we need this blank line?

> + * @src: source address in user space
> + * @n: number of array members to copy
> + * @size: size of one array member
> + *
> + * Return: an ERR_PTR() on failure.  Result is physically
> + * contiguous, to be freed by kfree().
> + */

...

> +/**
> + * vmemdup_array_user - duplicate array from user space

> + *

Redundant?

> + * @src: source address in user space
> + * @n: number of array members to copy
> + * @size: size of one array member
> + *
> + * Return: an ERR_PTR() on failure.  Result may be not
> + * physically contiguous.  Use kvfree() to free.
> + */
Philipp Stanner Aug. 30, 2023, 7:15 p.m. UTC | #2
On Wed, 2023-08-30 at 17:29 +0300, Andy Shevchenko wrote:
> On Wed, Aug 30, 2023 at 5:19 PM <pstanner@redhat.com> wrote:
> > On Wed, 2023-08-30 at 17:11 +0300, Andy Shevchenko wrote:
> > > On Wed, Aug 30, 2023 at 4:46 PM Philipp Stanner
> > > <pstanner@redhat.com>
> > > wrote:
> 
> > > > --- a/include/linux/string.h
> > > > +++ b/include/linux/string.h
> > > 
> > > I'm wondering if this has no side-effects as string.h/string.c
> > > IIRC
> > > is
> > > used also for early stages where some of the APIs are not
> > > available.
> > > 
> > > > @@ -6,6 +6,8 @@
> > > >  #include <linux/types.h>       /* for size_t */
> > > >  #include <linux/stddef.h>      /* for NULL */
> > > >  #include <linux/errno.h>       /* for E2BIG */
> > > > +#include <linux/overflow.h>    /* for check_mul_overflow() */
> > > > +#include <linux/err.h>         /* for ERR_PTR() */
> > > 
> > > Can we preserve order (to some extent)?
> > 
> > Sure. I just put it there so the comments build a congruent block.
> > Which order would you prefer?
> 
> Alphabetical.
> 
> compiler.h
> err.h
> overflow.h
> ...the rest that is a bit unordered...
> 
> > > >  #include <linux/stdarg.h>
> > > >  #include <uapi/linux/string.h>
> 
> ...

I mean I could include my own in a sorted manner – but the existing
ones are not sorted:

/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _LINUX_STRING_H_
#define _LINUX_STRING_H_

#include <linux/compiler.h>	/* for inline */
#include <linux/types.h>	/* for size_t */
#include <linux/stddef.h>	/* for NULL */
#include <linux/errno.h>	/* for E2BIG */
#include <linux/stdarg.h>
#include <uapi/linux/string.h>

extern char *strndup_user(const char __user *, long);

We could sort them all, but I'd prefer to do that in a separate patch
so that this commit does not make the impression of doing anything else
than including the two new headers

Such a separate patch could also unify the docstring style, see below

> 
> > > > +/**
> > > > + * memdup_array_user - duplicate array from user space
> > > 
> > > > + *
> > > 
> > > Do we need this blank line?
> > 
> > I more or less directly copied the docstring format from the
> > original
> > functions (v)memdup_user() in mm/util.c
> > I guess this is common style?
> 
> I think it's not. But you may grep kernel source tree and tell which
> one occurs more often with or without this (unneeded) blank line.


It seems to be very much mixed. string.h itself is mixed.
When you look at the bottom of string.h, you'll find functions such as
kbasename() that have the extra line.

That's not really a super decisive point for me, though. We can remove
the line I guess


P.


> 
> > > > + * @src: source address in user space
> > > > + * @n: number of array members to copy
> > > > + * @size: size of one array member
> > > > + *
> > > > + * Return: an ERR_PTR() on failure.  Result is physically
> > > > + * contiguous, to be freed by kfree().
> > > > + */
> 
> ...
> 
> > > > +/**
> > > > + * vmemdup_array_user - duplicate array from user space
> > > 
> > > > + *
> > > 
> > > Redundant?
> > 
> > No, there are two functions:
> >  * memdup_array_user()
> >  * vmemdup_array_user()
> > 
> > On the deeper layers they utilize kmalloc() or kvmalloc(),
> > respectively.
> 
> I guess you misunderstood my comment. I was talking about kernel doc
> (as in the previous function).
> 
> > > > + * @src: source address in user space
> > > > + * @n: number of array members to copy
> > > > + * @size: size of one array member
> > > > + *
> > > > + * Return: an ERR_PTR() on failure.  Result may be not
> > > > + * physically contiguous.  Use kvfree() to free.
> > > > + */
> 
>
Andy Shevchenko Aug. 31, 2023, 8:59 a.m. UTC | #3
On Wed, Aug 30, 2023 at 10:15 PM <pstanner@redhat.com> wrote:
> On Wed, 2023-08-30 at 17:29 +0300, Andy Shevchenko wrote:
> > On Wed, Aug 30, 2023 at 5:19 PM <pstanner@redhat.com> wrote:
> > > On Wed, 2023-08-30 at 17:11 +0300, Andy Shevchenko wrote:
> > > > On Wed, Aug 30, 2023 at 4:46 PM Philipp Stanner
> > > > <pstanner@redhat.com>
> > > > wrote:

...

> > > > >  #include <linux/types.h>       /* for size_t */
> > > > >  #include <linux/stddef.h>      /* for NULL */
> > > > >  #include <linux/errno.h>       /* for E2BIG */
> > > > > +#include <linux/overflow.h>    /* for check_mul_overflow() */
> > > > > +#include <linux/err.h>         /* for ERR_PTR() */
> > > >
> > > > Can we preserve order (to some extent)?
> > >
> > > Sure. I just put it there so the comments build a congruent block.
> > > Which order would you prefer?
> >
> > Alphabetical.
> >
> > compiler.h
> > err.h
> > overflow.h
> > ...the rest that is a bit unordered...
> >
> > > > >  #include <linux/stdarg.h>
> > > > >  #include <uapi/linux/string.h>
>
> I mean I could include my own in a sorted manner – but the existing
> ones are not sorted:

I know, that's why I put "(to some extent)" in my initial comment.

> We could sort them all, but I'd prefer to do that in a separate patch
> so that this commit does not make the impression of doing anything else
> than including the two new headers

But you can do your stuff first with better ordering than you proposed
initially, so there will be less churn for any additional change
(either that simply unifies the thing or something else).

> Such a separate patch could also unify the docstring style, see below

Sure, patches are welcome!

> > > > > +/**
> > > > > + * memdup_array_user - duplicate array from user space
> > > >
> > > > > + *
> > > >
> > > > Do we need this blank line?
> > >
> > > I more or less directly copied the docstring format from the
> > > original
> > > functions (v)memdup_user() in mm/util.c
> > > I guess this is common style?
> >
> > I think it's not. But you may grep kernel source tree and tell which
> > one occurs more often with or without this (unneeded) blank line.
>
>
> It seems to be very much mixed. string.h itself is mixed.
> When you look at the bottom of string.h, you'll find functions such as
> kbasename() that have the extra line.
>
> That's not really a super decisive point for me, though. We can remove
> the line I guess

I think the less LoCs the better generally speaking. So, if we don't
need that line and it doesn't make the readability worse, why to keep
it?

> > > > > + * @src: source address in user space
> > > > > + * @n: number of array members to copy
> > > > > + * @size: size of one array member
> > > > > + *
> > > > > + * Return: an ERR_PTR() on failure.  Result is physically
> > > > > + * contiguous, to be freed by kfree().
> > > > > + */

--
With Best Regards,
Andy Shevchenko
Philipp Stanner Aug. 31, 2023, 12:16 p.m. UTC | #4
On Thu, 2023-08-31 at 11:59 +0300, Andy Shevchenko wrote:
> On Wed, Aug 30, 2023 at 10:15 PM <pstanner@redhat.com> wrote:
> > On Wed, 2023-08-30 at 17:29 +0300, Andy Shevchenko wrote:
> > > On Wed, Aug 30, 2023 at 5:19 PM <pstanner@redhat.com> wrote:
> > > > On Wed, 2023-08-30 at 17:11 +0300, Andy Shevchenko wrote:
> > > > > On Wed, Aug 30, 2023 at 4:46 PM Philipp Stanner
> > > > > <pstanner@redhat.com>
> > > Alphabetical.
> > > 
> > > compiler.h
> > > err.h
> > > overflow.h
> > > ...the rest that is a bit unordered...
> > > 
> > > > > >  #include <linux/stdarg.h>
> > > > > >  #include <uapi/linux/string.h>
> > 
> > I mean I could include my own in a sorted manner – but the existing
> > ones are not sorted:
> 
> I know, that's why I put "(to some extent)" in my initial comment.
> 
> > We could sort them all, but I'd prefer to do that in a separate
> > patch
> > so that this commit does not make the impression of doing anything
> > else
> > than including the two new headers
> 
> But you can do your stuff first with better ordering than you
> proposed
> initially, so there will be less churn for any additional change
> (either that simply unifies the thing or something else).
> 
> 

I'll take care of those points in a v2 once I gathered some more
feedback from the other parties.

P.
Philipp Stanner Aug. 31, 2023, 12:22 p.m. UTC | #5
On Wed, 2023-08-30 at 17:11 +0300, Andy Shevchenko wrote:
> On Wed, Aug 30, 2023 at 4:46 PM Philipp Stanner <pstanner@redhat.com>
> wrote:
> > 
> > Currently, user array duplications are sometimes done without an
> > overflow check. Sometimes the checks are done manually; sometimes
> > the
> > array size is calculated with array_size() and sometimes by
> > calculating
> > n * size directly in code.
> > 
> > Introduce wrappers for arrays for memdup_user() and vmemdup_user()
> > to
> > provide a standardized and safe way for duplicating user arrays.
> > 
> > This is both for new code as well as replacing usage of
> > (v)memdup_user()
> > in existing code that uses, e.g., n * size to calculate array
> > sizes.
> 
> ...
> 
> > --- a/include/linux/string.h
> > +++ b/include/linux/string.h
> 
> I'm wondering if this has no side-effects as string.h/string.c IIRC
> is
> used also for early stages where some of the APIs are not available.
> 

I forgot to address this point in my previous reply.

Who's going to decide whether this is a problem or not?


My personal guess is that this is unlikely to be a problem because

   A. either (v)memdup_user() is available, in which case
      (v)memdup_array_user() will always work – 
   B. or (v)memdup_user() is not available, which would cause the code
      that currently uses (v)memdup_user() for copying arrays to fail
      anyways.


P.
Andy Shevchenko Aug. 31, 2023, 1:16 p.m. UTC | #6
On Thu, Aug 31, 2023 at 3:22 PM Philipp Stanner <pstanner@redhat.com> wrote:
> On Wed, 2023-08-30 at 17:11 +0300, Andy Shevchenko wrote:
> > On Wed, Aug 30, 2023 at 4:46 PM Philipp Stanner <pstanner@redhat.com>
> > wrote:

...

> > I'm wondering if this has no side-effects as string.h/string.c IIRC
> > is used also for early stages where some of the APIs are not available.
>
> I forgot to address this point in my previous reply.
>
> Who's going to decide whether this is a problem or not?
>
> My personal guess is that this is unlikely to be a problem because
>
>    A. either (v)memdup_user() is available, in which case
>       (v)memdup_array_user() will always work –
>    B. or (v)memdup_user() is not available, which would cause the code
>       that currently uses (v)memdup_user() for copying arrays to fail
>       anyways.

It also uses something from overflow.h. I don't remember if that
header was ever been used in early stage modules (like
boot/decompressor).
Andy Shevchenko Aug. 31, 2023, 1:17 p.m. UTC | #7
On Thu, Aug 31, 2023 at 4:16 PM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
> On Thu, Aug 31, 2023 at 3:22 PM Philipp Stanner <pstanner@redhat.com> wrote:
> > On Wed, 2023-08-30 at 17:11 +0300, Andy Shevchenko wrote:
> > > On Wed, Aug 30, 2023 at 4:46 PM Philipp Stanner <pstanner@redhat.com>
> > > wrote:

...

> > > I'm wondering if this has no side-effects as string.h/string.c IIRC
> > > is used also for early stages where some of the APIs are not available.
> >
> > I forgot to address this point in my previous reply.
> >
> > Who's going to decide whether this is a problem or not?
> >
> > My personal guess is that this is unlikely to be a problem because
> >
> >    A. either (v)memdup_user() is available, in which case
> >       (v)memdup_array_user() will always work –
> >    B. or (v)memdup_user() is not available, which would cause the code
> >       that currently uses (v)memdup_user() for copying arrays to fail
> >       anyways.
>
> It also uses something from overflow.h. I don't remember if that
> header was ever been used in early stage modules (like
> boot/decompressor).

Also we need to be sure UML is still buildable. Dunno if they are
using anything from this, but it appeared to me recently that someone
tried to optimize something using (internal) kernel headers and broke
the build in some cases.