diff mbox series

mm: Make kvfree safe to call

Message ID 20190726210137.23395-1-willy@infradead.org (mailing list archive)
State New, archived
Headers show
Series mm: Make kvfree safe to call | expand

Commit Message

Matthew Wilcox July 26, 2019, 9:01 p.m. UTC
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Since vfree() can sleep, calling kvfree() from contexts where sleeping
is not permitted (eg holding a spinlock) is a bit of a lottery whether
it'll work.  Introduce kvfree_safe() for situations where we know we can
sleep, but make kvfree() safe by default.

Reported-by: Jeff Layton <jlayton@kernel.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Luis Henriques <lhenriques@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/util.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

Comments

Alexander Duyck July 26, 2019, 9:10 p.m. UTC | #1
On Fri, Jul 26, 2019 at 2:01 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
>
> Since vfree() can sleep, calling kvfree() from contexts where sleeping
> is not permitted (eg holding a spinlock) is a bit of a lottery whether
> it'll work.  Introduce kvfree_safe() for situations where we know we can
> sleep, but make kvfree() safe by default.
>
> Reported-by: Jeff Layton <jlayton@kernel.org>
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: Luis Henriques <lhenriques@suse.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Carlos Maiolino <cmaiolino@redhat.com>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

So you say you are adding kvfree_safe() in the patch description, but
it looks like you are introducing kvfree_fast() below. Did something
change and the patch description wasn't updated, or is this just the
wrong description for this patch?

> ---
>  mm/util.c | 26 ++++++++++++++++++++++++--
>  1 file changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/mm/util.c b/mm/util.c
> index bab284d69c8c..992f0332dced 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -470,6 +470,28 @@ void *kvmalloc_node(size_t size, gfp_t flags, int node)
>  }
>  EXPORT_SYMBOL(kvmalloc_node);
>
> +/**
> + * kvfree_fast() - Free memory.
> + * @addr: Pointer to allocated memory.
> + *
> + * kvfree_fast frees memory allocated by any of vmalloc(), kmalloc() or
> + * kvmalloc().  It is slightly more efficient to use kfree() or vfree() if
> + * you are certain that you know which one to use.
> + *
> + * Context: Either preemptible task context or not-NMI interrupt.  Must not
> + * hold a spinlock as it can sleep.
> + */
> +void kvfree_fast(const void *addr)
> +{
> +       might_sleep();
> +
> +       if (is_vmalloc_addr(addr))
> +               vfree(addr);
> +       else
> +               kfree(addr);
> +}
> +EXPORT_SYMBOL(kvfree_fast);
> +
>  /**
>   * kvfree() - Free memory.
>   * @addr: Pointer to allocated memory.
> @@ -478,12 +500,12 @@ EXPORT_SYMBOL(kvmalloc_node);
>   * It is slightly more efficient to use kfree() or vfree() if you are certain
>   * that you know which one to use.
>   *
> - * Context: Either preemptible task context or not-NMI interrupt.
> + * Context: Any context except NMI.
>   */
>  void kvfree(const void *addr)
>  {
>         if (is_vmalloc_addr(addr))
> -               vfree(addr);
> +               vfree_atomic(addr);
>         else
>                 kfree(addr);
>  }
> --
> 2.20.1
>
Jeff Layton July 26, 2019, 9:25 p.m. UTC | #2
On Fri, 2019-07-26 at 14:10 -0700, Alexander Duyck wrote:
> On Fri, Jul 26, 2019 at 2:01 PM Matthew Wilcox <willy@infradead.org> wrote:
> > From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> > 
> > Since vfree() can sleep, calling kvfree() from contexts where sleeping
> > is not permitted (eg holding a spinlock) is a bit of a lottery whether
> > it'll work.  Introduce kvfree_safe() for situations where we know we can
> > sleep, but make kvfree() safe by default.
> > 
> > Reported-by: Jeff Layton <jlayton@kernel.org>
> > Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> > Cc: Luis Henriques <lhenriques@suse.com>
> > Cc: Christoph Hellwig <hch@lst.de>
> > Cc: Carlos Maiolino <cmaiolino@redhat.com>
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> 
> So you say you are adding kvfree_safe() in the patch description, but
> it looks like you are introducing kvfree_fast() below. Did something
> change and the patch description wasn't updated, or is this just the
> wrong description for this patch?
> 
> > ---
> >  mm/util.c | 26 ++++++++++++++++++++++++--
> >  1 file changed, 24 insertions(+), 2 deletions(-)
> > 
> > diff --git a/mm/util.c b/mm/util.c
> > index bab284d69c8c..992f0332dced 100644
> > --- a/mm/util.c
> > +++ b/mm/util.c
> > @@ -470,6 +470,28 @@ void *kvmalloc_node(size_t size, gfp_t flags, int node)
> >  }
> >  EXPORT_SYMBOL(kvmalloc_node);
> > 
> > +/**
> > + * kvfree_fast() - Free memory.
> > + * @addr: Pointer to allocated memory.
> > + *
> > + * kvfree_fast frees memory allocated by any of vmalloc(), kmalloc() or
> > + * kvmalloc().  It is slightly more efficient to use kfree() or vfree() if
> > + * you are certain that you know which one to use.
> > + *
> > + * Context: Either preemptible task context or not-NMI interrupt.  Must not
> > + * hold a spinlock as it can sleep.
> > + */
> > +void kvfree_fast(const void *addr)
> > +{
> > +       might_sleep();
> > +

    might_sleep_if(!in_interrupt());

That's what vfree does anyway, so we might as well exempt the case where
you are.

> > +       if (is_vmalloc_addr(addr))
> > +               vfree(addr);
> > +       else
> > +               kfree(addr);
> > +}
> > +EXPORT_SYMBOL(kvfree_fast);
> > +

That said -- is this really useful?

The only way to know that this is safe is to know what sort of
allocation it is, and in that case you can just call kfree or vfree as
appropriate.

> >  /**
> >   * kvfree() - Free memory.
> >   * @addr: Pointer to allocated memory.
> > @@ -478,12 +500,12 @@ EXPORT_SYMBOL(kvmalloc_node);
> >   * It is slightly more efficient to use kfree() or vfree() if you are certain
> >   * that you know which one to use.
> >   *
> > - * Context: Either preemptible task context or not-NMI interrupt.
> > + * Context: Any context except NMI.
> >   */
> >  void kvfree(const void *addr)
> >  {
> >         if (is_vmalloc_addr(addr))
> > -               vfree(addr);
> > +               vfree_atomic(addr);
> >         else
> >                 kfree(addr);
> >  }
> > --
> > 2.20.1
> >
Matthew Wilcox July 27, 2019, 12:38 a.m. UTC | #3
On Fri, Jul 26, 2019 at 05:25:03PM -0400, Jeff Layton wrote:
> On Fri, 2019-07-26 at 14:10 -0700, Alexander Duyck wrote:
> > On Fri, Jul 26, 2019 at 2:01 PM Matthew Wilcox <willy@infradead.org> wrote:
> > > From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> > > 
> > > Since vfree() can sleep, calling kvfree() from contexts where sleeping
> > > is not permitted (eg holding a spinlock) is a bit of a lottery whether
> > > it'll work.  Introduce kvfree_safe() for situations where we know we can
> > > sleep, but make kvfree() safe by default.
> > > 
> > > Reported-by: Jeff Layton <jlayton@kernel.org>
> > > Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> > > Cc: Luis Henriques <lhenriques@suse.com>
> > > Cc: Christoph Hellwig <hch@lst.de>
> > > Cc: Carlos Maiolino <cmaiolino@redhat.com>
> > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > 
> > So you say you are adding kvfree_safe() in the patch description, but
> > it looks like you are introducing kvfree_fast() below. Did something
> > change and the patch description wasn't updated, or is this just the
> > wrong description for this patch?

Oops, bad description.  Thanks, I'll fix it for v2.

> > > +/**
> > > + * kvfree_fast() - Free memory.
> > > + * @addr: Pointer to allocated memory.
> > > + *
> > > + * kvfree_fast frees memory allocated by any of vmalloc(), kmalloc() or
> > > + * kvmalloc().  It is slightly more efficient to use kfree() or vfree() if
> > > + * you are certain that you know which one to use.
> > > + *
> > > + * Context: Either preemptible task context or not-NMI interrupt.  Must not
> > > + * hold a spinlock as it can sleep.
> > > + */
> > > +void kvfree_fast(const void *addr)
> > > +{
> > > +       might_sleep();
> > > +
> 
>     might_sleep_if(!in_interrupt());
> 
> That's what vfree does anyway, so we might as well exempt the case where
> you are.

True, but if we are in interrupt, then we may as well call kvfree() since
it'll do the same thing, and this way the rules are clearer.

> > > +       if (is_vmalloc_addr(addr))
> > > +               vfree(addr);
> > > +       else
> > > +               kfree(addr);
> > > +}
> > > +EXPORT_SYMBOL(kvfree_fast);
> > > +
> 
> That said -- is this really useful?
> 
> The only way to know that this is safe is to know what sort of
> allocation it is, and in that case you can just call kfree or vfree as
> appropriate.

It's safe if you know you're not holding any spinlocks, for example ...
Jeff Layton July 27, 2019, 2:54 p.m. UTC | #4
On Fri, 2019-07-26 at 17:38 -0700, Matthew Wilcox wrote:
> On Fri, Jul 26, 2019 at 05:25:03PM -0400, Jeff Layton wrote:
> > On Fri, 2019-07-26 at 14:10 -0700, Alexander Duyck wrote:
> > > On Fri, Jul 26, 2019 at 2:01 PM Matthew Wilcox <willy@infradead.org> wrote:
> > > > From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> > > > 
> > > > Since vfree() can sleep, calling kvfree() from contexts where sleeping
> > > > is not permitted (eg holding a spinlock) is a bit of a lottery whether
> > > > it'll work.  Introduce kvfree_safe() for situations where we know we can
> > > > sleep, but make kvfree() safe by default.
> > > > 
> > > > Reported-by: Jeff Layton <jlayton@kernel.org>
> > > > Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> > > > Cc: Luis Henriques <lhenriques@suse.com>
> > > > Cc: Christoph Hellwig <hch@lst.de>
> > > > Cc: Carlos Maiolino <cmaiolino@redhat.com>
> > > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > > 
> > > So you say you are adding kvfree_safe() in the patch description, but
> > > it looks like you are introducing kvfree_fast() below. Did something
> > > change and the patch description wasn't updated, or is this just the
> > > wrong description for this patch?
> 
> Oops, bad description.  Thanks, I'll fix it for v2.
> 
> > > > +/**
> > > > + * kvfree_fast() - Free memory.
> > > > + * @addr: Pointer to allocated memory.
> > > > + *
> > > > + * kvfree_fast frees memory allocated by any of vmalloc(), kmalloc() or
> > > > + * kvmalloc().  It is slightly more efficient to use kfree() or vfree() if
> > > > + * you are certain that you know which one to use.
> > > > + *
> > > > + * Context: Either preemptible task context or not-NMI interrupt.  Must not
> > > > + * hold a spinlock as it can sleep.
> > > > + */
> > > > +void kvfree_fast(const void *addr)
> > > > +{
> > > > +       might_sleep();
> > > > +
> > 
> >     might_sleep_if(!in_interrupt());
> > 
> > That's what vfree does anyway, so we might as well exempt the case where
> > you are.
> 
> True, but if we are in interrupt, then we may as well call kvfree() since
> it'll do the same thing, and this way the rules are clearer.
> 
> > > > +       if (is_vmalloc_addr(addr))
> > > > +               vfree(addr);
> > > > +       else
> > > > +               kfree(addr);
> > > > +}
> > > > +EXPORT_SYMBOL(kvfree_fast);
> > > > +
> > 
> > That said -- is this really useful?
> > 
> > The only way to know that this is safe is to know what sort of
> > allocation it is, and in that case you can just call kfree or vfree as
> > appropriate.
> 
> It's safe if you know you're not holding any spinlocks, for example ...
> 

Fair points all around. You can add:

    Reviewed-by: Jeff Layton <jlayton@kernel.org>

The only real question then is whether we'll incur any extra overhead
when some of these kvfree sites suddenly start queueing these up. One
would hope it wouldn't matter much on most workloads.
Michal Hocko July 29, 2019, 9:28 a.m. UTC | #5
On Fri 26-07-19 14:01:37, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> Since vfree() can sleep, calling kvfree() from contexts where sleeping
> is not permitted (eg holding a spinlock) is a bit of a lottery whether
> it'll work.  Introduce kvfree_safe() for situations where we know we can
> sleep, but make kvfree() safe by default.

So now you have converted all kvfree callers to an atomic version. Is
that really desirable? Aren't we adding way too much work to be done in
a deferred context? If not then why a regular vfree cannot do this
already and then we do not need vfree_atomic and kvfree_safe.

In other words, why do we want to complicate the API further?

> Reported-by: Jeff Layton <jlayton@kernel.org>
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: Luis Henriques <lhenriques@suse.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Carlos Maiolino <cmaiolino@redhat.com>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  mm/util.c | 26 ++++++++++++++++++++++++--
>  1 file changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/util.c b/mm/util.c
> index bab284d69c8c..992f0332dced 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -470,6 +470,28 @@ void *kvmalloc_node(size_t size, gfp_t flags, int node)
>  }
>  EXPORT_SYMBOL(kvmalloc_node);
>  
> +/**
> + * kvfree_fast() - Free memory.
> + * @addr: Pointer to allocated memory.
> + *
> + * kvfree_fast frees memory allocated by any of vmalloc(), kmalloc() or
> + * kvmalloc().  It is slightly more efficient to use kfree() or vfree() if
> + * you are certain that you know which one to use.
> + *
> + * Context: Either preemptible task context or not-NMI interrupt.  Must not
> + * hold a spinlock as it can sleep.
> + */
> +void kvfree_fast(const void *addr)
> +{
> +	might_sleep();
> +
> +	if (is_vmalloc_addr(addr))
> +		vfree(addr);
> +	else
> +		kfree(addr);
> +}
> +EXPORT_SYMBOL(kvfree_fast);
> +
>  /**
>   * kvfree() - Free memory.
>   * @addr: Pointer to allocated memory.
> @@ -478,12 +500,12 @@ EXPORT_SYMBOL(kvmalloc_node);
>   * It is slightly more efficient to use kfree() or vfree() if you are certain
>   * that you know which one to use.
>   *
> - * Context: Either preemptible task context or not-NMI interrupt.
> + * Context: Any context except NMI.
>   */
>  void kvfree(const void *addr)
>  {
>  	if (is_vmalloc_addr(addr))
> -		vfree(addr);
> +		vfree_atomic(addr);
>  	else
>  		kfree(addr);
>  }
> -- 
> 2.20.1
Jason Gunthorpe July 29, 2019, 1:42 p.m. UTC | #6
On Mon, Jul 29, 2019 at 11:28:30AM +0200, Michal Hocko wrote:
> On Fri 26-07-19 14:01:37, Matthew Wilcox wrote:
> > From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> > 
> > Since vfree() can sleep, calling kvfree() from contexts where sleeping
> > is not permitted (eg holding a spinlock) is a bit of a lottery whether
> > it'll work.  Introduce kvfree_safe() for situations where we know we can
> > sleep, but make kvfree() safe by default.
> 
> So now you have converted all kvfree callers to an atomic version. Is
> that really desirable? Aren't we adding way too much work to be done in
> a deferred context? If not then why a regular vfree cannot do this
> already and then we do not need vfree_atomic and kvfree_safe.

I know infiniband has kvfree calls under user control, mayne uses of
kvfree are related to allocating kernel memory for some potentially
large user data on the syscall path.. 

I'm also nervous about making them all queuing.

If we added kvfree_atomic() & a warning how many places would hit the
warning and need conversion?

Jason
diff mbox series

Patch

diff --git a/mm/util.c b/mm/util.c
index bab284d69c8c..992f0332dced 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -470,6 +470,28 @@  void *kvmalloc_node(size_t size, gfp_t flags, int node)
 }
 EXPORT_SYMBOL(kvmalloc_node);
 
+/**
+ * kvfree_fast() - Free memory.
+ * @addr: Pointer to allocated memory.
+ *
+ * kvfree_fast frees memory allocated by any of vmalloc(), kmalloc() or
+ * kvmalloc().  It is slightly more efficient to use kfree() or vfree() if
+ * you are certain that you know which one to use.
+ *
+ * Context: Either preemptible task context or not-NMI interrupt.  Must not
+ * hold a spinlock as it can sleep.
+ */
+void kvfree_fast(const void *addr)
+{
+	might_sleep();
+
+	if (is_vmalloc_addr(addr))
+		vfree(addr);
+	else
+		kfree(addr);
+}
+EXPORT_SYMBOL(kvfree_fast);
+
 /**
  * kvfree() - Free memory.
  * @addr: Pointer to allocated memory.
@@ -478,12 +500,12 @@  EXPORT_SYMBOL(kvmalloc_node);
  * It is slightly more efficient to use kfree() or vfree() if you are certain
  * that you know which one to use.
  *
- * Context: Either preemptible task context or not-NMI interrupt.
+ * Context: Any context except NMI.
  */
 void kvfree(const void *addr)
 {
 	if (is_vmalloc_addr(addr))
-		vfree(addr);
+		vfree_atomic(addr);
 	else
 		kfree(addr);
 }