Message ID | 20190726210137.23395-1-willy@infradead.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm: Make kvfree safe to call | expand |
On Fri, Jul 26, 2019 at 2:01 PM Matthew Wilcox <willy@infradead.org> wrote: > > From: "Matthew Wilcox (Oracle)" <willy@infradead.org> > > Since vfree() can sleep, calling kvfree() from contexts where sleeping > is not permitted (eg holding a spinlock) is a bit of a lottery whether > it'll work. Introduce kvfree_safe() for situations where we know we can > sleep, but make kvfree() safe by default. > > Reported-by: Jeff Layton <jlayton@kernel.org> > Cc: Alexander Viro <viro@zeniv.linux.org.uk> > Cc: Luis Henriques <lhenriques@suse.com> > Cc: Christoph Hellwig <hch@lst.de> > Cc: Carlos Maiolino <cmaiolino@redhat.com> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> So you say you are adding kvfree_safe() in the patch description, but it looks like you are introducing kvfree_fast() below. Did something change and the patch description wasn't updated, or is this just the wrong description for this patch? > --- > mm/util.c | 26 ++++++++++++++++++++++++-- > 1 file changed, 24 insertions(+), 2 deletions(-) > > diff --git a/mm/util.c b/mm/util.c > index bab284d69c8c..992f0332dced 100644 > --- a/mm/util.c > +++ b/mm/util.c > @@ -470,6 +470,28 @@ void *kvmalloc_node(size_t size, gfp_t flags, int node) > } > EXPORT_SYMBOL(kvmalloc_node); > > +/** > + * kvfree_fast() - Free memory. > + * @addr: Pointer to allocated memory. > + * > + * kvfree_fast frees memory allocated by any of vmalloc(), kmalloc() or > + * kvmalloc(). It is slightly more efficient to use kfree() or vfree() if > + * you are certain that you know which one to use. > + * > + * Context: Either preemptible task context or not-NMI interrupt. Must not > + * hold a spinlock as it can sleep. > + */ > +void kvfree_fast(const void *addr) > +{ > + might_sleep(); > + > + if (is_vmalloc_addr(addr)) > + vfree(addr); > + else > + kfree(addr); > +} > +EXPORT_SYMBOL(kvfree_fast); > + > /** > * kvfree() - Free memory. > * @addr: Pointer to allocated memory. > @@ -478,12 +500,12 @@ EXPORT_SYMBOL(kvmalloc_node); > * It is slightly more efficient to use kfree() or vfree() if you are certain > * that you know which one to use. > * > - * Context: Either preemptible task context or not-NMI interrupt. > + * Context: Any context except NMI. > */ > void kvfree(const void *addr) > { > if (is_vmalloc_addr(addr)) > - vfree(addr); > + vfree_atomic(addr); > else > kfree(addr); > } > -- > 2.20.1 >
On Fri, 2019-07-26 at 14:10 -0700, Alexander Duyck wrote: > On Fri, Jul 26, 2019 at 2:01 PM Matthew Wilcox <willy@infradead.org> wrote: > > From: "Matthew Wilcox (Oracle)" <willy@infradead.org> > > > > Since vfree() can sleep, calling kvfree() from contexts where sleeping > > is not permitted (eg holding a spinlock) is a bit of a lottery whether > > it'll work. Introduce kvfree_safe() for situations where we know we can > > sleep, but make kvfree() safe by default. > > > > Reported-by: Jeff Layton <jlayton@kernel.org> > > Cc: Alexander Viro <viro@zeniv.linux.org.uk> > > Cc: Luis Henriques <lhenriques@suse.com> > > Cc: Christoph Hellwig <hch@lst.de> > > Cc: Carlos Maiolino <cmaiolino@redhat.com> > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > > So you say you are adding kvfree_safe() in the patch description, but > it looks like you are introducing kvfree_fast() below. Did something > change and the patch description wasn't updated, or is this just the > wrong description for this patch? > > > --- > > mm/util.c | 26 ++++++++++++++++++++++++-- > > 1 file changed, 24 insertions(+), 2 deletions(-) > > > > diff --git a/mm/util.c b/mm/util.c > > index bab284d69c8c..992f0332dced 100644 > > --- a/mm/util.c > > +++ b/mm/util.c > > @@ -470,6 +470,28 @@ void *kvmalloc_node(size_t size, gfp_t flags, int node) > > } > > EXPORT_SYMBOL(kvmalloc_node); > > > > +/** > > + * kvfree_fast() - Free memory. > > + * @addr: Pointer to allocated memory. > > + * > > + * kvfree_fast frees memory allocated by any of vmalloc(), kmalloc() or > > + * kvmalloc(). It is slightly more efficient to use kfree() or vfree() if > > + * you are certain that you know which one to use. > > + * > > + * Context: Either preemptible task context or not-NMI interrupt. Must not > > + * hold a spinlock as it can sleep. > > + */ > > +void kvfree_fast(const void *addr) > > +{ > > + might_sleep(); > > + might_sleep_if(!in_interrupt()); That's what vfree does anyway, so we might as well exempt the case where you are. > > + if (is_vmalloc_addr(addr)) > > + vfree(addr); > > + else > > + kfree(addr); > > +} > > +EXPORT_SYMBOL(kvfree_fast); > > + That said -- is this really useful? The only way to know that this is safe is to know what sort of allocation it is, and in that case you can just call kfree or vfree as appropriate. > > /** > > * kvfree() - Free memory. > > * @addr: Pointer to allocated memory. > > @@ -478,12 +500,12 @@ EXPORT_SYMBOL(kvmalloc_node); > > * It is slightly more efficient to use kfree() or vfree() if you are certain > > * that you know which one to use. > > * > > - * Context: Either preemptible task context or not-NMI interrupt. > > + * Context: Any context except NMI. > > */ > > void kvfree(const void *addr) > > { > > if (is_vmalloc_addr(addr)) > > - vfree(addr); > > + vfree_atomic(addr); > > else > > kfree(addr); > > } > > -- > > 2.20.1 > >
On Fri, Jul 26, 2019 at 05:25:03PM -0400, Jeff Layton wrote: > On Fri, 2019-07-26 at 14:10 -0700, Alexander Duyck wrote: > > On Fri, Jul 26, 2019 at 2:01 PM Matthew Wilcox <willy@infradead.org> wrote: > > > From: "Matthew Wilcox (Oracle)" <willy@infradead.org> > > > > > > Since vfree() can sleep, calling kvfree() from contexts where sleeping > > > is not permitted (eg holding a spinlock) is a bit of a lottery whether > > > it'll work. Introduce kvfree_safe() for situations where we know we can > > > sleep, but make kvfree() safe by default. > > > > > > Reported-by: Jeff Layton <jlayton@kernel.org> > > > Cc: Alexander Viro <viro@zeniv.linux.org.uk> > > > Cc: Luis Henriques <lhenriques@suse.com> > > > Cc: Christoph Hellwig <hch@lst.de> > > > Cc: Carlos Maiolino <cmaiolino@redhat.com> > > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > > > > So you say you are adding kvfree_safe() in the patch description, but > > it looks like you are introducing kvfree_fast() below. Did something > > change and the patch description wasn't updated, or is this just the > > wrong description for this patch? Oops, bad description. Thanks, I'll fix it for v2. > > > +/** > > > + * kvfree_fast() - Free memory. > > > + * @addr: Pointer to allocated memory. > > > + * > > > + * kvfree_fast frees memory allocated by any of vmalloc(), kmalloc() or > > > + * kvmalloc(). It is slightly more efficient to use kfree() or vfree() if > > > + * you are certain that you know which one to use. > > > + * > > > + * Context: Either preemptible task context or not-NMI interrupt. Must not > > > + * hold a spinlock as it can sleep. > > > + */ > > > +void kvfree_fast(const void *addr) > > > +{ > > > + might_sleep(); > > > + > > might_sleep_if(!in_interrupt()); > > That's what vfree does anyway, so we might as well exempt the case where > you are. True, but if we are in interrupt, then we may as well call kvfree() since it'll do the same thing, and this way the rules are clearer. > > > + if (is_vmalloc_addr(addr)) > > > + vfree(addr); > > > + else > > > + kfree(addr); > > > +} > > > +EXPORT_SYMBOL(kvfree_fast); > > > + > > That said -- is this really useful? > > The only way to know that this is safe is to know what sort of > allocation it is, and in that case you can just call kfree or vfree as > appropriate. It's safe if you know you're not holding any spinlocks, for example ...
On Fri, 2019-07-26 at 17:38 -0700, Matthew Wilcox wrote: > On Fri, Jul 26, 2019 at 05:25:03PM -0400, Jeff Layton wrote: > > On Fri, 2019-07-26 at 14:10 -0700, Alexander Duyck wrote: > > > On Fri, Jul 26, 2019 at 2:01 PM Matthew Wilcox <willy@infradead.org> wrote: > > > > From: "Matthew Wilcox (Oracle)" <willy@infradead.org> > > > > > > > > Since vfree() can sleep, calling kvfree() from contexts where sleeping > > > > is not permitted (eg holding a spinlock) is a bit of a lottery whether > > > > it'll work. Introduce kvfree_safe() for situations where we know we can > > > > sleep, but make kvfree() safe by default. > > > > > > > > Reported-by: Jeff Layton <jlayton@kernel.org> > > > > Cc: Alexander Viro <viro@zeniv.linux.org.uk> > > > > Cc: Luis Henriques <lhenriques@suse.com> > > > > Cc: Christoph Hellwig <hch@lst.de> > > > > Cc: Carlos Maiolino <cmaiolino@redhat.com> > > > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > > > > > > So you say you are adding kvfree_safe() in the patch description, but > > > it looks like you are introducing kvfree_fast() below. Did something > > > change and the patch description wasn't updated, or is this just the > > > wrong description for this patch? > > Oops, bad description. Thanks, I'll fix it for v2. > > > > > +/** > > > > + * kvfree_fast() - Free memory. > > > > + * @addr: Pointer to allocated memory. > > > > + * > > > > + * kvfree_fast frees memory allocated by any of vmalloc(), kmalloc() or > > > > + * kvmalloc(). It is slightly more efficient to use kfree() or vfree() if > > > > + * you are certain that you know which one to use. > > > > + * > > > > + * Context: Either preemptible task context or not-NMI interrupt. Must not > > > > + * hold a spinlock as it can sleep. > > > > + */ > > > > +void kvfree_fast(const void *addr) > > > > +{ > > > > + might_sleep(); > > > > + > > > > might_sleep_if(!in_interrupt()); > > > > That's what vfree does anyway, so we might as well exempt the case where > > you are. > > True, but if we are in interrupt, then we may as well call kvfree() since > it'll do the same thing, and this way the rules are clearer. > > > > > + if (is_vmalloc_addr(addr)) > > > > + vfree(addr); > > > > + else > > > > + kfree(addr); > > > > +} > > > > +EXPORT_SYMBOL(kvfree_fast); > > > > + > > > > That said -- is this really useful? > > > > The only way to know that this is safe is to know what sort of > > allocation it is, and in that case you can just call kfree or vfree as > > appropriate. > > It's safe if you know you're not holding any spinlocks, for example ... > Fair points all around. You can add: Reviewed-by: Jeff Layton <jlayton@kernel.org> The only real question then is whether we'll incur any extra overhead when some of these kvfree sites suddenly start queueing these up. One would hope it wouldn't matter much on most workloads.
On Fri 26-07-19 14:01:37, Matthew Wilcox wrote: > From: "Matthew Wilcox (Oracle)" <willy@infradead.org> > > Since vfree() can sleep, calling kvfree() from contexts where sleeping > is not permitted (eg holding a spinlock) is a bit of a lottery whether > it'll work. Introduce kvfree_safe() for situations where we know we can > sleep, but make kvfree() safe by default. So now you have converted all kvfree callers to an atomic version. Is that really desirable? Aren't we adding way too much work to be done in a deferred context? If not then why a regular vfree cannot do this already and then we do not need vfree_atomic and kvfree_safe. In other words, why do we want to complicate the API further? > Reported-by: Jeff Layton <jlayton@kernel.org> > Cc: Alexander Viro <viro@zeniv.linux.org.uk> > Cc: Luis Henriques <lhenriques@suse.com> > Cc: Christoph Hellwig <hch@lst.de> > Cc: Carlos Maiolino <cmaiolino@redhat.com> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > --- > mm/util.c | 26 ++++++++++++++++++++++++-- > 1 file changed, 24 insertions(+), 2 deletions(-) > > diff --git a/mm/util.c b/mm/util.c > index bab284d69c8c..992f0332dced 100644 > --- a/mm/util.c > +++ b/mm/util.c > @@ -470,6 +470,28 @@ void *kvmalloc_node(size_t size, gfp_t flags, int node) > } > EXPORT_SYMBOL(kvmalloc_node); > > +/** > + * kvfree_fast() - Free memory. > + * @addr: Pointer to allocated memory. > + * > + * kvfree_fast frees memory allocated by any of vmalloc(), kmalloc() or > + * kvmalloc(). It is slightly more efficient to use kfree() or vfree() if > + * you are certain that you know which one to use. > + * > + * Context: Either preemptible task context or not-NMI interrupt. Must not > + * hold a spinlock as it can sleep. > + */ > +void kvfree_fast(const void *addr) > +{ > + might_sleep(); > + > + if (is_vmalloc_addr(addr)) > + vfree(addr); > + else > + kfree(addr); > +} > +EXPORT_SYMBOL(kvfree_fast); > + > /** > * kvfree() - Free memory. > * @addr: Pointer to allocated memory. > @@ -478,12 +500,12 @@ EXPORT_SYMBOL(kvmalloc_node); > * It is slightly more efficient to use kfree() or vfree() if you are certain > * that you know which one to use. > * > - * Context: Either preemptible task context or not-NMI interrupt. > + * Context: Any context except NMI. > */ > void kvfree(const void *addr) > { > if (is_vmalloc_addr(addr)) > - vfree(addr); > + vfree_atomic(addr); > else > kfree(addr); > } > -- > 2.20.1
On Mon, Jul 29, 2019 at 11:28:30AM +0200, Michal Hocko wrote: > On Fri 26-07-19 14:01:37, Matthew Wilcox wrote: > > From: "Matthew Wilcox (Oracle)" <willy@infradead.org> > > > > Since vfree() can sleep, calling kvfree() from contexts where sleeping > > is not permitted (eg holding a spinlock) is a bit of a lottery whether > > it'll work. Introduce kvfree_safe() for situations where we know we can > > sleep, but make kvfree() safe by default. > > So now you have converted all kvfree callers to an atomic version. Is > that really desirable? Aren't we adding way too much work to be done in > a deferred context? If not then why a regular vfree cannot do this > already and then we do not need vfree_atomic and kvfree_safe. I know infiniband has kvfree calls under user control, mayne uses of kvfree are related to allocating kernel memory for some potentially large user data on the syscall path.. I'm also nervous about making them all queuing. If we added kvfree_atomic() & a warning how many places would hit the warning and need conversion? Jason
diff --git a/mm/util.c b/mm/util.c index bab284d69c8c..992f0332dced 100644 --- a/mm/util.c +++ b/mm/util.c @@ -470,6 +470,28 @@ void *kvmalloc_node(size_t size, gfp_t flags, int node) } EXPORT_SYMBOL(kvmalloc_node); +/** + * kvfree_fast() - Free memory. + * @addr: Pointer to allocated memory. + * + * kvfree_fast frees memory allocated by any of vmalloc(), kmalloc() or + * kvmalloc(). It is slightly more efficient to use kfree() or vfree() if + * you are certain that you know which one to use. + * + * Context: Either preemptible task context or not-NMI interrupt. Must not + * hold a spinlock as it can sleep. + */ +void kvfree_fast(const void *addr) +{ + might_sleep(); + + if (is_vmalloc_addr(addr)) + vfree(addr); + else + kfree(addr); +} +EXPORT_SYMBOL(kvfree_fast); + /** * kvfree() - Free memory. * @addr: Pointer to allocated memory. @@ -478,12 +500,12 @@ EXPORT_SYMBOL(kvmalloc_node); * It is slightly more efficient to use kfree() or vfree() if you are certain * that you know which one to use. * - * Context: Either preemptible task context or not-NMI interrupt. + * Context: Any context except NMI. */ void kvfree(const void *addr) { if (is_vmalloc_addr(addr)) - vfree(addr); + vfree_atomic(addr); else kfree(addr); }