slub: fix __kmem_cache_empty for !CONFIG_SLUB_DEBUG

Message ID	20180619213352.71740-1-shakeelb@google.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <owner-linux-mm@kvack.org> Received-SPF: pass (google.com: domain of shakeelb@google.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; From: Shakeel Butt <shakeelb@google.com> To: "Jason A . Donenfeld" <Jason@zx2c4.com> Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Shakeel Butt <shakeelb@google.com>, Christoph Lameter <cl@linux.com>, Pekka Enberg <penberg@kernel.org>, David Rientjes <rientjes@google.com>, Joonsoo Kim <iamjoonsoo.kim@lge.com>, Andrew Morton <akpm@linux-foundation.org>, stable@vger.kernel.org Subject: [PATCH] slub: fix __kmem_cache_empty for !CONFIG_SLUB_DEBUG Date: Tue, 19 Jun 2018 14:33:52 -0700 Message-Id: <20180619213352.71740-1-shakeelb@google.com> Sender: owner-linux-mm@kvack.org Precedence: bulk

Shakeel Butt June 19, 2018, 9:33 p.m. UTC

For !CONFIG_SLUB_DEBUG, SLUB does not maintain the number of slabs
allocated per node for a kmem_cache. Thus, slabs_node() in
__kmem_cache_empty() will always return 0. So, in such situation, it is
required to check per-cpu slabs to make sure if a kmem_cache is empty or
not.

Please note that __kmem_cache_shutdown() and __kmem_cache_shrink() are
not affected by !CONFIG_SLUB_DEBUG as they call flush_all() to clear
per-cpu slabs.

Fixes: f9e13c0a5a33 ("slab, slub: skip unnecessary kasan_cache_shutdown()")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Reported-by: Jason A . Donenfeld <Jason@zx2c4.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org>
---
 mm/slub.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

Shakeel Butt June 19, 2018, 9:38 p.m. UTC | #1

On Tue, Jun 19, 2018 at 2:33 PM Shakeel Butt <shakeelb@google.com> wrote:
>
> For !CONFIG_SLUB_DEBUG, SLUB does not maintain the number of slabs
> allocated per node for a kmem_cache. Thus, slabs_node() in
> __kmem_cache_empty() will always return 0. So, in such situation, it is
> required to check per-cpu slabs to make sure if a kmem_cache is empty or
> not.
>
> Please note that __kmem_cache_shutdown() and __kmem_cache_shrink() are
> not affected by !CONFIG_SLUB_DEBUG as they call flush_all() to clear
> per-cpu slabs.
>
> Fixes: f9e13c0a5a33 ("slab, slub: skip unnecessary kasan_cache_shutdown()")
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> Reported-by: Jason A . Donenfeld <Jason@zx2c4.com>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: <stable@vger.kernel.org>

Forgot to Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>

> ---
>  mm/slub.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index a3b8467c14af..731c02b371ae 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3673,9 +3673,23 @@ static void free_partial(struct kmem_cache *s, struct kmem_cache_node *n)
>
>  bool __kmem_cache_empty(struct kmem_cache *s)
>  {
> -       int node;
> +       int cpu, node;
>         struct kmem_cache_node *n;
>
> +       /*
> +        * slabs_node will always be 0 for !CONFIG_SLUB_DEBUG. So, manually
> +        * check slabs for all cpus.
> +        */
> +       if (!IS_ENABLED(CONFIG_SLUB_DEBUG)) {
> +               for_each_online_cpu(cpu) {
> +                       struct kmem_cache_cpu *c;
> +
> +                       c = per_cpu_ptr(s->cpu_slab, cpu);
> +                       if (c->page || slub_percpu_partial(c))
> +                               return false;
> +               }
> +       }
> +
>         for_each_kmem_cache_node(s, node, n)
>                 if (n->nr_partial || slabs_node(s, node))
>                         return false;
> --
> 2.18.0.rc1.244.gcf134e6275-goog
>

Jason A. Donenfeld June 19, 2018, 9:53 p.m. UTC | #2

On Tue, Jun 19, 2018 at 11:34 PM Shakeel Butt <shakeelb@google.com> wrote:
>
> For !CONFIG_SLUB_DEBUG, SLUB does not maintain the number of slabs
> allocated per node for a kmem_cache. Thus, slabs_node() in
> __kmem_cache_empty() will always return 0. So, in such situation, it is
> required to check per-cpu slabs to make sure if a kmem_cache is empty or
> not.
>
> Please note that __kmem_cache_shutdown() and __kmem_cache_shrink() are
> not affected by !CONFIG_SLUB_DEBUG as they call flush_all() to clear
> per-cpu slabs.
>
> Fixes: f9e13c0a5a33 ("slab, slub: skip unnecessary kasan_cache_shutdown()")
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> Reported-by: Jason A . Donenfeld <Jason@zx2c4.com>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: <stable@vger.kernel.org>
> ---
>  mm/slub.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index a3b8467c14af..731c02b371ae 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3673,9 +3673,23 @@ static void free_partial(struct kmem_cache *s, struct kmem_cache_node *n)
>
>  bool __kmem_cache_empty(struct kmem_cache *s)
>  {
> -       int node;
> +       int cpu, node;
>         struct kmem_cache_node *n;
>
> +       /*
> +        * slabs_node will always be 0 for !CONFIG_SLUB_DEBUG. So, manually
> +        * check slabs for all cpus.
> +        */
> +       if (!IS_ENABLED(CONFIG_SLUB_DEBUG)) {
> +               for_each_online_cpu(cpu) {
> +                       struct kmem_cache_cpu *c;
> +
> +                       c = per_cpu_ptr(s->cpu_slab, cpu);
> +                       if (c->page || slub_percpu_partial(c))
> +                               return false;
> +               }
> +       }
> +
>         for_each_kmem_cache_node(s, node, n)
>                 if (n->nr_partial || slabs_node(s, node))
>                         return false;
> --
> 2.18.0.rc1.244.gcf134e6275-goog
>

I can confirm that this fixes the test case on build.wireguard.com.

Andrew Morton June 19, 2018, 10:21 p.m. UTC | #3

On Tue, 19 Jun 2018 14:33:52 -0700 Shakeel Butt <shakeelb@google.com> wrote:

> For !CONFIG_SLUB_DEBUG, SLUB does not maintain the number of slabs
> allocated per node for a kmem_cache. Thus, slabs_node() in
> __kmem_cache_empty() will always return 0. So, in such situation, it is
> required to check per-cpu slabs to make sure if a kmem_cache is empty or
> not.
> 
> Please note that __kmem_cache_shutdown() and __kmem_cache_shrink() are
> not affected by !CONFIG_SLUB_DEBUG as they call flush_all() to clear
> per-cpu slabs.

Thanks guys.  I'll beef up this changelog a bit by adding

f9e13c0a5a33 ("slab, slub: skip unnecessary kasan_cache_shutdown()")
causes crashes when using slub, as described at
http://lkml.kernel.org/r/CAHmME9rtoPwxUSnktxzKso14iuVCWT7BE_-_8PAC=pGw1iJnQg@mail.gmail.com

So that a) Greg knows why we're sending it at him and b) other people
who are seeing the same sorts of crashes in their various kernels will
know that this patch will probably fix them.

David Rientjes June 20, 2018, 12:49 a.m. UTC | #4

On Tue, 19 Jun 2018, Shakeel Butt wrote:

> diff --git a/mm/slub.c b/mm/slub.c
> index a3b8467c14af..731c02b371ae 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3673,9 +3673,23 @@ static void free_partial(struct kmem_cache *s, struct kmem_cache_node *n)
>  
>  bool __kmem_cache_empty(struct kmem_cache *s)
>  {
> -	int node;
> +	int cpu, node;

Nit: wouldn't cpu be unused if CONFIG_SLUB_DEBUG is disabled?

>  	struct kmem_cache_node *n;
>  
> +	/*
> +	 * slabs_node will always be 0 for !CONFIG_SLUB_DEBUG. So, manually
> +	 * check slabs for all cpus.
> +	 */
> +	if (!IS_ENABLED(CONFIG_SLUB_DEBUG)) {
> +		for_each_online_cpu(cpu) {
> +			struct kmem_cache_cpu *c;
> +
> +			c = per_cpu_ptr(s->cpu_slab, cpu);
> +			if (c->page || slub_percpu_partial(c))
> +				return false;
> +		}
> +	}
> +
>  	for_each_kmem_cache_node(s, node, n)
>  		if (n->nr_partial || slabs_node(s, node))
>  			return false;

Wouldn't it just be better to allow {inc,dec}_slabs_node() to adjust the 
nr_slabs counter instead of doing the per-cpu iteration on every shutdown?

Shakeel Butt June 20, 2018, 1:24 a.m. UTC | #5

On Tue, Jun 19, 2018 at 5:49 PM David Rientjes <rientjes@google.com> wrote:
>
> On Tue, 19 Jun 2018, Shakeel Butt wrote:
>
> > diff --git a/mm/slub.c b/mm/slub.c
> > index a3b8467c14af..731c02b371ae 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -3673,9 +3673,23 @@ static void free_partial(struct kmem_cache *s, struct kmem_cache_node *n)
> >
> >  bool __kmem_cache_empty(struct kmem_cache *s)
> >  {
> > -     int node;
> > +     int cpu, node;
>
> Nit: wouldn't cpu be unused if CONFIG_SLUB_DEBUG is disabled?
>

I think I didn't get the warning as I didn't use #ifdef.

> >       struct kmem_cache_node *n;
> >
> > +     /*
> > +      * slabs_node will always be 0 for !CONFIG_SLUB_DEBUG. So, manually
> > +      * check slabs for all cpus.
> > +      */
> > +     if (!IS_ENABLED(CONFIG_SLUB_DEBUG)) {
> > +             for_each_online_cpu(cpu) {
> > +                     struct kmem_cache_cpu *c;
> > +
> > +                     c = per_cpu_ptr(s->cpu_slab, cpu);
> > +                     if (c->page || slub_percpu_partial(c))
> > +                             return false;
> > +             }
> > +     }
> > +
> >       for_each_kmem_cache_node(s, node, n)
> >               if (n->nr_partial || slabs_node(s, node))
> >                       return false;
>
> Wouldn't it just be better to allow {inc,dec}_slabs_node() to adjust the
> nr_slabs counter instead of doing the per-cpu iteration on every shutdown?

Yes that is doable as the functions {inc,dec}_slabs_node() are called
in slow path. I can move them out of CONFIG_SLUB_DEBUG. I think the
reason 0f389ec63077 ("slub: No need for per node slab counters if
!SLUB_DEBUG") put them under CONFIG_SLUB_DEBUG is because these
counters were only read through sysfs API which were disabled on
!CONFIG_SLUB_DEBUG. However we have a usecase other than sysfs API.

Please let me know if there is any objection to this conversion. For
large machines I think this is preferable approach.

thanks,
Shakeel

Andrey Ryabinin June 20, 2018, 12:09 p.m. UTC | #6

On 06/20/2018 12:33 AM, Shakeel Butt wrote:
> For !CONFIG_SLUB_DEBUG, SLUB does not maintain the number of slabs
> allocated per node for a kmem_cache. Thus, slabs_node() in
> __kmem_cache_empty() will always return 0. So, in such situation, it is
> required to check per-cpu slabs to make sure if a kmem_cache is empty or
> not.
> 
> Please note that __kmem_cache_shutdown() and __kmem_cache_shrink() are
> not affected by !CONFIG_SLUB_DEBUG as they call flush_all() to clear
> per-cpu slabs.

So what? Yes, they call flush_all() and then check if there are non-empty slabs left.
And that check doesn't work in case of disabled CONFIG_SLUB_DEBUG.
How is flush_all() or per-cpu slabs even relevant here?

Shakeel Butt June 20, 2018, 9:36 p.m. UTC | #7

On Wed, Jun 20, 2018 at 5:08 AM Andrey Ryabinin <aryabinin@virtuozzo.com> wrote:
>
>
>
> On 06/20/2018 12:33 AM, Shakeel Butt wrote:
> > For !CONFIG_SLUB_DEBUG, SLUB does not maintain the number of slabs
> > allocated per node for a kmem_cache. Thus, slabs_node() in
> > __kmem_cache_empty() will always return 0. So, in such situation, it is
> > required to check per-cpu slabs to make sure if a kmem_cache is empty or
> > not.
> >
> > Please note that __kmem_cache_shutdown() and __kmem_cache_shrink() are
> > not affected by !CONFIG_SLUB_DEBUG as they call flush_all() to clear
> > per-cpu slabs.
>
> So what? Yes, they call flush_all() and then check if there are non-empty slabs left.
> And that check doesn't work in case of disabled CONFIG_SLUB_DEBUG.
> How is flush_all() or per-cpu slabs even relevant here?
>

The flush_all() will move all cpu slabs and partials to node's partial
list and thus later check of node's partial list will handle non-empty
slabs situation. However what I missed is the 'full slabs' which are
not on any list for !CONFIG_SLUB_DEBUG. So, this patch is not the
complete solution. I think David's suggestion is the complete
solution. I will post a patch based on David's suggestion.

slub: fix __kmem_cache_empty for !CONFIG_SLUB_DEBUG

Commit Message

Comments

Patch