diff mbox series

[6/7] slub: fix unreclaimable slab stat for bulk free

Message ID 20210729215350.SZC9InNuL%akpm@linux-foundation.org (mailing list archive)
State New
Headers show
Series [1/7] lib/test_string.c: move string selftest in the Runtime Testing menu | expand

Commit Message

Andrew Morton July 29, 2021, 9:53 p.m. UTC
From: Shakeel Butt <shakeelb@google.com>
Subject: slub: fix unreclaimable slab stat for bulk free

SLUB uses page allocator for higher order allocations and update
unreclaimable slab stat for such allocations.  At the moment, the bulk
free for SLUB does not share code with normal free code path for these
type of allocations and have missed the stat update.  So, fix the stat
update by common code.  The user visible impact of the bug is the
potential of inconsistent unreclaimable slab stat visible through meminfo
and vmstat.

Link: https://lkml.kernel.org/r/20210728155354.3440560-1-shakeelb@google.com
Fixes: 6a486c0ad4dc ("mm, sl[ou]b: improve memory accounting")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slub.c |   22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

Comments

Nathan Chancellor July 31, 2021, 10:18 p.m. UTC | #1
On Thu, Jul 29, 2021 at 02:53:50PM -0700, Andrew Morton wrote:
> From: Shakeel Butt <shakeelb@google.com>
> Subject: slub: fix unreclaimable slab stat for bulk free
> 
> SLUB uses page allocator for higher order allocations and update
> unreclaimable slab stat for such allocations.  At the moment, the bulk
> free for SLUB does not share code with normal free code path for these
> type of allocations and have missed the stat update.  So, fix the stat
> update by common code.  The user visible impact of the bug is the
> potential of inconsistent unreclaimable slab stat visible through meminfo
> and vmstat.
> 
> Link: https://lkml.kernel.org/r/20210728155354.3440560-1-shakeelb@google.com
> Fixes: 6a486c0ad4dc ("mm, sl[ou]b: improve memory accounting")
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Acked-by: Roman Gushchin <guro@fb.com>
> Reviewed-by: Muchun Song <songmuchun@bytedance.com>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  mm/slub.c |   22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> --- a/mm/slub.c~slub-fix-unreclaimable-slab-stat-for-bulk-free
> +++ a/mm/slub.c
> @@ -3236,6 +3236,16 @@ struct detached_freelist {
>  	struct kmem_cache *s;
>  };
>  
> +static inline void free_nonslab_page(struct page *page)
> +{
> +	unsigned int order = compound_order(page);
> +
> +	VM_BUG_ON_PAGE(!PageCompound(page), page);
> +	kfree_hook(page_address(page));
> +	mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE_B, -(PAGE_SIZE << order));
> +	__free_pages(page, order);
> +}
> +
>  /*
>   * This function progressively scans the array with free objects (with
>   * a limited look ahead) and extract objects belonging to the same
> @@ -3272,9 +3282,7 @@ int build_detached_freelist(struct kmem_
>  	if (!s) {
>  		/* Handle kalloc'ed objects */
>  		if (unlikely(!PageSlab(page))) {
> -			BUG_ON(!PageCompound(page));
> -			kfree_hook(object);
> -			__free_pages(page, compound_order(page));
> +			free_nonslab_page(page);
>  			p[size] = NULL; /* mark object processed */
>  			return size;
>  		}
> @@ -4250,13 +4258,7 @@ void kfree(const void *x)
>  
>  	page = virt_to_head_page(x);
>  	if (unlikely(!PageSlab(page))) {
> -		unsigned int order = compound_order(page);
> -
> -		BUG_ON(!PageCompound(page));
> -		kfree_hook(object);
> -		mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE_B,
> -				      -(PAGE_SIZE << order));
> -		__free_pages(page, order);
> +		free_nonslab_page(page);
>  		return;
>  	}
>  	slab_free(page->slab_cache, page, object, NULL, 1, _RET_IP_);
> _

This patch, now in mainline as commit f227f0faf63b ("slub: fix
unreclaimable slab stat for bulk free") causes the KASAN KUnit test
kmalloc_pagealloc_invalid_free to no longer fail:

[    0.000000] Linux version 5.14.0-rc3-00066-gf227f0faf63b (nathan@archlinux-ax161) (x86_64-linux-gcc (GCC) 11.2.0, GNU ld (GNU Binutils) 2.37) #1 SMP Sat Jul 31 15:08:11 MST 2021
...
[    5.717678]     # kmalloc_pagealloc_invalid_free: EXPECTATION FAILED at lib/test_kasan.c:203
[    5.717678]     KASAN failure expected in "kfree(ptr + 1)", but none occurred
[    5.718909]     not ok 6 - kmalloc_pagealloc_invalid_free
...
[    9.481520] not ok 1 - kasan

The previous commit is fine:

[    0.000000] Linux version 5.14.0-rc3-00065-gb5916c025432 (nathan@archlinux-ax161) (x86_64-linux-gcc (GCC) 11.2.0, GNU ld (GNU Binutils) 2.37) #1 SMP Sat Jul 31 15:05:09 MST 2021
...
[    9.347598] ok 1 - kasan

I am by no means a KASAN or mm/ expert, I noticed this when trying to
test KASAN with clang for ClangBuiltLinux's CI, so it does not appear to
be compiler dependent. It is reproducible for me in QEMU with
x86_64_defconfig + CONFIG_KASAN=y + CONFIG_KUNIT=y +
CONFIG_KASAN_KUNIT_TEST=y.

Please let me know if there is any other information I can provide or
testing I can do.

Cheers,
Nathan
Shakeel Butt Aug. 1, 2021, 5:32 a.m. UTC | #2
Hi Nathan,

On Sat, Jul 31, 2021 at 3:18 PM Nathan Chancellor <nathan@kernel.org> wrote:
>
> On Thu, Jul 29, 2021 at 02:53:50PM -0700, Andrew Morton wrote:
> > From: Shakeel Butt <shakeelb@google.com>
> > Subject: slub: fix unreclaimable slab stat for bulk free
> >
> > SLUB uses page allocator for higher order allocations and update
> > unreclaimable slab stat for such allocations.  At the moment, the bulk
> > free for SLUB does not share code with normal free code path for these
> > type of allocations and have missed the stat update.  So, fix the stat
> > update by common code.  The user visible impact of the bug is the
> > potential of inconsistent unreclaimable slab stat visible through meminfo
> > and vmstat.
> >
> > Link: https://lkml.kernel.org/r/20210728155354.3440560-1-shakeelb@google.com
> > Fixes: 6a486c0ad4dc ("mm, sl[ou]b: improve memory accounting")
> > Signed-off-by: Shakeel Butt <shakeelb@google.com>
> > Acked-by: Michal Hocko <mhocko@suse.com>
> > Acked-by: Roman Gushchin <guro@fb.com>
> > Reviewed-by: Muchun Song <songmuchun@bytedance.com>
> > Cc: Christoph Lameter <cl@linux.com>
> > Cc: Pekka Enberg <penberg@kernel.org>
> > Cc: David Rientjes <rientjes@google.com>
> > Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > Cc: Vlastimil Babka <vbabka@suse.cz>
> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > ---
> >
> >  mm/slub.c |   22 ++++++++++++----------
> >  1 file changed, 12 insertions(+), 10 deletions(-)
> >
> > --- a/mm/slub.c~slub-fix-unreclaimable-slab-stat-for-bulk-free
> > +++ a/mm/slub.c
> > @@ -3236,6 +3236,16 @@ struct detached_freelist {
> >       struct kmem_cache *s;
> >  };
> >
> > +static inline void free_nonslab_page(struct page *page)
> > +{
> > +     unsigned int order = compound_order(page);
> > +
> > +     VM_BUG_ON_PAGE(!PageCompound(page), page);
> > +     kfree_hook(page_address(page));
> > +     mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE_B, -(PAGE_SIZE << order));
> > +     __free_pages(page, order);
> > +}
> > +
> >  /*
> >   * This function progressively scans the array with free objects (with
> >   * a limited look ahead) and extract objects belonging to the same
> > @@ -3272,9 +3282,7 @@ int build_detached_freelist(struct kmem_
> >       if (!s) {
> >               /* Handle kalloc'ed objects */
> >               if (unlikely(!PageSlab(page))) {
> > -                     BUG_ON(!PageCompound(page));
> > -                     kfree_hook(object);
> > -                     __free_pages(page, compound_order(page));
> > +                     free_nonslab_page(page);
> >                       p[size] = NULL; /* mark object processed */
> >                       return size;
> >               }
> > @@ -4250,13 +4258,7 @@ void kfree(const void *x)
> >
> >       page = virt_to_head_page(x);
> >       if (unlikely(!PageSlab(page))) {
> > -             unsigned int order = compound_order(page);
> > -
> > -             BUG_ON(!PageCompound(page));
> > -             kfree_hook(object);
> > -             mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE_B,
> > -                                   -(PAGE_SIZE << order));
> > -             __free_pages(page, order);
> > +             free_nonslab_page(page);
> >               return;
> >       }
> >       slab_free(page->slab_cache, page, object, NULL, 1, _RET_IP_);
> > _
>
> This patch, now in mainline as commit f227f0faf63b ("slub: fix
> unreclaimable slab stat for bulk free") causes the KASAN KUnit test
> kmalloc_pagealloc_invalid_free to no longer fail:
>
> [    0.000000] Linux version 5.14.0-rc3-00066-gf227f0faf63b (nathan@archlinux-ax161) (x86_64-linux-gcc (GCC) 11.2.0, GNU ld (GNU Binutils) 2.37) #1 SMP Sat Jul 31 15:08:11 MST 2021
> ...
> [    5.717678]     # kmalloc_pagealloc_invalid_free: EXPECTATION FAILED at lib/test_kasan.c:203
> [    5.717678]     KASAN failure expected in "kfree(ptr + 1)", but none occurred
> [    5.718909]     not ok 6 - kmalloc_pagealloc_invalid_free
> ...
> [    9.481520] not ok 1 - kasan
>
> The previous commit is fine:
>
> [    0.000000] Linux version 5.14.0-rc3-00065-gb5916c025432 (nathan@archlinux-ax161) (x86_64-linux-gcc (GCC) 11.2.0, GNU ld (GNU Binutils) 2.37) #1 SMP Sat Jul 31 15:05:09 MST 2021
> ...
> [    9.347598] ok 1 - kasan
>
> I am by no means a KASAN or mm/ expert, I noticed this when trying to
> test KASAN with clang for ClangBuiltLinux's CI, so it does not appear to
> be compiler dependent. It is reproducible for me in QEMU with
> x86_64_defconfig + CONFIG_KASAN=y + CONFIG_KUNIT=y +
> CONFIG_KASAN_KUNIT_TEST=y.
>
> Please let me know if there is any other information I can provide or
> testing I can do.
>

Thanks for the report. This is actually due to changing
kfree_hook(object) to kfree_hook(page_address(page)). The test forces
slub to go to the page allocator and then freeing with the next byte
address instead of the returned address. Since both are addresses on
the same page, the code is fine but the kasan test is not happy.

The test is making sure that programmers use the address returned by
kmalloc in the kfree. I don't think this is urgent but I will send the
patch to fix this during the week.

thanks,
Shakeel
diff mbox series

Patch

--- a/mm/slub.c~slub-fix-unreclaimable-slab-stat-for-bulk-free
+++ a/mm/slub.c
@@ -3236,6 +3236,16 @@  struct detached_freelist {
 	struct kmem_cache *s;
 };
 
+static inline void free_nonslab_page(struct page *page)
+{
+	unsigned int order = compound_order(page);
+
+	VM_BUG_ON_PAGE(!PageCompound(page), page);
+	kfree_hook(page_address(page));
+	mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE_B, -(PAGE_SIZE << order));
+	__free_pages(page, order);
+}
+
 /*
  * This function progressively scans the array with free objects (with
  * a limited look ahead) and extract objects belonging to the same
@@ -3272,9 +3282,7 @@  int build_detached_freelist(struct kmem_
 	if (!s) {
 		/* Handle kalloc'ed objects */
 		if (unlikely(!PageSlab(page))) {
-			BUG_ON(!PageCompound(page));
-			kfree_hook(object);
-			__free_pages(page, compound_order(page));
+			free_nonslab_page(page);
 			p[size] = NULL; /* mark object processed */
 			return size;
 		}
@@ -4250,13 +4258,7 @@  void kfree(const void *x)
 
 	page = virt_to_head_page(x);
 	if (unlikely(!PageSlab(page))) {
-		unsigned int order = compound_order(page);
-
-		BUG_ON(!PageCompound(page));
-		kfree_hook(object);
-		mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE_B,
-				      -(PAGE_SIZE << order));
-		__free_pages(page, order);
+		free_nonslab_page(page);
 		return;
 	}
 	slab_free(page->slab_cache, page, object, NULL, 1, _RET_IP_);