diff mbox series

[STABLE] slub: make ->cpu_partial unsigned int

Message ID 1538303301-61784-1-git-send-email-zhongjiang@huawei.com (mailing list archive)
State New, archived
Headers show
Series [STABLE] slub: make ->cpu_partial unsigned int | expand

Commit Message

zhong jiang Sept. 30, 2018, 10:28 a.m. UTC
From: Alexey Dobriyan <adobriyan@gmail.com>

[ Upstream commit e5d9998f3e09359b372a037a6ac55ba235d95d57 ]

        /*
         * cpu_partial determined the maximum number of objects
         * kept in the per cpu partial lists of a processor.
         */

Can't be negative.

I hit a real issue that it will result in a large number of memory leak.
Becuase Freeing slabs are in interrupt context. So it can trigger this issue.
put_cpu_partial can be interrupted more than once.
due to a union struct of lru and pobjects in struct page, when other core handles
page->lru list, for eaxmple, remove_partial in freeing slab code flow, It will
result in pobjects being a negative value(0xdead0000). Therefore, a large number
of slabs will be added to per_cpu partial list.

I had posted the issue to community before. The detailed issue description is as follows.

https://www.spinics.net/lists/kernel/msg2870979.html

After applying the patch, The issue is fixed. So the patch is a effective bugfix.
It should go into stable.

Link: http://lkml.kernel.org/r/20180305200730.15812-15-adobriyan@gmail.com
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <stable@vger.kernel.org> # 4.4.x
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
 include/linux/slub_def.h | 3 ++-
 mm/slub.c                | 6 +++---
 2 files changed, 5 insertions(+), 4 deletions(-)

Comments

Greg KH Sept. 30, 2018, 12:37 p.m. UTC | #1
On Sun, Sep 30, 2018 at 06:28:21PM +0800, zhong jiang wrote:
> From: Alexey Dobriyan <adobriyan@gmail.com>
> 
> [ Upstream commit e5d9998f3e09359b372a037a6ac55ba235d95d57 ]
> 
>         /*
>          * cpu_partial determined the maximum number of objects
>          * kept in the per cpu partial lists of a processor.
>          */
> 
> Can't be negative.
> 
> I hit a real issue that it will result in a large number of memory leak.
> Becuase Freeing slabs are in interrupt context. So it can trigger this issue.
> put_cpu_partial can be interrupted more than once.
> due to a union struct of lru and pobjects in struct page, when other core handles
> page->lru list, for eaxmple, remove_partial in freeing slab code flow, It will
> result in pobjects being a negative value(0xdead0000). Therefore, a large number
> of slabs will be added to per_cpu partial list.
> 
> I had posted the issue to community before. The detailed issue description is as follows.
> 
> https://www.spinics.net/lists/kernel/msg2870979.html
> 
> After applying the patch, The issue is fixed. So the patch is a effective bugfix.
> It should go into stable.
> 
> Link: http://lkml.kernel.org/r/20180305200730.15812-15-adobriyan@gmail.com
> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
> Acked-by: Christoph Lameter <cl@linux.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: <stable@vger.kernel.org> # 4.4.x

This didn't apply to 4.14.y and any reason you didn't cc: the stable
mailing list for the other stable developers to see it?

I've fixed up the patch, but next time please always cc: the stable
list.

thanks,

greg k-h
Matthew Wilcox (Oracle) Sept. 30, 2018, 12:50 p.m. UTC | #2
On Sun, Sep 30, 2018 at 06:28:21PM +0800, zhong jiang wrote:
> From: Alexey Dobriyan <adobriyan@gmail.com>
> 
> [ Upstream commit e5d9998f3e09359b372a037a6ac55ba235d95d57 ]
> 
>         /*
>          * cpu_partial determined the maximum number of objects
>          * kept in the per cpu partial lists of a processor.
>          */
> 
> Can't be negative.
> 
> I hit a real issue that it will result in a large number of memory leak.
> Becuase Freeing slabs are in interrupt context. So it can trigger this issue.
> put_cpu_partial can be interrupted more than once.
> due to a union struct of lru and pobjects in struct page, when other core handles
> page->lru list, for eaxmple, remove_partial in freeing slab code flow, It will
> result in pobjects being a negative value(0xdead0000). Therefore, a large number
> of slabs will be added to per_cpu partial list.
> 
> I had posted the issue to community before. The detailed issue description is as follows.
> 
> https://www.spinics.net/lists/kernel/msg2870979.html
> 
> After applying the patch, The issue is fixed. So the patch is a effective bugfix.
> It should go into stable.
> 
> Link: http://lkml.kernel.org/r/20180305200730.15812-15-adobriyan@gmail.com
> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
> Acked-by: Christoph Lameter <cl@linux.com>

Hang on.  Christoph acked the _original_ patch going into upstream.
When he reviewed this patch for _stable_ last week, he asked for more
investigation.  Including this patch in stable is misleading.

> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: <stable@vger.kernel.org> # 4.4.x
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: zhong jiang <zhongjiang@huawei.com>
> ---
>  include/linux/slub_def.h | 3 ++-
>  mm/slub.c                | 6 +++---
>  2 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
> index 3388511..9b681f2 100644
> --- a/include/linux/slub_def.h
> +++ b/include/linux/slub_def.h
> @@ -67,7 +67,8 @@ struct kmem_cache {
>  	int size;		/* The size of an object including meta data */
>  	int object_size;	/* The size of an object without meta data */
>  	int offset;		/* Free pointer offset. */
> -	int cpu_partial;	/* Number of per cpu partial objects to keep around */
> +	/* Number of per cpu partial objects to keep around */
> +	unsigned int cpu_partial;
>  	struct kmem_cache_order_objects oo;
>  
>  	/* Allocation and freeing of slabs */
> diff --git a/mm/slub.c b/mm/slub.c
> index 2284c43..c33b0e1 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1661,7 +1661,7 @@ static void *get_partial_node(struct kmem_cache *s, struct kmem_cache_node *n,
>  {
>  	struct page *page, *page2;
>  	void *object = NULL;
> -	int available = 0;
> +	unsigned int available = 0;
>  	int objects;
>  
>  	/*
> @@ -4674,10 +4674,10 @@ static ssize_t cpu_partial_show(struct kmem_cache *s, char *buf)
>  static ssize_t cpu_partial_store(struct kmem_cache *s, const char *buf,
>  				 size_t length)
>  {
> -	unsigned long objects;
> +	unsigned int objects;
>  	int err;
>  
> -	err = kstrtoul(buf, 10, &objects);
> +	err = kstrtouint(buf, 10, &objects);
>  	if (err)
>  		return err;
>  	if (objects && !kmem_cache_has_cpu_partial(s))
> -- 
> 1.7.12.4
>
Greg KH Sept. 30, 2018, 1:10 p.m. UTC | #3
On Sun, Sep 30, 2018 at 05:50:38AM -0700, Matthew Wilcox wrote:
> On Sun, Sep 30, 2018 at 06:28:21PM +0800, zhong jiang wrote:
> > From: Alexey Dobriyan <adobriyan@gmail.com>
> > 
> > [ Upstream commit e5d9998f3e09359b372a037a6ac55ba235d95d57 ]
> > 
> >         /*
> >          * cpu_partial determined the maximum number of objects
> >          * kept in the per cpu partial lists of a processor.
> >          */
> > 
> > Can't be negative.
> > 
> > I hit a real issue that it will result in a large number of memory leak.
> > Becuase Freeing slabs are in interrupt context. So it can trigger this issue.
> > put_cpu_partial can be interrupted more than once.
> > due to a union struct of lru and pobjects in struct page, when other core handles
> > page->lru list, for eaxmple, remove_partial in freeing slab code flow, It will
> > result in pobjects being a negative value(0xdead0000). Therefore, a large number
> > of slabs will be added to per_cpu partial list.
> > 
> > I had posted the issue to community before. The detailed issue description is as follows.
> > 
> > https://www.spinics.net/lists/kernel/msg2870979.html
> > 
> > After applying the patch, The issue is fixed. So the patch is a effective bugfix.
> > It should go into stable.
> > 
> > Link: http://lkml.kernel.org/r/20180305200730.15812-15-adobriyan@gmail.com
> > Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
> > Acked-by: Christoph Lameter <cl@linux.com>
> 
> Hang on.  Christoph acked the _original_ patch going into upstream.
> When he reviewed this patch for _stable_ last week, he asked for more
> investigation.  Including this patch in stable is misleading.

But the original patch has been in upstream for a long time now (it went
into 4.17-rc1).  If there was a real problem here, whouldn't it have
been resolved already?

And the patch in mainline has Christoph's ack...

thanks,

greg k-h
Matthew Wilcox (Oracle) Sept. 30, 2018, 1:23 p.m. UTC | #4
On Sun, Sep 30, 2018 at 06:10:26AM -0700, Greg KH wrote:
> On Sun, Sep 30, 2018 at 05:50:38AM -0700, Matthew Wilcox wrote:
> > On Sun, Sep 30, 2018 at 06:28:21PM +0800, zhong jiang wrote:
> > > From: Alexey Dobriyan <adobriyan@gmail.com>
> > > 
> > > [ Upstream commit e5d9998f3e09359b372a037a6ac55ba235d95d57 ]
> > > 
> > >         /*
> > >          * cpu_partial determined the maximum number of objects
> > >          * kept in the per cpu partial lists of a processor.
> > >          */
> > > 
> > > Can't be negative.
> > > 
> > > I hit a real issue that it will result in a large number of memory leak.
> > > Becuase Freeing slabs are in interrupt context. So it can trigger this issue.
> > > put_cpu_partial can be interrupted more than once.
> > > due to a union struct of lru and pobjects in struct page, when other core handles
> > > page->lru list, for eaxmple, remove_partial in freeing slab code flow, It will
> > > result in pobjects being a negative value(0xdead0000). Therefore, a large number
> > > of slabs will be added to per_cpu partial list.
> > > 
> > > I had posted the issue to community before. The detailed issue description is as follows.
> > > 
> > > https://www.spinics.net/lists/kernel/msg2870979.html
> > > 
> > > After applying the patch, The issue is fixed. So the patch is a effective bugfix.
> > > It should go into stable.
> > > 
> > > Link: http://lkml.kernel.org/r/20180305200730.15812-15-adobriyan@gmail.com
> > > Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
> > > Acked-by: Christoph Lameter <cl@linux.com>
> > 
> > Hang on.  Christoph acked the _original_ patch going into upstream.
> > When he reviewed this patch for _stable_ last week, he asked for more
> > investigation.  Including this patch in stable is misleading.
> 
> But the original patch has been in upstream for a long time now (it went
> into 4.17-rc1).  If there was a real problem here, whouldn't it have
> been resolved already?
> 
> And the patch in mainline has Christoph's ack...

I'm not saying there's a problem with the patch.  It's that the rationale
for backporting doesn't make any damned sense.  There's something going
on that nobody understands.  This patch is probably masking an underlying
problem that will pop back up and bite us again someday.
Christoph Lameter (Ampere) Oct. 2, 2018, 2:50 p.m. UTC | #5
On Sun, 30 Sep 2018, Matthew Wilcox wrote:

> > And the patch in mainline has Christoph's ack...
>
> I'm not saying there's a problem with the patch.  It's that the rationale
> for backporting doesn't make any damned sense.  There's something going
> on that nobody understands.  This patch is probably masking an underlying
> problem that will pop back up and bite us again someday.

Right. That is why I raised the issue. I do not see any harm in
backporting but I do not think it fixes the real issue which may be in
concurrent use of page struct fields that are overlapping.
diff mbox series

Patch

diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 3388511..9b681f2 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -67,7 +67,8 @@  struct kmem_cache {
 	int size;		/* The size of an object including meta data */
 	int object_size;	/* The size of an object without meta data */
 	int offset;		/* Free pointer offset. */
-	int cpu_partial;	/* Number of per cpu partial objects to keep around */
+	/* Number of per cpu partial objects to keep around */
+	unsigned int cpu_partial;
 	struct kmem_cache_order_objects oo;
 
 	/* Allocation and freeing of slabs */
diff --git a/mm/slub.c b/mm/slub.c
index 2284c43..c33b0e1 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1661,7 +1661,7 @@  static void *get_partial_node(struct kmem_cache *s, struct kmem_cache_node *n,
 {
 	struct page *page, *page2;
 	void *object = NULL;
-	int available = 0;
+	unsigned int available = 0;
 	int objects;
 
 	/*
@@ -4674,10 +4674,10 @@  static ssize_t cpu_partial_show(struct kmem_cache *s, char *buf)
 static ssize_t cpu_partial_store(struct kmem_cache *s, const char *buf,
 				 size_t length)
 {
-	unsigned long objects;
+	unsigned int objects;
 	int err;
 
-	err = kstrtoul(buf, 10, &objects);
+	err = kstrtouint(buf, 10, &objects);
 	if (err)
 		return err;
 	if (objects && !kmem_cache_has_cpu_partial(s))