diff mbox series

[4/5] mm: memcg/slab: Fix root memcg vmstats

Message ID 20201027080256.76497-5-songmuchun@bytedance.com (mailing list archive)
State New, archived
Headers show
Series Fix some bugs in memcg/slab | expand

Commit Message

Muchun Song Oct. 27, 2020, 8:02 a.m. UTC
If we reparent the slab objects to the root memcg, when we free
the slab object, we need to update the per-memcg vmstats to keep
it correct for the root memcg. Now this at least affects the vmstat
of NR_KERNEL_STACK_KB for !CONFIG_VMAP_STACK when the thread stack
size is smaller than the PAGE_SIZE.

Fixes: ec9f02384f60 ("mm: workingset: fix vmstat counters for shadow nodes")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 mm/memcontrol.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Roman Gushchin Oct. 27, 2020, 6:48 p.m. UTC | #1
On Tue, Oct 27, 2020 at 04:02:55PM +0800, Muchun Song wrote:
> If we reparent the slab objects to the root memcg, when we free
> the slab object, we need to update the per-memcg vmstats to keep
> it correct for the root memcg. Now this at least affects the vmstat
> of NR_KERNEL_STACK_KB for !CONFIG_VMAP_STACK when the thread stack
> size is smaller than the PAGE_SIZE.
> 
> Fixes: ec9f02384f60 ("mm: workingset: fix vmstat counters for shadow nodes")
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>

Can you, please, drop this patch for now?

I'm working on a bigger cleanup related to the handling of the root memory
cgroup (I sent a link earlier in this thread), which already does a similar change.
There are several issues like this one, so it will be nice to fix them all at once.

Thank you!

> ---
>  mm/memcontrol.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 22b4fb941b54..70345b15b150 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -875,8 +875,13 @@ void __mod_lruvec_kmem_state(void *p, enum node_stat_item idx, int val)
>  	rcu_read_lock();
>  	memcg = mem_cgroup_from_obj(p);
>  
> -	/* Untracked pages have no memcg, no lruvec. Update only the node */
> -	if (!memcg || memcg == root_mem_cgroup) {
> +	/*
> +	 * Untracked pages have no memcg, no lruvec. Update only the
> +	 * node. If we reparent the slab objects to the root memcg,
> +	 * when we free the slab object, we need to update the per-memcg
> +	 * vmstats to keep it correct for the root memcg.
> +	 */
> +	if (!memcg) {
>  		__mod_node_page_state(pgdat, idx, val);
>  	} else {
>  		lruvec = mem_cgroup_lruvec(memcg, pgdat);
> -- 
> 2.20.1
>
Muchun Song Oct. 28, 2020, 2:56 a.m. UTC | #2
On Wed, Oct 28, 2020 at 2:48 AM Roman Gushchin <guro@fb.com> wrote:
>
> On Tue, Oct 27, 2020 at 04:02:55PM +0800, Muchun Song wrote:
> > If we reparent the slab objects to the root memcg, when we free
> > the slab object, we need to update the per-memcg vmstats to keep
> > it correct for the root memcg. Now this at least affects the vmstat
> > of NR_KERNEL_STACK_KB for !CONFIG_VMAP_STACK when the thread stack
> > size is smaller than the PAGE_SIZE.
> >
> > Fixes: ec9f02384f60 ("mm: workingset: fix vmstat counters for shadow nodes")
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>
> Can you, please, drop this patch for now?
>
> I'm working on a bigger cleanup related to the handling of the root memory
> cgroup (I sent a link earlier in this thread), which already does a similar change.
> There are several issues like this one, so it will be nice to fix them all at once.

I have read the patch of https://lkml.org/lkml/2020/10/14/869. You
mean this patch
fixes this issue? It chooses to uncharge the root memcg. But here we may need to
uncharge the root memcg to keep root vmstats correct. If we do not do
this, we can
see the wrong vmstats via root memory.stat(e.g. NR_KERNEL_STACK_KB).

Thanks.

>
> Thank you!
>
> > ---
> >  mm/memcontrol.c | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 22b4fb941b54..70345b15b150 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -875,8 +875,13 @@ void __mod_lruvec_kmem_state(void *p, enum node_stat_item idx, int val)
> >       rcu_read_lock();
> >       memcg = mem_cgroup_from_obj(p);
> >
> > -     /* Untracked pages have no memcg, no lruvec. Update only the node */
> > -     if (!memcg || memcg == root_mem_cgroup) {
> > +     /*
> > +      * Untracked pages have no memcg, no lruvec. Update only the
> > +      * node. If we reparent the slab objects to the root memcg,
> > +      * when we free the slab object, we need to update the per-memcg
> > +      * vmstats to keep it correct for the root memcg.
> > +      */
> > +     if (!memcg) {
> >               __mod_node_page_state(pgdat, idx, val);
> >       } else {
> >               lruvec = mem_cgroup_lruvec(memcg, pgdat);
> > --
> > 2.20.1
> >
Muchun Song Oct. 28, 2020, 3:47 a.m. UTC | #3
On Wed, Oct 28, 2020 at 10:56 AM Muchun Song <songmuchun@bytedance.com> wrote:
>
> On Wed, Oct 28, 2020 at 2:48 AM Roman Gushchin <guro@fb.com> wrote:
> >
> > On Tue, Oct 27, 2020 at 04:02:55PM +0800, Muchun Song wrote:
> > > If we reparent the slab objects to the root memcg, when we free
> > > the slab object, we need to update the per-memcg vmstats to keep
> > > it correct for the root memcg. Now this at least affects the vmstat
> > > of NR_KERNEL_STACK_KB for !CONFIG_VMAP_STACK when the thread stack
> > > size is smaller than the PAGE_SIZE.
> > >
> > > Fixes: ec9f02384f60 ("mm: workingset: fix vmstat counters for shadow nodes")
> > > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> >
> > Can you, please, drop this patch for now?
> >
> > I'm working on a bigger cleanup related to the handling of the root memory
> > cgroup (I sent a link earlier in this thread), which already does a similar change.
> > There are several issues like this one, so it will be nice to fix them all at once.
>
> I have read the patch of https://lkml.org/lkml/2020/10/14/869. You
> mean this patch
> fixes this issue? It chooses to uncharge the root memcg. But here we may need to

Here I mean "It chooses to not uncharge the root memcg", sorry.

> uncharge the root memcg to keep root vmstats correct. If we do not do
> this, we can
> see the wrong vmstats via root memory.stat(e.g. NR_KERNEL_STACK_KB).
>
> Thanks.
>
> >
> > Thank you!
> >
> > > ---
> > >  mm/memcontrol.c | 9 +++++++--
> > >  1 file changed, 7 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > index 22b4fb941b54..70345b15b150 100644
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -875,8 +875,13 @@ void __mod_lruvec_kmem_state(void *p, enum node_stat_item idx, int val)
> > >       rcu_read_lock();
> > >       memcg = mem_cgroup_from_obj(p);
> > >
> > > -     /* Untracked pages have no memcg, no lruvec. Update only the node */
> > > -     if (!memcg || memcg == root_mem_cgroup) {
> > > +     /*
> > > +      * Untracked pages have no memcg, no lruvec. Update only the
> > > +      * node. If we reparent the slab objects to the root memcg,
> > > +      * when we free the slab object, we need to update the per-memcg
> > > +      * vmstats to keep it correct for the root memcg.
> > > +      */
> > > +     if (!memcg) {
> > >               __mod_node_page_state(pgdat, idx, val);
> > >       } else {
> > >               lruvec = mem_cgroup_lruvec(memcg, pgdat);
> > > --
> > > 2.20.1
> > >
>
>
>
> --
> Yours,
> Muchun
Roman Gushchin Oct. 29, 2020, 12:14 a.m. UTC | #4
On Wed, Oct 28, 2020 at 10:56:20AM +0800, Muchun Song wrote:
> On Wed, Oct 28, 2020 at 2:48 AM Roman Gushchin <guro@fb.com> wrote:
> >
> > On Tue, Oct 27, 2020 at 04:02:55PM +0800, Muchun Song wrote:
> > > If we reparent the slab objects to the root memcg, when we free
> > > the slab object, we need to update the per-memcg vmstats to keep
> > > it correct for the root memcg. Now this at least affects the vmstat
> > > of NR_KERNEL_STACK_KB for !CONFIG_VMAP_STACK when the thread stack
> > > size is smaller than the PAGE_SIZE.
> > >
> > > Fixes: ec9f02384f60 ("mm: workingset: fix vmstat counters for shadow nodes")
> > > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> >
> > Can you, please, drop this patch for now?
> >
> > I'm working on a bigger cleanup related to the handling of the root memory
> > cgroup (I sent a link earlier in this thread), which already does a similar change.
> > There are several issues like this one, so it will be nice to fix them all at once.
> 
> I have read the patch of https://lkml.org/lkml/2020/10/14/869. You
> mean this patch
> fixes this issue? It chooses to uncharge the root memcg. But here we may need to
> uncharge the root memcg to keep root vmstats correct. If we do not do
> this, we can
> see the wrong vmstats via root memory.stat(e.g. NR_KERNEL_STACK_KB).

I pointed at a different patch in the same thread (it looks like you read the first one):
https://lkml.org/lkml/2020/10/21/612

It contained the following part:

@@ -868,7 +860,7 @@ void __mod_lruvec_slab_state(void *p, enum node_stat_item idx, int val)
 	memcg = mem_cgroup_from_obj(p);
 
 	/* Untracked pages have no memcg, no lruvec. Update only the node */
-	if (!memcg || memcg == root_mem_cgroup) {
+	if (!memcg) {
 		__mod_node_page_state(pgdat, idx, val);
 	} else {
 		lruvec = mem_cgroup_lruvec(memcg, pgdat);

So it's exactly what your patch does.

Thanks!
Muchun Song Oct. 29, 2020, 6:15 a.m. UTC | #5
On Thu, Oct 29, 2020 at 8:14 AM Roman Gushchin <guro@fb.com> wrote:
>
> On Wed, Oct 28, 2020 at 10:56:20AM +0800, Muchun Song wrote:
> > On Wed, Oct 28, 2020 at 2:48 AM Roman Gushchin <guro@fb.com> wrote:
> > >
> > > On Tue, Oct 27, 2020 at 04:02:55PM +0800, Muchun Song wrote:
> > > > If we reparent the slab objects to the root memcg, when we free
> > > > the slab object, we need to update the per-memcg vmstats to keep
> > > > it correct for the root memcg. Now this at least affects the vmstat
> > > > of NR_KERNEL_STACK_KB for !CONFIG_VMAP_STACK when the thread stack
> > > > size is smaller than the PAGE_SIZE.
> > > >
> > > > Fixes: ec9f02384f60 ("mm: workingset: fix vmstat counters for shadow nodes")
> > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > >
> > > Can you, please, drop this patch for now?
> > >
> > > I'm working on a bigger cleanup related to the handling of the root memory
> > > cgroup (I sent a link earlier in this thread), which already does a similar change.
> > > There are several issues like this one, so it will be nice to fix them all at once.
> >
> > I have read the patch of https://lkml.org/lkml/2020/10/14/869. You
> > mean this patch
> > fixes this issue? It chooses to uncharge the root memcg. But here we may need to
> > uncharge the root memcg to keep root vmstats correct. If we do not do
> > this, we can
> > see the wrong vmstats via root memory.stat(e.g. NR_KERNEL_STACK_KB).
>
> I pointed at a different patch in the same thread (it looks like you read the first one):
> https://lkml.org/lkml/2020/10/21/612

Got it. Thanks. That is fine to me.

>
> It contained the following part:
>
> @@ -868,7 +860,7 @@ void __mod_lruvec_slab_state(void *p, enum node_stat_item idx, int val)
>         memcg = mem_cgroup_from_obj(p);
>
>         /* Untracked pages have no memcg, no lruvec. Update only the node */
> -       if (!memcg || memcg == root_mem_cgroup) {
> +       if (!memcg) {
>                 __mod_node_page_state(pgdat, idx, val);
>         } else {
>                 lruvec = mem_cgroup_lruvec(memcg, pgdat);
>
> So it's exactly what your patch does.
>
> Thanks!
Roman Gushchin Nov. 10, 2020, 1:32 a.m. UTC | #6
On Thu, Oct 29, 2020 at 02:15:43PM +0800, Muchun Song wrote:
> On Thu, Oct 29, 2020 at 8:14 AM Roman Gushchin <guro@fb.com> wrote:
> >
> > On Wed, Oct 28, 2020 at 10:56:20AM +0800, Muchun Song wrote:
> > > On Wed, Oct 28, 2020 at 2:48 AM Roman Gushchin <guro@fb.com> wrote:
> > > >
> > > > On Tue, Oct 27, 2020 at 04:02:55PM +0800, Muchun Song wrote:
> > > > > If we reparent the slab objects to the root memcg, when we free
> > > > > the slab object, we need to update the per-memcg vmstats to keep
> > > > > it correct for the root memcg. Now this at least affects the vmstat
> > > > > of NR_KERNEL_STACK_KB for !CONFIG_VMAP_STACK when the thread stack
> > > > > size is smaller than the PAGE_SIZE.
> > > > >
> > > > > Fixes: ec9f02384f60 ("mm: workingset: fix vmstat counters for shadow nodes")
> > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > > >
> > > > Can you, please, drop this patch for now?
> > > >
> > > > I'm working on a bigger cleanup related to the handling of the root memory
> > > > cgroup (I sent a link earlier in this thread), which already does a similar change.
> > > > There are several issues like this one, so it will be nice to fix them all at once.
> > >
> > > I have read the patch of https://lkml.org/lkml/2020/10/14/869. You
> > > mean this patch
> > > fixes this issue? It chooses to uncharge the root memcg. But here we may need to
> > > uncharge the root memcg to keep root vmstats correct. If we do not do
> > > this, we can
> > > see the wrong vmstats via root memory.stat(e.g. NR_KERNEL_STACK_KB).
> >
> > I pointed at a different patch in the same thread (it looks like you read the first one):
> > https://lkml.org/lkml/2020/10/21/612

Hi Muchun!

Can you please, resend your patch? The planned cleanup of the root memory cgroup
is more complex than expected, so I think it makes sense to merge your patch without
waiting for it. I'm sorry for delaying it initially.

Please, feel free to add
Acked-by: Roman Gushchin <guro@fb.com>

Thank you!

Roman
Muchun Song Nov. 10, 2020, 2:57 a.m. UTC | #7
On Tue, Nov 10, 2020 at 9:33 AM Roman Gushchin <guro@fb.com> wrote:
>
> On Thu, Oct 29, 2020 at 02:15:43PM +0800, Muchun Song wrote:
> > On Thu, Oct 29, 2020 at 8:14 AM Roman Gushchin <guro@fb.com> wrote:
> > >
> > > On Wed, Oct 28, 2020 at 10:56:20AM +0800, Muchun Song wrote:
> > > > On Wed, Oct 28, 2020 at 2:48 AM Roman Gushchin <guro@fb.com> wrote:
> > > > >
> > > > > On Tue, Oct 27, 2020 at 04:02:55PM +0800, Muchun Song wrote:
> > > > > > If we reparent the slab objects to the root memcg, when we free
> > > > > > the slab object, we need to update the per-memcg vmstats to keep
> > > > > > it correct for the root memcg. Now this at least affects the vmstat
> > > > > > of NR_KERNEL_STACK_KB for !CONFIG_VMAP_STACK when the thread stack
> > > > > > size is smaller than the PAGE_SIZE.
> > > > > >
> > > > > > Fixes: ec9f02384f60 ("mm: workingset: fix vmstat counters for shadow nodes")
> > > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > > > >
> > > > > Can you, please, drop this patch for now?
> > > > >
> > > > > I'm working on a bigger cleanup related to the handling of the root memory
> > > > > cgroup (I sent a link earlier in this thread), which already does a similar change.
> > > > > There are several issues like this one, so it will be nice to fix them all at once.
> > > >
> > > > I have read the patch of https://lkml.org/lkml/2020/10/14/869. You
> > > > mean this patch
> > > > fixes this issue? It chooses to uncharge the root memcg. But here we may need to
> > > > uncharge the root memcg to keep root vmstats correct. If we do not do
> > > > this, we can
> > > > see the wrong vmstats via root memory.stat(e.g. NR_KERNEL_STACK_KB).
> > >
> > > I pointed at a different patch in the same thread (it looks like you read the first one):
> > > https://lkml.org/lkml/2020/10/21/612
>
> Hi Muchun!
>
> Can you please, resend your patch? The planned cleanup of the root memory cgroup
> is more complex than expected, so I think it makes sense to merge your patch without
> waiting for it. I'm sorry for delaying it initially.

OK, I will do that. Thanks.

>
> Please, feel free to add
> Acked-by: Roman Gushchin <guro@fb.com>
>
> Thank you!
>
> Roman
diff mbox series

Patch

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 22b4fb941b54..70345b15b150 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -875,8 +875,13 @@  void __mod_lruvec_kmem_state(void *p, enum node_stat_item idx, int val)
 	rcu_read_lock();
 	memcg = mem_cgroup_from_obj(p);
 
-	/* Untracked pages have no memcg, no lruvec. Update only the node */
-	if (!memcg || memcg == root_mem_cgroup) {
+	/*
+	 * Untracked pages have no memcg, no lruvec. Update only the
+	 * node. If we reparent the slab objects to the root memcg,
+	 * when we free the slab object, we need to update the per-memcg
+	 * vmstats to keep it correct for the root memcg.
+	 */
+	if (!memcg) {
 		__mod_node_page_state(pgdat, idx, val);
 	} else {
 		lruvec = mem_cgroup_lruvec(memcg, pgdat);