diff mbox series

[mmotm] mm: memcontrol: decouple reference counting from page accounting fix

Message ID alpine.LSU.2.11.2007302011450.2347@eggly.anvils (mailing list archive)
State New, archived
Headers show
Series [mmotm] mm: memcontrol: decouple reference counting from page accounting fix | expand

Commit Message

Hugh Dickins July 31, 2020, 3:17 a.m. UTC
Moving tasks between mem cgroups with memory.move_charge_at_immigrate 3,
while swapping, crashes soon on mmotm (and so presumably on linux-next):
for example, spinlock found corrupted when lock_page_memcg() is called.
It's as if the mem cgroup structures have been freed too early.

Stab in the dark: what if all the accounting is right, except that the
css_put_many() in __mem_cgroup_clear_mc() is now (worse than) redundant?
Removing it fixes the crashes, but that's hardly surprising; and stats
temporarily hacked into mem_cgroup_css_alloc() and mem_cgroup_css_free()
showed that mem cgroups were not being leaked with this change.

Note: this removes the last call to css_put_many() from the tree; and
mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
removes the last call to css_get_many(): now that their last references
have gone, I expect them soon to be freed from include/linux/cgroup.h.

Signed-off-by: Hugh Dickins <hughd@google.com>
---
Fixes mm-memcontrol-decouple-reference-counting-from-page-accounting.patch

 mm/memcontrol.c |    2 --
 1 file changed, 2 deletions(-)

Comments

Johannes Weiner July 31, 2020, 3:04 p.m. UTC | #1
On Thu, Jul 30, 2020 at 08:17:50PM -0700, Hugh Dickins wrote:
> Moving tasks between mem cgroups with memory.move_charge_at_immigrate 3,
> while swapping, crashes soon on mmotm (and so presumably on linux-next):
> for example, spinlock found corrupted when lock_page_memcg() is called.
> It's as if the mem cgroup structures have been freed too early.
> 
> Stab in the dark: what if all the accounting is right, except that the
> css_put_many() in __mem_cgroup_clear_mc() is now (worse than) redundant?
> Removing it fixes the crashes, but that's hardly surprising; and stats
> temporarily hacked into mem_cgroup_css_alloc() and mem_cgroup_css_free()
> showed that mem cgroups were not being leaked with this change.
> 
> Note: this removes the last call to css_put_many() from the tree; and
> mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
> removes the last call to css_get_many(): now that their last references
> have gone, I expect them soon to be freed from include/linux/cgroup.h.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>

Thanks, Hugh. This fix looks correct to me.

And I'd agree with the put being worse than redundant. Its counterpart
in try_charge() has been removed, so this a clear-cut ref imbalance.

When moving a task between cgroups, we scan the page tables for pages
and swap entries, and then pre-charge the target group while we're
still allowed to veto the task move (can_attach). In the actual attach
step we then reassign all the pages and swap entries and balance the
books in the cgroup the task emigrated from.

That precharging used to acquire css references for every page charge
and swap entry charge when calling try_charge(). That is gone. Now we
move css references along with the page (move_account), and swap
entries use the mem_cgroup_id references which pin the css indirectly.

Leaving that css_put_many behind in the swap path was an oversight.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

> ---
> Fixes mm-memcontrol-decouple-reference-counting-from-page-accounting.patch
> 
>  mm/memcontrol.c |    2 --
>  1 file changed, 2 deletions(-)
> 
> --- mmotm/mm/memcontrol.c	2020-07-27 18:55:00.700554752 -0700
> +++ linux/mm/memcontrol.c	2020-07-30 12:05:00.640091618 -0700
> @@ -5887,8 +5887,6 @@ static void __mem_cgroup_clear_mc(void)
>  		if (!mem_cgroup_is_root(mc.to))
>  			page_counter_uncharge(&mc.to->memory, mc.moved_swap);
>  
> -		css_put_many(&mc.to->css, mc.moved_swap);
> -
>  		mc.moved_swap = 0;
>  	}
>  	memcg_oom_recover(from);
Roman Gushchin Aug. 1, 2020, 1:20 a.m. UTC | #2
On Thu, Jul 30, 2020 at 08:17:50PM -0700, Hugh Dickins wrote:
> Moving tasks between mem cgroups with memory.move_charge_at_immigrate 3,
> while swapping, crashes soon on mmotm (and so presumably on linux-next):
> for example, spinlock found corrupted when lock_page_memcg() is called.
> It's as if the mem cgroup structures have been freed too early.
> 
> Stab in the dark: what if all the accounting is right, except that the
> css_put_many() in __mem_cgroup_clear_mc() is now (worse than) redundant?
> Removing it fixes the crashes, but that's hardly surprising; and stats
> temporarily hacked into mem_cgroup_css_alloc() and mem_cgroup_css_free()
> showed that mem cgroups were not being leaked with this change.
> 
> Note: this removes the last call to css_put_many() from the tree; and
> mm-memcg-slab-use-a-single-set-of-kmem_caches-for-all-accounted-allocations.patch
> removes the last call to css_get_many(): now that their last references
> have gone, I expect them soon to be freed from include/linux/cgroup.h.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>
> ---
> Fixes mm-memcontrol-decouple-reference-counting-from-page-accounting.patch
> 
>  mm/memcontrol.c |    2 --
>  1 file changed, 2 deletions(-)
> 
> --- mmotm/mm/memcontrol.c	2020-07-27 18:55:00.700554752 -0700
> +++ linux/mm/memcontrol.c	2020-07-30 12:05:00.640091618 -0700
> @@ -5887,8 +5887,6 @@ static void __mem_cgroup_clear_mc(void)
>  		if (!mem_cgroup_is_root(mc.to))
>  			page_counter_uncharge(&mc.to->memory, mc.moved_swap);
>  
> -		css_put_many(&mc.to->css, mc.moved_swap);
> -
>  		mc.moved_swap = 0;
>  	}
>  	memcg_oom_recover(from);

Acked-by: Roman Gushchin <guro@fb.com>

Good catch!

Thank you, Hugh!
diff mbox series

Patch

--- mmotm/mm/memcontrol.c	2020-07-27 18:55:00.700554752 -0700
+++ linux/mm/memcontrol.c	2020-07-30 12:05:00.640091618 -0700
@@ -5887,8 +5887,6 @@  static void __mem_cgroup_clear_mc(void)
 		if (!mem_cgroup_is_root(mc.to))
 			page_counter_uncharge(&mc.to->memory, mc.moved_swap);
 
-		css_put_many(&mc.to->css, mc.moved_swap);
-
 		mc.moved_swap = 0;
 	}
 	memcg_oom_recover(from);