memcg: drain obj stock on cpu hotplug teardown

Message ID	20250310230934.2913113-1-shakeel.butt@linux.dev (mailing list archive)
State	New
Headers	show Return-Path: <owner-linux-mm@kvack.org> From: Shakeel Butt <shakeel.butt@linux.dev> To: Andrew Morton <akpm@linux-foundation.org> Cc: Johannes Weiner <hannes@cmpxchg.org>, Michal Hocko <mhocko@kernel.org>, Roman Gushchin <roman.gushchin@linux.dev>, Muchun Song <muchun.song@linux.dev>, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team <kernel-team@meta.com>, stable@vger.kernel.org Subject: [PATCH] memcg: drain obj stock on cpu hotplug teardown Date: Mon, 10 Mar 2025 16:09:34 -0700 Message-ID: <20250310230934.2913113-1-shakeel.butt@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	memcg: drain obj stock on cpu hotplug teardown \| expand memcg: drain obj stock on cpu hotplug teardown

Shakeel Butt March 10, 2025, 11:09 p.m. UTC

Currently on cpu hotplug teardown, only memcg stock is drained but we
need to drain the obj stock as well otherwise we will miss the stats
accumulated on the target cpu as well as the nr_bytes cached. The stats
include MEMCG_KMEM, NR_SLAB_RECLAIMABLE_B & NR_SLAB_UNRECLAIMABLE_B. In
addition we are leaking reference to struct obj_cgroup object.

Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API")
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: <stable@vger.kernel.org>
---
 mm/memcontrol.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Johannes Weiner March 11, 2025, 3:30 p.m. UTC | #1

On Mon, Mar 10, 2025 at 04:09:34PM -0700, Shakeel Butt wrote:
> Currently on cpu hotplug teardown, only memcg stock is drained but we
> need to drain the obj stock as well otherwise we will miss the stats
> accumulated on the target cpu as well as the nr_bytes cached. The stats
> include MEMCG_KMEM, NR_SLAB_RECLAIMABLE_B & NR_SLAB_UNRECLAIMABLE_B. In
> addition we are leaking reference to struct obj_cgroup object.
> 
> Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API")
> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> Cc: <stable@vger.kernel.org>

Wow, that's old. Good catch.

> ---
>  mm/memcontrol.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 4de6acb9b8ec..59dcaf6a3519 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1921,9 +1921,18 @@ void drain_all_stock(struct mem_cgroup *root_memcg)
>  static int memcg_hotplug_cpu_dead(unsigned int cpu)
>  {
>  	struct memcg_stock_pcp *stock;
> +	struct obj_cgroup *old;
> +	unsigned long flags;
>  
>  	stock = &per_cpu(memcg_stock, cpu);
> +
> +	/* drain_obj_stock requires stock_lock */
> +	local_lock_irqsave(&memcg_stock.stock_lock, flags);
> +	old = drain_obj_stock(stock);
> +	local_unlock_irqrestore(&memcg_stock.stock_lock, flags);
> +
>  	drain_stock(stock);
> +	obj_cgroup_put(old);

It might be better to call drain_local_stock() directly instead. That
would prevent a bug of this type to reoccur in the future.

Shakeel Butt March 11, 2025, 3:57 p.m. UTC | #2

On Tue, Mar 11, 2025 at 11:30:32AM -0400, Johannes Weiner wrote:
> On Mon, Mar 10, 2025 at 04:09:34PM -0700, Shakeel Butt wrote:
> > Currently on cpu hotplug teardown, only memcg stock is drained but we
> > need to drain the obj stock as well otherwise we will miss the stats
> > accumulated on the target cpu as well as the nr_bytes cached. The stats
> > include MEMCG_KMEM, NR_SLAB_RECLAIMABLE_B & NR_SLAB_UNRECLAIMABLE_B. In
> > addition we are leaking reference to struct obj_cgroup object.
> > 
> > Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API")
> > Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> > Cc: <stable@vger.kernel.org>
> 
> Wow, that's old. Good catch.
> 
> > ---
> >  mm/memcontrol.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 4de6acb9b8ec..59dcaf6a3519 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -1921,9 +1921,18 @@ void drain_all_stock(struct mem_cgroup *root_memcg)
> >  static int memcg_hotplug_cpu_dead(unsigned int cpu)
> >  {
> >  	struct memcg_stock_pcp *stock;
> > +	struct obj_cgroup *old;
> > +	unsigned long flags;
> >  
> >  	stock = &per_cpu(memcg_stock, cpu);
> > +
> > +	/* drain_obj_stock requires stock_lock */
> > +	local_lock_irqsave(&memcg_stock.stock_lock, flags);
> > +	old = drain_obj_stock(stock);
> > +	local_unlock_irqrestore(&memcg_stock.stock_lock, flags);
> > +
> >  	drain_stock(stock);
> > +	obj_cgroup_put(old);
> 
> It might be better to call drain_local_stock() directly instead. That
> would prevent a bug of this type to reoccur in the future.

The issue is drain_local_stock() works on the local cpu stock while here
we are working on a remote cpu cpu which is dead (memcg_hotplug_cpu_dead
is in PREPARE section of hotplug teardown which runs after the cpu is
dead).

We can safely call drain_stock() on remote cpu stock here but
drain_obj_stock() is a bit tricky as it can __refill_stock() to local cpu
stock and can call __mod_objcg_mlstate to flush stats. Both of these
requires irq disable for NON-RT kernels and thus I added the local_lock
here.

Anyways I wanted a simple fix for the backports and in parallel I am
working on cleaning up all the stock functions as I plan to add multi
memcg support.

Thanks for taking a look.

Roman Gushchin March 11, 2025, 4:26 p.m. UTC | #3

On Mon, Mar 10, 2025 at 04:09:34PM -0700, Shakeel Butt wrote:
> Currently on cpu hotplug teardown, only memcg stock is drained but we
> need to drain the obj stock as well otherwise we will miss the stats
> accumulated on the target cpu as well as the nr_bytes cached. The stats
> include MEMCG_KMEM, NR_SLAB_RECLAIMABLE_B & NR_SLAB_UNRECLAIMABLE_B. In
> addition we are leaking reference to struct obj_cgroup object.
> 
> Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API")
> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> Cc: <stable@vger.kernel.org>

Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev>

Thanks!

Roman Gushchin March 11, 2025, 4:28 p.m. UTC | #4

On Tue, Mar 11, 2025 at 08:57:54AM -0700, Shakeel Butt wrote:
> On Tue, Mar 11, 2025 at 11:30:32AM -0400, Johannes Weiner wrote:
> > On Mon, Mar 10, 2025 at 04:09:34PM -0700, Shakeel Butt wrote:
> > > Currently on cpu hotplug teardown, only memcg stock is drained but we
> > > need to drain the obj stock as well otherwise we will miss the stats
> > > accumulated on the target cpu as well as the nr_bytes cached. The stats
> > > include MEMCG_KMEM, NR_SLAB_RECLAIMABLE_B & NR_SLAB_UNRECLAIMABLE_B. In
> > > addition we are leaking reference to struct obj_cgroup object.
> > > 
> > > Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API")
> > > Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> > > Cc: <stable@vger.kernel.org>
> > 
> > Wow, that's old. Good catch.
> > 
> > > ---
> > >  mm/memcontrol.c | 9 +++++++++
> > >  1 file changed, 9 insertions(+)
> > > 
> > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > index 4de6acb9b8ec..59dcaf6a3519 100644
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -1921,9 +1921,18 @@ void drain_all_stock(struct mem_cgroup *root_memcg)
> > >  static int memcg_hotplug_cpu_dead(unsigned int cpu)
> > >  {
> > >  	struct memcg_stock_pcp *stock;
> > > +	struct obj_cgroup *old;
> > > +	unsigned long flags;
> > >  
> > >  	stock = &per_cpu(memcg_stock, cpu);
> > > +
> > > +	/* drain_obj_stock requires stock_lock */
> > > +	local_lock_irqsave(&memcg_stock.stock_lock, flags);
> > > +	old = drain_obj_stock(stock);
> > > +	local_unlock_irqrestore(&memcg_stock.stock_lock, flags);
> > > +
> > >  	drain_stock(stock);
> > > +	obj_cgroup_put(old);
> > 
> > It might be better to call drain_local_stock() directly instead. That
> > would prevent a bug of this type to reoccur in the future.
> 
> The issue is drain_local_stock() works on the local cpu stock while here
> we are working on a remote cpu cpu which is dead (memcg_hotplug_cpu_dead
> is in PREPARE section of hotplug teardown which runs after the cpu is
> dead).
> 
> We can safely call drain_stock() on remote cpu stock here but
> drain_obj_stock() is a bit tricky as it can __refill_stock() to local cpu
> stock and can call __mod_objcg_mlstate to flush stats. Both of these
> requires irq disable for NON-RT kernels and thus I added the local_lock
> here.
> 
> Anyways I wanted a simple fix for the backports and in parallel I am
> working on cleaning up all the stock functions as I plan to add multi
> memcg support.

Really curious to see patches/more details here, I have some ideas here
as well (nothing materialized yet though).

Thanks!

Johannes Weiner March 11, 2025, 5:29 p.m. UTC | #5

On Tue, Mar 11, 2025 at 08:57:54AM -0700, Shakeel Butt wrote:
> On Tue, Mar 11, 2025 at 11:30:32AM -0400, Johannes Weiner wrote:
> > On Mon, Mar 10, 2025 at 04:09:34PM -0700, Shakeel Butt wrote:
> > > Currently on cpu hotplug teardown, only memcg stock is drained but we
> > > need to drain the obj stock as well otherwise we will miss the stats
> > > accumulated on the target cpu as well as the nr_bytes cached. The stats
> > > include MEMCG_KMEM, NR_SLAB_RECLAIMABLE_B & NR_SLAB_UNRECLAIMABLE_B. In
> > > addition we are leaking reference to struct obj_cgroup object.
> > > 
> > > Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API")
> > > Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> > > Cc: <stable@vger.kernel.org>
> > 
> > Wow, that's old. Good catch.
> > 
> > > ---
> > >  mm/memcontrol.c | 9 +++++++++
> > >  1 file changed, 9 insertions(+)
> > > 
> > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > index 4de6acb9b8ec..59dcaf6a3519 100644
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -1921,9 +1921,18 @@ void drain_all_stock(struct mem_cgroup *root_memcg)
> > >  static int memcg_hotplug_cpu_dead(unsigned int cpu)
> > >  {
> > >  	struct memcg_stock_pcp *stock;
> > > +	struct obj_cgroup *old;
> > > +	unsigned long flags;
> > >  
> > >  	stock = &per_cpu(memcg_stock, cpu);
> > > +
> > > +	/* drain_obj_stock requires stock_lock */
> > > +	local_lock_irqsave(&memcg_stock.stock_lock, flags);
> > > +	old = drain_obj_stock(stock);
> > > +	local_unlock_irqrestore(&memcg_stock.stock_lock, flags);
> > > +
> > >  	drain_stock(stock);
> > > +	obj_cgroup_put(old);
> > 
> > It might be better to call drain_local_stock() directly instead. That
> > would prevent a bug of this type to reoccur in the future.
> 
> The issue is drain_local_stock() works on the local cpu stock while here
> we are working on a remote cpu cpu which is dead (memcg_hotplug_cpu_dead
> is in PREPARE section of hotplug teardown which runs after the cpu is
> dead).
> 
> We can safely call drain_stock() on remote cpu stock here but
> drain_obj_stock() is a bit tricky as it can __refill_stock() to local cpu
> stock and can call __mod_objcg_mlstate to flush stats. Both of these
> requires irq disable for NON-RT kernels and thus I added the local_lock
> here.
> 
> Anyways I wanted a simple fix for the backports and in parallel I am
> working on cleaning up all the stock functions as I plan to add multi
> memcg support.

True, it can be refactored separately.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Shakeel Butt March 11, 2025, 5:43 p.m. UTC | #6

On Tue, Mar 11, 2025 at 04:28:15PM +0000, Roman Gushchin wrote:
[...]
> > 
> > Anyways I wanted a simple fix for the backports and in parallel I am
> > working on cleaning up all the stock functions as I plan to add multi
> > memcg support.
> 
> Really curious to see patches/more details here, I have some ideas here
> as well (nothing materialized yet though).
> 

My eventual goal is to add support for multi memcg stock to solve the
charging cost of incoming networking traffic in multi-tenant
environment. In the cleanups my plan is to reduce the number of irq
disable operations in the most common paths and explore if for task
context we can do without any irq disabling operation (possibly by
separate stocks for in_task and !in_task contexts).

memcg: drain obj stock on cpu hotplug teardown

Commit Message

Comments

Patch