diff mbox series

mm/list_lru: dont make them memcg aware if kmem is disabled

Message ID 20201126043023.377343-1-bsingharora@gmail.com (mailing list archive)
State New, archived
Headers show
Series mm/list_lru: dont make them memcg aware if kmem is disabled | expand

Commit Message

Education Directorate Nov. 26, 2020, 4:30 a.m. UTC
When alloc_super() allocates list_lrus for dentries and inodes
they are made memcg aware if KMEM is compiled in, we should
also check if kmem was disabled at runtime.

This overhead is about 32 bytes extra per possible nodes per caller
of list_lru_init()

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
---
 include/linux/memcontrol.h | 6 ++++++
 mm/list_lru.c              | 9 +++++++++
 mm/memcontrol.c            | 9 +++++++++
 3 files changed, 24 insertions(+)

Comments

Vlastimil Babka Nov. 26, 2020, 12:10 p.m. UTC | #1
On 11/26/20 5:30 AM, Balbir Singh wrote:
> When alloc_super() allocates list_lrus for dentries and inodes
> they are made memcg aware if KMEM is compiled in, we should
> also check if kmem was disabled at runtime.
> 
> This overhead is about 32 bytes extra per possible nodes per caller
> of list_lru_init()
> 
> Signed-off-by: Balbir Singh <bsingharora@gmail.com>

I'd rather export cgroup_memory_nokmem and make cgroup_kmem_disabled() inline, 
put it next to memcg_kmem_enabled() and explain in comments what each means.

And ideally, the current memcg_kmem_enabled() should be named e.g. 
memcg_kmem_active(), and then the new cgroup_kmem_disabled() could be named 
memcg_kmem_enabled(). But that's churn and potential future backport hazard, so 
dunno.

> ---
>   include/linux/memcontrol.h | 6 ++++++
>   mm/list_lru.c              | 9 +++++++++
>   mm/memcontrol.c            | 9 +++++++++
>   3 files changed, 24 insertions(+)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 4a4df9a23a8b..4285741a855d 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -1636,6 +1636,7 @@ static inline bool memcg_kmem_enabled(void)
>   	return static_branch_likely(&memcg_kmem_enabled_key);
>   }
>   
> +extern bool cgroup_kmem_disabled(void);
>   static inline int memcg_kmem_charge_page(struct page *page, gfp_t gfp,
>   					 int order)
>   {
> @@ -1691,6 +1692,11 @@ static inline bool memcg_kmem_enabled(void)
>   	return false;
>   }
>   
> +static inline bool cgroup_kmem_disabled(void)
> +{
> +	return false;
> +}
> +
>   static inline int memcg_cache_id(struct mem_cgroup *memcg)
>   {
>   	return -1;
> diff --git a/mm/list_lru.c b/mm/list_lru.c
> index 5aa6e44bc2ae..478386aa9852 100644
> --- a/mm/list_lru.c
> +++ b/mm/list_lru.c
> @@ -439,6 +439,15 @@ static int memcg_init_list_lru(struct list_lru *lru, bool memcg_aware)
>   {
>   	int i;
>   
> +	/*
> +	 * Override memcg_aware  here, otherwise callers
> +	 * will allocate memcg aware lru's and that wastes
> +	 * memory and the amount can scale with the number of
> +	 * nodes and caller contexts.
> +	 */
> +	if (cgroup_kmem_disabled())
> +		memcg_aware = false;
> +
>   	lru->memcg_aware = memcg_aware;
>   
>   	if (!memcg_aware)
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 15f2396f1605..8edf3ada8f4e 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -99,6 +99,15 @@ static bool do_memsw_account(void)
>   	return !cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_noswap;
>   }
>   
> +/*
> + * NOTE: This is not the logical opposite of what
> + * memcg_kmem_enabled() does
> + */
> +bool cgroup_kmem_disabled(void)
> +{
> +	return cgroup_memory_nokmem;
> +}
> +
>   #define THRESHOLDS_EVENTS_TARGET 128
>   #define SOFTLIMIT_EVENTS_TARGET 1024
>   
>
Education Directorate Nov. 27, 2020, 9:21 a.m. UTC | #2
On Thu, Nov 26, 2020 at 01:10:18PM +0100, Vlastimil Babka wrote:
> On 11/26/20 5:30 AM, Balbir Singh wrote:
> > When alloc_super() allocates list_lrus for dentries and inodes
> > they are made memcg aware if KMEM is compiled in, we should
> > also check if kmem was disabled at runtime.
> > 
> > This overhead is about 32 bytes extra per possible nodes per caller
> > of list_lru_init()
> > 
> > Signed-off-by: Balbir Singh <bsingharora@gmail.com>
> 
> I'd rather export cgroup_memory_nokmem and make cgroup_kmem_disabled()
> inline, put it next to memcg_kmem_enabled() and explain in comments what
> each means.
> 
> And ideally, the current memcg_kmem_enabled() should be named e.g.
> memcg_kmem_active(), and then the new cgroup_kmem_disabled() could be named
> memcg_kmem_enabled(). But that's churn and potential future backport hazard,
> so dunno.

Yes, I am happy with whatever approach works to fast track the patches

Andrew, thoughts/comments?

Balbir Singh.
Andrew Morton Nov. 28, 2020, 3:49 a.m. UTC | #3
On Fri, 27 Nov 2020 20:21:11 +1100 Balbir Singh <bsingharora@gmail.com> wrote:

> On Thu, Nov 26, 2020 at 01:10:18PM +0100, Vlastimil Babka wrote:
> > On 11/26/20 5:30 AM, Balbir Singh wrote:
> > > When alloc_super() allocates list_lrus for dentries and inodes
> > > they are made memcg aware if KMEM is compiled in, we should
> > > also check if kmem was disabled at runtime.
> > > 
> > > This overhead is about 32 bytes extra per possible nodes per caller
> > > of list_lru_init()
> > > 
> > > Signed-off-by: Balbir Singh <bsingharora@gmail.com>
> > 
> > I'd rather export cgroup_memory_nokmem and make cgroup_kmem_disabled()
> > inline, put it next to memcg_kmem_enabled() and explain in comments what
> > each means.
> > 
> > And ideally, the current memcg_kmem_enabled() should be named e.g.
> > memcg_kmem_active(), and then the new cgroup_kmem_disabled() could be named
> > memcg_kmem_enabled(). But that's churn and potential future backport hazard,
> > so dunno.
> 
> Yes, I am happy with whatever approach works to fast track the patches
> 
> Andrew, thoughts/comments?
> 

Your original changelog doesn't make the case that the patch should be
fast tracked, so it looks like there's missing information.  Please
don't miss information ;)

If we're looking for a backportable quickfix for the
not-yet-really-explained problem then yes, Vlastimil's suggestion (as a
high-priority patch and a separate low-priority patch) sounds good.

It is rather a twisty maze of identifiers, so please do comment
everything well.
Education Directorate Nov. 28, 2020, 11:47 a.m. UTC | #4
On Fri, Nov 27, 2020 at 07:49:00PM -0800, Andrew Morton wrote:
> On Fri, 27 Nov 2020 20:21:11 +1100 Balbir Singh <bsingharora@gmail.com> wrote:
> 
> > On Thu, Nov 26, 2020 at 01:10:18PM +0100, Vlastimil Babka wrote:
> > > On 11/26/20 5:30 AM, Balbir Singh wrote:
> > > > When alloc_super() allocates list_lrus for dentries and inodes
> > > > they are made memcg aware if KMEM is compiled in, we should
> > > > also check if kmem was disabled at runtime.
> > > > 
> > > > This overhead is about 32 bytes extra per possible nodes per caller
> > > > of list_lru_init()
> > > > 
> > > > Signed-off-by: Balbir Singh <bsingharora@gmail.com>
> > > 
> > > I'd rather export cgroup_memory_nokmem and make cgroup_kmem_disabled()
> > > inline, put it next to memcg_kmem_enabled() and explain in comments what
> > > each means.
> > > 
> > > And ideally, the current memcg_kmem_enabled() should be named e.g.
> > > memcg_kmem_active(), and then the new cgroup_kmem_disabled() could be named
> > > memcg_kmem_enabled(). But that's churn and potential future backport hazard,
> > > so dunno.
> > 
> > Yes, I am happy with whatever approach works to fast track the patches
> > 
> > Andrew, thoughts/comments?
> > 
> 
> Your original changelog doesn't make the case that the patch should be
> fast tracked, so it looks like there's missing information.  Please
> don't miss information ;)
>

I just wanted to get the patches done :) Nothing to be seriously backported.
I can refactor the code, I've seen cases where even with kmem disabled on
pressure, kvmalloc leads to wasting memory for allocating memcg aware list_lrus.

With older kernels (5.4 and before) where, creation of a new namespace
always caused these allocations to occur via pid_ns_prepare_proc(), the issue
is quite obviously exposed - the waster is a function of

wasted_memory = sizeof (list_lru_memcg+4) * number of nodes * number of memcgs with
kmem enabled ( per clone/container ) -- (1)

That's not too bad today, but I am afraid it's useless memory being allocated
and can become larger if the number of possible nodes increase.



 
> If we're looking for a backportable quickfix for the
> not-yet-really-explained problem then yes, Vlastimil's suggestion (as a
> high-priority patch and a separate low-priority patch) sounds good.
> 
> It is rather a twisty maze of identifiers, so please do comment
> everything well.

Yes, I am going to refactor things out a bit and resend the patches

Balbir Singh.
diff mbox series

Patch

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 4a4df9a23a8b..4285741a855d 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -1636,6 +1636,7 @@  static inline bool memcg_kmem_enabled(void)
 	return static_branch_likely(&memcg_kmem_enabled_key);
 }
 
+extern bool cgroup_kmem_disabled(void);
 static inline int memcg_kmem_charge_page(struct page *page, gfp_t gfp,
 					 int order)
 {
@@ -1691,6 +1692,11 @@  static inline bool memcg_kmem_enabled(void)
 	return false;
 }
 
+static inline bool cgroup_kmem_disabled(void)
+{
+	return false;
+}
+
 static inline int memcg_cache_id(struct mem_cgroup *memcg)
 {
 	return -1;
diff --git a/mm/list_lru.c b/mm/list_lru.c
index 5aa6e44bc2ae..478386aa9852 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -439,6 +439,15 @@  static int memcg_init_list_lru(struct list_lru *lru, bool memcg_aware)
 {
 	int i;
 
+	/*
+	 * Override memcg_aware  here, otherwise callers
+	 * will allocate memcg aware lru's and that wastes
+	 * memory and the amount can scale with the number of
+	 * nodes and caller contexts.
+	 */
+	if (cgroup_kmem_disabled())
+		memcg_aware = false;
+
 	lru->memcg_aware = memcg_aware;
 
 	if (!memcg_aware)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 15f2396f1605..8edf3ada8f4e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -99,6 +99,15 @@  static bool do_memsw_account(void)
 	return !cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_noswap;
 }
 
+/*
+ * NOTE: This is not the logical opposite of what
+ * memcg_kmem_enabled() does
+ */
+bool cgroup_kmem_disabled(void)
+{
+	return cgroup_memory_nokmem;
+}
+
 #define THRESHOLDS_EVENTS_TARGET 128
 #define SOFTLIMIT_EVENTS_TARGET 1024