[v6,13/19] mm: memcg/slab: simplify memcg cache creation

Message ID	20200608230654.828134-14-guro@fb.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=PqO3=7V=kvack.org=owner-linux-mm@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F12702085B Smtp-Origin-Hostprefix: devvm From: Roman Gushchin <guro@fb.com> Smtp-Origin-Hostname: devvm1291.vll0.facebook.com To: Andrew Morton <akpm@linux-foundation.org>, Christoph Lameter <cl@linux.com> CC: Johannes Weiner <hannes@cmpxchg.org>, Michal Hocko <mhocko@kernel.org>, Shakeel Butt <shakeelb@google.com>, <linux-mm@kvack.org>, Vlastimil Babka <vbabka@suse.cz>, <kernel-team@fb.com>, <linux-kernel@vger.kernel.org>, Roman Gushchin <guro@fb.com> Smtp-Origin-Cluster: vll0c01 Subject: [PATCH v6 13/19] mm: memcg/slab: simplify memcg cache creation Date: Mon, 8 Jun 2020 16:06:48 -0700 Message-ID: <20200608230654.828134-14-guro@fb.com> In-Reply-To: <20200608230654.828134-1-guro@fb.com> References: <20200608230654.828134-1-guro@fb.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	The new cgroup slab memory controller \| expand [v6,00/19] The new cgroup slab memory controller [v6,01/19] mm: memcg: factor out memcg- and lruvec-level changes out of __mod_lruvec_state() [v6,02/19] mm: memcg: prepare for byte-sized vmstat items [v6,03/19] mm: memcg: convert vmstat slab counters to bytes [v6,04/19] mm: slub: implement SLUB version of obj_to_index() [v6,05/19] mm: memcontrol: decouple reference counting from page accounting [v6,06/19] mm: memcg/slab: obj_cgroup API [v6,07/19] mm: memcg/slab: allocate obj_cgroups for non-root slab pages [v6,08/19] mm: memcg/slab: save obj_cgroup for non-root slab objects [v6,09/19] mm: memcg/slab: charge individual slab objects instead of pages [v6,10/19] mm: memcg/slab: deprecate memory.kmem.slabinfo [v6,11/19] mm: memcg/slab: move memcg_kmem_bypass() to memcontrol.h [v6,12/19] mm: memcg/slab: use a single set of kmem_caches for all accounted allocations [v6,13/19] mm: memcg/slab: simplify memcg cache creation [v6,14/19] mm: memcg/slab: remove memcg_kmem_get_cache() [v6,15/19] mm: memcg/slab: deprecate slab_root_caches [v6,16/19] mm: memcg/slab: remove redundant check in memcg_accumulate_slabinfo() [v6,17/19] mm: memcg/slab: use a single set of kmem_caches for all allocations [v6,18/19] kselftests: cgroup: add kernel memory accounting tests [v6,19/19] tools/cgroup: add memcg_slabinfo.py tool

Message ID

20200608230654.828134-14-guro@fb.com (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F12702085B
Smtp-Origin-Hostprefix: devvm
From: Roman Gushchin <guro@fb.com>
Smtp-Origin-Hostname: devvm1291.vll0.facebook.com
To: Andrew Morton <akpm@linux-foundation.org>,
        Christoph Lameter
	<cl@linux.com>
CC: Johannes Weiner <hannes@cmpxchg.org>, Michal Hocko <mhocko@kernel.org>,
        Shakeel Butt <shakeelb@google.com>, <linux-mm@kvack.org>,
        Vlastimil Babka
	<vbabka@suse.cz>, <kernel-team@fb.com>,
        <linux-kernel@vger.kernel.org>, Roman
 Gushchin <guro@fb.com>
Smtp-Origin-Cluster: vll0c01
Subject: [PATCH v6 13/19] mm: memcg/slab: simplify memcg cache creation
Date: Mon, 8 Jun 2020 16:06:48 -0700
Message-ID: <20200608230654.828134-14-guro@fb.com>
In-Reply-To: <20200608230654.828134-1-guro@fb.com>
References: <20200608230654.828134-1-guro@fb.com>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

The new cgroup slab memory controller | expand

Commit Message

Roman Gushchin June 8, 2020, 11:06 p.m. UTC

Because the number of non-root kmem_caches doesn't depend on the
number of memory cgroups anymore and is generally not very big,
there is no more need for a dedicated workqueue.

Also, as there is no more need to pass any arguments to the
memcg_create_kmem_cache() except the root kmem_cache, it's
possible to just embed the work structure into the kmem_cache
and avoid the dynamic allocation of the work structure.

This will also simplify the synchronization: for each root kmem_cache
there is only one work. So there will be no more concurrent attempts
to create a non-root kmem_cache for a root kmem_cache: the second and
all following attempts to queue the work will fail.

On the kmem_cache destruction path there is no more need to call the
expensive flush_workqueue() and wait for all pending works to be
finished. Instead, cancel_work_sync() can be used to cancel/wait for
only one work.

Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
---
 include/linux/memcontrol.h |  1 -
 mm/memcontrol.c            | 48 +-------------------------------------
 mm/slab.h                  |  2 ++
 mm/slab_common.c           | 22 +++++++++--------
 4 files changed, 15 insertions(+), 58 deletions(-)

Comments

Shakeel Butt June 22, 2020, 5:29 p.m. UTC | #1

On Mon, Jun 8, 2020 at 4:07 PM Roman Gushchin <guro@fb.com> wrote:
>
> Because the number of non-root kmem_caches doesn't depend on the
> number of memory cgroups anymore and is generally not very big,
> there is no more need for a dedicated workqueue.
>
> Also, as there is no more need to pass any arguments to the
> memcg_create_kmem_cache() except the root kmem_cache, it's
> possible to just embed the work structure into the kmem_cache
> and avoid the dynamic allocation of the work structure.
>
> This will also simplify the synchronization: for each root kmem_cache
> there is only one work. So there will be no more concurrent attempts
> to create a non-root kmem_cache for a root kmem_cache: the second and
> all following attempts to queue the work will fail.
>
>
> On the kmem_cache destruction path there is no more need to call the
> expensive flush_workqueue() and wait for all pending works to be
> finished. Instead, cancel_work_sync() can be used to cancel/wait for
> only one work.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>

Why not pre-allocate the non-root kmem_cache at the kmem_cache
creation time? No need for work_struct, queue_work() or
cancel_work_sync() at all.

Roman Gushchin June 22, 2020, 5:40 p.m. UTC | #2

On Mon, Jun 22, 2020 at 10:29:29AM -0700, Shakeel Butt wrote:
> On Mon, Jun 8, 2020 at 4:07 PM Roman Gushchin <guro@fb.com> wrote:
> >
> > Because the number of non-root kmem_caches doesn't depend on the
> > number of memory cgroups anymore and is generally not very big,
> > there is no more need for a dedicated workqueue.
> >
> > Also, as there is no more need to pass any arguments to the
> > memcg_create_kmem_cache() except the root kmem_cache, it's
> > possible to just embed the work structure into the kmem_cache
> > and avoid the dynamic allocation of the work structure.
> >
> > This will also simplify the synchronization: for each root kmem_cache
> > there is only one work. So there will be no more concurrent attempts
> > to create a non-root kmem_cache for a root kmem_cache: the second and
> > all following attempts to queue the work will fail.
> >
> >
> > On the kmem_cache destruction path there is no more need to call the
> > expensive flush_workqueue() and wait for all pending works to be
> > finished. Instead, cancel_work_sync() can be used to cancel/wait for
> > only one work.
> >
> > Signed-off-by: Roman Gushchin <guro@fb.com>
> > Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> 
> Why not pre-allocate the non-root kmem_cache at the kmem_cache
> creation time? No need for work_struct, queue_work() or
> cancel_work_sync() at all.

Simple because some kmem_caches are created very early, so we don't
even know at that time if we will need memcg slab caches. But this
code is likely going away if we're going with a single set for all
allocations.

Thanks!

Shakeel Butt June 22, 2020, 6:03 p.m. UTC | #3

On Mon, Jun 22, 2020 at 10:40 AM Roman Gushchin <guro@fb.com> wrote:
>
> On Mon, Jun 22, 2020 at 10:29:29AM -0700, Shakeel Butt wrote:
> > On Mon, Jun 8, 2020 at 4:07 PM Roman Gushchin <guro@fb.com> wrote:
> > >
> > > Because the number of non-root kmem_caches doesn't depend on the
> > > number of memory cgroups anymore and is generally not very big,
> > > there is no more need for a dedicated workqueue.
> > >
> > > Also, as there is no more need to pass any arguments to the
> > > memcg_create_kmem_cache() except the root kmem_cache, it's
> > > possible to just embed the work structure into the kmem_cache
> > > and avoid the dynamic allocation of the work structure.
> > >
> > > This will also simplify the synchronization: for each root kmem_cache
> > > there is only one work. So there will be no more concurrent attempts
> > > to create a non-root kmem_cache for a root kmem_cache: the second and
> > > all following attempts to queue the work will fail.
> > >
> > >
> > > On the kmem_cache destruction path there is no more need to call the
> > > expensive flush_workqueue() and wait for all pending works to be
> > > finished. Instead, cancel_work_sync() can be used to cancel/wait for
> > > only one work.
> > >
> > > Signed-off-by: Roman Gushchin <guro@fb.com>
> > > Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> >
> > Why not pre-allocate the non-root kmem_cache at the kmem_cache
> > creation time? No need for work_struct, queue_work() or
> > cancel_work_sync() at all.
>
> Simple because some kmem_caches are created very early, so we don't
> even know at that time if we will need memcg slab caches. But this
> code is likely going away if we're going with a single set for all
> allocations.
>

LGTM.

Reviewed-by: Shakeel Butt <shakeelb@google.com>

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index e2c4d54aa1f6..ed0d2ac6a5d2 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -1418,7 +1418,6 @@  int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size);
 void obj_cgroup_uncharge(struct obj_cgroup *objcg, size_t size);
 
 extern struct static_key_false memcg_kmem_enabled_key;
-extern struct workqueue_struct *memcg_kmem_cache_wq;
 
 extern int memcg_nr_cache_ids;
 void memcg_get_cache_ids(void);
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 995204f65217..2695cdc15baa 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -399,8 +399,6 @@  void memcg_put_cache_ids(void)
  */
 DEFINE_STATIC_KEY_FALSE(memcg_kmem_enabled_key);
 EXPORT_SYMBOL(memcg_kmem_enabled_key);
-
-struct workqueue_struct *memcg_kmem_cache_wq;
 #endif
 
 static int memcg_shrinker_map_size;
@@ -2902,39 +2900,6 @@  static void memcg_free_cache_id(int id)
 	ida_simple_remove(&memcg_cache_ida, id);
 }
 
-struct memcg_kmem_cache_create_work {
-	struct kmem_cache *cachep;
-	struct work_struct work;
-};
-
-static void memcg_kmem_cache_create_func(struct work_struct *w)
-{
-	struct memcg_kmem_cache_create_work *cw =
-		container_of(w, struct memcg_kmem_cache_create_work, work);
-	struct kmem_cache *cachep = cw->cachep;
-
-	memcg_create_kmem_cache(cachep);
-
-	kfree(cw);
-}
-
-/*
- * Enqueue the creation of a per-memcg kmem_cache.
- */
-static void memcg_schedule_kmem_cache_create(struct kmem_cache *cachep)
-{
-	struct memcg_kmem_cache_create_work *cw;
-
-	cw = kmalloc(sizeof(*cw), GFP_NOWAIT | __GFP_NOWARN);
-	if (!cw)
-		return;
-
-	cw->cachep = cachep;
-	INIT_WORK(&cw->work, memcg_kmem_cache_create_func);
-
-	queue_work(memcg_kmem_cache_wq, &cw->work);
-}
-
 /**
  * memcg_kmem_get_cache: select memcg or root cache for allocation
  * @cachep: the original global kmem cache
@@ -2951,7 +2916,7 @@  struct kmem_cache *memcg_kmem_get_cache(struct kmem_cache *cachep)
 
 	memcg_cachep = READ_ONCE(cachep->memcg_params.memcg_cache);
 	if (unlikely(!memcg_cachep)) {
-		memcg_schedule_kmem_cache_create(cachep);
+		queue_work(system_wq, &cachep->memcg_params.work);
 		return cachep;
 	}
 
@@ -7062,17 +7027,6 @@  static int __init mem_cgroup_init(void)
 {
 	int cpu, node;
 
-#ifdef CONFIG_MEMCG_KMEM
-	/*
-	 * Kmem cache creation is mostly done with the slab_mutex held,
-	 * so use a workqueue with limited concurrency to avoid stalling
-	 * all worker threads in case lots of cgroups are created and
-	 * destroyed simultaneously.
-	 */
-	memcg_kmem_cache_wq = alloc_workqueue("memcg_kmem_cache", 0, 1);
-	BUG_ON(!memcg_kmem_cache_wq);
-#endif
-
 	cpuhp_setup_state_nocalls(CPUHP_MM_MEMCQ_DEAD, "mm/memctrl:dead", NULL,
 				  memcg_hotplug_cpu_dead);
 
diff --git a/mm/slab.h b/mm/slab.h
index 8f8552df5675..c6c7987dfd85 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -45,12 +45,14 @@  struct kmem_cache {
  * @memcg_cache: pointer to memcg kmem cache, used by all non-root memory
  *		cgroups.
  * @root_caches_node: list node for slab_root_caches list.
+ * @work: work struct used to create the non-root cache.
  */
 struct memcg_cache_params {
 	struct kmem_cache *root_cache;
 
 	struct kmem_cache *memcg_cache;
 	struct list_head __root_caches_node;
+	struct work_struct work;
 };
 #endif /* CONFIG_SLOB */
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index e9deaafddbb6..10aa2acb84ca 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -132,10 +132,18 @@  int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t nr,
 
 LIST_HEAD(slab_root_caches);
 
+static void memcg_kmem_cache_create_func(struct work_struct *work)
+{
+	struct kmem_cache *cachep = container_of(work, struct kmem_cache,
+						 memcg_params.work);
+	memcg_create_kmem_cache(cachep);
+}
+
 void slab_init_memcg_params(struct kmem_cache *s)
 {
 	s->memcg_params.root_cache = NULL;
 	s->memcg_params.memcg_cache = NULL;
+	INIT_WORK(&s->memcg_params.work, memcg_kmem_cache_create_func);
 }
 
 static void init_memcg_params(struct kmem_cache *s,
@@ -584,15 +592,9 @@  static int shutdown_memcg_caches(struct kmem_cache *s)
 	return 0;
 }
 
-static void flush_memcg_workqueue(struct kmem_cache *s)
+static void cancel_memcg_cache_creation(struct kmem_cache *s)
 {
-	/*
-	 * SLAB and SLUB create memcg kmem_caches through workqueue and SLUB
-	 * deactivates the memcg kmem_caches through workqueue. Make sure all
-	 * previous workitems on workqueue are processed.
-	 */
-	if (likely(memcg_kmem_cache_wq))
-		flush_workqueue(memcg_kmem_cache_wq);
+	cancel_work_sync(&s->memcg_params.work);
 }
 #else
 static inline int shutdown_memcg_caches(struct kmem_cache *s)
@@ -600,7 +602,7 @@  static inline int shutdown_memcg_caches(struct kmem_cache *s)
 	return 0;
 }
 
-static inline void flush_memcg_workqueue(struct kmem_cache *s)
+static inline void cancel_memcg_cache_creation(struct kmem_cache *s)
 {
 }
 #endif /* CONFIG_MEMCG_KMEM */
@@ -619,7 +621,7 @@  void kmem_cache_destroy(struct kmem_cache *s)
 	if (unlikely(!s))
 		return;
 
-	flush_memcg_workqueue(s);
+	cancel_memcg_cache_creation(s);
 
 	get_online_cpus();
 	get_online_mems();

[v6,13/19] mm: memcg/slab: simplify memcg cache creation

Commit Message

Comments

Patch