diff mbox series

[v2,5/6] mm: flush memcg percpu stats and events before releasing

Message ID 20190312223404.28665-6-guro@fb.com (mailing list archive)
State New, archived
Headers show
Series mm: reduce the memory footprint of dying memory cgroups | expand

Commit Message

Roman Gushchin March 12, 2019, 10:34 p.m. UTC
Flush percpu stats and events data to corresponding before releasing
percpu memory.

Although per-cpu stats are never exactly precise, dropping them on
floor regularly may lead to an accumulation of an error. So, it's
safer to flush them before releasing.

To minimize the number of atomic updates, let's sum all stats/events
on all cpus locally, and then make a single update per entry.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 mm/memcontrol.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

Comments

Johannes Weiner March 13, 2019, 4 p.m. UTC | #1
On Tue, Mar 12, 2019 at 03:34:02PM -0700, Roman Gushchin wrote:
> Flush percpu stats and events data to corresponding before releasing
> percpu memory.
> 
> Although per-cpu stats are never exactly precise, dropping them on
> floor regularly may lead to an accumulation of an error. So, it's
> safer to flush them before releasing.
> 
> To minimize the number of atomic updates, let's sum all stats/events
> on all cpus locally, and then make a single update per entry.
> 
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Do you mind merging 6/6 into this one? That would make it easier to
verify that the code added in this patch and the code removed in 6/6
are indeed functionally equivalent.
Roman Gushchin March 13, 2019, 6:23 p.m. UTC | #2
On Wed, Mar 13, 2019 at 12:00:17PM -0400, Johannes Weiner wrote:
> On Tue, Mar 12, 2019 at 03:34:02PM -0700, Roman Gushchin wrote:
> > Flush percpu stats and events data to corresponding before releasing
> > percpu memory.
> > 
> > Although per-cpu stats are never exactly precise, dropping them on
> > floor regularly may lead to an accumulation of an error. So, it's
> > safer to flush them before releasing.
> > 
> > To minimize the number of atomic updates, let's sum all stats/events
> > on all cpus locally, and then make a single update per entry.
> > 
> > Signed-off-by: Roman Gushchin <guro@fb.com>
> 
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> 
> Do you mind merging 6/6 into this one? That would make it easier to
> verify that the code added in this patch and the code removed in 6/6
> are indeed functionally equivalent.
> 

I did try, but the result is the mess of added and removed lines,
which are *almost* the same, but are slightly different (e.g. tabs).
So it's much easier to review it as two separate patches.

Thanks!
diff mbox series

Patch

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 1b5fe826d6d0..0f18bf2afea8 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2119,6 +2119,56 @@  static void drain_all_stock(struct mem_cgroup *root_memcg)
 	mutex_unlock(&percpu_charge_mutex);
 }
 
+/*
+ * Flush all per-cpu stats and events into atomics.
+ * Try to minimize the number of atomic writes by gathering data from
+ * all cpus locally, and then make one atomic update.
+ * No locking is required, because no one has an access to
+ * the offlined percpu data.
+ */
+static void memcg_flush_offline_percpu(struct mem_cgroup *memcg)
+{
+	struct memcg_vmstats_percpu __percpu *vmstats_percpu;
+	struct lruvec_stat __percpu *lruvec_stat_cpu;
+	struct mem_cgroup_per_node *pn;
+	int cpu, i;
+	long x;
+
+	vmstats_percpu = memcg->vmstats_percpu_offlined;
+
+	for (i = 0; i < MEMCG_NR_STAT; i++) {
+		int nid;
+
+		x = 0;
+		for_each_possible_cpu(cpu)
+			x += per_cpu(vmstats_percpu->stat[i], cpu);
+		if (x)
+			atomic_long_add(x, &memcg->vmstats[i]);
+
+		if (i >= NR_VM_NODE_STAT_ITEMS)
+			continue;
+
+		for_each_node(nid) {
+			pn = mem_cgroup_nodeinfo(memcg, nid);
+			lruvec_stat_cpu = pn->lruvec_stat_cpu_offlined;
+
+			x = 0;
+			for_each_possible_cpu(cpu)
+				x += per_cpu(lruvec_stat_cpu->count[i], cpu);
+			if (x)
+				atomic_long_add(x, &pn->lruvec_stat[i]);
+		}
+	}
+
+	for (i = 0; i < NR_VM_EVENT_ITEMS; i++) {
+		x = 0;
+		for_each_possible_cpu(cpu)
+			x += per_cpu(vmstats_percpu->events[i], cpu);
+		if (x)
+			atomic_long_add(x, &memcg->vmevents[i]);
+	}
+}
+
 static int memcg_hotplug_cpu_dead(unsigned int cpu)
 {
 	struct memcg_vmstats_percpu __percpu *vmstats_percpu;
@@ -4618,6 +4668,8 @@  static void percpu_rcu_free(struct rcu_head *rcu)
 	struct mem_cgroup *memcg = container_of(rcu, struct mem_cgroup, rcu);
 	int node;
 
+	memcg_flush_offline_percpu(memcg);
+
 	for_each_node(node) {
 		struct mem_cgroup_per_node *pn = memcg->nodeinfo[node];