diff mbox series

[2/2] memcg: dump memory.stat during cgroup OOM for v1

Message ID 20230426133919.1342942-3-yosryahmed@google.com (mailing list archive)
State New
Headers show
Series memcg: OOM log improvements | expand

Commit Message

Yosry Ahmed April 26, 2023, 1:39 p.m. UTC
Commit c8713d0b2312 ("mm: memcontrol: dump memory.stat during cgroup
OOM") made sure we dump all the stats in memory.stat during a cgroup
OOM, but it also introduced a slight behavioral change. The code used to
print the non-hierarchical v1 cgroup stats for the entire cgroup
subtree, not it only prints the v2 cgroup stats for the cgroup under
OOM.

Although v2 stats are a superset of v1 stats, some of them have
different naming. We also lost the non-hierarchical stats for the cgroup
under OOM in v1.

Move the v2 specific formatting from memory_stat_format() to
memcg_stat_format(), add memcg1_stat_format() for v1, and make
memory_stat_format() select between them based on cgroup version.
Since memory_stat_show() now works for both v1 & v2, drop
memcg_stat_show().

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
---
 mm/memcontrol.c | 60 ++++++++++++++++++++++++++++---------------------
 1 file changed, 35 insertions(+), 25 deletions(-)

Comments

Michal Hocko April 26, 2023, 3:27 p.m. UTC | #1
On Wed 26-04-23 13:39:19, Yosry Ahmed wrote:
> Commit c8713d0b2312 ("mm: memcontrol: dump memory.stat during cgroup
> OOM") made sure we dump all the stats in memory.stat during a cgroup
> OOM, but it also introduced a slight behavioral change. The code used to
> print the non-hierarchical v1 cgroup stats for the entire cgroup
> subtree, not it only prints the v2 cgroup stats for the cgroup under
> OOM.
> 
> Although v2 stats are a superset of v1 stats, some of them have
> different naming. We also lost the non-hierarchical stats for the cgroup
> under OOM in v1.

Why is that a problem worth solving? It would be also nice to add an
example of the oom report before and after the patch.
Yosry Ahmed April 27, 2023, 9:21 a.m. UTC | #2
On Wed, Apr 26, 2023 at 8:27 AM Michal Hocko <mhocko@suse.com> wrote:
>
> On Wed 26-04-23 13:39:19, Yosry Ahmed wrote:
> > Commit c8713d0b2312 ("mm: memcontrol: dump memory.stat during cgroup
> > OOM") made sure we dump all the stats in memory.stat during a cgroup
> > OOM, but it also introduced a slight behavioral change. The code used to
> > print the non-hierarchical v1 cgroup stats for the entire cgroup
> > subtree, not it only prints the v2 cgroup stats for the cgroup under
> > OOM.
> >
> > Although v2 stats are a superset of v1 stats, some of them have
> > different naming. We also lost the non-hierarchical stats for the cgroup
> > under OOM in v1.
>
> Why is that a problem worth solving? It would be also nice to add an
> example of the oom report before and after the patch.
> --
> Michal Hocko
> SUSE Labs

Thanks for taking a look!

The problem is that when upgrading to a kernel that contains
c8713d0b2312 on cgroup v1, the OOM logs suddenly change. The stats
names become different, a couple of stats are gone, and the
non-hierarchical stats disappear.

The non-hierarchical stats are important to identify if a memcg OOM'd
because of the memory consumption of its own processes or its
descendants. In the example below, I created a parent memcg "a", and a
child memcg "b". A process in "a" itself ("tail" in this case) is
hogging memory and causing an OOM, not the processes in the child "b"
(the "sleep" processes). With non-hierarchical stats, it's clear that
this is the case.

Also, it is generally nice to keep things consistent as much as
possible. The sudden change of the OOM log with the kernel upgrade is
confusing, especially that the memcg stats in the OOM logs in cgroup
v1 now look different from the stats in memory.stat. This patch
restores the consistency for cgroup v1, without affecting cgroup v2.
IMO, it's also a nice cleanup to have the stats formatting code be
consistent across cgroup v1 and v2. I personally didn't like the
memory_stat_format() vs. memcg_stat_show() distinction.

Here is a sample of the OOM logs from the scenario described above:

Before:
[   88.339330] memory: usage 10240kB, limit 10240kB, failcnt 54
[   88.339340] memory+swap: usage 10240kB, limit 9007199254740988kB, failcnt 0
[   88.339347] kmem: usage 552kB, limit 9007199254740988kB, failcnt 0
[   88.339348] Memory cgroup stats for /a:
[   88.339458] anon 9900032
[   88.339483] file 0
[   88.339483] kernel 565248
[   88.339484] kernel_stack 0
[   88.339485] pagetables 294912
[   88.339486] sec_pagetables 0
[   88.339486] percpu 15584
[   88.339487] sock 0
[   88.339487] vmalloc 0
[   88.339488] shmem 0
[   88.339488] zswap 0
[   88.339489] zswapped 0
[   88.339489] file_mapped 0
[   88.339490] file_dirty 0
[   88.339490] file_writeback 0
[   88.339491] swapcached 0
[   88.339491] anon_thp 2097152
[   88.339492] file_thp 0
[   88.339492] shmem_thp 0
[   88.339497] inactive_anon 9797632
[   88.339498] active_anon 45056
[   88.339498] inactive_file 0
[   88.339499] active_file 0
[   88.339499] unevictable 0
[   88.339500] slab_reclaimable 19888
[   88.339500] slab_unreclaimable 42752
[   88.339501] slab 62640
[   88.339501] workingset_refault_anon 0
[   88.339502] workingset_refault_file 0
[   88.339502] workingset_activate_anon 0
[   88.339503] workingset_activate_file 0
[   88.339503] workingset_restore_anon 0
[   88.339504] workingset_restore_file 0
[   88.339504] workingset_nodereclaim 0
[   88.339505] pgscan 0
[   88.339505] pgsteal 0
[   88.339506] pgscan_kswapd 0
[   88.339506] pgscan_direct 0
[   88.339507] pgscan_khugepaged 0
[   88.339507] pgsteal_kswapd 0
[   88.339508] pgsteal_direct 0
[   88.339508] pgsteal_khugepaged 0
[   88.339509] pgfault 2750
[   88.339509] pgmajfault 0
[   88.339510] pgrefill 0
[   88.339510] pgactivate 1
[   88.339511] pgdeactivate 0
[   88.339511] pglazyfree 0
[   88.339512] pglazyfreed 0
[   88.339512] zswpin 0
[   88.339513] zswpout 0
[   88.339513] thp_fault_alloc 0
[   88.339514] thp_collapse_alloc 1
[   88.339514] Tasks state (memory values in pages):
[   88.339515] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes
swapents oom_score_adj name
[   88.339516] [    108]     0   108     2986     2624    61440
0             0 tail
[   88.339525] [     97]     0    97      724      352    32768
0             0 sleep
[   88.339538] [     99]     0    99      724      352    32768
0             0 sleep
[   88.339541] [     98]     0    98      724      320    32768
0             0 sleep
[   88.339542] [    101]     0   101      724      320    32768
0             0 sleep
[   88.339544] [    102]     0   102      724      352    32768
0             0 sleep
[   88.339546] [    103]     0   103      724      352    32768
0             0 sleep
[   88.339548] [    104]     0   104      724      352    32768
0             0 sleep
[   88.339549] [    105]     0   105      724      352    32768
0             0 sleep
[   88.339551] [    100]     0   100      724      352    32768
0             0 sleep
[   88.339558] [    106]     0   106      724      352    32768
0             0 sleep
[   88.339563] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0-2,oom_memcg=/a,task_memcg=/a,task=tail,pid=108,uid0
[   88.339588] Memory cgroup out of memory: Killed process 108 (tail)
total-vm:11944kB, anon-rss:9216kB, file-rss:0kB, shmem-rss:1280kB,
UID:00


After:
[   74.447997] memory: usage 10240kB, limit 10240kB, failcnt 116
[   74.447998] memory+swap: usage 10240kB, limit 9007199254740988kB, failcnt 0
[   74.448000] kmem: usage 548kB, limit 9007199254740988kB, failcnt 0
[   74.448001] Memory cgroup stats for /a:
[   74.448103] cache 0
[   74.448104] rss 9433088
[   74.448105] rss_huge 2097152
[   74.448105] shmem 0
[   74.448106] mapped_file 0
[   74.448106] dirty 0
[   74.448107] writeback 0
[   74.448107] workingset_refault_anon 0
[   74.448108] workingset_refault_file 0
[   74.448109] swap 0
[   74.448109] pgpgin 2304
[   74.448110] pgpgout 512
[   74.448111] pgfault 2332
[   74.448111] pgmajfault 0
[   74.448112] inactive_anon 9388032
[   74.448112] active_anon 4096
[   74.448113] inactive_file 0
[   74.448113] active_file 0
[   74.448114] unevictable 0
[   74.448114] hierarchical_memory_limit 10485760
[   74.448115] hierarchical_memsw_limit 9223372036854771712
[   74.448116] total_cache 0
[   74.448116] total_rss 9818112
[   74.448117] total_rss_huge 2097152
[   74.448118] total_shmem 0
[   74.448118] total_mapped_file 0
[   74.448119] total_dirty 0
[   74.448119] total_writeback 0
[   74.448120] total_workingset_refault_anon 0
[   74.448120] total_workingset_refault_file 0
[   74.448121] total_swap 0
[   74.448121] total_pgpgin 2407
[   74.448121] total_pgpgout 521
[   74.448122] total_pgfault 2734
[   74.448122] total_pgmajfault 0
[   74.448123] total_inactive_anon 9715712
[   74.448123] total_active_anon 45056
[   74.448124] total_inactive_file 0
[   74.448124] total_active_file 0
[   74.448125] total_unevictable 0
[   74.448125] Tasks state (memory values in pages):
[   74.448126] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes
swapents oom_score_adj name
[   74.448127] [    107]     0   107     2982     2592    61440
0             0 tail
[   74.448131] [     97]     0    97      724      352    32768
0             0 sleep
[   74.448134] [     98]     0    98      724      352    32768
0             0 sleep
[   74.448136] [     99]     0    99      724      352    32768
0             0 sleep
[   74.448137] [    101]     0   101      724      352    32768
0             0 sleep
[   74.448139] [    102]     0   102      724      352    32768
0             0 sleep
[   74.448141] [    103]     0   103      724      352    28672
0             0 sleep
[   74.448143] [    104]     0   104      724      352    32768
0             0 sleep
[   74.448144] [    105]     0   105      724      352    32768
0             0 sleep
[   74.448146] [    106]     0   106      724      352    32768
0             0 sleep
[   74.448148] [    100]     0   100      724      352    32768
0             0 sleep
[   74.448155] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0-2,oom_memcg=/a,task_memcg=/a,task=tail,pid=107,uid0
[   74.448178] Memory cgroup out of memory: Killed process 107 (tail)
total-vm:11928kB, anon-rss:9088kB, file-rss:0kB, shmem-rss:1280kB,
UID:00
Michal Hocko April 27, 2023, 2:06 p.m. UTC | #3
On Thu 27-04-23 02:21:30, Yosry Ahmed wrote:
> On Wed, Apr 26, 2023 at 8:27 AM Michal Hocko <mhocko@suse.com> wrote:
> >
> > On Wed 26-04-23 13:39:19, Yosry Ahmed wrote:
> > > Commit c8713d0b2312 ("mm: memcontrol: dump memory.stat during cgroup
> > > OOM") made sure we dump all the stats in memory.stat during a cgroup
> > > OOM, but it also introduced a slight behavioral change. The code used to
> > > print the non-hierarchical v1 cgroup stats for the entire cgroup
> > > subtree, not it only prints the v2 cgroup stats for the cgroup under
> > > OOM.
> > >
> > > Although v2 stats are a superset of v1 stats, some of them have
> > > different naming. We also lost the non-hierarchical stats for the cgroup
> > > under OOM in v1.
> >
> > Why is that a problem worth solving? It would be also nice to add an
> > example of the oom report before and after the patch.
> > --
> > Michal Hocko
> > SUSE Labs
> 
> Thanks for taking a look!
> 
> The problem is that when upgrading to a kernel that contains
> c8713d0b2312 on cgroup v1, the OOM logs suddenly change. The stats
> names become different, a couple of stats are gone, and the
> non-hierarchical stats disappear.
> 
> The non-hierarchical stats are important to identify if a memcg OOM'd
> because of the memory consumption of its own processes or its
> descendants. In the example below, I created a parent memcg "a", and a
> child memcg "b". A process in "a" itself ("tail" in this case) is
> hogging memory and causing an OOM, not the processes in the child "b"
> (the "sleep" processes). With non-hierarchical stats, it's clear that
> this is the case.

Is this difference really important from the OOM POV. There is no group
oom semantic in v1 and so it always boils down to a specific process
that gets selected. Which memcg it is sitting in shouldn't matter all
that much. Or does it really matter?

> Also, it is generally nice to keep things consistent as much as
> possible. The sudden change of the OOM log with the kernel upgrade is
> confusing, especially that the memcg stats in the OOM logs in cgroup
> v1 now look different from the stats in memory.stat.

Generally speaking oom report is not carved into stone. While we
shouldn't make changes just nilly willy it might change for
implementation specific reasons.

In this particular case I would agree that the new output is more
confusing than helpful. Just look at 
> [   88.339505] pgscan 0
> [   88.339505] pgsteal 0
> [   88.339506] pgscan_kswapd 0
> [   88.339506] pgscan_direct 0
> [   88.339507] pgscan_khugepaged 0
> [   88.339507] pgsteal_kswapd 0
> [   88.339508] pgsteal_direct 0
> [   88.339508] pgsteal_khugepaged 0

These stats are actively misleading because it would suggest there was
no memory reclaim done before oom was hit and that would imply a
potentially premature OOM killer invocation (thus a bug). There are
likely other stats which are not tracked in v1 yet they are reported
that might add to the confusion. I believe this would be a sound
justification to get back to the original reporting.
Yosry Ahmed April 27, 2023, 10:12 p.m. UTC | #4
On Thu, Apr 27, 2023 at 7:06 AM Michal Hocko <mhocko@suse.com> wrote:
>
> On Thu 27-04-23 02:21:30, Yosry Ahmed wrote:
> > On Wed, Apr 26, 2023 at 8:27 AM Michal Hocko <mhocko@suse.com> wrote:
> > >
> > > On Wed 26-04-23 13:39:19, Yosry Ahmed wrote:
> > > > Commit c8713d0b2312 ("mm: memcontrol: dump memory.stat during cgroup
> > > > OOM") made sure we dump all the stats in memory.stat during a cgroup
> > > > OOM, but it also introduced a slight behavioral change. The code used to
> > > > print the non-hierarchical v1 cgroup stats for the entire cgroup
> > > > subtree, not it only prints the v2 cgroup stats for the cgroup under
> > > > OOM.
> > > >
> > > > Although v2 stats are a superset of v1 stats, some of them have
> > > > different naming. We also lost the non-hierarchical stats for the cgroup
> > > > under OOM in v1.
> > >
> > > Why is that a problem worth solving? It would be also nice to add an
> > > example of the oom report before and after the patch.
> > > --
> > > Michal Hocko
> > > SUSE Labs
> >
> > Thanks for taking a look!
> >
> > The problem is that when upgrading to a kernel that contains
> > c8713d0b2312 on cgroup v1, the OOM logs suddenly change. The stats
> > names become different, a couple of stats are gone, and the
> > non-hierarchical stats disappear.
> >
> > The non-hierarchical stats are important to identify if a memcg OOM'd
> > because of the memory consumption of its own processes or its
> > descendants. In the example below, I created a parent memcg "a", and a
> > child memcg "b". A process in "a" itself ("tail" in this case) is
> > hogging memory and causing an OOM, not the processes in the child "b"
> > (the "sleep" processes). With non-hierarchical stats, it's clear that
> > this is the case.
>
> Is this difference really important from the OOM POV. There is no group
> oom semantic in v1 and so it always boils down to a specific process
> that gets selected. Which memcg it is sitting in shouldn't matter all
> that much. Or does it really matter?

It is useful information that we already have and used to dump. It's
not really about which memcg the victim lived in, but which memcg is
not behaving as expected causing an OOM. If you have processes running
in the OOMing memcg itself as well as its descendants, then it's nice
to get an idea of which memcg's usage is not as expected.

>
> > Also, it is generally nice to keep things consistent as much as
> > possible. The sudden change of the OOM log with the kernel upgrade is
> > confusing, especially that the memcg stats in the OOM logs in cgroup
> > v1 now look different from the stats in memory.stat.
>
> Generally speaking oom report is not carved into stone. While we
> shouldn't make changes just nilly willy it might change for
> implementation specific reasons.

Agreed, but for this case there is really no reason to change the
behavior, and imo making restoring the behavior makes the code cleaner
/ more consistent.

>
> In this particular case I would agree that the new output is more
> confusing than helpful. Just look at
> > [   88.339505] pgscan 0
> > [   88.339505] pgsteal 0
> > [   88.339506] pgscan_kswapd 0
> > [   88.339506] pgscan_direct 0
> > [   88.339507] pgscan_khugepaged 0
> > [   88.339507] pgsteal_kswapd 0
> > [   88.339508] pgsteal_direct 0
> > [   88.339508] pgsteal_khugepaged 0
>
> These stats are actively misleading because it would suggest there was
> no memory reclaim done before oom was hit and that would imply a
> potentially premature OOM killer invocation (thus a bug). There are
> likely other stats which are not tracked in v1 yet they are reported
> that might add to the confusion. I believe this would be a sound
> justification to get back to the original reporting.

I don't think the collection of these stats is v2-specific, I was just
testing in a VM without swap, and the process had 100% anonymous
memory. Testing with swap gives sane stats:
[   68.851724] pgscan 248801
[   68.851725] pgsteal 123081
[   68.851725] pgscan_kswapd 0
[   68.851726] pgscan_direct 248801
[   68.851727] pgscan_khugepaged 0
[   68.851727] pgsteal_kswapd 0
[   68.851728] pgsteal_direct 123081
[   68.851728] pgsteal_khugepaged 0

However, I still think this change is valuable. Like you mentioned,
the OOM log is not set in stone, but we shouldn't just change it for
no reason. In this case, for cgroup v1 users, the OOM log changed for
no reason beyond a side effect of another patch. Upon upgrading our
kernel we noticed the behavior change. This patch restores the old
behavior without any cost really, and it makes the code a tiny bit
more consistent.

> --
> Michal Hocko
> SUSE Labs
Michal Hocko April 28, 2023, 9:44 a.m. UTC | #5
On Thu 27-04-23 15:12:46, Yosry Ahmed wrote:
[...]
> However, I still think this change is valuable. Like you mentioned,
> the OOM log is not set in stone, but we shouldn't just change it for
> no reason. In this case, for cgroup v1 users, the OOM log changed for
> no reason beyond a side effect of another patch. Upon upgrading our
> kernel we noticed the behavior change. This patch restores the old
> behavior without any cost really, and it makes the code a tiny bit
> more consistent.

Fair enough. Just make sure you go into more details about why this is
causing problems/inconveniences. I am slightly worried this might cause
problems to other people who would like to have the same report for both
v1 and v2 so we should at least have some solid argumetns to revert
rather than "it used has changed and we liked it more that way".

I personally do not care all that much. It kinda sucks to dump counters
that are not tracked or fully tracked in v1 because that can mislead
people and that would be a bigger problem from my POV.
Yosry Ahmed April 28, 2023, 1:05 p.m. UTC | #6
On Fri, Apr 28, 2023 at 2:44 AM Michal Hocko <mhocko@suse.com> wrote:
>
> On Thu 27-04-23 15:12:46, Yosry Ahmed wrote:
> [...]
> > However, I still think this change is valuable. Like you mentioned,
> > the OOM log is not set in stone, but we shouldn't just change it for
> > no reason. In this case, for cgroup v1 users, the OOM log changed for
> > no reason beyond a side effect of another patch. Upon upgrading our
> > kernel we noticed the behavior change. This patch restores the old
> > behavior without any cost really, and it makes the code a tiny bit
> > more consistent.
>
> Fair enough. Just make sure you go into more details about why this is
> causing problems/inconveniences. I am slightly worried this might cause
> problems to other people who would like to have the same report for both
> v1 and v2 so we should at least have some solid argumetns to revert
> rather than "it used has changed and we liked it more that way".
>
> I personally do not care all that much. It kinda sucks to dump counters
> that are not tracked or fully tracked in v1 because that can mislead
> people and that would be a bigger problem from my POV.

Great point, let me send a v2 rephrasing the commit log of this patch
and adding the Ack's on the first one.

Thanks Michal!

> --
> Michal Hocko
> SUSE Labs
diff mbox series

Patch

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5922940f92c9..2b492f8d540c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1551,7 +1551,7 @@  static inline unsigned long memcg_page_state_output(struct mem_cgroup *memcg,
 	return memcg_page_state(memcg, item) * memcg_page_state_unit(item);
 }
 
-static void memory_stat_format(struct mem_cgroup *memcg, struct seq_buf *s)
+static void memcg_stat_format(struct mem_cgroup *memcg, struct seq_buf *s)
 {
 	int i;
 
@@ -1604,6 +1604,17 @@  static void memory_stat_format(struct mem_cgroup *memcg, struct seq_buf *s)
 	WARN_ON_ONCE(seq_buf_has_overflowed(s));
 }
 
+static void memcg1_stat_format(struct mem_cgroup *memcg, struct seq_buf *s);
+
+static void memory_stat_format(struct mem_cgroup *memcg, struct seq_buf *s)
+{
+	if (cgroup_subsys_on_dfl(memory_cgrp_subsys))
+		memcg_stat_format(memcg, s);
+	else
+		memcg1_stat_format(memcg, s);
+	WARN_ON_ONCE(seq_buf_has_overflowed(s));
+}
+
 #define K(x) ((x) << (PAGE_SHIFT-10))
 /**
  * mem_cgroup_print_oom_context: Print OOM information relevant to
@@ -4078,9 +4089,8 @@  static const unsigned int memcg1_events[] = {
 	PGMAJFAULT,
 };
 
-static int memcg_stat_show(struct seq_file *m, void *v)
+static void memcg1_stat_format(struct mem_cgroup *memcg, struct seq_buf *s)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_seq(m);
 	unsigned long memory, memsw;
 	struct mem_cgroup *mi;
 	unsigned int i;
@@ -4095,18 +4105,18 @@  static int memcg_stat_show(struct seq_file *m, void *v)
 		if (memcg1_stats[i] == MEMCG_SWAP && !do_memsw_account())
 			continue;
 		nr = memcg_page_state_local(memcg, memcg1_stats[i]);
-		seq_printf(m, "%s %lu\n", memcg1_stat_names[i],
+		seq_buf_printf(s, "%s %lu\n", memcg1_stat_names[i],
 			   nr * memcg_page_state_unit(memcg1_stats[i]));
 	}
 
 	for (i = 0; i < ARRAY_SIZE(memcg1_events); i++)
-		seq_printf(m, "%s %lu\n", vm_event_name(memcg1_events[i]),
-			   memcg_events_local(memcg, memcg1_events[i]));
+		seq_buf_printf(s, "%s %lu\n", vm_event_name(memcg1_events[i]),
+			       memcg_events_local(memcg, memcg1_events[i]));
 
 	for (i = 0; i < NR_LRU_LISTS; i++)
-		seq_printf(m, "%s %lu\n", lru_list_name(i),
-			   memcg_page_state_local(memcg, NR_LRU_BASE + i) *
-			   PAGE_SIZE);
+		seq_buf_printf(s, "%s %lu\n", lru_list_name(i),
+			       memcg_page_state_local(memcg, NR_LRU_BASE + i) *
+			       PAGE_SIZE);
 
 	/* Hierarchical information */
 	memory = memsw = PAGE_COUNTER_MAX;
@@ -4114,11 +4124,11 @@  static int memcg_stat_show(struct seq_file *m, void *v)
 		memory = min(memory, READ_ONCE(mi->memory.max));
 		memsw = min(memsw, READ_ONCE(mi->memsw.max));
 	}
-	seq_printf(m, "hierarchical_memory_limit %llu\n",
-		   (u64)memory * PAGE_SIZE);
+	seq_buf_printf(s, "hierarchical_memory_limit %llu\n",
+		       (u64)memory * PAGE_SIZE);
 	if (do_memsw_account())
-		seq_printf(m, "hierarchical_memsw_limit %llu\n",
-			   (u64)memsw * PAGE_SIZE);
+		seq_buf_printf(s, "hierarchical_memsw_limit %llu\n",
+			       (u64)memsw * PAGE_SIZE);
 
 	for (i = 0; i < ARRAY_SIZE(memcg1_stats); i++) {
 		unsigned long nr;
@@ -4126,19 +4136,19 @@  static int memcg_stat_show(struct seq_file *m, void *v)
 		if (memcg1_stats[i] == MEMCG_SWAP && !do_memsw_account())
 			continue;
 		nr = memcg_page_state(memcg, memcg1_stats[i]);
-		seq_printf(m, "total_%s %llu\n", memcg1_stat_names[i],
+		seq_buf_printf(s, "total_%s %llu\n", memcg1_stat_names[i],
 			   (u64)nr * memcg_page_state_unit(memcg1_stats[i]));
 	}
 
 	for (i = 0; i < ARRAY_SIZE(memcg1_events); i++)
-		seq_printf(m, "total_%s %llu\n",
-			   vm_event_name(memcg1_events[i]),
-			   (u64)memcg_events(memcg, memcg1_events[i]));
+		seq_buf_printf(s, "total_%s %llu\n",
+			       vm_event_name(memcg1_events[i]),
+			       (u64)memcg_events(memcg, memcg1_events[i]));
 
 	for (i = 0; i < NR_LRU_LISTS; i++)
-		seq_printf(m, "total_%s %llu\n", lru_list_name(i),
-			   (u64)memcg_page_state(memcg, NR_LRU_BASE + i) *
-			   PAGE_SIZE);
+		seq_buf_printf(s, "total_%s %llu\n", lru_list_name(i),
+			       (u64)memcg_page_state(memcg, NR_LRU_BASE + i) *
+			       PAGE_SIZE);
 
 #ifdef CONFIG_DEBUG_VM
 	{
@@ -4153,12 +4163,10 @@  static int memcg_stat_show(struct seq_file *m, void *v)
 			anon_cost += mz->lruvec.anon_cost;
 			file_cost += mz->lruvec.file_cost;
 		}
-		seq_printf(m, "anon_cost %lu\n", anon_cost);
-		seq_printf(m, "file_cost %lu\n", file_cost);
+		seq_buf_printf(s, "anon_cost %lu\n", anon_cost);
+		seq_buf_printf(s, "file_cost %lu\n", file_cost);
 	}
 #endif
-
-	return 0;
 }
 
 static u64 mem_cgroup_swappiness_read(struct cgroup_subsys_state *css,
@@ -4998,6 +5006,8 @@  static int mem_cgroup_slab_show(struct seq_file *m, void *p)
 }
 #endif
 
+static int memory_stat_show(struct seq_file *m, void *v);
+
 static struct cftype mem_cgroup_legacy_files[] = {
 	{
 		.name = "usage_in_bytes",
@@ -5030,7 +5040,7 @@  static struct cftype mem_cgroup_legacy_files[] = {
 	},
 	{
 		.name = "stat",
-		.seq_show = memcg_stat_show,
+		.seq_show = memory_stat_show,
 	},
 	{
 		.name = "force_empty",