diff mbox

[v2,6/7] mm, proc: add KReclaimable to /proc/meminfo

Message ID 20180618091808.4419-7-vbabka@suse.cz (mailing list archive)
State New, archived
Headers show

Commit Message

Vlastimil Babka June 18, 2018, 9:18 a.m. UTC
The vmstat NR_KERNEL_MISC_RECLAIMABLE counter is for kernel non-slab
allocations that can be reclaimed via shrinker. In /proc/meminfo, we can show
the sum of all reclaimable kernel allocations (including slab) as
"KReclaimable". Add the same counter also to per-node meminfo under /sys

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 Documentation/filesystems/proc.txt |  4 ++++
 drivers/base/node.c                | 19 ++++++++++++-------
 fs/proc/meminfo.c                  | 16 ++++++++--------
 3 files changed, 24 insertions(+), 15 deletions(-)

Comments

Andrew Morton June 18, 2018, 9:33 p.m. UTC | #1
On Mon, 18 Jun 2018 11:18:07 +0200 Vlastimil Babka <vbabka@suse.cz> wrote:

> The vmstat NR_KERNEL_MISC_RECLAIMABLE counter is for kernel non-slab
> allocations that can be reclaimed via shrinker. In /proc/meminfo, we can show
> the sum of all reclaimable kernel allocations (including slab) as
> "KReclaimable". Add the same counter also to per-node meminfo under /sys

Why do you consider this useful enough to justify adding it to
/pro/meminfo?  How will people use it, what benefit will they see, etc?


Maybe you've undersold this whole patchset, but I'm struggling a bit to
see what the end-user benefits are.  What would be wrong with just
sticking with what we have now?
Vlastimil Babka June 19, 2018, 7:30 a.m. UTC | #2
On 06/18/2018 11:33 PM, Andrew Morton wrote:
> On Mon, 18 Jun 2018 11:18:07 +0200 Vlastimil Babka <vbabka@suse.cz> wrote:
> 
>> The vmstat NR_KERNEL_MISC_RECLAIMABLE counter is for kernel non-slab
>> allocations that can be reclaimed via shrinker. In /proc/meminfo, we can show
>> the sum of all reclaimable kernel allocations (including slab) as
>> "KReclaimable". Add the same counter also to per-node meminfo under /sys
> 
> Why do you consider this useful enough to justify adding it to
> /pro/meminfo?  How will people use it, what benefit will they see, etc?

Let's add this:

With this counter, users will have more complete information about
kernel memory usage. Non-slab reclaimable pages (currently just the ION
allocator) will not be missing from /proc/meminfo, making users wonder
where part of their memory went. More precisely, they already appear in
MemAvailable, but without the new counter, it's not obvious why the
value in MemAvailable doesn't fully correspond with the sum of other
counters participating in it.

> Maybe you've undersold this whole patchset, but I'm struggling a bit to
> see what the end-user benefits are.  What would be wrong with just
> sticking with what we have now?

Fair enough, I will add more info in reply to the cover letter.
Minchan Kim June 19, 2018, 8:13 a.m. UTC | #3
On Tue, Jun 19, 2018 at 09:30:03AM +0200, Vlastimil Babka wrote:
> On 06/18/2018 11:33 PM, Andrew Morton wrote:
> > On Mon, 18 Jun 2018 11:18:07 +0200 Vlastimil Babka <vbabka@suse.cz> wrote:
> > 
> >> The vmstat NR_KERNEL_MISC_RECLAIMABLE counter is for kernel non-slab
> >> allocations that can be reclaimed via shrinker. In /proc/meminfo, we can show
> >> the sum of all reclaimable kernel allocations (including slab) as
> >> "KReclaimable". Add the same counter also to per-node meminfo under /sys
> > 
> > Why do you consider this useful enough to justify adding it to
> > /pro/meminfo?  How will people use it, what benefit will they see, etc?
> 
> Let's add this:
> 
> With this counter, users will have more complete information about
> kernel memory usage. Non-slab reclaimable pages (currently just the ION
> allocator) will not be missing from /proc/meminfo, making users wonder
> where part of their memory went. More precisely, they already appear in
> MemAvailable, but without the new counter, it's not obvious why the
> value in MemAvailable doesn't fully correspond with the sum of other
> counters participating in it.

Hmm, if we could get MemAvailable with sum of other counters participating
in it, MemAvailable wouldn't be meaninful. IMO, MemAvailable don't need to
be matched with other counters.

The benefit of ION KReclaimable in real field is there are some sluggish
problem bugreport under memory pressure and found ION page pool is too
much without shrinking. In that case, that meminfo would be useful to
know something was broken in the system.

In that point of view, a concern to me is if we put more KReclaimable
pages(e.g., binder is candidate), it ends up we couldn't identify what
caches are too much among them. That means we needs KReclaimableInfo(like
slabinfo) to show each type's KReclaimable pages in future.

Anyway, it's good for first step.

> 
> > Maybe you've undersold this whole patchset, but I'm struggling a bit to
> > see what the end-user benefits are.  What would be wrong with just
> > sticking with what we have now?
> 
> Fair enough, I will add more info in reply to the cover letter.
>
Vlastimil Babka June 19, 2018, 12:44 p.m. UTC | #4
On 06/19/2018 10:13 AM, Minchan Kim wrote:
> On Tue, Jun 19, 2018 at 09:30:03AM +0200, Vlastimil Babka wrote:
>> On 06/18/2018 11:33 PM, Andrew Morton wrote:
>>> On Mon, 18 Jun 2018 11:18:07 +0200 Vlastimil Babka <vbabka@suse.cz> wrote:
>>>
>>>> The vmstat NR_KERNEL_MISC_RECLAIMABLE counter is for kernel non-slab
>>>> allocations that can be reclaimed via shrinker. In /proc/meminfo, we can show
>>>> the sum of all reclaimable kernel allocations (including slab) as
>>>> "KReclaimable". Add the same counter also to per-node meminfo under /sys
>>>
>>> Why do you consider this useful enough to justify adding it to
>>> /pro/meminfo?  How will people use it, what benefit will they see, etc?
>>
>> Let's add this:
>>
>> With this counter, users will have more complete information about
>> kernel memory usage. Non-slab reclaimable pages (currently just the ION
>> allocator) will not be missing from /proc/meminfo, making users wonder
>> where part of their memory went. More precisely, they already appear in
>> MemAvailable, but without the new counter, it's not obvious why the
>> value in MemAvailable doesn't fully correspond with the sum of other
>> counters participating in it.
> 
> Hmm, if we could get MemAvailable with sum of other counters participating
> in it, MemAvailable wouldn't be meaninful. IMO, MemAvailable don't need to
> be matched with other counters.

MemAvailable is meant as a "shortcut" for users, so they don't have to
remember which counters to count and add them up manually. It's also not
an exact sum, because there are some assumptions that part of
reclaimable memory might be pinned etc. Still, missing KReclaimable in
/proc/meminfo would be an odd exception wrt the other counters, IMHO.

> The benefit of ION KReclaimable in real field is there are some sluggish
> problem bugreport under memory pressure and found ION page pool is too
> much without shrinking. In that case, that meminfo would be useful to
> know something was broken in the system.

Right.

> In that point of view, a concern to me is if we put more KReclaimable
> pages(e.g., binder is candidate), it ends up we couldn't identify what
> caches are too much among them. That means we needs KReclaimableInfo(like
> slabinfo) to show each type's KReclaimable pages in future.

Yeah there are more direct kernel allocations that can eat significant
amounts of memory, without being visible in /proc/meminfo, and not
necessarily reclaimable. E.g. unless that changed, I recall XFS page
buffers. Striking a good balance of how detailed the accounting should
be is not easy.

BTW at some point I proposed MemUnaccounted to make it more obvious
(without adding up fields manually) that there is some memory consumed
by kernel allocations not visible in the other meminfo fields.
diff mbox

Patch

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 520f6a84cf50..6a255f960ab5 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -858,6 +858,7 @@  Writeback:           0 kB
 AnonPages:      861800 kB
 Mapped:         280372 kB
 Shmem:             644 kB
+KReclaimable:   168048 kB
 Slab:           284364 kB
 SReclaimable:   159856 kB
 SUnreclaim:     124508 kB
@@ -921,6 +922,9 @@  AnonHugePages: Non-file backed huge pages mapped into userspace page tables
 ShmemHugePages: Memory used by shared memory (shmem) and tmpfs allocated
               with huge pages
 ShmemPmdMapped: Shared memory mapped into userspace with huge pages
+KReclaimable: Kernel allocations that the kernel will attempt to reclaim
+              under memory pressure. Includes SReclaimable (below), and other
+              direct allocations with a shrinker.
         Slab: in-kernel data structures cache
 SReclaimable: Part of Slab, that might be reclaimed, such as caches
   SUnreclaim: Part of Slab, that cannot be reclaimed on memory pressure
diff --git a/drivers/base/node.c b/drivers/base/node.c
index a5e821d09656..81cef8031eae 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -67,8 +67,11 @@  static ssize_t node_read_meminfo(struct device *dev,
 	int nid = dev->id;
 	struct pglist_data *pgdat = NODE_DATA(nid);
 	struct sysinfo i;
+	unsigned long sreclaimable, sunreclaimable;
 
 	si_meminfo_node(&i, nid);
+	sreclaimable = node_page_state(pgdat, NR_SLAB_RECLAIMABLE);
+	sunreclaimable = node_page_state(pgdat, NR_SLAB_UNRECLAIMABLE);
 	n = sprintf(buf,
 		       "Node %d MemTotal:       %8lu kB\n"
 		       "Node %d MemFree:        %8lu kB\n"
@@ -118,6 +121,7 @@  static ssize_t node_read_meminfo(struct device *dev,
 		       "Node %d NFS_Unstable:   %8lu kB\n"
 		       "Node %d Bounce:         %8lu kB\n"
 		       "Node %d WritebackTmp:   %8lu kB\n"
+		       "Node %d KReclaimable:   %8lu kB\n"
 		       "Node %d Slab:           %8lu kB\n"
 		       "Node %d SReclaimable:   %8lu kB\n"
 		       "Node %d SUnreclaim:     %8lu kB\n"
@@ -138,20 +142,21 @@  static ssize_t node_read_meminfo(struct device *dev,
 		       nid, K(node_page_state(pgdat, NR_UNSTABLE_NFS)),
 		       nid, K(sum_zone_node_page_state(nid, NR_BOUNCE)),
 		       nid, K(node_page_state(pgdat, NR_WRITEBACK_TEMP)),
-		       nid, K(node_page_state(pgdat, NR_SLAB_RECLAIMABLE) +
-			      node_page_state(pgdat, NR_SLAB_UNRECLAIMABLE)),
-		       nid, K(node_page_state(pgdat, NR_SLAB_RECLAIMABLE)),
+		       nid, K(sreclaimable +
+			      node_page_state(pgdat, NR_KERNEL_MISC_RECLAIMABLE)),
+		       nid, K(sreclaimable + sunreclaimable),
+		       nid, K(sreclaimable),
+		       nid, K(sunreclaimable)
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-		       nid, K(node_page_state(pgdat, NR_SLAB_UNRECLAIMABLE)),
+		       ,
 		       nid, K(node_page_state(pgdat, NR_ANON_THPS) *
 				       HPAGE_PMD_NR),
 		       nid, K(node_page_state(pgdat, NR_SHMEM_THPS) *
 				       HPAGE_PMD_NR),
 		       nid, K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED) *
-				       HPAGE_PMD_NR));
-#else
-		       nid, K(node_page_state(pgdat, NR_SLAB_UNRECLAIMABLE)));
+				       HPAGE_PMD_NR)
 #endif
+		       );
 	n += hugetlb_report_node_meminfo(nid, buf + n);
 	return n;
 }
diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index 2fb04846ed11..61a18477bc07 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -37,6 +37,7 @@  static int meminfo_proc_show(struct seq_file *m, void *v)
 	long cached;
 	long available;
 	unsigned long pages[NR_LRU_LISTS];
+	unsigned long sreclaimable, sunreclaim;
 	int lru;
 
 	si_meminfo(&i);
@@ -52,6 +53,8 @@  static int meminfo_proc_show(struct seq_file *m, void *v)
 		pages[lru] = global_node_page_state(NR_LRU_BASE + lru);
 
 	available = si_mem_available();
+	sreclaimable = global_node_page_state(NR_SLAB_RECLAIMABLE);
+	sunreclaim = global_node_page_state(NR_SLAB_UNRECLAIMABLE);
 
 	show_val_kb(m, "MemTotal:       ", i.totalram);
 	show_val_kb(m, "MemFree:        ", i.freeram);
@@ -93,14 +96,11 @@  static int meminfo_proc_show(struct seq_file *m, void *v)
 	show_val_kb(m, "Mapped:         ",
 		    global_node_page_state(NR_FILE_MAPPED));
 	show_val_kb(m, "Shmem:          ", i.sharedram);
-	show_val_kb(m, "Slab:           ",
-		    global_node_page_state(NR_SLAB_RECLAIMABLE) +
-		    global_node_page_state(NR_SLAB_UNRECLAIMABLE));
-
-	show_val_kb(m, "SReclaimable:   ",
-		    global_node_page_state(NR_SLAB_RECLAIMABLE));
-	show_val_kb(m, "SUnreclaim:     ",
-		    global_node_page_state(NR_SLAB_UNRECLAIMABLE));
+	show_val_kb(m, "KReclaimable:   ", sreclaimable +
+		    global_node_page_state(NR_KERNEL_MISC_RECLAIMABLE));
+	show_val_kb(m, "Slab:           ", sreclaimable + sunreclaim);
+	show_val_kb(m, "SReclaimable:   ", sreclaimable);
+	show_val_kb(m, "SUnreclaim:     ", sunreclaim);
 	seq_printf(m, "KernelStack:    %8lu kB\n",
 		   global_zone_page_state(NR_KERNEL_STACK_KB));
 	show_val_kb(m, "PageTables:     ",