diff mbox series

mm/vmstat: Fix placement of per-node stats in /proc/zoneinfo

Message ID 20240725105626.824-1-shivankg@amd.com (mailing list archive)
State New
Headers show
Series mm/vmstat: Fix placement of per-node stats in /proc/zoneinfo | expand

Commit Message

Shivank Garg July 25, 2024, 10:56 a.m. UTC
The per-node stats in /proc/zoneinfo output are incorrectly inserted
between the first populated zone title and its specific stats. This
creates confusion while reading or parsing its output.

Output of /proc/zoneinfo

Node 0, zone      DMA
  per-node stats
      nr_inactive_anon 2096
...
  pages free     2814
        boost    0
...
      nr_free_pages 2814
...
Node 0, zone    DMA32
  pages free     357384
        boost    0
...
...
Node 1, zone   Normal
  per-node stats
      nr_inactive_anon 87606
...
  pages free     51836334
        boost    0
...
Node 1, zone  Movable
  pages free     0
        boost    0

This patch relocates the per-node stats at the beginning for each node,
followed by the individual zone statistics. This fix results in a clearer
and more consistent output format.

/proc/zoneinfo output with proposed fix:

Node 0, per-node stats
      nr_inactive_anon 2783
      nr_active_anon 36
...
Node 0, zone      DMA
  pages free     3584
        boost    0
        min      46
...
...

Remove is_zone_first_populated() as it is not used anywhere else.

Fixes: e2ecc8a79ed4 ("mm, vmstat: print node-based stats in zoneinfo file")
Signed-off-by: Shivank Garg <shivankg@amd.com>
---
 mm/vmstat.c | 36 +++++++++++-------------------------
 1 file changed, 11 insertions(+), 25 deletions(-)

Comments

Andrew Morton July 25, 2024, 7:47 p.m. UTC | #1
On Thu, 25 Jul 2024 16:26:26 +0530 Shivank Garg <shivankg@amd.com> wrote:

> The per-node stats in /proc/zoneinfo output are incorrectly inserted
> between the first populated zone title and its specific stats. This
> creates confusion while reading or parsing its output.

I pity anyone who has to write a parser for that mess.

> This patch relocates the per-node stats at the beginning for each node,
> followed by the individual zone statistics. This fix results in a clearer
> and more consistent output format.

> Fixes: e2ecc8a79ed4 ("mm, vmstat: print node-based stats in zoneinfo file")

It's been this way since 2016?  Surely there's a risk of breaking
existing userspace parsers?
Shivank Garg July 26, 2024, 4:16 a.m. UTC | #2
On 7/26/2024 1:17 AM, Andrew Morton wrote:
> On Thu, 25 Jul 2024 16:26:26 +0530 Shivank Garg <shivankg@amd.com> wrote:
> 
>> The per-node stats in /proc/zoneinfo output are incorrectly inserted
>> between the first populated zone title and its specific stats. This
>> creates confusion while reading or parsing its output.
> 
> I pity anyone who has to write a parser for that mess.

Some userspace parser have faced an issue like this:
https://github.com/prometheus/procfs/issues/386
In the fix commit, they are ignoring per-node stats section.

> 
>> This patch relocates the per-node stats at the beginning for each node,
>> followed by the individual zone statistics. This fix results in a clearer
>> and more consistent output format.
> 
>> Fixes: e2ecc8a79ed4 ("mm, vmstat: print node-based stats in zoneinfo file")
> 
> It's been this way since 2016?  Surely there's a risk of breaking
> existing userspace parsers?

For some of the per-node stats, some application may probe the
/sys/devices/system/node/node*/meminfo (in KB)

There may be a possibility of breaking scripts, but it will make the
output more consistent for future scripts.

Thanks,
Shivank
Shivank Garg Aug. 2, 2024, 5:18 a.m. UTC | #3
On 7/26/2024 9:46 AM, Garg, Shivank wrote:
> 
> 
> On 7/26/2024 1:17 AM, Andrew Morton wrote:
>> On Thu, 25 Jul 2024 16:26:26 +0530 Shivank Garg <shivankg@amd.com> wrote:
>>
>>> The per-node stats in /proc/zoneinfo output are incorrectly inserted
>>> between the first populated zone title and its specific stats. This
>>> creates confusion while reading or parsing its output.
>>
>> I pity anyone who has to write a parser for that mess.
> 
> Some userspace parser have faced an issue like this:
> https://github.com/prometheus/procfs/issues/386
> In the fix commit, they are ignoring per-node stats section.
> 
>>
>>> This patch relocates the per-node stats at the beginning for each node,
>>> followed by the individual zone statistics. This fix results in a clearer
>>> and more consistent output format.
>>
>>> Fixes: e2ecc8a79ed4 ("mm, vmstat: print node-based stats in zoneinfo file")
>>
>> It's been this way since 2016?  Surely there's a risk of breaking
>> existing userspace parsers?
> 
> For some of the per-node stats, some application may probe the
> /sys/devices/system/node/node*/meminfo (in KB)



As per this commit,

https://github.com/torvalds/linux/commit/e2ecc8a79ed49f7838b4fdf352c4c48cec9424ac 
"There are a number of stats that were previously accessible via zoneinfo
that are now invisible.  While it is possible to create a new file for
the node stats, this may be missed by users.  Instead this patch prints
the stats under the first populated zone in /proc/zoneinfo."

per node stats were added to zoneinfo since some stats were missing.


However I find that all stats are in fact present in 
/sys/devices/system/node/node*/vmstat
hence I wonder why should we continue to have per-node stats in zoneinfo.


> 
> There may be a possibility of breaking scripts, but it will make the
> output more consistent for future scripts.
> 
> Thanks,
> Shivank
> 
>
diff mbox series

Patch

diff --git a/mm/vmstat.c b/mm/vmstat.c
index 8507c497218b..1b32fab265e1 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1663,36 +1663,11 @@  static const struct seq_operations pagetypeinfo_op = {
 	.show	= pagetypeinfo_show,
 };
 
-static bool is_zone_first_populated(pg_data_t *pgdat, struct zone *zone)
-{
-	int zid;
-
-	for (zid = 0; zid < MAX_NR_ZONES; zid++) {
-		struct zone *compare = &pgdat->node_zones[zid];
-
-		if (populated_zone(compare))
-			return zone == compare;
-	}
-
-	return false;
-}
-
 static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
 							struct zone *zone)
 {
 	int i;
 	seq_printf(m, "Node %d, zone %8s", pgdat->node_id, zone->name);
-	if (is_zone_first_populated(pgdat, zone)) {
-		seq_printf(m, "\n  per-node stats");
-		for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) {
-			unsigned long pages = node_page_state_pages(pgdat, i);
-
-			if (vmstat_item_print_in_thp(i))
-				pages /= HPAGE_PMD_NR;
-			seq_printf(m, "\n      %-12s %lu", node_stat_name(i),
-				   pages);
-		}
-	}
 	seq_printf(m,
 		   "\n  pages free     %lu"
 		   "\n        boost    %lu"
@@ -1773,7 +1748,18 @@  static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
  */
 static int zoneinfo_show(struct seq_file *m, void *arg)
 {
+	int i;
 	pg_data_t *pgdat = (pg_data_t *)arg;
+
+	seq_printf(m, "Node %d, per-node stats", pgdat->node_id);
+	for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) {
+		unsigned long pages = node_page_state_pages(pgdat, i);
+
+		if (vmstat_item_print_in_thp(i))
+			pages /= HPAGE_PMD_NR;
+		seq_printf(m, "\n      %-12s %lu", node_stat_name(i), pages);
+	}
+	seq_putc(m, '\n');
 	walk_zones_in_node(m, pgdat, false, false, zoneinfo_show_print);
 	return 0;
 }