Message ID | 20240725105626.824-1-shivankg@amd.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/vmstat: Fix placement of per-node stats in /proc/zoneinfo | expand |
On Thu, 25 Jul 2024 16:26:26 +0530 Shivank Garg <shivankg@amd.com> wrote: > The per-node stats in /proc/zoneinfo output are incorrectly inserted > between the first populated zone title and its specific stats. This > creates confusion while reading or parsing its output. I pity anyone who has to write a parser for that mess. > This patch relocates the per-node stats at the beginning for each node, > followed by the individual zone statistics. This fix results in a clearer > and more consistent output format. > Fixes: e2ecc8a79ed4 ("mm, vmstat: print node-based stats in zoneinfo file") It's been this way since 2016? Surely there's a risk of breaking existing userspace parsers?
On 7/26/2024 1:17 AM, Andrew Morton wrote: > On Thu, 25 Jul 2024 16:26:26 +0530 Shivank Garg <shivankg@amd.com> wrote: > >> The per-node stats in /proc/zoneinfo output are incorrectly inserted >> between the first populated zone title and its specific stats. This >> creates confusion while reading or parsing its output. > > I pity anyone who has to write a parser for that mess. Some userspace parser have faced an issue like this: https://github.com/prometheus/procfs/issues/386 In the fix commit, they are ignoring per-node stats section. > >> This patch relocates the per-node stats at the beginning for each node, >> followed by the individual zone statistics. This fix results in a clearer >> and more consistent output format. > >> Fixes: e2ecc8a79ed4 ("mm, vmstat: print node-based stats in zoneinfo file") > > It's been this way since 2016? Surely there's a risk of breaking > existing userspace parsers? For some of the per-node stats, some application may probe the /sys/devices/system/node/node*/meminfo (in KB) There may be a possibility of breaking scripts, but it will make the output more consistent for future scripts. Thanks, Shivank
On 7/26/2024 9:46 AM, Garg, Shivank wrote: > > > On 7/26/2024 1:17 AM, Andrew Morton wrote: >> On Thu, 25 Jul 2024 16:26:26 +0530 Shivank Garg <shivankg@amd.com> wrote: >> >>> The per-node stats in /proc/zoneinfo output are incorrectly inserted >>> between the first populated zone title and its specific stats. This >>> creates confusion while reading or parsing its output. >> >> I pity anyone who has to write a parser for that mess. > > Some userspace parser have faced an issue like this: > https://github.com/prometheus/procfs/issues/386 > In the fix commit, they are ignoring per-node stats section. > >> >>> This patch relocates the per-node stats at the beginning for each node, >>> followed by the individual zone statistics. This fix results in a clearer >>> and more consistent output format. >> >>> Fixes: e2ecc8a79ed4 ("mm, vmstat: print node-based stats in zoneinfo file") >> >> It's been this way since 2016? Surely there's a risk of breaking >> existing userspace parsers? > > For some of the per-node stats, some application may probe the > /sys/devices/system/node/node*/meminfo (in KB) As per this commit, https://github.com/torvalds/linux/commit/e2ecc8a79ed49f7838b4fdf352c4c48cec9424ac "There are a number of stats that were previously accessible via zoneinfo that are now invisible. While it is possible to create a new file for the node stats, this may be missed by users. Instead this patch prints the stats under the first populated zone in /proc/zoneinfo." per node stats were added to zoneinfo since some stats were missing. However I find that all stats are in fact present in /sys/devices/system/node/node*/vmstat hence I wonder why should we continue to have per-node stats in zoneinfo. > > There may be a possibility of breaking scripts, but it will make the > output more consistent for future scripts. > > Thanks, > Shivank > >
diff --git a/mm/vmstat.c b/mm/vmstat.c index 8507c497218b..1b32fab265e1 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1663,36 +1663,11 @@ static const struct seq_operations pagetypeinfo_op = { .show = pagetypeinfo_show, }; -static bool is_zone_first_populated(pg_data_t *pgdat, struct zone *zone) -{ - int zid; - - for (zid = 0; zid < MAX_NR_ZONES; zid++) { - struct zone *compare = &pgdat->node_zones[zid]; - - if (populated_zone(compare)) - return zone == compare; - } - - return false; -} - static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat, struct zone *zone) { int i; seq_printf(m, "Node %d, zone %8s", pgdat->node_id, zone->name); - if (is_zone_first_populated(pgdat, zone)) { - seq_printf(m, "\n per-node stats"); - for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) { - unsigned long pages = node_page_state_pages(pgdat, i); - - if (vmstat_item_print_in_thp(i)) - pages /= HPAGE_PMD_NR; - seq_printf(m, "\n %-12s %lu", node_stat_name(i), - pages); - } - } seq_printf(m, "\n pages free %lu" "\n boost %lu" @@ -1773,7 +1748,18 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat, */ static int zoneinfo_show(struct seq_file *m, void *arg) { + int i; pg_data_t *pgdat = (pg_data_t *)arg; + + seq_printf(m, "Node %d, per-node stats", pgdat->node_id); + for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) { + unsigned long pages = node_page_state_pages(pgdat, i); + + if (vmstat_item_print_in_thp(i)) + pages /= HPAGE_PMD_NR; + seq_printf(m, "\n %-12s %lu", node_stat_name(i), pages); + } + seq_putc(m, '\n'); walk_zones_in_node(m, pgdat, false, false, zoneinfo_show_print); return 0; }
The per-node stats in /proc/zoneinfo output are incorrectly inserted between the first populated zone title and its specific stats. This creates confusion while reading or parsing its output. Output of /proc/zoneinfo Node 0, zone DMA per-node stats nr_inactive_anon 2096 ... pages free 2814 boost 0 ... nr_free_pages 2814 ... Node 0, zone DMA32 pages free 357384 boost 0 ... ... Node 1, zone Normal per-node stats nr_inactive_anon 87606 ... pages free 51836334 boost 0 ... Node 1, zone Movable pages free 0 boost 0 This patch relocates the per-node stats at the beginning for each node, followed by the individual zone statistics. This fix results in a clearer and more consistent output format. /proc/zoneinfo output with proposed fix: Node 0, per-node stats nr_inactive_anon 2783 nr_active_anon 36 ... Node 0, zone DMA pages free 3584 boost 0 min 46 ... ... Remove is_zone_first_populated() as it is not used anywhere else. Fixes: e2ecc8a79ed4 ("mm, vmstat: print node-based stats in zoneinfo file") Signed-off-by: Shivank Garg <shivankg@amd.com> --- mm/vmstat.c | 36 +++++++++++------------------------- 1 file changed, 11 insertions(+), 25 deletions(-)