Message ID | alpine.LSU.2.11.2103012158540.7549@eggly.anvils (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | None | expand |
On Mon, Mar 01, 2021 at 10:03:26PM -0800, Hugh Dickins wrote: > vmstat_refresh() can occasionally catch nr_zone_write_pending and > nr_writeback when they are transiently negative. The reason is partly > that the interrupt which decrements them in test_clear_page_writeback() > can come in before __test_set_page_writeback() got to increment them; > but transient negatives are still seen even when that is prevented, and > I am not yet certain why (but see Roman's note below). Those stats are > not buggy, they have never been seen to drift away from 0 permanently: > so just avoid the annoyance of showing a warning on them. > > Similarly avoid showing a warning on nr_free_cma: CMA users have seen > that one reported negative from /proc/sys/vm/stat_refresh too, but it > does drift away permanently: I believe that's because its incrementation > and decrementation are decided by page migratetype, but the migratetype > of a pageblock is not guaranteed to be constant. > > Roman Gushchin points out: > For performance reasons, vmstat counters are incremented and decremented > using per-cpu batches. vmstat_refresh() flushes the per-cpu batches on > all CPUs, to get values as accurate as possible; but this method is not > atomic, so the resulting value is not always precise. As a consequence, > for those counters whose actual value is close to 0, a small negative > value may occasionally be reported. If the value is small and the state > is transient, it is not an indication of an error. > > Link: https://lore.kernel.org/linux-mm/20200714173747.3315771-1-guro@fb.com/ > Reported-by: Roman Gushchin <guro@fb.com> > Signed-off-by: Hugh Dickins <hughd@google.com> > --- Oh, sorry, it looks like I missed to ack it. Thank you for updating the commit log! Acked-by: Roman Gushchin <guro@fb.com>
--- vmstat2/mm/vmstat.c 2021-02-25 11:56:18.000000000 -0800 +++ vmstat3/mm/vmstat.c 2021-02-25 12:42:15.000000000 -0800 @@ -1840,6 +1840,14 @@ int vmstat_refresh(struct ctl_table *tab if (err) return err; for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++) { + /* + * Skip checking stats known to go negative occasionally. + */ + switch (i) { + case NR_ZONE_WRITE_PENDING: + case NR_FREE_CMA_PAGES: + continue; + } val = atomic_long_read(&vm_zone_stat[i]); if (val < 0) { pr_warn("%s: %s %ld\n", @@ -1856,6 +1864,13 @@ int vmstat_refresh(struct ctl_table *tab } #endif for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) { + /* + * Skip checking stats known to go negative occasionally. + */ + switch (i) { + case NR_WRITEBACK: + continue; + } val = atomic_long_read(&vm_node_stat[i]); if (val < 0) { pr_warn("%s: %s %ld\n",
vmstat_refresh() can occasionally catch nr_zone_write_pending and nr_writeback when they are transiently negative. The reason is partly that the interrupt which decrements them in test_clear_page_writeback() can come in before __test_set_page_writeback() got to increment them; but transient negatives are still seen even when that is prevented, and I am not yet certain why (but see Roman's note below). Those stats are not buggy, they have never been seen to drift away from 0 permanently: so just avoid the annoyance of showing a warning on them. Similarly avoid showing a warning on nr_free_cma: CMA users have seen that one reported negative from /proc/sys/vm/stat_refresh too, but it does drift away permanently: I believe that's because its incrementation and decrementation are decided by page migratetype, but the migratetype of a pageblock is not guaranteed to be constant. Roman Gushchin points out: For performance reasons, vmstat counters are incremented and decremented using per-cpu batches. vmstat_refresh() flushes the per-cpu batches on all CPUs, to get values as accurate as possible; but this method is not atomic, so the resulting value is not always precise. As a consequence, for those counters whose actual value is close to 0, a small negative value may occasionally be reported. If the value is small and the state is transient, it is not an indication of an error. Link: https://lore.kernel.org/linux-mm/20200714173747.3315771-1-guro@fb.com/ Reported-by: Roman Gushchin <guro@fb.com> Signed-off-by: Hugh Dickins <hughd@google.com> --- mm/vmstat.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)