mbox series

[RFC,v2,0/6] mm: working set reporting

Message ID 20230621180454.973862-1-yuanchu@google.com (mailing list archive)
Headers show
Series mm: working set reporting | expand

Message

Yuanchu Xie June 21, 2023, 6:04 p.m. UTC
RFC v1: https://lore.kernel.org/linux-mm/20230509185419.1088297-1-yuanchu@google.com/
For background and interfaces, see the RFC v1 posting.

Changes from v1 -> v2:
- Refactored the patchs into smaller pieces
- Renamed interfaces and functions from wss to wsr (Working Set Reporting)
- Fixed build errors when CONFIG_WSR is not set
- Changed working_set_num_bins to u8 for virtio-balloon
- Added support for per-NUMA node reporting for virtio-balloon

The RFC adds CONFIG_WSR and requires MGLRU to function. T.J. and I aim to support
the active/inactive LRU and working set estimation from the userspace as well.
This series should be build with the following configs:
CONFIG_LRU_GEN=y
CONFIG_LRU_GEN_ENABLED=y
CONFIG_VIRTIO_BALLOON=y
CONFIG_WSR=y

TODO list:
- There's a hack in mm/vmscan.c that calls into the virtio-balloon driver,
  which doesn't work if CONFIG_VIRTIO_BALLOON=m. T.J. Alumbaugh (talumbau@google.com)
  and I plan on solving this problem with a working set notification mechanism
  that would allow multiple consumers to subscribe for working set changes.
- memory.reaccess.histogram does not consider swapped out pages to be reaccessed.
  I plan to implement this with the shadow entry computed from mm/workingset.c.

QEMU device implementation:
https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg06617.html

virtio-dev spec proposal v1 (v2 to be posted by T.J.):
https://lore.kernel.org/virtio-dev/CABmGT5Hv6Jd_F9EoQqVMDo4w5=7wJYmS4wwYDqXK3wov44Tf=w@mail.gmail.com/

LSF/MM discussion slides:
https://lore.kernel.org/linux-mm/CABmGT5HK9xHz=E4q4sECCD8XodP9DUcH0dMeQ8kznUQB5HTQhQ@mail.gmail.com/

T.J. Alumbaugh (1):
  virtio-balloon: Add Working Set reporting

Yuanchu Xie (5):
  mm: aggregate working set information into histograms
  mm: add working set refresh threshold to rate-limit aggregation
  mm: report working set when under memory pressure
  mm: extend working set reporting to memcgs
  mm: add per-memcg reaccess histogram

 drivers/base/node.c                 |   3 +
 drivers/virtio/virtio_balloon.c     | 288 +++++++++++++++++
 include/linux/balloon_compaction.h  |   3 +
 include/linux/memcontrol.h          |   6 +
 include/linux/mmzone.h              |   5 +
 include/linux/wsr.h                 | 114 +++++++
 include/uapi/linux/virtio_balloon.h |  33 ++
 mm/Kconfig                          |   7 +
 mm/Makefile                         |   1 +
 mm/internal.h                       |  12 +
 mm/memcontrol.c                     | 351 ++++++++++++++++++++-
 mm/mmzone.c                         |   3 +
 mm/vmscan.c                         | 194 +++++++++++-
 mm/wsr.c                            | 464 ++++++++++++++++++++++++++++
 14 files changed, 1480 insertions(+), 4 deletions(-)
 create mode 100644 include/linux/wsr.h
 create mode 100644 mm/wsr.c

Comments

Yu Zhao June 21, 2023, 6:48 p.m. UTC | #1
On Wed, Jun 21, 2023 at 12:16 PM Yuanchu Xie <yuanchu@google.com> wrote:
>
> RFC v1: https://lore.kernel.org/linux-mm/20230509185419.1088297-1-yuanchu@google.com/
> For background and interfaces, see the RFC v1 posting.

v1 only mentioned one use case (ballooning), but we both know there
are at least two solid use cases (the other being job
scheduling/binpacking, e.g., for kubernetes [1]).

Please do a survey, as thoroughly as possible, of use cases.
* What's the significance of WSR to the landscape, in terms of server
and client use cases?
* How would userspace tools, e.g., a PMU-based memory profiler,
leverage the infra provided by WSR?
* Would those who register slab shrinkers, e.g., DMA buffs [2], want
to report their working sets?
* Does this effort intersect with memory placement with NUMA and CXL.mem?

[1] https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
[2] https://lore.kernel.org/linux-mm/20230123191728.2928839-1-tjmercier@google.com/