mbox series

[RFC,v1,00/14] mm/block: add bdi sysfs knobs

Message ID 20221011010044.851537-1-shr@devkernel.io (mailing list archive)
Headers show
Series mm/block: add bdi sysfs knobs | expand

Message

Stefan Roesch Oct. 11, 2022, 1 a.m. UTC
At meta network block devices (nbd) are used to implement remote block storage. In testing
and during production it has been observed that these network block devices can consume
a huge portion of the dirty writeback and writeback can take a considerable time.

To give stricter limits, I'm proposing the following changes and new sysfs knobs:

1) strictlimit knob
  Currently the max_ratio knob exists to limit the dirty_memory. However this knob
  only applies once (dirty_ratio + dirty_background_ratio) / 2 has been reached.
  With the BDI_CAP_STRICTLIMIT flag, the max_ratio can be applied without reaching
  that limit. This change exposes that knob.

  This knob can also be useful for NFS, fuse filesystems and USB devices.

2) Part of 10000 internal calculation
  The max_ratio is based on percentage. With the current machine sizes percentage
  values can be very high (1% of a 256GB main memory is already 2.5GB). This change
  uses part of 10000 instead of percentages for the internal calculations.

3) Introduce two new knobs: min_bytes and max_bytes.
  Currently all calculations are based on ratio, but for a user it often more
  convenient to specify a limit in bytes. The new knobs will not store bytes values,
  instead they will translate the byte value to a corresponding ratio. As the internal
  values are now part of 10000, the ratio is closer to the specified value. However
  the value should be more seen as an approximation as it can fluctuate over time.


Stefan Roesch (14):
  mm: add bdi_set_strict_limit() function
  mm: Add new knob /sys/class/bdi/<bdi>/strict_limit
  mm: document new /sys/class/bdi/<bdi>/strict_limit knob
  mm: Use part per 10000 for bdi ratios.
  mm: add bdi_get_max_bytes() function
  mm: split off __bdi_set_max_ratio() function
  mm: add bdi_set_max_bytes() function.
  mm: Add new knob /sys/class/bdi/<bdi>/max_bytes
  mm: document new /sys/class/bdi/<bdi>/max_bytes knob
  mm: add bdi_get_min_bytes() function.
  mm: split off __bdi_set_min_ratio() function
  mm: add bdi_set_min_bytes() function
  mm: add new /sys/class/bdi/<bdi>/min_bytes knob
  mm: document new /sys/class/bdi/<bdi>/min_bytes knob

 Documentation/ABI/testing/sysfs-class-bdi |  40 +++++++
 include/linux/backing-dev.h               |   8 ++
 mm/backing-dev.c                          |  93 +++++++++++++++-
 mm/page-writeback.c                       | 126 ++++++++++++++++++++--
 4 files changed, 253 insertions(+), 14 deletions(-)


base-commit: e2302539dd4f1c62d96651c07ddb05aa2461d29c

Comments

Matthew Wilcox Oct. 11, 2022, 6:20 p.m. UTC | #1
On Mon, Oct 10, 2022 at 06:00:30PM -0700, Stefan Roesch wrote:
> 2) Part of 10000 internal calculation
>   The max_ratio is based on percentage. With the current machine sizes percentage
>   values can be very high (1% of a 256GB main memory is already 2.5GB). This change
>   uses part of 10000 instead of percentages for the internal calculations.

Why 10,000?  If you need better accuracy than 1/1000, the next step
should normally be parts per million.
Stefan Roesch Oct. 13, 2022, 8:11 p.m. UTC | #2
Matthew Wilcox <willy@infradead.org> writes:

> On Mon, Oct 10, 2022 at 06:00:30PM -0700, Stefan Roesch wrote:
>> 2) Part of 10000 internal calculation
>>   The max_ratio is based on percentage. With the current machine sizes percentage
>>   values can be very high (1% of a 256GB main memory is already 2.5GB). This change
>>   uses part of 10000 instead of percentages for the internal calculations.
>
> Why 10,000?  If you need better accuracy than 1/1000, the next step
> should normally be parts per million.

For current main memory sizes 1000 is enough. I wanted to give some
additional headroom. Parts per million is too much. In the next version
of the patch, I'll change it to parts per 1000.
Christoph Hellwig Oct. 17, 2022, 7:45 a.m. UTC | #3
On Mon, Oct 10, 2022 at 06:00:30PM -0700, Stefan Roesch wrote:
> At meta network block devices (nbd) are used to implement remote block storage. In testing

FYI, this mail is pretty much misformatted and really hard to read.