diff mbox series

mm/page-writeback: Raise wb_thresh to prevent write blocking with strictlimit

Message ID 20241023100032.62952-1-jimzhao.ai@gmail.com (mailing list archive)
State New
Headers show
Series mm/page-writeback: Raise wb_thresh to prevent write blocking with strictlimit | expand

Commit Message

Jim Zhao Oct. 23, 2024, 10 a.m. UTC
With the strictlimit flag, wb_thresh acts as a hard limit in
balance_dirty_pages() and wb_position_ratio(). When device write
operations are inactive, wb_thresh can drop to 0, causing writes to
be blocked. The issue occasionally occurs in fuse fs, particularly
with network backends, the write thread is blocked frequently during
a period. To address it, this patch raises the minimum wb_thresh to a
controllable level, similar to the non-strictlimit case.

Signed-off-by: Jim Zhao <jimzhao.ai@gmail.com>
---
 mm/page-writeback.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

Comments

Andrew Morton Oct. 23, 2024, 11:24 p.m. UTC | #1
On Wed, 23 Oct 2024 18:00:32 +0800 Jim Zhao <jimzhao.ai@gmail.com> wrote:

> With the strictlimit flag, wb_thresh acts as a hard limit in
> balance_dirty_pages() and wb_position_ratio(). When device write
> operations are inactive, wb_thresh can drop to 0, causing writes to
> be blocked. The issue occasionally occurs in fuse fs, particularly
> with network backends, the write thread is blocked frequently during
> a period. To address it, this patch raises the minimum wb_thresh to a
> controllable level, similar to the non-strictlimit case.

Please tell us more about the userspace-visible effects of this.  It
*sounds* like a serious (but occasional) problem, but that is unclear.

And, very much relatedly, do you feel this fix is needed in earlier
(-stable) kernels?
diff mbox series

Patch

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 72a5d8836425..f21d856c408b 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -917,7 +917,9 @@  static unsigned long __wb_calc_thresh(struct dirty_throttle_control *dtc,
 				      unsigned long thresh)
 {
 	struct wb_domain *dom = dtc_dom(dtc);
+	struct bdi_writeback *wb = dtc->wb;
 	u64 wb_thresh;
+	u64 wb_max_thresh;
 	unsigned long numerator, denominator;
 	unsigned long wb_min_ratio, wb_max_ratio;
 
@@ -931,11 +933,28 @@  static unsigned long __wb_calc_thresh(struct dirty_throttle_control *dtc,
 	wb_thresh *= numerator;
 	wb_thresh = div64_ul(wb_thresh, denominator);
 
-	wb_min_max_ratio(dtc->wb, &wb_min_ratio, &wb_max_ratio);
+	wb_min_max_ratio(wb, &wb_min_ratio, &wb_max_ratio);
 
 	wb_thresh += (thresh * wb_min_ratio) / (100 * BDI_RATIO_SCALE);
-	if (wb_thresh > (thresh * wb_max_ratio) / (100 * BDI_RATIO_SCALE))
-		wb_thresh = thresh * wb_max_ratio / (100 * BDI_RATIO_SCALE);
+	wb_max_thresh = thresh * wb_max_ratio / (100 * BDI_RATIO_SCALE);
+	if (wb_thresh > wb_max_thresh)
+		wb_thresh = wb_max_thresh;
+
+	/*
+	 * With strictlimit flag, the wb_thresh is treated as
+	 * a hard limit in balance_dirty_pages() and wb_position_ratio().
+	 * It's possible that wb_thresh is close to zero, not because
+	 * the device is slow, but because it has been inactive.
+	 * To prevent occasional writes from being blocked, we raise wb_thresh.
+	 */
+	if (unlikely(wb->bdi->capabilities & BDI_CAP_STRICTLIMIT)) {
+		unsigned long limit = hard_dirty_limit(dom, dtc->thresh);
+		u64 wb_scale_thresh = 0;
+
+		if (limit > dtc->dirty)
+			wb_scale_thresh = (limit - dtc->dirty) / 100;
+		wb_thresh = max(wb_thresh, min(wb_scale_thresh, wb_max_thresh / 4));
+	}
 
 	return wb_thresh;
 }