[v2] block: limit request dispatch loop duration

Message ID	20220318022641.133484-1-shinichiro.kawasaki@wdc.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-block-owner@kernel.org> IronPort-SDR: xkeDq88RQWWllZxvAZysZTLsHVSsfNDUKGBkg58y6MYL8xAlBmu4U8LhRK+xQCWdvRcGstIcDs X33l3xXhEWQDPJ+Hf87w5V2KdEzzZXKtvzzQOjvErUjxCYQ7/hm+UxLOHSRrlMAAzrroDf/KuU r6AfDr3ljdVWpDaJs6BmtXjtv5hSzyE4UY38sOFPjqqKgvm1AQkIr/yRplbRN4o/Ovu4vSxCsA ZoUBBj8WjSmcWL8wkZSGW6SsnOS4LYwpVeMQzcZQ8NzWgI8Jkn6vsLZ9Fs7CLpvAxyTocuBytc 4FG8BFWcyt/CGM3JLipxnfrc IronPort-SDR: qO/SUEhRybTWOnXcp9PXEkp0vY7fevq9CS+yN8ouO0RawRdWe7P5VLO/5i5DAwi1I7w2InEdf+ eq08K2X2mUlGFlErKPhCF1UeRWxONqk3/nshRSqdAwYM682hzu8gk9TXhPGmiuE7OczbA3yo+c E+06HecISHwub9SowoXE997mB8Mf1qXC64R8L2/IEa8zunobIsNKR8L4XJMvCw1JFKXboQwieA fnFPIhxY3WORJDJlZinjSbiDzaa+KPafsS/SpLyE5YX5s2JCwgJTFMp4zLgjI2oK24OvRQ6r2c If8= WDCIronportException: Internal From: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> To: linux-block@vger.kernel.org, Jens Axboe <axboe@kernel.dk>, Ming Lei <ming.lei@redhat.com> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>, Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Subject: [PATCH v2] block: limit request dispatch loop duration Date: Fri, 18 Mar 2022 11:26:41 +0900 Message-Id: <20220318022641.133484-1-shinichiro.kawasaki@wdc.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	[v2] block: limit request dispatch loop duration \| expand [v2] block: limit request dispatch loop duration

Message ID

20220318022641.133484-1-shinichiro.kawasaki@wdc.com (mailing list archive)

State

New, archived

Headers

IronPort-SDR: 
 xkeDq88RQWWllZxvAZysZTLsHVSsfNDUKGBkg58y6MYL8xAlBmu4U8LhRK+xQCWdvRcGstIcDs
 X33l3xXhEWQDPJ+Hf87w5V2KdEzzZXKtvzzQOjvErUjxCYQ7/hm+UxLOHSRrlMAAzrroDf/KuU
 r6AfDr3ljdVWpDaJs6BmtXjtv5hSzyE4UY38sOFPjqqKgvm1AQkIr/yRplbRN4o/Ovu4vSxCsA
 ZoUBBj8WjSmcWL8wkZSGW6SsnOS4LYwpVeMQzcZQ8NzWgI8Jkn6vsLZ9Fs7CLpvAxyTocuBytc
 4FG8BFWcyt/CGM3JLipxnfrc
IronPort-SDR: 
 qO/SUEhRybTWOnXcp9PXEkp0vY7fevq9CS+yN8ouO0RawRdWe7P5VLO/5i5DAwi1I7w2InEdf+
 eq08K2X2mUlGFlErKPhCF1UeRWxONqk3/nshRSqdAwYM682hzu8gk9TXhPGmiuE7OczbA3yo+c
 E+06HecISHwub9SowoXE997mB8Mf1qXC64R8L2/IEa8zunobIsNKR8L4XJMvCw1JFKXboQwieA
 fnFPIhxY3WORJDJlZinjSbiDzaa+KPafsS/SpLyE5YX5s2JCwgJTFMp4zLgjI2oK24OvRQ6r2c
 If8=
WDCIronportException: Internal
From: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
To: linux-block@vger.kernel.org, Jens Axboe <axboe@kernel.dk>,
        Ming Lei <ming.lei@redhat.com>
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>,
        Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Subject: [PATCH v2] block: limit request dispatch loop duration
Date: Fri, 18 Mar 2022 11:26:41 +0900
Message-Id: <20220318022641.133484-1-shinichiro.kawasaki@wdc.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

[v2] block: limit request dispatch loop duration | expand

Commit Message

Shin'ichiro Kawasaki March 18, 2022, 2:26 a.m. UTC

When IO requests are made continuously and the target block device
handles requests faster than request arrival, the request dispatch loop
keeps on repeating to dispatch the arriving requests very long time,
more than a minute. Since the loop runs as a workqueue worker task, the
very long loop duration triggers workqueue watchdog timeout and BUG [1].

To avoid the very long loop duration, break the loop periodically. When
opportunity to dispatch requests still exists, check need_resched(). If
need_resched() returns true, the dispatch loop already consumed its time
slice, then reschedule the dispatch work and break the loop. With heavy
IO load, need_resched() does not return true for 20~30 seconds. To cover
such case, check time spent in the dispatch loop with jiffies. If more
than 1 second is spent, reschedule the dispatch work and break the loop.

[1]

[  609.691437] BUG: workqueue lockup - pool cpus=10 node=1 flags=0x0 nice=-20 stuck for 35s!
[  609.701820] Showing busy workqueues and worker pools:
[  609.707915] workqueue events: flags=0x0
[  609.712615]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[  609.712626]     pending: drm_fb_helper_damage_work [drm_kms_helper]
[  609.712687] workqueue events_freezable: flags=0x4
[  609.732943]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[  609.732952]     pending: pci_pme_list_scan
[  609.732968] workqueue events_power_efficient: flags=0x80
[  609.751947]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[  609.751955]     pending: neigh_managed_work
[  609.752018] workqueue kblockd: flags=0x18
[  609.769480]   pwq 21: cpus=10 node=1 flags=0x0 nice=-20 active=3/256 refcnt=4
[  609.769488]     in-flight: 1020:blk_mq_run_work_fn
[  609.769498]     pending: blk_mq_timeout_work, blk_mq_run_work_fn
[  609.769744] pool 21: cpus=10 node=1 flags=0x0 nice=-20 hung=35s workers=2 idle: 67
[  639.899730] BUG: workqueue lockup - pool cpus=10 node=1 flags=0x0 nice=-20 stuck for 66s!
[  639.909513] Showing busy workqueues and worker pools:
[  639.915404] workqueue events: flags=0x0
[  639.920197]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[  639.920215]     pending: drm_fb_helper_damage_work [drm_kms_helper]
[  639.920365] workqueue kblockd: flags=0x18
[  639.939932]   pwq 21: cpus=10 node=1 flags=0x0 nice=-20 active=3/256 refcnt=4
[  639.939942]     in-flight: 1020:blk_mq_run_work_fn
[  639.939955]     pending: blk_mq_timeout_work, blk_mq_run_work_fn
[  639.940212] pool 21: cpus=10 node=1 flags=0x0 nice=-20 hung=66s workers=2 idle: 67

Fixes: 6e6fcbc27e778 ("blk-mq: support batching dispatch in case of io")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Cc: stable@vger.kernel.org # v5.10+
Link: https://lore.kernel.org/linux-block/20220310091649.zypaem5lkyfadymg@shindev/
---
Changes from v1:
* Inverted logic of jiffies and end time comparison
* Improved readability per comment on the list

 block/blk-mq-sched.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Jens Axboe March 18, 2022, 2:40 a.m. UTC | #1

On Fri, 18 Mar 2022 11:26:41 +0900, Shin'ichiro Kawasaki wrote:
> When IO requests are made continuously and the target block device
> handles requests faster than request arrival, the request dispatch loop
> keeps on repeating to dispatch the arriving requests very long time,
> more than a minute. Since the loop runs as a workqueue worker task, the
> very long loop duration triggers workqueue watchdog timeout and BUG [1].
> 
> To avoid the very long loop duration, break the loop periodically. When
> opportunity to dispatch requests still exists, check need_resched(). If
> need_resched() returns true, the dispatch loop already consumed its time
> slice, then reschedule the dispatch work and break the loop. With heavy
> IO load, need_resched() does not return true for 20~30 seconds. To cover
> such case, check time spent in the dispatch loop with jiffies. If more
> than 1 second is spent, reschedule the dispatch work and break the loop.
> 
> [...]

Applied, thanks!

[1/1] block: limit request dispatch loop duration
      commit: 572299f03afd676dd4e20669cdaf5ed0fe1379d4

Best regards,

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 55488ba97823..80e0eb26b697 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -180,11 +180,18 @@  static int __blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
 
 static int blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
 {
+	unsigned long end = jiffies + HZ;
 	int ret;
 
 	do {
 		ret = __blk_mq_do_dispatch_sched(hctx);
-	} while (ret == 1);
+		if (ret != 1)
+			break;
+		if (need_resched() || time_is_before_jiffies(end)) {
+			blk_mq_delay_run_hw_queue(hctx, 0);
+			break;
+		}
+	} while (1);
 
 	return ret;
 }

[v2] block: limit request dispatch loop duration

Commit Message

Comments

Patch