diff mbox

[v4.13-rc,BUG] system lockup when running big buffered write(4M) to IB SRP via mpath

Message ID 1502233843.2686.4.camel@wdc.com (mailing list archive)
State New, archived
Headers show

Commit Message

Bart Van Assche Aug. 8, 2017, 11:10 p.m. UTC
On Tue, 2017-08-08 at 22:17 +0800, Ming Lei wrote:
> Laurence and I see a system lockup issue when running concurrent
> big buffered write(4M bytes) to IB SRP on v4.13-rc3.
> [ ... ] 
> 	#cat hammer_write.sh
> 	#!/bin/bash
> 	while true; do
> 		dd if=/dev/zero of=/dev/mapper/$1 bs=4096k count=800 	
>       done

Hello Laurence,

Is your goal perhaps to simulate a DDN workload? In that case I think you
need oflag=direct to the dd argument list such that the page cache writeback
code does not alter the size of the write requests. Anyway, this test should
not trigger a lockup. Can you check whether the patch below makes the soft
lockup complaints disappear (without changing the hammer_write.sh test
script)?

Thanks,

Bart.
----------------------------------------------------------------------------
[PATCH] block: Make blk_mq_delay_kick_requeue_list() rerun the queue at a
quiet time

Drivers like dm-mpath requeue requests if no paths are available and
if configured to do so. If the queue depth is sufficiently high and
the queue rerunning delay sufficiently short then .requeue_work can
be queued so often that other work items queued on the same work queue
do not get executed. Avoid that this happens by only rerunning the
queue after no blk_mq_delay_kick_requeue_list() calls have occurred
during @msecs milliseconds. Since the device mapper core is the only
user of blk_mq_delay_kick_requeue_list(), modify the implementation
of this function instead of creating a new function.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
---
 block/blk-mq.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
diff mbox

Patch

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 041f7b7fa0d6..8bfea36e92f9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -679,8 +679,8 @@  EXPORT_SYMBOL(blk_mq_kick_requeue_list);
 void blk_mq_delay_kick_requeue_list(struct request_queue *q,
 				    unsigned long msecs)
 {
-	kblockd_schedule_delayed_work(&q->requeue_work,
-				      msecs_to_jiffies(msecs));
+	kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, &q->requeue_work,
+				    msecs_to_jiffies(msecs));
 }
 EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);