Patchwork [V2,1/5] dm-mpath: don't call blk_mq_delay_run_hw_queue() in case of BLK_STS_RESOURCE

login
register
mail settings
Submitter Ming Lei
Date Nov. 27, 2017, 5:07 a.m.
Message ID <20171127050721.5884-2-ming.lei@redhat.com>
Download mbox | patch
Permalink /patch/10075765/
State Superseded, archived
Delegated to: Mike Snitzer
Headers show

Comments

Ming Lei - Nov. 27, 2017, 5:07 a.m.
If .queue_rq() returns BLK_STS_RESOURCE, blk-mq will rerun the queue in
the three situations:

1) if BLK_MQ_S_SCHED_RESTART is set
- queue is rerun after one rq is completed, see blk_mq_sched_restart()
which is run from blk_mq_free_request()

2) run out of driver tag
- queue is rerun after one tag is freed

3) otherwise
- queue is run immediately in blk_mq_dispatch_rq_list()

This random dealy of running hw queue is introduced by commit 6077c2d706097c0
(dm rq: Avoid that request processing stalls sporadically), which claimed
one request processing stalling is fixed, but never explained the behind
idea, and it is a workaound at most. Even the question isn't explained by
anyone in recent discussion.

Also calling blk_mq_delay_run_hw_queue() inside .queue_rq() is a horrible
hack because it makes BLK_MQ_S_SCHED_RESTART not working, and degrades I/O
peformance a lot.

Finally this patch makes sure that dm-rq returns BLK_STS_RESOURCE to blk-mq
only when underlying queue is out of resource, so we switch to return
DM_MAPIO_DELAY_REQUEU if either MPATHF_QUEUE_IO or MPATHF_PG_INIT_REQUIRED
is set in multipath_clone_and_map().

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/md/dm-mpath.c | 4 +---
 drivers/md/dm-rq.c    | 1 -
 2 files changed, 1 insertion(+), 4 deletions(-)
Bart Van Assche - Nov. 27, 2017, 5:14 p.m.
On Mon, 2017-11-27 at 13:07 +0800, Ming Lei wrote:
> If .queue_rq() returns BLK_STS_RESOURCE, blk-mq will rerun the queue in
> the three situations:
> 
> 1) if BLK_MQ_S_SCHED_RESTART is set
> - queue is rerun after one rq is completed, see blk_mq_sched_restart()
> which is run from blk_mq_free_request()
> 
> 2) run out of driver tag
> - queue is rerun after one tag is freed
> 
> 3) otherwise
> - queue is run immediately in blk_mq_dispatch_rq_list()
> 
> This random dealy of running hw queue is introduced by commit 6077c2d706097c0
> (dm rq: Avoid that request processing stalls sporadically), which claimed
> one request processing stalling is fixed, but never explained the behind
> idea, and it is a workaound at most. Even the question isn't explained by
> anyone in recent discussion.
> 
> Also calling blk_mq_delay_run_hw_queue() inside .queue_rq() is a horrible
> hack because it makes BLK_MQ_S_SCHED_RESTART not working, and degrades I/O
> peformance a lot.
> 
> Finally this patch makes sure that dm-rq returns BLK_STS_RESOURCE to blk-mq
> only when underlying queue is out of resource, so we switch to return
> DM_MAPIO_DELAY_REQUEU if either MPATHF_QUEUE_IO or MPATHF_PG_INIT_REQUIRED
> is set in multipath_clone_and_map().

Sorry but in my opinion the above description shows that you don't understand
the dm-mpath driver completely.

> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> index c8faa2b85842..8fe3f45407ce 100644
> --- a/drivers/md/dm-mpath.c
> +++ b/drivers/md/dm-mpath.c
> @@ -484,9 +484,7 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,
>  		return DM_MAPIO_KILL;
>  	} else if (test_bit(MPATHF_QUEUE_IO, &m->flags) ||
>  		   test_bit(MPATHF_PG_INIT_REQUIRED, &m->flags)) {
> -		if (pg_init_all_paths(m))
> -			return DM_MAPIO_DELAY_REQUEUE;
> -		return DM_MAPIO_REQUEUE;
> +		return DM_MAPIO_DELAY_REQUEUE;
>  	}

This patch removes a pg_init_all_paths() call but you don't explain why you
think it is allowed to remove that call. Did you perhaps remove that call by
mistake?

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Ming Lei - Dec. 1, 2017, 2:01 a.m.
On Mon, Nov 27, 2017 at 05:14:46PM +0000, Bart Van Assche wrote:
> On Mon, 2017-11-27 at 13:07 +0800, Ming Lei wrote:
> > If .queue_rq() returns BLK_STS_RESOURCE, blk-mq will rerun the queue in
> > the three situations:
> > 
> > 1) if BLK_MQ_S_SCHED_RESTART is set
> > - queue is rerun after one rq is completed, see blk_mq_sched_restart()
> > which is run from blk_mq_free_request()
> > 
> > 2) run out of driver tag
> > - queue is rerun after one tag is freed
> > 
> > 3) otherwise
> > - queue is run immediately in blk_mq_dispatch_rq_list()
> > 
> > This random dealy of running hw queue is introduced by commit 6077c2d706097c0
> > (dm rq: Avoid that request processing stalls sporadically), which claimed
> > one request processing stalling is fixed, but never explained the behind
> > idea, and it is a workaound at most. Even the question isn't explained by
> > anyone in recent discussion.
> > 
> > Also calling blk_mq_delay_run_hw_queue() inside .queue_rq() is a horrible
> > hack because it makes BLK_MQ_S_SCHED_RESTART not working, and degrades I/O
> > peformance a lot.
> > 
> > Finally this patch makes sure that dm-rq returns BLK_STS_RESOURCE to blk-mq
> > only when underlying queue is out of resource, so we switch to return
> > DM_MAPIO_DELAY_REQUEU if either MPATHF_QUEUE_IO or MPATHF_PG_INIT_REQUIRED
> > is set in multipath_clone_and_map().
> 
> Sorry but in my opinion the above description shows that you don't understand
> the dm-mpath driver completely.

I have to treat your above comment as a noop since you never provide a explanation.

Also I don't think it is wrong to deal with MPATHF_QUEUE_IO/MPATHF_PG_INIT_REQUIRED
via DM_MAPIO_DELAY_REQUEUE, since both can seldom happen, and the delay
won't cause performance issue.

The idea behind this change is that this patchset switches to return BLK_STS_RESOURCE
to blk-mq only when we run out of resource, but the above two(MPATHF_QUEUE_IO and
MPATHF_PG_INIT_REQUIRED) don't belong to 'run out of resource'.

> 
> > diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> > index c8faa2b85842..8fe3f45407ce 100644
> > --- a/drivers/md/dm-mpath.c
> > +++ b/drivers/md/dm-mpath.c
> > @@ -484,9 +484,7 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,
> >  		return DM_MAPIO_KILL;
> >  	} else if (test_bit(MPATHF_QUEUE_IO, &m->flags) ||
> >  		   test_bit(MPATHF_PG_INIT_REQUIRED, &m->flags)) {
> > -		if (pg_init_all_paths(m))
> > -			return DM_MAPIO_DELAY_REQUEUE;
> > -		return DM_MAPIO_REQUEUE;
> > +		return DM_MAPIO_DELAY_REQUEUE;
> >  	}
> 
> This patch removes a pg_init_all_paths() call but you don't explain why you
> think it is allowed to remove that call. Did you perhaps remove that call by
> mistake?

OK, that is a problem, will fix it in V2.

Patch

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index c8faa2b85842..8fe3f45407ce 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -484,9 +484,7 @@  static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,
 		return DM_MAPIO_KILL;
 	} else if (test_bit(MPATHF_QUEUE_IO, &m->flags) ||
 		   test_bit(MPATHF_PG_INIT_REQUIRED, &m->flags)) {
-		if (pg_init_all_paths(m))
-			return DM_MAPIO_DELAY_REQUEUE;
-		return DM_MAPIO_REQUEUE;
+		return DM_MAPIO_DELAY_REQUEUE;
 	}
 
 	memset(mpio, 0, sizeof(*mpio));
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 9d32f25489c2..cbe8a06ef8b0 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -758,7 +758,6 @@  static blk_status_t dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
 		/* Undo dm_start_request() before requeuing */
 		rq_end_stats(md, rq);
 		rq_completed(md, rq_data_dir(rq), false);
-		blk_mq_delay_run_hw_queue(hctx, 100/*ms*/);
 		return BLK_STS_RESOURCE;
 	}