diff mbox series

dm-rq: don't call blk_mq_queue_stopped in dm_stop_queue()

Message ID 20200619084214.337449-1-ming.lei@redhat.com (mailing list archive)
State New, archived
Headers show
Series dm-rq: don't call blk_mq_queue_stopped in dm_stop_queue() | expand

Commit Message

Ming Lei June 19, 2020, 8:42 a.m. UTC
dm-rq won't stop queue, meantime blk-mq won't stop one queue too, so
remove the check.

dm_stop_queue() actually tries to quiesce hw queues via blk_mq_quiesce_queue(),
we can't check via blk_queue_quiesced for avoiding unnecessary queue
quiesce because the flag is set before synchronize_rcu() and dm_stop_queue
may be called when synchronize_rcu from another blk_mq_quiesce_queue is
in-progress.

Cc: linux-block@vger.kernel.org
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/md/dm-rq.c | 3 ---
 1 file changed, 3 deletions(-)

Comments

Mike Snitzer June 19, 2020, 9:42 a.m. UTC | #1
Hi Ming,

Thanks for the patch!  But I'm having a hard time understanding what
you've written in the patch header,

On Fri, Jun 19 2020 at  4:42am -0400,
Ming Lei <ming.lei@redhat.com> wrote:

> dm-rq won't stop queue, meantime blk-mq won't stop one queue too, so
> remove the check.

It'd be helpful if you could unpack this with more detail before going on
to explain why using blk_queue_quiesced, despite dm-rq using
blk_mq_queue_stopped, would also be ineffective.

SO:

> dm-rq won't stop queue

1) why won't dm-rq stop the queue?  Do you mean it won't reliably
   _always_ stop the queue because of the blk_mq_queue_stopped() check?

> meantime blk-mq won't stop one queue too, so remove the check.

2) Meaning?: blk_mq_queue_stopped() will return true even if only one hw
queue is stopped, given blk-mq must stop all hw queues a positive return
from this blk_mq_queue_stopped() check is incorrectly assuming it meanss
all hw queues are stopped.

> dm_stop_queue() actually tries to quiesce hw queues via blk_mq_quiesce_queue(),
> we can't check via blk_queue_quiesced for avoiding unnecessary queue
> quiesce because the flag is set before synchronize_rcu() and dm_stop_queue
> may be called when synchronize_rcu from another blk_mq_quiesce_queue is
> in-progress.

But I'm left with questions/confusion on this too:

1) you mention blk_queue_quiesced instead of blk_mq_queue_stopped, so I
   assume you mean that: not only is blk_mq_queue_stopped()
   ineffective, blk_queue_quiesced() would be too?

2) the race you detail (with competing blk_mq_quiesce_queue) relative to
   synchronize_rcu() and testing "the flag" is very detailed yet vague.

Anyway, once we get this heaader cleaned up a bit more I'll be happy to
get this staged as a stable@ fix for 5.8 inclusion ASAP.

Thanks!
Mike

> 
> Cc: linux-block@vger.kernel.org
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  drivers/md/dm-rq.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> index f60c02512121..ed4d5ea66ccc 100644
> --- a/drivers/md/dm-rq.c
> +++ b/drivers/md/dm-rq.c
> @@ -70,9 +70,6 @@ void dm_start_queue(struct request_queue *q)
>  
>  void dm_stop_queue(struct request_queue *q)
>  {
> -	if (blk_mq_queue_stopped(q))
> -		return;
> -
>  	blk_mq_quiesce_queue(q);
>  }
>  
> -- 
> 2.25.2
>
Ming Lei June 19, 2020, 10:11 a.m. UTC | #2
Hi Mike,

On Fri, Jun 19, 2020 at 05:42:50AM -0400, Mike Snitzer wrote:
> Hi Ming,
> 
> Thanks for the patch!  But I'm having a hard time understanding what
> you've written in the patch header,
> 
> On Fri, Jun 19 2020 at  4:42am -0400,
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > dm-rq won't stop queue, meantime blk-mq won't stop one queue too, so
> > remove the check.
> 
> It'd be helpful if you could unpack this with more detail before going on
> to explain why using blk_queue_quiesced, despite dm-rq using
> blk_mq_queue_stopped, would also be ineffective.
> 
> SO:
> 
> > dm-rq won't stop queue
> 
> 1) why won't dm-rq stop the queue?  Do you mean it won't reliably
>    _always_ stop the queue because of the blk_mq_queue_stopped() check?

device mapper doesn't call blk_mq_stop_hw_queue or blk_mq_stop_hw_queues.

> 
> > meantime blk-mq won't stop one queue too, so remove the check.
> 
> 2) Meaning?: blk_mq_queue_stopped() will return true even if only one hw
> queue is stopped, given blk-mq must stop all hw queues a positive return
> from this blk_mq_queue_stopped() check is incorrectly assuming it meanss
> all hw queues are stopped.

blk-mq won't call blk_mq_stop_hw_queue or blk_mq_stop_hw_queues for
dm-rq's queue too, so dm-rq's hw queue won't be stopped.

BTW blk_mq_stop_hw_queue or blk_mq_stop_hw_queues are supposed to be
used for throttling queue.

> 
> > dm_stop_queue() actually tries to quiesce hw queues via blk_mq_quiesce_queue(),
> > we can't check via blk_queue_quiesced for avoiding unnecessary queue
> > quiesce because the flag is set before synchronize_rcu() and dm_stop_queue
> > may be called when synchronize_rcu from another blk_mq_quiesce_queue is
> > in-progress.
> 
> But I'm left with questions/confusion on this too:
> 
> 1) you mention blk_queue_quiesced instead of blk_mq_queue_stopped, so I
>    assume you mean that: not only is blk_mq_queue_stopped()
>    ineffective, blk_queue_quiesced() would be too?

blk_mq_queue_stopped isn't necessary because dm-rq's hw queue won't be
stopped by anyone, meantime replacing it with blk_queue_quiesced() is wrong.

> 
> 2) the race you detail (with competing blk_mq_quiesce_queue) relative to
>    synchronize_rcu() and testing "the flag" is very detailed yet vague.

If two code paths are calling dm_stop_queue() at the same time, one path may
return immediately and it is wrong, sine synchronize_rcu() from another path
may not be done.

> 
> Anyway, once we get this heaader cleaned up a bit more I'll be happy to
> get this staged as a stable@ fix for 5.8 inclusion ASAP.

This patch isn't a fix, and it shouldn't be related with rhel8's issue.

Thanks,
Ming

> 
> Thanks!
> Mike
> 
> > 
> > Cc: linux-block@vger.kernel.org
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >  drivers/md/dm-rq.c | 3 ---
> >  1 file changed, 3 deletions(-)
> > 
> > diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> > index f60c02512121..ed4d5ea66ccc 100644
> > --- a/drivers/md/dm-rq.c
> > +++ b/drivers/md/dm-rq.c
> > @@ -70,9 +70,6 @@ void dm_start_queue(struct request_queue *q)
> >  
> >  void dm_stop_queue(struct request_queue *q)
> >  {
> > -	if (blk_mq_queue_stopped(q))
> > -		return;
> > -
> >  	blk_mq_quiesce_queue(q);
> >  }
> >  
> > -- 
> > 2.25.2
> >
Mike Snitzer June 19, 2020, 4:06 p.m. UTC | #3
On Fri, Jun 19 2020 at  6:11am -0400,
Ming Lei <ming.lei@redhat.com> wrote:

> Hi Mike,
> 
> On Fri, Jun 19, 2020 at 05:42:50AM -0400, Mike Snitzer wrote:
> > Hi Ming,
> > 
> > Thanks for the patch!  But I'm having a hard time understanding what
> > you've written in the patch header,
> > 
> > On Fri, Jun 19 2020 at  4:42am -0400,
> > Ming Lei <ming.lei@redhat.com> wrote:
> > 
> > > dm-rq won't stop queue, meantime blk-mq won't stop one queue too, so
> > > remove the check.
> > 
> > It'd be helpful if you could unpack this with more detail before going on
> > to explain why using blk_queue_quiesced, despite dm-rq using
> > blk_mq_queue_stopped, would also be ineffective.
> > 
> > SO:
> > 
> > > dm-rq won't stop queue
> > 
> > 1) why won't dm-rq stop the queue?  Do you mean it won't reliably
> >    _always_ stop the queue because of the blk_mq_queue_stopped() check?
> 
> device mapper doesn't call blk_mq_stop_hw_queue or blk_mq_stop_hw_queues.
> 
> > 
> > > meantime blk-mq won't stop one queue too, so remove the check.
> > 
> > 2) Meaning?: blk_mq_queue_stopped() will return true even if only one hw
> > queue is stopped, given blk-mq must stop all hw queues a positive return
> > from this blk_mq_queue_stopped() check is incorrectly assuming it meanss
> > all hw queues are stopped.
> 
> blk-mq won't call blk_mq_stop_hw_queue or blk_mq_stop_hw_queues for
> dm-rq's queue too, so dm-rq's hw queue won't be stopped.
> 
> BTW blk_mq_stop_hw_queue or blk_mq_stop_hw_queues are supposed to be
> used for throttling queue.

I'm going to look at actually stopping the queue (using one of these
interfaces).  I didn't realize I wasn't actually stopping the queue.
The intent was to do so.

In speaking with Jens yesterday about freeze vs stop: it is clear that
dm-rq needs to still be able to allocate new requests, but _not_ call
the queue_rq to issue the requests, while "stopped" (due to dm-mpath
potentially deferring retries of failed requests because of path failure
while quiescing the queue during DM device suspend).  But that freezing
the queue goes too far because it won't allow such request allocation.

> > > dm_stop_queue() actually tries to quiesce hw queues via blk_mq_quiesce_queue(),
> > > we can't check via blk_queue_quiesced for avoiding unnecessary queue
> > > quiesce because the flag is set before synchronize_rcu() and dm_stop_queue
> > > may be called when synchronize_rcu from another blk_mq_quiesce_queue is
> > > in-progress.
> > 
> > But I'm left with questions/confusion on this too:
> > 
> > 1) you mention blk_queue_quiesced instead of blk_mq_queue_stopped, so I
> >    assume you mean that: not only is blk_mq_queue_stopped()
> >    ineffective, blk_queue_quiesced() would be too?
> 
> blk_mq_queue_stopped isn't necessary because dm-rq's hw queue won't be
> stopped by anyone, meantime replacing it with blk_queue_quiesced() is wrong.
> 
> > 
> > 2) the race you detail (with competing blk_mq_quiesce_queue) relative to
> >    synchronize_rcu() and testing "the flag" is very detailed yet vague.
> 
> If two code paths are calling dm_stop_queue() at the same time, one path may
> return immediately and it is wrong, sine synchronize_rcu() from another path
> may not be done.
> 
> > 
> > Anyway, once we get this heaader cleaned up a bit more I'll be happy to
> > get this staged as a stable@ fix for 5.8 inclusion ASAP.
> 
> This patch isn't a fix, and it shouldn't be related with rhel8's issue.

I realize that now.  I've changed the patch header to be a bit clearer
and staged it for 5.9, see:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.9&id=06e788ed59e0095b679bdce9e39c1a251032ae62

Thanks,
Mike
Mike Snitzer June 19, 2020, 5:40 p.m. UTC | #4
On Fri, Jun 19 2020 at 12:06pm -0400,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Fri, Jun 19 2020 at  6:11am -0400,
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > Hi Mike,
> > 
> > On Fri, Jun 19, 2020 at 05:42:50AM -0400, Mike Snitzer wrote:
> > > Hi Ming,
> > > 
> > > Thanks for the patch!  But I'm having a hard time understanding what
> > > you've written in the patch header,
> > > 
> > > On Fri, Jun 19 2020 at  4:42am -0400,
> > > Ming Lei <ming.lei@redhat.com> wrote:
> > > 
> > > > dm-rq won't stop queue, meantime blk-mq won't stop one queue too, so
> > > > remove the check.
> > > 
> > > It'd be helpful if you could unpack this with more detail before going on
> > > to explain why using blk_queue_quiesced, despite dm-rq using
> > > blk_mq_queue_stopped, would also be ineffective.
> > > 
> > > SO:
> > > 
> > > > dm-rq won't stop queue
> > > 
> > > 1) why won't dm-rq stop the queue?  Do you mean it won't reliably
> > >    _always_ stop the queue because of the blk_mq_queue_stopped() check?
> > 
> > device mapper doesn't call blk_mq_stop_hw_queue or blk_mq_stop_hw_queues.
> > 
> > > 
> > > > meantime blk-mq won't stop one queue too, so remove the check.
> > > 
> > > 2) Meaning?: blk_mq_queue_stopped() will return true even if only one hw
> > > queue is stopped, given blk-mq must stop all hw queues a positive return
> > > from this blk_mq_queue_stopped() check is incorrectly assuming it meanss
> > > all hw queues are stopped.
> > 
> > blk-mq won't call blk_mq_stop_hw_queue or blk_mq_stop_hw_queues for
> > dm-rq's queue too, so dm-rq's hw queue won't be stopped.
> > 
> > BTW blk_mq_stop_hw_queue or blk_mq_stop_hw_queues are supposed to be
> > used for throttling queue.
> 
> I'm going to look at actually stopping the queue (using one of these
> interfaces).  I didn't realize I wasn't actually stopping the queue.
> The intent was to do so.
> 
> In speaking with Jens yesterday about freeze vs stop: it is clear that
> dm-rq needs to still be able to allocate new requests, but _not_ call
> the queue_rq to issue the requests, while "stopped" (due to dm-mpath
> potentially deferring retries of failed requests because of path failure
> while quiescing the queue during DM device suspend).  But that freezing
> the queue goes too far because it won't allow such request allocation.

Seems I'm damned if I do (stop) or damned if I don't (new reports of
requests completing after DM device suspend's
blk_mq_quiesce_queue()+dm_wait_for_completion()).

I'm left at something of a loss about what to do!  Bart? Jens? Ming?

Looking closer at the git history, commit 7b17c2f7292ba takes center
stage:

commit 7b17c2f7292ba1f3f98dae3f7077f9e569653276
Author: Bart Van Assche <bart.vanassche@sandisk.com>
Date:   Fri Oct 28 17:22:16 2016 -0700

    dm: Fix a race condition related to stopping and starting queues

    Ensure that all ongoing dm_mq_queue_rq() and dm_mq_requeue_request()
    calls have stopped before setting the "queue stopped" flag. This
    allows to remove the "queue stopped" test from dm_mq_queue_rq() and
    dm_mq_requeue_request(). This patch fixes a race condition because
    dm_mq_queue_rq() is called without holding the queue lock and hence
    BLK_MQ_S_STOPPED can be set at any time while dm_mq_queue_rq() is
    in progress. This patch prevents that the following hang occurs
    sporadically when using dm-mq:

    INFO: task systemd-udevd:10111 blocked for more than 480 seconds.
    Call Trace:
     [<ffffffff8161f397>] schedule+0x37/0x90
     [<ffffffff816239ef>] schedule_timeout+0x27f/0x470
     [<ffffffff8161e76f>] io_schedule_timeout+0x9f/0x110
     [<ffffffff8161fb36>] bit_wait_io+0x16/0x60
     [<ffffffff8161f929>] __wait_on_bit_lock+0x49/0xa0
     [<ffffffff8114fe69>] __lock_page+0xb9/0xc0
     [<ffffffff81165d90>] truncate_inode_pages_range+0x3e0/0x760
     [<ffffffff81166120>] truncate_inode_pages+0x10/0x20
     [<ffffffff81212a20>] kill_bdev+0x30/0x40
     [<ffffffff81213d41>] __blkdev_put+0x71/0x360
     [<ffffffff81214079>] blkdev_put+0x49/0x170
     [<ffffffff812141c0>] blkdev_close+0x20/0x30
     [<ffffffff811d48e8>] __fput+0xe8/0x1f0
     [<ffffffff811d4a29>] ____fput+0x9/0x10
     [<ffffffff810842d3>] task_work_run+0x83/0xb0
     [<ffffffff8106606e>] do_exit+0x3ee/0xc40
     [<ffffffff8106694b>] do_group_exit+0x4b/0xc0
     [<ffffffff81073d9a>] get_signal+0x2ca/0x940
     [<ffffffff8101bf43>] do_signal+0x23/0x660
     [<ffffffff810022b3>] exit_to_usermode_loop+0x73/0xb0
     [<ffffffff81002cb0>] syscall_return_slowpath+0xb0/0xc0
     [<ffffffff81624e33>] entry_SYSCALL_64_fastpath+0xa6/0xa8

    Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
    Acked-by: Mike Snitzer <snitzer@redhat.com>
    Reviewed-by: Hannes Reinecke <hare@suse.com>
    Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Jens Axboe <axboe@fb.com>

diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 09c958b6f038..8b92e066bb69 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -102,7 +102,7 @@ static void dm_mq_stop_queue(struct request_queue *q)
        if (blk_mq_queue_stopped(q))
                return;

-       blk_mq_stop_hw_queues(q);
+       blk_mq_quiesce_queue(q);
 }

 void dm_stop_queue(struct request_queue *q)
@@ -880,17 +880,6 @@ static int dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
                dm_put_live_table(md, srcu_idx);
        }

-       /*
-        * On suspend dm_stop_queue() handles stopping the blk-mq
-        * request_queue BUT: even though the hw_queues are marked
-        * BLK_MQ_S_STOPPED at that point there is still a race that
-        * is allowing block/blk-mq.c to call ->queue_rq against a
-        * hctx that it really shouldn't.  The following check guards
-        * against this rarity (albeit _not_ race-free).
-        */
-       if (unlikely(test_bit(BLK_MQ_S_STOPPED, &hctx->state)))
-               return BLK_MQ_RQ_QUEUE_BUSY;
-
        if (ti->type->busy && ti->type->busy(ti))
                return BLK_MQ_RQ_QUEUE_BUSY;
Ming Lei June 19, 2020, 10:23 p.m. UTC | #5
On Fri, Jun 19, 2020 at 12:06:57PM -0400, Mike Snitzer wrote:
> On Fri, Jun 19 2020 at  6:11am -0400,
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > Hi Mike,
> > 
> > On Fri, Jun 19, 2020 at 05:42:50AM -0400, Mike Snitzer wrote:
> > > Hi Ming,
> > > 
> > > Thanks for the patch!  But I'm having a hard time understanding what
> > > you've written in the patch header,
> > > 
> > > On Fri, Jun 19 2020 at  4:42am -0400,
> > > Ming Lei <ming.lei@redhat.com> wrote:
> > > 
> > > > dm-rq won't stop queue, meantime blk-mq won't stop one queue too, so
> > > > remove the check.
> > > 
> > > It'd be helpful if you could unpack this with more detail before going on
> > > to explain why using blk_queue_quiesced, despite dm-rq using
> > > blk_mq_queue_stopped, would also be ineffective.
> > > 
> > > SO:
> > > 
> > > > dm-rq won't stop queue
> > > 
> > > 1) why won't dm-rq stop the queue?  Do you mean it won't reliably
> > >    _always_ stop the queue because of the blk_mq_queue_stopped() check?
> > 
> > device mapper doesn't call blk_mq_stop_hw_queue or blk_mq_stop_hw_queues.
> > 
> > > 
> > > > meantime blk-mq won't stop one queue too, so remove the check.
> > > 
> > > 2) Meaning?: blk_mq_queue_stopped() will return true even if only one hw
> > > queue is stopped, given blk-mq must stop all hw queues a positive return
> > > from this blk_mq_queue_stopped() check is incorrectly assuming it meanss
> > > all hw queues are stopped.
> > 
> > blk-mq won't call blk_mq_stop_hw_queue or blk_mq_stop_hw_queues for
> > dm-rq's queue too, so dm-rq's hw queue won't be stopped.
> > 
> > BTW blk_mq_stop_hw_queue or blk_mq_stop_hw_queues are supposed to be
> > used for throttling queue.
> 
> I'm going to look at actually stopping the queue (using one of these
> interfaces).  I didn't realize I wasn't actually stopping the queue.
> The intent was to do so.
> 
> In speaking with Jens yesterday about freeze vs stop: it is clear that
> dm-rq needs to still be able to allocate new requests, but _not_ call
> the queue_rq to issue the requests, while "stopped" (due to dm-mpath
> potentially deferring retries of failed requests because of path failure
> while quiescing the queue during DM device suspend).  But that freezing
> the queue goes too far because it won't allow such request allocation.

Freezing shouldn't be a good choice for driver usually, and quiesce is
exactly what you expect: request allocation is allowed, meantime, no
.queue_rq is possible.

> 
> > > > dm_stop_queue() actually tries to quiesce hw queues via blk_mq_quiesce_queue(),
> > > > we can't check via blk_queue_quiesced for avoiding unnecessary queue
> > > > quiesce because the flag is set before synchronize_rcu() and dm_stop_queue
> > > > may be called when synchronize_rcu from another blk_mq_quiesce_queue is
> > > > in-progress.
> > > 
> > > But I'm left with questions/confusion on this too:
> > > 
> > > 1) you mention blk_queue_quiesced instead of blk_mq_queue_stopped, so I
> > >    assume you mean that: not only is blk_mq_queue_stopped()
> > >    ineffective, blk_queue_quiesced() would be too?
> > 
> > blk_mq_queue_stopped isn't necessary because dm-rq's hw queue won't be
> > stopped by anyone, meantime replacing it with blk_queue_quiesced() is wrong.
> > 
> > > 
> > > 2) the race you detail (with competing blk_mq_quiesce_queue) relative to
> > >    synchronize_rcu() and testing "the flag" is very detailed yet vague.
> > 
> > If two code paths are calling dm_stop_queue() at the same time, one path may
> > return immediately and it is wrong, sine synchronize_rcu() from another path
> > may not be done.
> > 
> > > 
> > > Anyway, once we get this heaader cleaned up a bit more I'll be happy to
> > > get this staged as a stable@ fix for 5.8 inclusion ASAP.
> > 
> > This patch isn't a fix, and it shouldn't be related with rhel8's issue.
> 
> I realize that now.  I've changed the patch header to be a bit clearer
> and staged it for 5.9, see:
> https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.9&id=06e788ed59e0095b679bdce9e39c1a251032ae62

Thanks!
Ming Lei June 19, 2020, 10:37 p.m. UTC | #6
On Fri, Jun 19, 2020 at 01:40:41PM -0400, Mike Snitzer wrote:
> On Fri, Jun 19 2020 at 12:06pm -0400,
> Mike Snitzer <snitzer@redhat.com> wrote:
> 
> > On Fri, Jun 19 2020 at  6:11am -0400,
> > Ming Lei <ming.lei@redhat.com> wrote:
> > 
> > > Hi Mike,
> > > 
> > > On Fri, Jun 19, 2020 at 05:42:50AM -0400, Mike Snitzer wrote:
> > > > Hi Ming,
> > > > 
> > > > Thanks for the patch!  But I'm having a hard time understanding what
> > > > you've written in the patch header,
> > > > 
> > > > On Fri, Jun 19 2020 at  4:42am -0400,
> > > > Ming Lei <ming.lei@redhat.com> wrote:
> > > > 
> > > > > dm-rq won't stop queue, meantime blk-mq won't stop one queue too, so
> > > > > remove the check.
> > > > 
> > > > It'd be helpful if you could unpack this with more detail before going on
> > > > to explain why using blk_queue_quiesced, despite dm-rq using
> > > > blk_mq_queue_stopped, would also be ineffective.
> > > > 
> > > > SO:
> > > > 
> > > > > dm-rq won't stop queue
> > > > 
> > > > 1) why won't dm-rq stop the queue?  Do you mean it won't reliably
> > > >    _always_ stop the queue because of the blk_mq_queue_stopped() check?
> > > 
> > > device mapper doesn't call blk_mq_stop_hw_queue or blk_mq_stop_hw_queues.
> > > 
> > > > 
> > > > > meantime blk-mq won't stop one queue too, so remove the check.
> > > > 
> > > > 2) Meaning?: blk_mq_queue_stopped() will return true even if only one hw
> > > > queue is stopped, given blk-mq must stop all hw queues a positive return
> > > > from this blk_mq_queue_stopped() check is incorrectly assuming it meanss
> > > > all hw queues are stopped.
> > > 
> > > blk-mq won't call blk_mq_stop_hw_queue or blk_mq_stop_hw_queues for
> > > dm-rq's queue too, so dm-rq's hw queue won't be stopped.
> > > 
> > > BTW blk_mq_stop_hw_queue or blk_mq_stop_hw_queues are supposed to be
> > > used for throttling queue.
> > 
> > I'm going to look at actually stopping the queue (using one of these
> > interfaces).  I didn't realize I wasn't actually stopping the queue.
> > The intent was to do so.
> > 
> > In speaking with Jens yesterday about freeze vs stop: it is clear that
> > dm-rq needs to still be able to allocate new requests, but _not_ call
> > the queue_rq to issue the requests, while "stopped" (due to dm-mpath
> > potentially deferring retries of failed requests because of path failure
> > while quiescing the queue during DM device suspend).  But that freezing
> > the queue goes too far because it won't allow such request allocation.
> 
> Seems I'm damned if I do (stop) or damned if I don't (new reports of
> requests completing after DM device suspend's
> blk_mq_quiesce_queue()+dm_wait_for_completion()).

request(but not new) completing is possible after blk_mq_quiesce_queue()+
dm_wait_for_completion, because blk_mq_rq_inflight() only checks INFLIGHT
request. If all requests are marked as MQ_RQ_COMPLETE, blk_mq_rq_inflight()
still may return false. However, MQ_RQ_COMPLETE is one transient state.

So what does dm-rq expect from dm_wait_for_completion()?

If it is just no new request entering dm_queue_rq(), there shouldn't be
issue.

If dm-rq hopes there aren't any real inflight request(MQ_RQ_COMPLETE &
MQ_RQ_INFLIGHT), we can change blk_mq_rq_inflight to support that.


Thanks, 
Ming
Ming Lei June 19, 2020, 10:52 p.m. UTC | #7
On Sat, Jun 20, 2020 at 06:37:44AM +0800, Ming Lei wrote:
> On Fri, Jun 19, 2020 at 01:40:41PM -0400, Mike Snitzer wrote:
> > On Fri, Jun 19 2020 at 12:06pm -0400,
> > Mike Snitzer <snitzer@redhat.com> wrote:
> > 
> > > On Fri, Jun 19 2020 at  6:11am -0400,
> > > Ming Lei <ming.lei@redhat.com> wrote:
> > > 
> > > > Hi Mike,
> > > > 
> > > > On Fri, Jun 19, 2020 at 05:42:50AM -0400, Mike Snitzer wrote:
> > > > > Hi Ming,
> > > > > 
> > > > > Thanks for the patch!  But I'm having a hard time understanding what
> > > > > you've written in the patch header,
> > > > > 
> > > > > On Fri, Jun 19 2020 at  4:42am -0400,
> > > > > Ming Lei <ming.lei@redhat.com> wrote:
> > > > > 
> > > > > > dm-rq won't stop queue, meantime blk-mq won't stop one queue too, so
> > > > > > remove the check.
> > > > > 
> > > > > It'd be helpful if you could unpack this with more detail before going on
> > > > > to explain why using blk_queue_quiesced, despite dm-rq using
> > > > > blk_mq_queue_stopped, would also be ineffective.
> > > > > 
> > > > > SO:
> > > > > 
> > > > > > dm-rq won't stop queue
> > > > > 
> > > > > 1) why won't dm-rq stop the queue?  Do you mean it won't reliably
> > > > >    _always_ stop the queue because of the blk_mq_queue_stopped() check?
> > > > 
> > > > device mapper doesn't call blk_mq_stop_hw_queue or blk_mq_stop_hw_queues.
> > > > 
> > > > > 
> > > > > > meantime blk-mq won't stop one queue too, so remove the check.
> > > > > 
> > > > > 2) Meaning?: blk_mq_queue_stopped() will return true even if only one hw
> > > > > queue is stopped, given blk-mq must stop all hw queues a positive return
> > > > > from this blk_mq_queue_stopped() check is incorrectly assuming it meanss
> > > > > all hw queues are stopped.
> > > > 
> > > > blk-mq won't call blk_mq_stop_hw_queue or blk_mq_stop_hw_queues for
> > > > dm-rq's queue too, so dm-rq's hw queue won't be stopped.
> > > > 
> > > > BTW blk_mq_stop_hw_queue or blk_mq_stop_hw_queues are supposed to be
> > > > used for throttling queue.
> > > 
> > > I'm going to look at actually stopping the queue (using one of these
> > > interfaces).  I didn't realize I wasn't actually stopping the queue.
> > > The intent was to do so.
> > > 
> > > In speaking with Jens yesterday about freeze vs stop: it is clear that
> > > dm-rq needs to still be able to allocate new requests, but _not_ call
> > > the queue_rq to issue the requests, while "stopped" (due to dm-mpath
> > > potentially deferring retries of failed requests because of path failure
> > > while quiescing the queue during DM device suspend).  But that freezing
> > > the queue goes too far because it won't allow such request allocation.
> > 
> > Seems I'm damned if I do (stop) or damned if I don't (new reports of
> > requests completing after DM device suspend's
> > blk_mq_quiesce_queue()+dm_wait_for_completion()).
> 
> request(but not new) completing is possible after blk_mq_quiesce_queue()+
> dm_wait_for_completion, because blk_mq_rq_inflight() only checks INFLIGHT
> request. If all requests are marked as MQ_RQ_COMPLETE, blk_mq_rq_inflight()
> still may return false. However, MQ_RQ_COMPLETE is one transient state.
> 
> So what does dm-rq expect from dm_wait_for_completion()?
> 
> If it is just no new request entering dm_queue_rq(), there shouldn't be
> issue.
> 
> If dm-rq hopes there aren't any real inflight request(MQ_RQ_COMPLETE &
> MQ_RQ_INFLIGHT), we can change blk_mq_rq_inflight to support that.

Hi Mike,

Please test the following patch and see if the issue can be fixed:

From faf0f9f15627446e8c35db518e37a4a2e4323eb2 Mon Sep 17 00:00:00 2001
From: Ming Lei <ming.lei@redhat.com>
Date: Sat, 20 Jun 2020 06:45:49 +0800
Subject: [PATCH] blk-mq: cover request of MQ_RQ_COMPLETE as inflight in
 blk_mq_rq_inflight

When request is marked as MQ_RQ_COMPLETE, ->complete isn't called & done
yet, and driver may expect that there isn't any driver related activity since
blk_mq_queue_inflight() returns.

Fixes it by covering request of MQ_RQ_COMPLETE as inflight in blk_mq_rq_inflight().

Cc: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4f57d27bfa73..7bc084b5bc37 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -831,7 +831,7 @@ static bool blk_mq_rq_inflight(struct blk_mq_hw_ctx *hctx, struct request *rq,
 	 * If we find a request that is inflight and the queue matches,
 	 * we know the queue is busy. Return false to stop the iteration.
 	 */
-	if (rq->state == MQ_RQ_IN_FLIGHT && rq->q == hctx->queue) {
+	if (rq->state != MQ_RQ_IDLE && rq->q == hctx->queue) {
 		bool *busy = priv;
 
 		*busy = true;
Mike Snitzer June 19, 2020, 11:04 p.m. UTC | #8
On Fri, Jun 19 2020 at  6:52pm -0400,
Ming Lei <ming.lei@redhat.com> wrote:

> On Sat, Jun 20, 2020 at 06:37:44AM +0800, Ming Lei wrote:
> > On Fri, Jun 19, 2020 at 01:40:41PM -0400, Mike Snitzer wrote:
> > > On Fri, Jun 19 2020 at 12:06pm -0400,
> > > Mike Snitzer <snitzer@redhat.com> wrote:
> > > 
> > > > On Fri, Jun 19 2020 at  6:11am -0400,
> > > > Ming Lei <ming.lei@redhat.com> wrote:
> > > > 
> > > > > Hi Mike,
> > > > > 
> > > > > On Fri, Jun 19, 2020 at 05:42:50AM -0400, Mike Snitzer wrote:
> > > > > > Hi Ming,
> > > > > > 
> > > > > > Thanks for the patch!  But I'm having a hard time understanding what
> > > > > > you've written in the patch header,
> > > > > > 
> > > > > > On Fri, Jun 19 2020 at  4:42am -0400,
> > > > > > Ming Lei <ming.lei@redhat.com> wrote:
> > > > > > 
> > > > > > > dm-rq won't stop queue, meantime blk-mq won't stop one queue too, so
> > > > > > > remove the check.
> > > > > > 
> > > > > > It'd be helpful if you could unpack this with more detail before going on
> > > > > > to explain why using blk_queue_quiesced, despite dm-rq using
> > > > > > blk_mq_queue_stopped, would also be ineffective.
> > > > > > 
> > > > > > SO:
> > > > > > 
> > > > > > > dm-rq won't stop queue
> > > > > > 
> > > > > > 1) why won't dm-rq stop the queue?  Do you mean it won't reliably
> > > > > >    _always_ stop the queue because of the blk_mq_queue_stopped() check?
> > > > > 
> > > > > device mapper doesn't call blk_mq_stop_hw_queue or blk_mq_stop_hw_queues.
> > > > > 
> > > > > > 
> > > > > > > meantime blk-mq won't stop one queue too, so remove the check.
> > > > > > 
> > > > > > 2) Meaning?: blk_mq_queue_stopped() will return true even if only one hw
> > > > > > queue is stopped, given blk-mq must stop all hw queues a positive return
> > > > > > from this blk_mq_queue_stopped() check is incorrectly assuming it meanss
> > > > > > all hw queues are stopped.
> > > > > 
> > > > > blk-mq won't call blk_mq_stop_hw_queue or blk_mq_stop_hw_queues for
> > > > > dm-rq's queue too, so dm-rq's hw queue won't be stopped.
> > > > > 
> > > > > BTW blk_mq_stop_hw_queue or blk_mq_stop_hw_queues are supposed to be
> > > > > used for throttling queue.
> > > > 
> > > > I'm going to look at actually stopping the queue (using one of these
> > > > interfaces).  I didn't realize I wasn't actually stopping the queue.
> > > > The intent was to do so.
> > > > 
> > > > In speaking with Jens yesterday about freeze vs stop: it is clear that
> > > > dm-rq needs to still be able to allocate new requests, but _not_ call
> > > > the queue_rq to issue the requests, while "stopped" (due to dm-mpath
> > > > potentially deferring retries of failed requests because of path failure
> > > > while quiescing the queue during DM device suspend).  But that freezing
> > > > the queue goes too far because it won't allow such request allocation.
> > > 
> > > Seems I'm damned if I do (stop) or damned if I don't (new reports of
> > > requests completing after DM device suspend's
> > > blk_mq_quiesce_queue()+dm_wait_for_completion()).
> > 
> > request(but not new) completing is possible after blk_mq_quiesce_queue()+
> > dm_wait_for_completion, because blk_mq_rq_inflight() only checks INFLIGHT
> > request. If all requests are marked as MQ_RQ_COMPLETE, blk_mq_rq_inflight()
> > still may return false. However, MQ_RQ_COMPLETE is one transient state.
> > 
> > So what does dm-rq expect from dm_wait_for_completion()?
> > 
> > If it is just no new request entering dm_queue_rq(), there shouldn't be
> > issue.
> > 
> > If dm-rq hopes there aren't any real inflight request(MQ_RQ_COMPLETE &
> > MQ_RQ_INFLIGHT), we can change blk_mq_rq_inflight to support that.
> 
> Hi Mike,
> 
> Please test the following patch and see if the issue can be fixed:
> 
> From faf0f9f15627446e8c35db518e37a4a2e4323eb2 Mon Sep 17 00:00:00 2001
> From: Ming Lei <ming.lei@redhat.com>
> Date: Sat, 20 Jun 2020 06:45:49 +0800
> Subject: [PATCH] blk-mq: cover request of MQ_RQ_COMPLETE as inflight in
>  blk_mq_rq_inflight
> 
> When request is marked as MQ_RQ_COMPLETE, ->complete isn't called & done
> yet, and driver may expect that there isn't any driver related activity since
> blk_mq_queue_inflight() returns.
> 
> Fixes it by covering request of MQ_RQ_COMPLETE as inflight in blk_mq_rq_inflight().
> 
> Cc: Mike Snitzer <snitzer@redhat.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  block/blk-mq.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 4f57d27bfa73..7bc084b5bc37 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -831,7 +831,7 @@ static bool blk_mq_rq_inflight(struct blk_mq_hw_ctx *hctx, struct request *rq,
>  	 * If we find a request that is inflight and the queue matches,
>  	 * we know the queue is busy. Return false to stop the iteration.
>  	 */
> -	if (rq->state == MQ_RQ_IN_FLIGHT && rq->q == hctx->queue) {
> +	if (rq->state != MQ_RQ_IDLE && rq->q == hctx->queue) {
>  		bool *busy = priv;
>  
>  		*busy = true;
> -- 
> 2.25.2
> 

I was going to ask if being more explit would be better:

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4f57d27bfa73..96816ce57eb1 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -828,10 +828,11 @@ static bool blk_mq_rq_inflight(struct blk_mq_hw_ctx *hctx, struct request *rq,
                                void *priv, bool reserved)
 {
         /*
-         * If we find a request that is inflight and the queue matches,
+         * If we find a request that is inflight or complete and the queue matches,
          * we know the queue is busy. Return false to stop the iteration.
          */
-        if (rq->state == MQ_RQ_IN_FLIGHT && rq->q == hctx->queue) {
+        if ((rq->state == MQ_RQ_IN_FLIGHT || rq->state == MQ_RQ_COMPLETE) &&
+            rq->q == hctx->queue) {
                 bool *busy = priv;

                 *busy = true;

But is your patch more forgiving of any future blk-mq states that might
also consistitute outstanding work?  Seems likely.

Thanks,
Mike
Ming Lei June 19, 2020, 11:14 p.m. UTC | #9
On Fri, Jun 19, 2020 at 07:04:05PM -0400, Mike Snitzer wrote:
> On Fri, Jun 19 2020 at  6:52pm -0400,
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > On Sat, Jun 20, 2020 at 06:37:44AM +0800, Ming Lei wrote:
> > > On Fri, Jun 19, 2020 at 01:40:41PM -0400, Mike Snitzer wrote:
> > > > On Fri, Jun 19 2020 at 12:06pm -0400,
> > > > Mike Snitzer <snitzer@redhat.com> wrote:
> > > > 
> > > > > On Fri, Jun 19 2020 at  6:11am -0400,
> > > > > Ming Lei <ming.lei@redhat.com> wrote:
> > > > > 
> > > > > > Hi Mike,
> > > > > > 
> > > > > > On Fri, Jun 19, 2020 at 05:42:50AM -0400, Mike Snitzer wrote:
> > > > > > > Hi Ming,
> > > > > > > 
> > > > > > > Thanks for the patch!  But I'm having a hard time understanding what
> > > > > > > you've written in the patch header,
> > > > > > > 
> > > > > > > On Fri, Jun 19 2020 at  4:42am -0400,
> > > > > > > Ming Lei <ming.lei@redhat.com> wrote:
> > > > > > > 
> > > > > > > > dm-rq won't stop queue, meantime blk-mq won't stop one queue too, so
> > > > > > > > remove the check.
> > > > > > > 
> > > > > > > It'd be helpful if you could unpack this with more detail before going on
> > > > > > > to explain why using blk_queue_quiesced, despite dm-rq using
> > > > > > > blk_mq_queue_stopped, would also be ineffective.
> > > > > > > 
> > > > > > > SO:
> > > > > > > 
> > > > > > > > dm-rq won't stop queue
> > > > > > > 
> > > > > > > 1) why won't dm-rq stop the queue?  Do you mean it won't reliably
> > > > > > >    _always_ stop the queue because of the blk_mq_queue_stopped() check?
> > > > > > 
> > > > > > device mapper doesn't call blk_mq_stop_hw_queue or blk_mq_stop_hw_queues.
> > > > > > 
> > > > > > > 
> > > > > > > > meantime blk-mq won't stop one queue too, so remove the check.
> > > > > > > 
> > > > > > > 2) Meaning?: blk_mq_queue_stopped() will return true even if only one hw
> > > > > > > queue is stopped, given blk-mq must stop all hw queues a positive return
> > > > > > > from this blk_mq_queue_stopped() check is incorrectly assuming it meanss
> > > > > > > all hw queues are stopped.
> > > > > > 
> > > > > > blk-mq won't call blk_mq_stop_hw_queue or blk_mq_stop_hw_queues for
> > > > > > dm-rq's queue too, so dm-rq's hw queue won't be stopped.
> > > > > > 
> > > > > > BTW blk_mq_stop_hw_queue or blk_mq_stop_hw_queues are supposed to be
> > > > > > used for throttling queue.
> > > > > 
> > > > > I'm going to look at actually stopping the queue (using one of these
> > > > > interfaces).  I didn't realize I wasn't actually stopping the queue.
> > > > > The intent was to do so.
> > > > > 
> > > > > In speaking with Jens yesterday about freeze vs stop: it is clear that
> > > > > dm-rq needs to still be able to allocate new requests, but _not_ call
> > > > > the queue_rq to issue the requests, while "stopped" (due to dm-mpath
> > > > > potentially deferring retries of failed requests because of path failure
> > > > > while quiescing the queue during DM device suspend).  But that freezing
> > > > > the queue goes too far because it won't allow such request allocation.
> > > > 
> > > > Seems I'm damned if I do (stop) or damned if I don't (new reports of
> > > > requests completing after DM device suspend's
> > > > blk_mq_quiesce_queue()+dm_wait_for_completion()).
> > > 
> > > request(but not new) completing is possible after blk_mq_quiesce_queue()+
> > > dm_wait_for_completion, because blk_mq_rq_inflight() only checks INFLIGHT
> > > request. If all requests are marked as MQ_RQ_COMPLETE, blk_mq_rq_inflight()
> > > still may return false. However, MQ_RQ_COMPLETE is one transient state.
> > > 
> > > So what does dm-rq expect from dm_wait_for_completion()?
> > > 
> > > If it is just no new request entering dm_queue_rq(), there shouldn't be
> > > issue.
> > > 
> > > If dm-rq hopes there aren't any real inflight request(MQ_RQ_COMPLETE &
> > > MQ_RQ_INFLIGHT), we can change blk_mq_rq_inflight to support that.
> > 
> > Hi Mike,
> > 
> > Please test the following patch and see if the issue can be fixed:
> > 
> > From faf0f9f15627446e8c35db518e37a4a2e4323eb2 Mon Sep 17 00:00:00 2001
> > From: Ming Lei <ming.lei@redhat.com>
> > Date: Sat, 20 Jun 2020 06:45:49 +0800
> > Subject: [PATCH] blk-mq: cover request of MQ_RQ_COMPLETE as inflight in
> >  blk_mq_rq_inflight
> > 
> > When request is marked as MQ_RQ_COMPLETE, ->complete isn't called & done
> > yet, and driver may expect that there isn't any driver related activity since
> > blk_mq_queue_inflight() returns.
> > 
> > Fixes it by covering request of MQ_RQ_COMPLETE as inflight in blk_mq_rq_inflight().
> > 
> > Cc: Mike Snitzer <snitzer@redhat.com>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >  block/blk-mq.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index 4f57d27bfa73..7bc084b5bc37 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -831,7 +831,7 @@ static bool blk_mq_rq_inflight(struct blk_mq_hw_ctx *hctx, struct request *rq,
> >  	 * If we find a request that is inflight and the queue matches,
> >  	 * we know the queue is busy. Return false to stop the iteration.
> >  	 */
> > -	if (rq->state == MQ_RQ_IN_FLIGHT && rq->q == hctx->queue) {
> > +	if (rq->state != MQ_RQ_IDLE && rq->q == hctx->queue) {
> >  		bool *busy = priv;
> >  
> >  		*busy = true;
> > -- 
> > 2.25.2
> > 
> 
> I was going to ask if being more explit would be better:
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 4f57d27bfa73..96816ce57eb1 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -828,10 +828,11 @@ static bool blk_mq_rq_inflight(struct blk_mq_hw_ctx *hctx, struct request *rq,
>                                 void *priv, bool reserved)
>  {
>          /*
> -         * If we find a request that is inflight and the queue matches,
> +         * If we find a request that is inflight or complete and the queue matches,
>           * we know the queue is busy. Return false to stop the iteration.
>           */
> -        if (rq->state == MQ_RQ_IN_FLIGHT && rq->q == hctx->queue) {
> +        if ((rq->state == MQ_RQ_IN_FLIGHT || rq->state == MQ_RQ_COMPLETE) &&
> +            rq->q == hctx->queue) {
>                  bool *busy = priv;
> 
>                  *busy = true;
> 
> But is your patch more forgiving of any future blk-mq states that might
> also consistitute outstanding work?  Seems likely.

I am fine with either way since it is called in slow path.

But what matters is if it can fix this issue?

Thanks,
Ming
Ming Lei June 19, 2020, 11:37 p.m. UTC | #10
On Sat, Jun 20, 2020 at 07:14:51AM +0800, Ming Lei wrote:
> On Fri, Jun 19, 2020 at 07:04:05PM -0400, Mike Snitzer wrote:
> > On Fri, Jun 19 2020 at  6:52pm -0400,
> > Ming Lei <ming.lei@redhat.com> wrote:
> > 
> > > On Sat, Jun 20, 2020 at 06:37:44AM +0800, Ming Lei wrote:
> > > > On Fri, Jun 19, 2020 at 01:40:41PM -0400, Mike Snitzer wrote:
> > > > > On Fri, Jun 19 2020 at 12:06pm -0400,
> > > > > Mike Snitzer <snitzer@redhat.com> wrote:
> > > > > 
> > > > > > On Fri, Jun 19 2020 at  6:11am -0400,
> > > > > > Ming Lei <ming.lei@redhat.com> wrote:
> > > > > > 
> > > > > > > Hi Mike,
> > > > > > > 
> > > > > > > On Fri, Jun 19, 2020 at 05:42:50AM -0400, Mike Snitzer wrote:
> > > > > > > > Hi Ming,
> > > > > > > > 
> > > > > > > > Thanks for the patch!  But I'm having a hard time understanding what
> > > > > > > > you've written in the patch header,
> > > > > > > > 
> > > > > > > > On Fri, Jun 19 2020 at  4:42am -0400,
> > > > > > > > Ming Lei <ming.lei@redhat.com> wrote:
> > > > > > > > 
> > > > > > > > > dm-rq won't stop queue, meantime blk-mq won't stop one queue too, so
> > > > > > > > > remove the check.
> > > > > > > > 
> > > > > > > > It'd be helpful if you could unpack this with more detail before going on
> > > > > > > > to explain why using blk_queue_quiesced, despite dm-rq using
> > > > > > > > blk_mq_queue_stopped, would also be ineffective.
> > > > > > > > 
> > > > > > > > SO:
> > > > > > > > 
> > > > > > > > > dm-rq won't stop queue
> > > > > > > > 
> > > > > > > > 1) why won't dm-rq stop the queue?  Do you mean it won't reliably
> > > > > > > >    _always_ stop the queue because of the blk_mq_queue_stopped() check?
> > > > > > > 
> > > > > > > device mapper doesn't call blk_mq_stop_hw_queue or blk_mq_stop_hw_queues.
> > > > > > > 
> > > > > > > > 
> > > > > > > > > meantime blk-mq won't stop one queue too, so remove the check.
> > > > > > > > 
> > > > > > > > 2) Meaning?: blk_mq_queue_stopped() will return true even if only one hw
> > > > > > > > queue is stopped, given blk-mq must stop all hw queues a positive return
> > > > > > > > from this blk_mq_queue_stopped() check is incorrectly assuming it meanss
> > > > > > > > all hw queues are stopped.
> > > > > > > 
> > > > > > > blk-mq won't call blk_mq_stop_hw_queue or blk_mq_stop_hw_queues for
> > > > > > > dm-rq's queue too, so dm-rq's hw queue won't be stopped.
> > > > > > > 
> > > > > > > BTW blk_mq_stop_hw_queue or blk_mq_stop_hw_queues are supposed to be
> > > > > > > used for throttling queue.
> > > > > > 
> > > > > > I'm going to look at actually stopping the queue (using one of these
> > > > > > interfaces).  I didn't realize I wasn't actually stopping the queue.
> > > > > > The intent was to do so.
> > > > > > 
> > > > > > In speaking with Jens yesterday about freeze vs stop: it is clear that
> > > > > > dm-rq needs to still be able to allocate new requests, but _not_ call
> > > > > > the queue_rq to issue the requests, while "stopped" (due to dm-mpath
> > > > > > potentially deferring retries of failed requests because of path failure
> > > > > > while quiescing the queue during DM device suspend).  But that freezing
> > > > > > the queue goes too far because it won't allow such request allocation.
> > > > > 
> > > > > Seems I'm damned if I do (stop) or damned if I don't (new reports of
> > > > > requests completing after DM device suspend's
> > > > > blk_mq_quiesce_queue()+dm_wait_for_completion()).
> > > > 
> > > > request(but not new) completing is possible after blk_mq_quiesce_queue()+
> > > > dm_wait_for_completion, because blk_mq_rq_inflight() only checks INFLIGHT
> > > > request. If all requests are marked as MQ_RQ_COMPLETE, blk_mq_rq_inflight()
> > > > still may return false. However, MQ_RQ_COMPLETE is one transient state.
> > > > 
> > > > So what does dm-rq expect from dm_wait_for_completion()?
> > > > 
> > > > If it is just no new request entering dm_queue_rq(), there shouldn't be
> > > > issue.
> > > > 
> > > > If dm-rq hopes there aren't any real inflight request(MQ_RQ_COMPLETE &
> > > > MQ_RQ_INFLIGHT), we can change blk_mq_rq_inflight to support that.
> > > 
> > > Hi Mike,
> > > 
> > > Please test the following patch and see if the issue can be fixed:
> > > 
> > > From faf0f9f15627446e8c35db518e37a4a2e4323eb2 Mon Sep 17 00:00:00 2001
> > > From: Ming Lei <ming.lei@redhat.com>
> > > Date: Sat, 20 Jun 2020 06:45:49 +0800
> > > Subject: [PATCH] blk-mq: cover request of MQ_RQ_COMPLETE as inflight in
> > >  blk_mq_rq_inflight
> > > 
> > > When request is marked as MQ_RQ_COMPLETE, ->complete isn't called & done
> > > yet, and driver may expect that there isn't any driver related activity since
> > > blk_mq_queue_inflight() returns.
> > > 
> > > Fixes it by covering request of MQ_RQ_COMPLETE as inflight in blk_mq_rq_inflight().
> > > 
> > > Cc: Mike Snitzer <snitzer@redhat.com>
> > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > ---
> > >  block/blk-mq.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > > index 4f57d27bfa73..7bc084b5bc37 100644
> > > --- a/block/blk-mq.c
> > > +++ b/block/blk-mq.c
> > > @@ -831,7 +831,7 @@ static bool blk_mq_rq_inflight(struct blk_mq_hw_ctx *hctx, struct request *rq,
> > >  	 * If we find a request that is inflight and the queue matches,
> > >  	 * we know the queue is busy. Return false to stop the iteration.
> > >  	 */
> > > -	if (rq->state == MQ_RQ_IN_FLIGHT && rq->q == hctx->queue) {
> > > +	if (rq->state != MQ_RQ_IDLE && rq->q == hctx->queue) {
> > >  		bool *busy = priv;
> > >  
> > >  		*busy = true;
> > > -- 
> > > 2.25.2
> > > 
> > 
> > I was going to ask if being more explit would be better:
> > 
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index 4f57d27bfa73..96816ce57eb1 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -828,10 +828,11 @@ static bool blk_mq_rq_inflight(struct blk_mq_hw_ctx *hctx, struct request *rq,
> >                                 void *priv, bool reserved)
> >  {
> >          /*
> > -         * If we find a request that is inflight and the queue matches,
> > +         * If we find a request that is inflight or complete and the queue matches,
> >           * we know the queue is busy. Return false to stop the iteration.
> >           */
> > -        if (rq->state == MQ_RQ_IN_FLIGHT && rq->q == hctx->queue) {
> > +        if ((rq->state == MQ_RQ_IN_FLIGHT || rq->state == MQ_RQ_COMPLETE) &&
> > +            rq->q == hctx->queue) {
> >                  bool *busy = priv;
> > 
> >                  *busy = true;
> > 
> > But is your patch more forgiving of any future blk-mq states that might
> > also consistitute outstanding work?  Seems likely.
> 
> I am fine with either way since it is called in slow path.

BTW, another candidate is to use blk_mq_request_started().

Thanks,
Ming
diff mbox series

Patch

diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index f60c02512121..ed4d5ea66ccc 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -70,9 +70,6 @@  void dm_start_queue(struct request_queue *q)
 
 void dm_stop_queue(struct request_queue *q)
 {
-	if (blk_mq_queue_stopped(q))
-		return;
-
 	blk_mq_quiesce_queue(q);
 }