diff mbox series

[3/5] blk-mq: add helper of blk_mq_global_quiesce_wait()

Message ID 20211119021849.2259254-4-ming.lei@redhat.com (mailing list archive)
State Superseded
Headers show
Series blk-mq: quiesce improvement | expand

Commit Message

Ming Lei Nov. 19, 2021, 2:18 a.m. UTC
Add helper of blk_mq_global_quiesce_wait() for supporting to quiesce
queues in parallel, then we can just wait once if global quiesce wait
is allowed.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 include/linux/blk-mq.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Comments

Sagi Grimberg Nov. 22, 2021, 7:56 a.m. UTC | #1
> Add helper of blk_mq_global_quiesce_wait() for supporting to quiesce
> queues in parallel, then we can just wait once if global quiesce wait
> is allowed.

blk_mq_global_quiesce_wait() is a poor name... global is scope-less and
obviously it has a scope.


> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>   include/linux/blk-mq.h | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
> 
> diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
> index 5cc7fc1ea863..a9fecda2507e 100644
> --- a/include/linux/blk-mq.h
> +++ b/include/linux/blk-mq.h
> @@ -777,6 +777,19 @@ static inline bool blk_mq_add_to_batch(struct request *req,
>   	return true;
>   }
>   
> +/*
> + * If the queue has allocated & used srcu to quiesce queue, quiesce wait is
> + * done via the synchronize_srcu(q->rcu), otherwise it is done via global
> + * synchronize_rcu().
> + *
> + * This helper can help us to support quiescing queue in parallel, so just
> + * one quiesce wait is enough if global quiesce wait is allowed.
> + */
> +static inline bool blk_mq_global_quiesce_wait(struct request_queue *q)
> +{
> +	return !q->alloc_srcu;
> +}
> +
>   void blk_mq_requeue_request(struct request *rq, bool kick_requeue_list);
>   void blk_mq_kick_requeue_list(struct request_queue *q);
>   void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long msecs);
>
Sagi Grimberg Nov. 22, 2021, 8 a.m. UTC | #2
On 11/22/21 9:56 AM, Sagi Grimberg wrote:
> 
>> Add helper of blk_mq_global_quiesce_wait() for supporting to quiesce
>> queues in parallel, then we can just wait once if global quiesce wait
>> is allowed.
> 
> blk_mq_global_quiesce_wait() is a poor name... global is scope-less and
> obviously it has a scope.
> 
> 
>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
>> ---
>>   include/linux/blk-mq.h | 13 +++++++++++++
>>   1 file changed, 13 insertions(+)
>>
>> diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
>> index 5cc7fc1ea863..a9fecda2507e 100644
>> --- a/include/linux/blk-mq.h
>> +++ b/include/linux/blk-mq.h
>> @@ -777,6 +777,19 @@ static inline bool blk_mq_add_to_batch(struct 
>> request *req,
>>       return true;
>>   }
>> +/*
>> + * If the queue has allocated & used srcu to quiesce queue, quiesce 
>> wait is
>> + * done via the synchronize_srcu(q->rcu), otherwise it is done via 
>> global
>> + * synchronize_rcu().
>> + *
>> + * This helper can help us to support quiescing queue in parallel, so 
>> just
>> + * one quiesce wait is enough if global quiesce wait is allowed.
>> + */
>> +static inline bool blk_mq_global_quiesce_wait(struct request_queue *q)
>> +{
>> +    return !q->alloc_srcu;

	return !q->srcu ?
Ming Lei Nov. 22, 2021, 1:26 p.m. UTC | #3
On Mon, Nov 22, 2021 at 09:56:10AM +0200, Sagi Grimberg wrote:
> 
> > Add helper of blk_mq_global_quiesce_wait() for supporting to quiesce
> > queues in parallel, then we can just wait once if global quiesce wait
> > is allowed.
> 
> blk_mq_global_quiesce_wait() is a poor name... global is scope-less and
> obviously it has a scope.

How about blk_mq_shared_quiesce_wait()? or any suggestion?


Thanks,
Ming
Sagi Grimberg Nov. 22, 2021, 1:55 p.m. UTC | #4
>>> Add helper of blk_mq_global_quiesce_wait() for supporting to quiesce
>>> queues in parallel, then we can just wait once if global quiesce wait
>>> is allowed.
>>
>> blk_mq_global_quiesce_wait() is a poor name... global is scope-less and
>> obviously it has a scope.
> 
> How about blk_mq_shared_quiesce_wait()? or any suggestion?

Shared between what?

Maybe if the queue has a non-blocking tagset, it can have a "quiesced"
flag that is cleared in unquiesce? then the callers can just continue
to iterate but will only wait the rcu grace period once.
Ming Lei Nov. 23, 2021, 12:17 a.m. UTC | #5
On Mon, Nov 22, 2021 at 03:55:36PM +0200, Sagi Grimberg wrote:
> 
> > > > Add helper of blk_mq_global_quiesce_wait() for supporting to quiesce
> > > > queues in parallel, then we can just wait once if global quiesce wait
> > > > is allowed.
> > > 
> > > blk_mq_global_quiesce_wait() is a poor name... global is scope-less and
> > > obviously it has a scope.
> > 
> > How about blk_mq_shared_quiesce_wait()? or any suggestion?
> 
> Shared between what?

All request queues in one host-wide, both scsi and nvme has such
requirement.

> 
> Maybe if the queue has a non-blocking tagset, it can have a "quiesced"
> flag that is cleared in unquiesce? then the callers can just continue
> to iterate but will only wait the rcu grace period once.

Yeah, that is what these patches try to implement.


Thanks
Ming
Sagi Grimberg Nov. 23, 2021, 9 a.m. UTC | #6
>>>>> Add helper of blk_mq_global_quiesce_wait() for supporting to quiesce
>>>>> queues in parallel, then we can just wait once if global quiesce wait
>>>>> is allowed.
>>>>
>>>> blk_mq_global_quiesce_wait() is a poor name... global is scope-less and
>>>> obviously it has a scope.
>>>
>>> How about blk_mq_shared_quiesce_wait()? or any suggestion?
>>
>> Shared between what?
> 
> All request queues in one host-wide, both scsi and nvme has such
> requirement.
> 
>>
>> Maybe if the queue has a non-blocking tagset, it can have a "quiesced"
>> flag that is cleared in unquiesce? then the callers can just continue
>> to iterate but will only wait the rcu grace period once.
> 
> Yeah, that is what these patches try to implement.

I was suggesting to "hide" it in the interface.
Maybe something like:
--
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8799fa73ef34..627b631db1f9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -263,14 +263,18 @@ void blk_mq_wait_quiesce_done(struct request_queue *q)
         unsigned int i;
         bool rcu = false;

+       if (!q->has_srcu && q->quiesced)
+               return;
         queue_for_each_hw_ctx(q, hctx, i) {
                 if (hctx->flags & BLK_MQ_F_BLOCKING)
                         synchronize_srcu(hctx->srcu);
                 else
                         rcu = true;
         }
-       if (rcu)
+       if (rcu) {
                 synchronize_rcu();
+               q->quiesced = true;
+       }
  }
  EXPORT_SYMBOL_GPL(blk_mq_wait_quiesce_done);

@@ -308,6 +312,7 @@ void blk_mq_unquiesce_queue(struct request_queue *q)
         } else if (!--q->quiesce_depth) {
                 blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q);
                 run_queue = true;
+               q->quiesced = false;
         }
         spin_unlock_irqrestore(&q->queue_lock, flags);
--
Ming Lei Nov. 30, 2021, 2:33 a.m. UTC | #7
On Tue, Nov 23, 2021 at 11:00:45AM +0200, Sagi Grimberg wrote:
> 
> > > > > > Add helper of blk_mq_global_quiesce_wait() for supporting to quiesce
> > > > > > queues in parallel, then we can just wait once if global quiesce wait
> > > > > > is allowed.
> > > > > 
> > > > > blk_mq_global_quiesce_wait() is a poor name... global is scope-less and
> > > > > obviously it has a scope.
> > > > 
> > > > How about blk_mq_shared_quiesce_wait()? or any suggestion?
> > > 
> > > Shared between what?
> > 
> > All request queues in one host-wide, both scsi and nvme has such
> > requirement.
> > 
> > > 
> > > Maybe if the queue has a non-blocking tagset, it can have a "quiesced"
> > > flag that is cleared in unquiesce? then the callers can just continue
> > > to iterate but will only wait the rcu grace period once.
> > 
> > Yeah, that is what these patches try to implement.
> 
> I was suggesting to "hide" it in the interface.
> Maybe something like:
> --
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 8799fa73ef34..627b631db1f9 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -263,14 +263,18 @@ void blk_mq_wait_quiesce_done(struct request_queue *q)
>         unsigned int i;
>         bool rcu = false;
> 
> +       if (!q->has_srcu && q->quiesced)
> +               return;
>         queue_for_each_hw_ctx(q, hctx, i) {
>                 if (hctx->flags & BLK_MQ_F_BLOCKING)
>                         synchronize_srcu(hctx->srcu);
>                 else
>                         rcu = true;
>         }
> -       if (rcu)
> +       if (rcu) {
>                 synchronize_rcu();
> +               q->quiesced = true;
> +       }
>  }
>  EXPORT_SYMBOL_GPL(blk_mq_wait_quiesce_done);
> 
> @@ -308,6 +312,7 @@ void blk_mq_unquiesce_queue(struct request_queue *q)
>         } else if (!--q->quiesce_depth) {
>                 blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q);
>                 run_queue = true;
> +               q->quiesced = false;

Different request queues are passed to blk_mq_wait_quiesce_done() during
the iteration, so marking 'quiesced' doesn't make any difference here.


Thanks,
Ming
Sagi Grimberg Dec. 8, 2021, 12:49 p.m. UTC | #8
On 11/30/21 4:33 AM, Ming Lei wrote:
> On Tue, Nov 23, 2021 at 11:00:45AM +0200, Sagi Grimberg wrote:
>>
>>>>>>> Add helper of blk_mq_global_quiesce_wait() for supporting to quiesce
>>>>>>> queues in parallel, then we can just wait once if global quiesce wait
>>>>>>> is allowed.
>>>>>>
>>>>>> blk_mq_global_quiesce_wait() is a poor name... global is scope-less and
>>>>>> obviously it has a scope.
>>>>>
>>>>> How about blk_mq_shared_quiesce_wait()? or any suggestion?
>>>>
>>>> Shared between what?
>>>
>>> All request queues in one host-wide, both scsi and nvme has such
>>> requirement.
>>>
>>>>
>>>> Maybe if the queue has a non-blocking tagset, it can have a "quiesced"
>>>> flag that is cleared in unquiesce? then the callers can just continue
>>>> to iterate but will only wait the rcu grace period once.
>>>
>>> Yeah, that is what these patches try to implement.
>>
>> I was suggesting to "hide" it in the interface.
>> Maybe something like:
>> --
>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>> index 8799fa73ef34..627b631db1f9 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -263,14 +263,18 @@ void blk_mq_wait_quiesce_done(struct request_queue *q)
>>          unsigned int i;
>>          bool rcu = false;
>>
>> +       if (!q->has_srcu && q->quiesced)
>> +               return;
>>          queue_for_each_hw_ctx(q, hctx, i) {
>>                  if (hctx->flags & BLK_MQ_F_BLOCKING)
>>                          synchronize_srcu(hctx->srcu);
>>                  else
>>                          rcu = true;
>>          }
>> -       if (rcu)
>> +       if (rcu) {
>>                  synchronize_rcu();
>> +               q->quiesced = true;
>> +       }
>>   }
>>   EXPORT_SYMBOL_GPL(blk_mq_wait_quiesce_done);
>>
>> @@ -308,6 +312,7 @@ void blk_mq_unquiesce_queue(struct request_queue *q)
>>          } else if (!--q->quiesce_depth) {
>>                  blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q);
>>                  run_queue = true;
>> +               q->quiesced = false;
> 
> Different request queues are passed to blk_mq_wait_quiesce_done() during
> the iteration, so marking 'quiesced' doesn't make any difference here.

I actually meant q->tag_set->quiesced, such that the flag will be
used in the tag_set reference. This way this sharing will be kept
hidden from the callers.
Ming Lei Dec. 10, 2021, 2:02 a.m. UTC | #9
On Wed, Dec 08, 2021 at 02:49:25PM +0200, Sagi Grimberg wrote:
> 
> 
> On 11/30/21 4:33 AM, Ming Lei wrote:
> > On Tue, Nov 23, 2021 at 11:00:45AM +0200, Sagi Grimberg wrote:
> > > 
> > > > > > > > Add helper of blk_mq_global_quiesce_wait() for supporting to quiesce
> > > > > > > > queues in parallel, then we can just wait once if global quiesce wait
> > > > > > > > is allowed.
> > > > > > > 
> > > > > > > blk_mq_global_quiesce_wait() is a poor name... global is scope-less and
> > > > > > > obviously it has a scope.
> > > > > > 
> > > > > > How about blk_mq_shared_quiesce_wait()? or any suggestion?
> > > > > 
> > > > > Shared between what?
> > > > 
> > > > All request queues in one host-wide, both scsi and nvme has such
> > > > requirement.
> > > > 
> > > > > 
> > > > > Maybe if the queue has a non-blocking tagset, it can have a "quiesced"
> > > > > flag that is cleared in unquiesce? then the callers can just continue
> > > > > to iterate but will only wait the rcu grace period once.
> > > > 
> > > > Yeah, that is what these patches try to implement.
> > > 
> > > I was suggesting to "hide" it in the interface.
> > > Maybe something like:
> > > --
> > > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > > index 8799fa73ef34..627b631db1f9 100644
> > > --- a/block/blk-mq.c
> > > +++ b/block/blk-mq.c
> > > @@ -263,14 +263,18 @@ void blk_mq_wait_quiesce_done(struct request_queue *q)
> > >          unsigned int i;
> > >          bool rcu = false;
> > > 
> > > +       if (!q->has_srcu && q->quiesced)
> > > +               return;
> > >          queue_for_each_hw_ctx(q, hctx, i) {
> > >                  if (hctx->flags & BLK_MQ_F_BLOCKING)
> > >                          synchronize_srcu(hctx->srcu);
> > >                  else
> > >                          rcu = true;
> > >          }
> > > -       if (rcu)
> > > +       if (rcu) {
> > >                  synchronize_rcu();
> > > +               q->quiesced = true;
> > > +       }
> > >   }
> > >   EXPORT_SYMBOL_GPL(blk_mq_wait_quiesce_done);
> > > 
> > > @@ -308,6 +312,7 @@ void blk_mq_unquiesce_queue(struct request_queue *q)
> > >          } else if (!--q->quiesce_depth) {
> > >                  blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q);
> > >                  run_queue = true;
> > > +               q->quiesced = false;
> > 
> > Different request queues are passed to blk_mq_wait_quiesce_done() during
> > the iteration, so marking 'quiesced' doesn't make any difference here.
> 
> I actually meant q->tag_set->quiesced, such that the flag will be
> used in the tag_set reference. This way this sharing will be kept
> hidden from the callers.

That looks easy, but not true really from API viewpoint, q->tag_set has different
lifetime which is covered by driver, not mention other request queues may touch
q->tag_set->quiesced from other code path, so things will become much complicated,
also what is the meaning of quiesced state for tagset wide?

Here I prefer to philosophy of 'Worse is better'[1], and simplicity over perfection
by adding one simple helper of blk_mq_shared_quiesce_wait() for improving the case
very easily.

[1] ihttps://ipfs.fleek.co/ipfs/QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco/wiki/Unix_philosophy.html


Thanks,
Ming
diff mbox series

Patch

diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 5cc7fc1ea863..a9fecda2507e 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -777,6 +777,19 @@  static inline bool blk_mq_add_to_batch(struct request *req,
 	return true;
 }
 
+/*
+ * If the queue has allocated & used srcu to quiesce queue, quiesce wait is
+ * done via the synchronize_srcu(q->rcu), otherwise it is done via global
+ * synchronize_rcu().
+ *
+ * This helper can help us to support quiescing queue in parallel, so just
+ * one quiesce wait is enough if global quiesce wait is allowed.
+ */
+static inline bool blk_mq_global_quiesce_wait(struct request_queue *q)
+{
+	return !q->alloc_srcu;
+}
+
 void blk_mq_requeue_request(struct request *rq, bool kick_requeue_list);
 void blk_mq_kick_requeue_list(struct request_queue *q);
 void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long msecs);