[1/2] block: Avoid invoking blk_run_queue() recursively

Message ID	51274C76.8080007@acm.org (mailing list archive)
State	Deferred, archived
Headers	show Return-Path: <dm-devel-bounces@redhat.com> Message-ID: <51274C76.8080007@acm.org> Date: Fri, 22 Feb 2013 11:46:14 +0100 From: Bart Van Assche <bvanassche@acm.org> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130105 Thunderbird/17.0.2 MIME-Version: 1.0 To: device-mapper development <dm-devel@redhat.com> References: <51274C2F.6070500@acm.org> In-Reply-To: <51274C2F.6070500@acm.org> Cc: Jens Axboe <axboe@kernel.dk>, linux-scsi <linux-scsi@vger.kernel.org>, Mike Snitzer <snitzer@redhat.com>, James Bottomley <JBottomley@parallels.com>, Tejun Heo <tj@kernel.org>, Alasdair G Kergon <agk@redhat.com> Subject: [dm-devel] [PATCH 1/2] block: Avoid invoking blk_run_queue() recursively Precedence: junk Reply-To: device-mapper development <dm-devel@redhat.com> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com

Message ID

51274C76.8080007@acm.org (mailing list archive)

State

Deferred, archived

Headers

Message-ID: <51274C76.8080007@acm.org>
Date: Fri, 22 Feb 2013 11:46:14 +0100
From: Bart Van Assche <bvanassche@acm.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
	rv:17.0) Gecko/20130105 Thunderbird/17.0.2
MIME-Version: 1.0
To: device-mapper development <dm-devel@redhat.com>
References: <51274C2F.6070500@acm.org>
In-Reply-To: <51274C2F.6070500@acm.org>
Cc: Jens Axboe <axboe@kernel.dk>, linux-scsi <linux-scsi@vger.kernel.org>,
	Mike Snitzer <snitzer@redhat.com>,
	James Bottomley <JBottomley@parallels.com>, Tejun Heo <tj@kernel.org>,
	Alasdair G Kergon <agk@redhat.com>
Subject: [dm-devel] [PATCH 1/2] block: Avoid invoking blk_run_queue()
	recursively
Precedence: junk
Reply-To: device-mapper development <dm-devel@redhat.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: dm-devel-bounces@redhat.com
Errors-To: dm-devel-bounces@redhat.com

Commit Message

Bart Van Assche Feb. 22, 2013, 10:46 a.m. UTC

Some block drivers, e.g. dm and SCSI, need to trigger a queue
run from inside functions that may be invoked by their request_fn()
implementation. Make sure that invoking blk_run_queue() instead
of blk_run_queue_async() from such functions does not trigger
recursion. Making blk_run_queue() skip queue processing when
invoked recursively is safe because the only two affected
request_fn() implementations (dm and SCSI) guarantee that the
request queue will be reexamined sooner or later before returning
from their request_fn() implementation.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: James Bottomley <JBottomley@parallels.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: <stable@vger.kernel.org>
---
 block/blk-core.c |   21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

Comments

Tejun Heo Feb. 22, 2013, 6:14 p.m. UTC | #1

Hello, Bart.

On Fri, Feb 22, 2013 at 11:46:14AM +0100, Bart Van Assche wrote:
> Some block drivers, e.g. dm and SCSI, need to trigger a queue
> run from inside functions that may be invoked by their request_fn()
> implementation. Make sure that invoking blk_run_queue() instead
> of blk_run_queue_async() from such functions does not trigger
> recursion. Making blk_run_queue() skip queue processing when
> invoked recursively is safe because the only two affected
> request_fn() implementations (dm and SCSI) guarantee that the
> request queue will be reexamined sooner or later before returning
> from their request_fn() implementation.
> 
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: James Bottomley <JBottomley@parallels.com>
> Cc: Alasdair G Kergon <agk@redhat.com>
> Cc: Mike Snitzer <snitzer@redhat.com>
> Cc: <stable@vger.kernel.org>
> ---
>  block/blk-core.c |   21 +++++++++++++--------
>  1 file changed, 13 insertions(+), 8 deletions(-)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index c973249..cf26e3a 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -304,19 +304,24 @@ EXPORT_SYMBOL(blk_sync_queue);
>   *    This variant runs the queue whether or not the queue has been
>   *    stopped. Must be called with the queue lock held and interrupts
>   *    disabled. See also @blk_run_queue.
> + *
> + * Note:
> + *    Request handling functions are allowed to invoke __blk_run_queue() or
> + *    blk_run_queue() directly or indirectly. This will not result in a
> + *    recursive call of the request handler. However, such request handling
> + *    functions must, before they return, either reexamine the request queue
> + *    or invoke blk_delay_queue() to avoid that queue processing stops.
> + *
> + *    Some request handler implementations, e.g. scsi_request_fn() and
> + *    dm_request_fn(), unlock the queue lock internally. Returning immediately
> + *    if q->request_fn_active > 0 avoids that for the same queue multiple
> + *    threads execute the request handling function concurrently.
>   */
>  inline void __blk_run_queue_uncond(struct request_queue *q)
>  {
> -	if (unlikely(blk_queue_dead(q)))
> +	if (unlikely(blk_queue_dead(q) || q->request_fn_active))
>  		return;

Hmmm... I can't say I like this.  Silently ignoring explicit
blk_run_queue() call like this is likely to introduce nasty queue hang
bugs later on even if it's okay for the current users and there isn't
even debugging aid to detect what's going on.  If somebody invokes
blk_run_queue(), block layer better guarantee that the queue will be
run at least once afterwards no matter what, so please don't it this
way.

Thanks.

Bart Van Assche Feb. 22, 2013, 6:57 p.m. UTC | #2

On 02/22/13 19:14, Tejun Heo wrote:
> Hello, Bart.
> 
> On Fri, Feb 22, 2013 at 11:46:14AM +0100, Bart Van Assche wrote:
>> Some block drivers, e.g. dm and SCSI, need to trigger a queue
>> run from inside functions that may be invoked by their request_fn()
>> implementation. Make sure that invoking blk_run_queue() instead
>> of blk_run_queue_async() from such functions does not trigger
>> recursion. Making blk_run_queue() skip queue processing when
>> invoked recursively is safe because the only two affected
>> request_fn() implementations (dm and SCSI) guarantee that the
>> request queue will be reexamined sooner or later before returning
>> from their request_fn() implementation.
>>
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>> Cc: Jens Axboe <axboe@kernel.dk>
>> Cc: Tejun Heo <tj@kernel.org>
>> Cc: James Bottomley <JBottomley@parallels.com>
>> Cc: Alasdair G Kergon <agk@redhat.com>
>> Cc: Mike Snitzer <snitzer@redhat.com>
>> Cc: <stable@vger.kernel.org>
>> ---
>>   block/blk-core.c |   21 +++++++++++++--------
>>   1 file changed, 13 insertions(+), 8 deletions(-)
>>
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index c973249..cf26e3a 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -304,19 +304,24 @@ EXPORT_SYMBOL(blk_sync_queue);
>>    *    This variant runs the queue whether or not the queue has been
>>    *    stopped. Must be called with the queue lock held and interrupts
>>    *    disabled. See also @blk_run_queue.
>> + *
>> + * Note:
>> + *    Request handling functions are allowed to invoke __blk_run_queue() or
>> + *    blk_run_queue() directly or indirectly. This will not result in a
>> + *    recursive call of the request handler. However, such request handling
>> + *    functions must, before they return, either reexamine the request queue
>> + *    or invoke blk_delay_queue() to avoid that queue processing stops.
>> + *
>> + *    Some request handler implementations, e.g. scsi_request_fn() and
>> + *    dm_request_fn(), unlock the queue lock internally. Returning immediately
>> + *    if q->request_fn_active > 0 avoids that for the same queue multiple
>> + *    threads execute the request handling function concurrently.
>>    */
>>   inline void __blk_run_queue_uncond(struct request_queue *q)
>>   {
>> -	if (unlikely(blk_queue_dead(q)))
>> +	if (unlikely(blk_queue_dead(q) || q->request_fn_active))
>>   		return;
> 
> Hmmm... I can't say I like this.  Silently ignoring explicit
> blk_run_queue() call like this is likely to introduce nasty queue hang
> bugs later on even if it's okay for the current users and there isn't
> even debugging aid to detect what's going on.  If somebody invokes
> blk_run_queue(), block layer better guarantee that the queue will be
> run at least once afterwards no matter what, so please don't it this
> way.

How about returning to an approach similar to the one introduced in commit
a538cd03 (April 2009) ? This is how the last part of __blk_run_queue()
looked like after that commit:

        /*
         * Only recurse once to avoid overrunning the stack, let the unplug
         * handling reinvoke the handler shortly if we already got there.
         */
        if (!queue_flag_test_and_set(QUEUE_FLAG_REENTER, q)) {
                q->request_fn(q);
                queue_flag_clear(QUEUE_FLAG_REENTER, q);
        } else {
                queue_flag_set(QUEUE_FLAG_PLUGGED, q);
                kblockd_schedule_work(q, &q->unplug_work);
        }

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Jens Axboe Feb. 22, 2013, 7:01 p.m. UTC | #3

On Fri, Feb 22 2013, Bart Van Assche wrote:
> On 02/22/13 19:14, Tejun Heo wrote:
> > Hello, Bart.
> > 
> > On Fri, Feb 22, 2013 at 11:46:14AM +0100, Bart Van Assche wrote:
> >> Some block drivers, e.g. dm and SCSI, need to trigger a queue
> >> run from inside functions that may be invoked by their request_fn()
> >> implementation. Make sure that invoking blk_run_queue() instead
> >> of blk_run_queue_async() from such functions does not trigger
> >> recursion. Making blk_run_queue() skip queue processing when
> >> invoked recursively is safe because the only two affected
> >> request_fn() implementations (dm and SCSI) guarantee that the
> >> request queue will be reexamined sooner or later before returning
> >> from their request_fn() implementation.
> >>
> >> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> >> Cc: Jens Axboe <axboe@kernel.dk>
> >> Cc: Tejun Heo <tj@kernel.org>
> >> Cc: James Bottomley <JBottomley@parallels.com>
> >> Cc: Alasdair G Kergon <agk@redhat.com>
> >> Cc: Mike Snitzer <snitzer@redhat.com>
> >> Cc: <stable@vger.kernel.org>
> >> ---
> >>   block/blk-core.c |   21 +++++++++++++--------
> >>   1 file changed, 13 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/block/blk-core.c b/block/blk-core.c
> >> index c973249..cf26e3a 100644
> >> --- a/block/blk-core.c
> >> +++ b/block/blk-core.c
> >> @@ -304,19 +304,24 @@ EXPORT_SYMBOL(blk_sync_queue);
> >>    *    This variant runs the queue whether or not the queue has been
> >>    *    stopped. Must be called with the queue lock held and interrupts
> >>    *    disabled. See also @blk_run_queue.
> >> + *
> >> + * Note:
> >> + *    Request handling functions are allowed to invoke __blk_run_queue() or
> >> + *    blk_run_queue() directly or indirectly. This will not result in a
> >> + *    recursive call of the request handler. However, such request handling
> >> + *    functions must, before they return, either reexamine the request queue
> >> + *    or invoke blk_delay_queue() to avoid that queue processing stops.
> >> + *
> >> + *    Some request handler implementations, e.g. scsi_request_fn() and
> >> + *    dm_request_fn(), unlock the queue lock internally. Returning immediately
> >> + *    if q->request_fn_active > 0 avoids that for the same queue multiple
> >> + *    threads execute the request handling function concurrently.
> >>    */
> >>   inline void __blk_run_queue_uncond(struct request_queue *q)
> >>   {
> >> -	if (unlikely(blk_queue_dead(q)))
> >> +	if (unlikely(blk_queue_dead(q) || q->request_fn_active))
> >>   		return;
> > 
> > Hmmm... I can't say I like this.  Silently ignoring explicit
> > blk_run_queue() call like this is likely to introduce nasty queue hang
> > bugs later on even if it's okay for the current users and there isn't
> > even debugging aid to detect what's going on.  If somebody invokes
> > blk_run_queue(), block layer better guarantee that the queue will be
> > run at least once afterwards no matter what, so please don't it this
> > way.
> 
> How about returning to an approach similar to the one introduced in commit
> a538cd03 (April 2009) ? This is how the last part of __blk_run_queue()
> looked like after that commit:
> 
>         /*
>          * Only recurse once to avoid overrunning the stack, let the unplug
>          * handling reinvoke the handler shortly if we already got there.
>          */
>         if (!queue_flag_test_and_set(QUEUE_FLAG_REENTER, q)) {
>                 q->request_fn(q);
>                 queue_flag_clear(QUEUE_FLAG_REENTER, q);
>         } else {
>                 queue_flag_set(QUEUE_FLAG_PLUGGED, q);
>                 kblockd_schedule_work(q, &q->unplug_work);
>         }

That'd be fine. I agree with Tejun that just returning is error prone
and could cause hard-to-debug hangs or stalls. So something ala

        if (blk_queue(dead))
                return;
        else if (q->request_fn_active) {
                blk_delay_queue(q, 0);
                return;
        }

would work. Alternatively, you could mark the queue as needing a re-run,
so any existing runner of it would notice and clear this flag and re-run
the queue. But that's somewhat more fragile, and since it isn't a
hot/performance path, I suspect the simple "just re-run me soon" is good
enough.

Bart Van Assche Feb. 23, 2013, 12:34 p.m. UTC | #4

On 02/22/13 20:01, Jens Axboe wrote:
> That'd be fine. I agree with Tejun that just returning is error prone
> and could cause hard-to-debug hangs or stalls. So something ala
>
>          if (blk_queue(dead))
>                  return;
>          else if (q->request_fn_active) {
>                  blk_delay_queue(q, 0);
>                  return;
>          }
>
> would work. Alternatively, you could mark the queue as needing a re-run,
> so any existing runner of it would notice and clear this flag and re-run
> the queue. But that's somewhat more fragile, and since it isn't a
> hot/performance path, I suspect the simple "just re-run me soon" is good
> enough.

I'm worried that in the device mapper the approach using 
blk_delay_queue() could trigger a queue run after the final dm_put(). 
And I'm also afraid that that approach could trigger several needless 
context switches before q->request_fn_active finally drops to zero. So 
the approach with the "need re-run" flag seems preferable to me. But 
maybe I'm overlooking something ?

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

diff --git a/block/blk-core.c b/block/blk-core.c
index c973249..cf26e3a 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -304,19 +304,24 @@  EXPORT_SYMBOL(blk_sync_queue);
  *    This variant runs the queue whether or not the queue has been
  *    stopped. Must be called with the queue lock held and interrupts
  *    disabled. See also @blk_run_queue.
+ *
+ * Note:
+ *    Request handling functions are allowed to invoke __blk_run_queue() or
+ *    blk_run_queue() directly or indirectly. This will not result in a
+ *    recursive call of the request handler. However, such request handling
+ *    functions must, before they return, either reexamine the request queue
+ *    or invoke blk_delay_queue() to avoid that queue processing stops.
+ *
+ *    Some request handler implementations, e.g. scsi_request_fn() and
+ *    dm_request_fn(), unlock the queue lock internally. Returning immediately
+ *    if q->request_fn_active > 0 avoids that for the same queue multiple
+ *    threads execute the request handling function concurrently.
  */
 inline void __blk_run_queue_uncond(struct request_queue *q)
 {
-	if (unlikely(blk_queue_dead(q)))
+	if (unlikely(blk_queue_dead(q) || q->request_fn_active))
 		return;
 
-	/*
-	 * Some request_fn implementations, e.g. scsi_request_fn(), unlock
-	 * the queue lock internally. As a result multiple threads may be
-	 * running such a request function concurrently. Keep track of the
-	 * number of active request_fn invocations such that blk_drain_queue()
-	 * can wait until all these request_fn calls have finished.
-	 */
 	q->request_fn_active++;
 	q->request_fn(q);
 	q->request_fn_active--;

[1/2] block: Avoid invoking blk_run_queue() recursively

Commit Message

Comments

Patch