diff mbox

bfq-mq: cause deadlock by executing exit_icq body immediately

Message ID 20170208171713.GA7811@vader (mailing list archive)
State New, archived
Headers show

Commit Message

Omar Sandoval Feb. 8, 2017, 5:17 p.m. UTC
On Wed, Feb 08, 2017 at 11:39:24AM +0100, Paolo Valente wrote:
> 
> > Il giorno 08 feb 2017, alle ore 11:33, Omar Sandoval <osandov@osandov.com> ha scritto:
> > 
> > On Wed, Feb 08, 2017 at 11:03:01AM +0100, Paolo Valente wrote:
> >> 
> >>> Il giorno 07 feb 2017, alle ore 22:45, Omar Sandoval <osandov@osandov.com> ha scritto:
> >>> 
> >>> On Tue, Feb 07, 2017 at 06:33:46PM +0100, Paolo Valente wrote:
> >>>> Hi,
> >>>> this patch is meant to show that, if the  body of the hook exit_icq is executed
> >>>> from inside that hook, and not as deferred work, then a circular deadlock
> >>>> occurs.
> >>>> 
> >>>> It happens if, on a CPU
> >>>> - the body of icq_exit takes the scheduler lock,
> >>>> - it does so from inside the exit_icq hook, which is invoked with the queue
> >>>> lock held
> >>>> 
> >>>> while, on another CPU
> >>>> - bfq_bio_merge, after taking the scheduler lock, invokes bfq_bic_lookup,
> >>>> which, in its turn, takes the queue lock. bfq_bic_lookup needs to take such a
> >>>> lock, because it invokes ioc_lookup_icq.
> >>>> 
> >>>> For more details, here is a lockdep report, right before the deadlock did occur.
> >>>> 
> >>>> [   44.059877] ======================================================
> >>>> [   44.124922] [ INFO: possible circular locking dependency detected ]
> >>>> [   44.125795] 4.10.0-rc5-bfq-mq+ #38 Not tainted
> >>>> [   44.126414] -------------------------------------------------------
> >>>> [   44.127291] sync/2043 is trying to acquire lock:
> >>>> [   44.128918]  (&(&bfqd->lock)->rlock){-.-...}, at: [<ffffffff90484195>] bfq_exit_icq_bfqq+0x55/0x140
> >>>> [   44.134052]
> >>>> [   44.134052] but task is already holding lock:
> >>>> [   44.134868]  (&(&q->__queue_lock)->rlock){-.....}, at: [<ffffffff9044738e>] put_io_context_active+0x6e/0xc0
> >>> 
> >>> Hey, Paolo,
> >>> 
> >>> I only briefly skimmed the code, but what are you using the queue_lock
> >>> for? You should just use your scheduler lock everywhere. blk-mq doesn't
> >>> use the queue lock, so the scheduler is the only thing you need mutual
> >>> exclusion against.
> >> 
> >> Hi Omar,
> >> the cause of the problem is that the hook functions bfq_request_merge
> >> and bfq_allow_bio_merge invoke, directly or through other functions,
> >> the function bfq_bic_lookup, which, in its turn, invokes
> >> ioc_lookup_icq.  The latter must be invoked with the queue lock held.
> >> In particular the offending lines in bfq_bic_lookup are:
> >> 
> >> 		spin_lock_irqsave(q->queue_lock, flags);
> >> 		icq = icq_to_bic(ioc_lookup_icq(ioc, q));
> >> 		spin_unlock_irqrestore(q->queue_lock, flags);
> >> 
> >> Maybe I'm missing something and we can avoid taking this lock?
> > 
> > Ah, I didn't realize we still used the q->queue_lock for the icq stuff.
> > You're right, you still need that lock for ioc_lookup_icq(). Unless
> > there's something else I'm forgetting, that should be the only thing you
> > need it for in the core code, and you should use your scheduler lock for
> > everything else. What else are you using q->queue_lock for? 
> 
> Nothing.  The deadlock follows from that bfq_request_merge gets called
> with the scheduler lock already held.  Problematic paths start from:
> bfq_bio_merge and bfq_insert_request.
> 
> I'm trying to understand whether I/we can reorder operations in some
> way that avoids the nested locking, but at no avail so far.
> 
> Thanks,
> Paolo

Okay, I understand what you're saying now. It was all in the first email
but I didn't see it right away, sorry about that.

I don't think it makes sense for ->exit_icq() to be invoked while
holding q->queue_lock for blk-mq -- we don't hold that lock for any of
the other hooks. Could you try the below? I haven't convinced myself
that there isn't a circular locking dependency between bfqd->lock and
ioc->lock now, but it's probably easiest for you to just try it.

Comments

Paolo Valente Feb. 10, 2017, 1 p.m. UTC | #1
> Il giorno 08 feb 2017, alle ore 18:17, Omar Sandoval <osandov@osandov.com> ha scritto:
> 
> On Wed, Feb 08, 2017 at 11:39:24AM +0100, Paolo Valente wrote:
>> 
>>> Il giorno 08 feb 2017, alle ore 11:33, Omar Sandoval <osandov@osandov.com> ha scritto:
>>> 
>>> On Wed, Feb 08, 2017 at 11:03:01AM +0100, Paolo Valente wrote:
>>>> 
>>>>> Il giorno 07 feb 2017, alle ore 22:45, Omar Sandoval <osandov@osandov.com> ha scritto:
>>>>> 
>>>>> On Tue, Feb 07, 2017 at 06:33:46PM +0100, Paolo Valente wrote:
>>>>>> Hi,
>>>>>> this patch is meant to show that, if the  body of the hook exit_icq is executed
>>>>>> from inside that hook, and not as deferred work, then a circular deadlock
>>>>>> occurs.
>>>>>> 
>>>>>> It happens if, on a CPU
>>>>>> - the body of icq_exit takes the scheduler lock,
>>>>>> - it does so from inside the exit_icq hook, which is invoked with the queue
>>>>>> lock held
>>>>>> 
>>>>>> while, on another CPU
>>>>>> - bfq_bio_merge, after taking the scheduler lock, invokes bfq_bic_lookup,
>>>>>> which, in its turn, takes the queue lock. bfq_bic_lookup needs to take such a
>>>>>> lock, because it invokes ioc_lookup_icq.
>>>>>> 
>>>>>> For more details, here is a lockdep report, right before the deadlock did occur.
>>>>>> 
>>>>>> [   44.059877] ======================================================
>>>>>> [   44.124922] [ INFO: possible circular locking dependency detected ]
>>>>>> [   44.125795] 4.10.0-rc5-bfq-mq+ #38 Not tainted
>>>>>> [   44.126414] -------------------------------------------------------
>>>>>> [   44.127291] sync/2043 is trying to acquire lock:
>>>>>> [   44.128918]  (&(&bfqd->lock)->rlock){-.-...}, at: [<ffffffff90484195>] bfq_exit_icq_bfqq+0x55/0x140
>>>>>> [   44.134052]
>>>>>> [   44.134052] but task is already holding lock:
>>>>>> [   44.134868]  (&(&q->__queue_lock)->rlock){-.....}, at: [<ffffffff9044738e>] put_io_context_active+0x6e/0xc0
>>>>> 
>>>>> Hey, Paolo,
>>>>> 
>>>>> I only briefly skimmed the code, but what are you using the queue_lock
>>>>> for? You should just use your scheduler lock everywhere. blk-mq doesn't
>>>>> use the queue lock, so the scheduler is the only thing you need mutual
>>>>> exclusion against.
>>>> 
>>>> Hi Omar,
>>>> the cause of the problem is that the hook functions bfq_request_merge
>>>> and bfq_allow_bio_merge invoke, directly or through other functions,
>>>> the function bfq_bic_lookup, which, in its turn, invokes
>>>> ioc_lookup_icq.  The latter must be invoked with the queue lock held.
>>>> In particular the offending lines in bfq_bic_lookup are:
>>>> 
>>>> 		spin_lock_irqsave(q->queue_lock, flags);
>>>> 		icq = icq_to_bic(ioc_lookup_icq(ioc, q));
>>>> 		spin_unlock_irqrestore(q->queue_lock, flags);
>>>> 
>>>> Maybe I'm missing something and we can avoid taking this lock?
>>> 
>>> Ah, I didn't realize we still used the q->queue_lock for the icq stuff.
>>> You're right, you still need that lock for ioc_lookup_icq(). Unless
>>> there's something else I'm forgetting, that should be the only thing you
>>> need it for in the core code, and you should use your scheduler lock for
>>> everything else. What else are you using q->queue_lock for? 
>> 
>> Nothing.  The deadlock follows from that bfq_request_merge gets called
>> with the scheduler lock already held.  Problematic paths start from:
>> bfq_bio_merge and bfq_insert_request.
>> 
>> I'm trying to understand whether I/we can reorder operations in some
>> way that avoids the nested locking, but at no avail so far.
>> 
>> Thanks,
>> Paolo
> 
> Okay, I understand what you're saying now. It was all in the first email
> but I didn't see it right away, sorry about that.
> 
> I don't think it makes sense for ->exit_icq() to be invoked while
> holding q->queue_lock for blk-mq -- we don't hold that lock for any of
> the other hooks. Could you try the below? I haven't convinced myself
> that there isn't a circular locking dependency between bfqd->lock and
> ioc->lock now, but it's probably easiest for you to just try it.
> 

Just passed the last of a heavy batch of tests!

I have updated the bfq-mq branch [1], adding a temporary commit that
contains your diffs (while waiting for you final patch or the like).

Looking forward to some feedback on the other issue I have raised on
locking, and to some review of the current version of bfq-mq, before
proceeding with cgroups support.

Thanks,
Paolo

[1] https://github.com/Algodev-github/bfq-mq

> diff --git a/block/blk-ioc.c b/block/blk-ioc.c
> index fe186a9eade9..b12f9c87b4c3 100644
> --- a/block/blk-ioc.c
> +++ b/block/blk-ioc.c
> @@ -35,7 +35,10 @@ static void icq_free_icq_rcu(struct rcu_head *head)
> 	kmem_cache_free(icq->__rcu_icq_cache, icq);
> }
> 
> -/* Exit an icq. Called with both ioc and q locked. */
> +/*
> + * Exit an icq. Called with both ioc and q locked for sq, only ioc locked for
> + * mq.
> + */
> static void ioc_exit_icq(struct io_cq *icq)
> {
> 	struct elevator_type *et = icq->q->elevator->type;
> @@ -166,6 +169,7 @@ EXPORT_SYMBOL(put_io_context);
>  */
> void put_io_context_active(struct io_context *ioc)
> {
> +	struct elevator_type *et;
> 	unsigned long flags;
> 	struct io_cq *icq;
> 
> @@ -184,13 +188,19 @@ void put_io_context_active(struct io_context *ioc)
> 	hlist_for_each_entry(icq, &ioc->icq_list, ioc_node) {
> 		if (icq->flags & ICQ_EXITED)
> 			continue;
> -		if (spin_trylock(icq->q->queue_lock)) {
> +
> +		et = icq->q->elevator->type;
> +		if (et->uses_mq) {
> 			ioc_exit_icq(icq);
> -			spin_unlock(icq->q->queue_lock);
> 		} else {
> -			spin_unlock_irqrestore(&ioc->lock, flags);
> -			cpu_relax();
> -			goto retry;
> +			if (spin_trylock(icq->q->queue_lock)) {
> +				ioc_exit_icq(icq);
> +				spin_unlock(icq->q->queue_lock);
> +			} else {
> +				spin_unlock_irqrestore(&ioc->lock, flags);
> +				cpu_relax();
> +				goto retry;
> +			}
> 		}
> 	}
> 	spin_unlock_irqrestore(&ioc->lock, flags);
Jens Axboe Feb. 10, 2017, 4:09 p.m. UTC | #2
On 02/10/2017 06:00 AM, Paolo Valente wrote:
> 
>> Il giorno 08 feb 2017, alle ore 18:17, Omar Sandoval <osandov@osandov.com> ha scritto:
>>
>> On Wed, Feb 08, 2017 at 11:39:24AM +0100, Paolo Valente wrote:
>>>
>>>> Il giorno 08 feb 2017, alle ore 11:33, Omar Sandoval <osandov@osandov.com> ha scritto:
>>>>
>>>> On Wed, Feb 08, 2017 at 11:03:01AM +0100, Paolo Valente wrote:
>>>>>
>>>>>> Il giorno 07 feb 2017, alle ore 22:45, Omar Sandoval <osandov@osandov.com> ha scritto:
>>>>>>
>>>>>> On Tue, Feb 07, 2017 at 06:33:46PM +0100, Paolo Valente wrote:
>>>>>>> Hi,
>>>>>>> this patch is meant to show that, if the  body of the hook exit_icq is executed
>>>>>>> from inside that hook, and not as deferred work, then a circular deadlock
>>>>>>> occurs.
>>>>>>>
>>>>>>> It happens if, on a CPU
>>>>>>> - the body of icq_exit takes the scheduler lock,
>>>>>>> - it does so from inside the exit_icq hook, which is invoked with the queue
>>>>>>> lock held
>>>>>>>
>>>>>>> while, on another CPU
>>>>>>> - bfq_bio_merge, after taking the scheduler lock, invokes bfq_bic_lookup,
>>>>>>> which, in its turn, takes the queue lock. bfq_bic_lookup needs to take such a
>>>>>>> lock, because it invokes ioc_lookup_icq.
>>>>>>>
>>>>>>> For more details, here is a lockdep report, right before the deadlock did occur.
>>>>>>>
>>>>>>> [   44.059877] ======================================================
>>>>>>> [   44.124922] [ INFO: possible circular locking dependency detected ]
>>>>>>> [   44.125795] 4.10.0-rc5-bfq-mq+ #38 Not tainted
>>>>>>> [   44.126414] -------------------------------------------------------
>>>>>>> [   44.127291] sync/2043 is trying to acquire lock:
>>>>>>> [   44.128918]  (&(&bfqd->lock)->rlock){-.-...}, at: [<ffffffff90484195>] bfq_exit_icq_bfqq+0x55/0x140
>>>>>>> [   44.134052]
>>>>>>> [   44.134052] but task is already holding lock:
>>>>>>> [   44.134868]  (&(&q->__queue_lock)->rlock){-.....}, at: [<ffffffff9044738e>] put_io_context_active+0x6e/0xc0
>>>>>>
>>>>>> Hey, Paolo,
>>>>>>
>>>>>> I only briefly skimmed the code, but what are you using the queue_lock
>>>>>> for? You should just use your scheduler lock everywhere. blk-mq doesn't
>>>>>> use the queue lock, so the scheduler is the only thing you need mutual
>>>>>> exclusion against.
>>>>>
>>>>> Hi Omar,
>>>>> the cause of the problem is that the hook functions bfq_request_merge
>>>>> and bfq_allow_bio_merge invoke, directly or through other functions,
>>>>> the function bfq_bic_lookup, which, in its turn, invokes
>>>>> ioc_lookup_icq.  The latter must be invoked with the queue lock held.
>>>>> In particular the offending lines in bfq_bic_lookup are:
>>>>>
>>>>> 		spin_lock_irqsave(q->queue_lock, flags);
>>>>> 		icq = icq_to_bic(ioc_lookup_icq(ioc, q));
>>>>> 		spin_unlock_irqrestore(q->queue_lock, flags);
>>>>>
>>>>> Maybe I'm missing something and we can avoid taking this lock?
>>>>
>>>> Ah, I didn't realize we still used the q->queue_lock for the icq stuff.
>>>> You're right, you still need that lock for ioc_lookup_icq(). Unless
>>>> there's something else I'm forgetting, that should be the only thing you
>>>> need it for in the core code, and you should use your scheduler lock for
>>>> everything else. What else are you using q->queue_lock for? 
>>>
>>> Nothing.  The deadlock follows from that bfq_request_merge gets called
>>> with the scheduler lock already held.  Problematic paths start from:
>>> bfq_bio_merge and bfq_insert_request.
>>>
>>> I'm trying to understand whether I/we can reorder operations in some
>>> way that avoids the nested locking, but at no avail so far.
>>>
>>> Thanks,
>>> Paolo
>>
>> Okay, I understand what you're saying now. It was all in the first email
>> but I didn't see it right away, sorry about that.
>>
>> I don't think it makes sense for ->exit_icq() to be invoked while
>> holding q->queue_lock for blk-mq -- we don't hold that lock for any of
>> the other hooks. Could you try the below? I haven't convinced myself
>> that there isn't a circular locking dependency between bfqd->lock and
>> ioc->lock now, but it's probably easiest for you to just try it.
>>
> 
> Just passed the last of a heavy batch of tests!

Omar, care to turn this into a real patch and submit it?
diff mbox

Patch

diff --git a/block/blk-ioc.c b/block/blk-ioc.c
index fe186a9eade9..b12f9c87b4c3 100644
--- a/block/blk-ioc.c
+++ b/block/blk-ioc.c
@@ -35,7 +35,10 @@  static void icq_free_icq_rcu(struct rcu_head *head)
 	kmem_cache_free(icq->__rcu_icq_cache, icq);
 }
 
-/* Exit an icq. Called with both ioc and q locked. */
+/*
+ * Exit an icq. Called with both ioc and q locked for sq, only ioc locked for
+ * mq.
+ */
 static void ioc_exit_icq(struct io_cq *icq)
 {
 	struct elevator_type *et = icq->q->elevator->type;
@@ -166,6 +169,7 @@  EXPORT_SYMBOL(put_io_context);
  */
 void put_io_context_active(struct io_context *ioc)
 {
+	struct elevator_type *et;
 	unsigned long flags;
 	struct io_cq *icq;
 
@@ -184,13 +188,19 @@  void put_io_context_active(struct io_context *ioc)
 	hlist_for_each_entry(icq, &ioc->icq_list, ioc_node) {
 		if (icq->flags & ICQ_EXITED)
 			continue;
-		if (spin_trylock(icq->q->queue_lock)) {
+
+		et = icq->q->elevator->type;
+		if (et->uses_mq) {
 			ioc_exit_icq(icq);
-			spin_unlock(icq->q->queue_lock);
 		} else {
-			spin_unlock_irqrestore(&ioc->lock, flags);
-			cpu_relax();
-			goto retry;
+			if (spin_trylock(icq->q->queue_lock)) {
+				ioc_exit_icq(icq);
+				spin_unlock(icq->q->queue_lock);
+			} else {
+				spin_unlock_irqrestore(&ioc->lock, flags);
+				cpu_relax();
+				goto retry;
+			}
 		}
 	}
 	spin_unlock_irqrestore(&ioc->lock, flags);