Message ID | 80de4554-a14b-4ce6-7c6e-ec66f70b14b1@fb.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
> Il giorno 02 mar 2017, alle ore 16:13, Jens Axboe <axboe@fb.com> ha scritto: > > On 03/02/2017 08:00 AM, Jens Axboe wrote: >> On 03/02/2017 03:28 AM, Paolo Valente wrote: >>> >>>> Il giorno 15 feb 2017, alle ore 19:04, Jens Axboe <axboe@fb.com> ha scritto: >>>> >>>> On 02/15/2017 10:58 AM, Jens Axboe wrote: >>>>> On 02/15/2017 10:24 AM, Paolo Valente wrote: >>>>>> >>>>>>> Il giorno 10 feb 2017, alle ore 19:32, Omar Sandoval <osandov@osandov.com> ha scritto: >>>>>>> >>>>>>> From: Omar Sandoval <osandov@fb.com> >>>>>>> >>>>>>> None of the other blk-mq elevator hooks are called with this lock held. >>>>>>> Additionally, it can lead to circular locking dependencies between >>>>>>> queue_lock and the private scheduler lock. >>>>>>> >>>>>> >>>>>> Hi Omar, >>>>>> I'm sorry but it seems that a new potential deadlock has showed up. >>>>>> See lockdep splat below. >>>>>> >>>>>> I've tried to think about different solutions than turning back to >>>>>> deferring the body of exit_icq, but at no avail. >>>>> >>>>> Looks like a interaction between bfqd->lock and q->queue_lock. Since the >>>>> core has no notion of you bfqd->lock, the naturally dependency here >>>>> would be to nest bfqd->lock inside q->queue_lock. Is that possible for >>>>> you? >>>>> >>>>> Looking at the code a bit, maybe it'd just be simpler to get rid of >>>>> holding the queue lock for that spot. For the mq scheduler, we really >>>>> don't want places where we invoke with that held already. Does the below >>>>> work for you? >>>> >>>> Would need to remove one more lockdep assert. And only test this for >>>> the mq parts, we'd need to spread a bit of love on the classic >>>> scheduling icq exit path for this to work on that side. >>>> >>> >>> >>> Jens, >>> here is the reply I anticipated in my previous email: after rebasing >>> against master, I'm getting again the deadlock that this patch of >>> yours solved (together with some changes in bfq-mq). I thought you added a >>> sort of equivalent commit (now) to the mainline branch. Am I wrong? >> >> The patch I posted was never pulled to completion, it wasn't clear >> to me if it fixed your issue or not. Maybe I missed a reply on >> that? >> >> Let me take another stab at it today, I'll send you a version to test >> on top of my for-linus branch. > > I took at look at my last posting, and I think it was only missing a > lock grab in cfq, since we don't hold that for icq exit anymore. Paolo, > it'd be great if you could verify that this works. Not for bfq-mq (we > already know it does), but for CFQ. No luck, sorry :( It looks like to have just not to take q->queue_lock in cfq_exit_icq. [ 9.329501] ============================================= [ 9.336132] [ INFO: possible recursive locking detected ] [ 9.343801] 4.10.0-bfq-mq+ #56 Not tainted [ 9.353359] --------------------------------------------- [ 9.354101] ata_id/160 is trying to acquire lock: [ 9.354750] (&(&q->__queue_lock)->rlock){..-...}, at: [<ffffffffaf48501a>] cfq_exit_icq+0x2a/0x80 [ 9.355986] [ 9.355986] but task is already holding lock: [ 9.364221] (&(&q->__queue_lock)->rlock){..-...}, at: [<ffffffffaf45a686>] put_io_context_active+0x86/0xe0 [ 9.382980] [ 9.382980] other info that might help us debug this: [ 9.386460] Possible unsafe locking scenario: [ 9.386460] [ 9.397070] CPU0 [ 9.397468] ---- [ 9.397811] lock(&(&q->__queue_lock)->rlock); [ 9.398440] lock(&(&q->__queue_lock)->rlock); [ 9.406265] [ 9.406265] *** DEADLOCK *** [ 9.406265] [ 9.409589] May be due to missing lock nesting notation [ 9.409589] [ 9.413040] 2 locks held by ata_id/160: [ 9.413576] #0: (&(&ioc->lock)->rlock/1){......}, at: [<ffffffffaf45a638>] put_io_context_active+0x38/0xe0 [ 9.418069] #1: (&(&q->__queue_lock)->rlock){..-...}, at: [<ffffffffaf45a686>] put_io_context_active+0x86/0xe0 [ 9.441118] [ 9.441118] stack backtrace: [ 9.445113] CPU: 0 PID: 160 Comm: ata_id Not tainted 4.10.0-bfq-mq+ #56 [ 9.446210] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 9.457223] Call Trace: [ 9.457636] dump_stack+0x85/0xc2 [ 9.458183] __lock_acquire+0x1834/0x1890 [ 9.458839] ? __lock_acquire+0x4af/0x1890 [ 9.464206] lock_acquire+0x11b/0x220 [ 9.471703] ? cfq_exit_icq+0x2a/0x80 [ 9.472225] _raw_spin_lock+0x3d/0x80 [ 9.472784] ? cfq_exit_icq+0x2a/0x80 [ 9.476336] cfq_exit_icq+0x2a/0x80 [ 9.486750] ioc_exit_icq+0x38/0x50 [ 9.487245] put_io_context_active+0x92/0xe0 [ 9.488049] exit_io_context+0x48/0x50 [ 9.488423] do_exit+0x7a8/0xc80 [ 9.488797] do_group_exit+0x50/0xd0 [ 9.496655] SyS_exit_group+0x14/0x20 [ 9.497170] entry_SYSCALL_64_fastpath+0x23/0xc6 [ 9.497808] RIP: 0033:0x7f224d1c1b98 [ 9.498302] RSP: 002b:00007ffd342666f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 [ 9.499360] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f224d1c1b98 [ 9.511706] RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000 [ 9.516390] RBP: 00007f224db02690 R08: 00000000000000e7 R09: ffffffffffffff90 [ 9.523044] R10: 00007f224d6d7250 R11: 0000000000000246 R12: 0000000000000016 [ 9.524015] R13: 0000000000000001 R14: 000055938a1c3010 R15: 00007ffd34266170 Thanks, Paolo > Run with lockdep enabled and see if > it spits out anything. We'll hit this exit path very quickly, so the > test doesn't need to be anything exhaustive. > > > diff --git a/block/blk-ioc.c b/block/blk-ioc.c > index b12f9c87b4c3..546ff8f81ede 100644 > --- a/block/blk-ioc.c > +++ b/block/blk-ioc.c > @@ -54,7 +54,7 @@ static void ioc_exit_icq(struct io_cq *icq) > icq->flags |= ICQ_EXITED; > } > > -/* Release an icq. Called with both ioc and q locked. */ > +/* Release an icq. Called with ioc locked. */ > static void ioc_destroy_icq(struct io_cq *icq) > { > struct io_context *ioc = icq->ioc; > @@ -62,7 +62,6 @@ static void ioc_destroy_icq(struct io_cq *icq) > struct elevator_type *et = q->elevator->type; > > lockdep_assert_held(&ioc->lock); > - lockdep_assert_held(q->queue_lock); > > radix_tree_delete(&ioc->icq_tree, icq->q->id); > hlist_del_init(&icq->ioc_node); > @@ -222,25 +221,34 @@ void exit_io_context(struct task_struct *task) > put_io_context_active(ioc); > } > > +static void __ioc_clear_queue(struct list_head *icq_list) > +{ > + while (!list_empty(icq_list)) { > + struct io_cq *icq = list_entry(icq_list->next, > + struct io_cq, q_node); > + struct io_context *ioc = icq->ioc; > + > + spin_lock_irq(&ioc->lock); > + ioc_destroy_icq(icq); > + spin_unlock_irq(&ioc->lock); > + } > +} > + > /** > * ioc_clear_queue - break any ioc association with the specified queue > * @q: request_queue being cleared > * > - * Walk @q->icq_list and exit all io_cq's. Must be called with @q locked. > + * Walk @q->icq_list and exit all io_cq's. > */ > void ioc_clear_queue(struct request_queue *q) > { > - lockdep_assert_held(q->queue_lock); > + LIST_HEAD(icq_list); > > - while (!list_empty(&q->icq_list)) { > - struct io_cq *icq = list_entry(q->icq_list.next, > - struct io_cq, q_node); > - struct io_context *ioc = icq->ioc; > + spin_lock_irq(q->queue_lock); > + list_splice_init(&q->icq_list, &icq_list); > + spin_unlock_irq(q->queue_lock); > > - spin_lock(&ioc->lock); > - ioc_destroy_icq(icq); > - spin_unlock(&ioc->lock); > - } > + __ioc_clear_queue(&icq_list); > } > > int create_task_io_context(struct task_struct *task, gfp_t gfp_flags, int node) > diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c > index 002af836aa87..c44b321335f3 100644 > --- a/block/blk-sysfs.c > +++ b/block/blk-sysfs.c > @@ -815,9 +815,7 @@ static void blk_release_queue(struct kobject *kobj) > blkcg_exit_queue(q); > > if (q->elevator) { > - spin_lock_irq(q->queue_lock); > ioc_clear_queue(q); > - spin_unlock_irq(q->queue_lock); > elevator_exit(q->elevator); > } > > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c > index 137944777859..4a71c79de037 100644 > --- a/block/cfq-iosched.c > +++ b/block/cfq-iosched.c > @@ -3658,6 +3658,8 @@ static void cfq_exit_icq(struct io_cq *icq) > struct cfq_io_cq *cic = icq_to_cic(icq); > struct cfq_data *cfqd = cic_to_cfqd(cic); > > + spin_lock(cfqd->queue->queue_lock); > + > if (cic_to_cfqq(cic, false)) { > cfq_exit_cfqq(cfqd, cic_to_cfqq(cic, false)); > cic_set_cfqq(cic, NULL, false); > @@ -3667,6 +3669,8 @@ static void cfq_exit_icq(struct io_cq *icq) > cfq_exit_cfqq(cfqd, cic_to_cfqq(cic, true)); > cic_set_cfqq(cic, NULL, true); > } > + > + spin_unlock(cfqd->queue->queue_lock); > } > > static void cfq_init_prio_data(struct cfq_queue *cfqq, struct cfq_io_cq *cic) > diff --git a/block/elevator.c b/block/elevator.c > index ac1c9f481a98..01139f549b5b 100644 > --- a/block/elevator.c > +++ b/block/elevator.c > @@ -983,9 +983,7 @@ static int elevator_switch(struct request_queue *q, struct elevator_type *new_e) > if (old_registered) > elv_unregister_queue(q); > > - spin_lock_irq(q->queue_lock); > ioc_clear_queue(q); > - spin_unlock_irq(q->queue_lock); > } > > /* allocate, init and register new elevator */ > > -- > Jens Axboe
> Il giorno 02 mar 2017, alle ore 17:13, Jens Axboe <axboe@fb.com> ha scritto: > > On 03/02/2017 09:07 AM, Paolo Valente wrote: >> >>> Il giorno 02 mar 2017, alle ore 16:13, Jens Axboe <axboe@fb.com> ha scritto: >>> >>> On 03/02/2017 08:00 AM, Jens Axboe wrote: >>>> On 03/02/2017 03:28 AM, Paolo Valente wrote: >>>>> >>>>>> Il giorno 15 feb 2017, alle ore 19:04, Jens Axboe <axboe@fb.com> ha scritto: >>>>>> >>>>>> On 02/15/2017 10:58 AM, Jens Axboe wrote: >>>>>>> On 02/15/2017 10:24 AM, Paolo Valente wrote: >>>>>>>> >>>>>>>>> Il giorno 10 feb 2017, alle ore 19:32, Omar Sandoval <osandov@osandov.com> ha scritto: >>>>>>>>> >>>>>>>>> From: Omar Sandoval <osandov@fb.com> >>>>>>>>> >>>>>>>>> None of the other blk-mq elevator hooks are called with this lock held. >>>>>>>>> Additionally, it can lead to circular locking dependencies between >>>>>>>>> queue_lock and the private scheduler lock. >>>>>>>>> >>>>>>>> >>>>>>>> Hi Omar, >>>>>>>> I'm sorry but it seems that a new potential deadlock has showed up. >>>>>>>> See lockdep splat below. >>>>>>>> >>>>>>>> I've tried to think about different solutions than turning back to >>>>>>>> deferring the body of exit_icq, but at no avail. >>>>>>> >>>>>>> Looks like a interaction between bfqd->lock and q->queue_lock. Since the >>>>>>> core has no notion of you bfqd->lock, the naturally dependency here >>>>>>> would be to nest bfqd->lock inside q->queue_lock. Is that possible for >>>>>>> you? >>>>>>> >>>>>>> Looking at the code a bit, maybe it'd just be simpler to get rid of >>>>>>> holding the queue lock for that spot. For the mq scheduler, we really >>>>>>> don't want places where we invoke with that held already. Does the below >>>>>>> work for you? >>>>>> >>>>>> Would need to remove one more lockdep assert. And only test this for >>>>>> the mq parts, we'd need to spread a bit of love on the classic >>>>>> scheduling icq exit path for this to work on that side. >>>>>> >>>>> >>>>> >>>>> Jens, >>>>> here is the reply I anticipated in my previous email: after rebasing >>>>> against master, I'm getting again the deadlock that this patch of >>>>> yours solved (together with some changes in bfq-mq). I thought you added a >>>>> sort of equivalent commit (now) to the mainline branch. Am I wrong? >>>> >>>> The patch I posted was never pulled to completion, it wasn't clear >>>> to me if it fixed your issue or not. Maybe I missed a reply on >>>> that? >>>> >>>> Let me take another stab at it today, I'll send you a version to test >>>> on top of my for-linus branch. >>> >>> I took at look at my last posting, and I think it was only missing a >>> lock grab in cfq, since we don't hold that for icq exit anymore. Paolo, >>> it'd be great if you could verify that this works. Not for bfq-mq (we >>> already know it does), but for CFQ. >> >> No luck, sorry :( >> >> It looks like to have just not to take q->queue_lock in cfq_exit_icq. > > I was worried about that. How about the below? We need to grab the lock > at some point for legacy scheduling, but the ordering should be correct. > > It seems to be: no more crashes or lockdep complains after several tests, boots and shutdowns :) Thanks, Paolo > diff --git a/block/blk-ioc.c b/block/blk-ioc.c > index b12f9c87b4c3..6fd633b5d567 100644 > --- a/block/blk-ioc.c > +++ b/block/blk-ioc.c > @@ -54,7 +54,7 @@ static void ioc_exit_icq(struct io_cq *icq) > icq->flags |= ICQ_EXITED; > } > > -/* Release an icq. Called with both ioc and q locked. */ > +/* Release an icq. Called with ioc locked. */ > static void ioc_destroy_icq(struct io_cq *icq) > { > struct io_context *ioc = icq->ioc; > @@ -62,7 +62,6 @@ static void ioc_destroy_icq(struct io_cq *icq) > struct elevator_type *et = q->elevator->type; > > lockdep_assert_held(&ioc->lock); > - lockdep_assert_held(q->queue_lock); > > radix_tree_delete(&ioc->icq_tree, icq->q->id); > hlist_del_init(&icq->ioc_node); > @@ -222,24 +221,40 @@ void exit_io_context(struct task_struct *task) > put_io_context_active(ioc); > } > > +static void __ioc_clear_queue(struct list_head *icq_list) > +{ > + unsigned long flags; > + > + while (!list_empty(icq_list)) { > + struct io_cq *icq = list_entry(icq_list->next, > + struct io_cq, q_node); > + struct io_context *ioc = icq->ioc; > + > + spin_lock_irqsave(&ioc->lock, flags); > + ioc_destroy_icq(icq); > + spin_unlock_irqrestore(&ioc->lock, flags); > + } > +} > + > /** > * ioc_clear_queue - break any ioc association with the specified queue > * @q: request_queue being cleared > * > - * Walk @q->icq_list and exit all io_cq's. Must be called with @q locked. > + * Walk @q->icq_list and exit all io_cq's. > */ > void ioc_clear_queue(struct request_queue *q) > { > - lockdep_assert_held(q->queue_lock); > + LIST_HEAD(icq_list); > > - while (!list_empty(&q->icq_list)) { > - struct io_cq *icq = list_entry(q->icq_list.next, > - struct io_cq, q_node); > - struct io_context *ioc = icq->ioc; > + spin_lock_irq(q->queue_lock); > + list_splice_init(&q->icq_list, &icq_list); > > - spin_lock(&ioc->lock); > - ioc_destroy_icq(icq); > - spin_unlock(&ioc->lock); > + if (q->mq_ops) { > + spin_unlock_irq(q->queue_lock); > + __ioc_clear_queue(&icq_list); > + } else { > + __ioc_clear_queue(&icq_list); > + spin_unlock_irq(q->queue_lock); > } > } > > diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c > index 002af836aa87..c44b321335f3 100644 > --- a/block/blk-sysfs.c > +++ b/block/blk-sysfs.c > @@ -815,9 +815,7 @@ static void blk_release_queue(struct kobject *kobj) > blkcg_exit_queue(q); > > if (q->elevator) { > - spin_lock_irq(q->queue_lock); > ioc_clear_queue(q); > - spin_unlock_irq(q->queue_lock); > elevator_exit(q->elevator); > } > > diff --git a/block/elevator.c b/block/elevator.c > index ac1c9f481a98..01139f549b5b 100644 > --- a/block/elevator.c > +++ b/block/elevator.c > @@ -983,9 +983,7 @@ static int elevator_switch(struct request_queue *q, struct elevator_type *new_e) > if (old_registered) > elv_unregister_queue(q); > > - spin_lock_irq(q->queue_lock); > ioc_clear_queue(q); > - spin_unlock_irq(q->queue_lock); > } > > /* allocate, init and register new elevator */ > > -- > Jens Axboe
On 03/02/2017 11:07 AM, Paolo Valente wrote: > >> Il giorno 02 mar 2017, alle ore 17:13, Jens Axboe <axboe@fb.com> ha scritto: >> >> On 03/02/2017 09:07 AM, Paolo Valente wrote: >>> >>>> Il giorno 02 mar 2017, alle ore 16:13, Jens Axboe <axboe@fb.com> ha scritto: >>>> >>>> On 03/02/2017 08:00 AM, Jens Axboe wrote: >>>>> On 03/02/2017 03:28 AM, Paolo Valente wrote: >>>>>> >>>>>>> Il giorno 15 feb 2017, alle ore 19:04, Jens Axboe <axboe@fb.com> ha scritto: >>>>>>> >>>>>>> On 02/15/2017 10:58 AM, Jens Axboe wrote: >>>>>>>> On 02/15/2017 10:24 AM, Paolo Valente wrote: >>>>>>>>> >>>>>>>>>> Il giorno 10 feb 2017, alle ore 19:32, Omar Sandoval <osandov@osandov.com> ha scritto: >>>>>>>>>> >>>>>>>>>> From: Omar Sandoval <osandov@fb.com> >>>>>>>>>> >>>>>>>>>> None of the other blk-mq elevator hooks are called with this lock held. >>>>>>>>>> Additionally, it can lead to circular locking dependencies between >>>>>>>>>> queue_lock and the private scheduler lock. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Hi Omar, >>>>>>>>> I'm sorry but it seems that a new potential deadlock has showed up. >>>>>>>>> See lockdep splat below. >>>>>>>>> >>>>>>>>> I've tried to think about different solutions than turning back to >>>>>>>>> deferring the body of exit_icq, but at no avail. >>>>>>>> >>>>>>>> Looks like a interaction between bfqd->lock and q->queue_lock. Since the >>>>>>>> core has no notion of you bfqd->lock, the naturally dependency here >>>>>>>> would be to nest bfqd->lock inside q->queue_lock. Is that possible for >>>>>>>> you? >>>>>>>> >>>>>>>> Looking at the code a bit, maybe it'd just be simpler to get rid of >>>>>>>> holding the queue lock for that spot. For the mq scheduler, we really >>>>>>>> don't want places where we invoke with that held already. Does the below >>>>>>>> work for you? >>>>>>> >>>>>>> Would need to remove one more lockdep assert. And only test this for >>>>>>> the mq parts, we'd need to spread a bit of love on the classic >>>>>>> scheduling icq exit path for this to work on that side. >>>>>>> >>>>>> >>>>>> >>>>>> Jens, >>>>>> here is the reply I anticipated in my previous email: after rebasing >>>>>> against master, I'm getting again the deadlock that this patch of >>>>>> yours solved (together with some changes in bfq-mq). I thought you added a >>>>>> sort of equivalent commit (now) to the mainline branch. Am I wrong? >>>>> >>>>> The patch I posted was never pulled to completion, it wasn't clear >>>>> to me if it fixed your issue or not. Maybe I missed a reply on >>>>> that? >>>>> >>>>> Let me take another stab at it today, I'll send you a version to test >>>>> on top of my for-linus branch. >>>> >>>> I took at look at my last posting, and I think it was only missing a >>>> lock grab in cfq, since we don't hold that for icq exit anymore. Paolo, >>>> it'd be great if you could verify that this works. Not for bfq-mq (we >>>> already know it does), but for CFQ. >>> >>> No luck, sorry :( >>> >>> It looks like to have just not to take q->queue_lock in cfq_exit_icq. >> >> I was worried about that. How about the below? We need to grab the lock >> at some point for legacy scheduling, but the ordering should be correct. >> >> > > It seems to be: no more crashes or lockdep complains after several tests, boots and shutdowns :) Great! Can I add your Tested-by?
> Il giorno 02 mar 2017, alle ore 19:25, Jens Axboe <axboe@fb.com> ha scritto: > > On 03/02/2017 11:07 AM, Paolo Valente wrote: >> >>> Il giorno 02 mar 2017, alle ore 17:13, Jens Axboe <axboe@fb.com> ha scritto: >>> >>> On 03/02/2017 09:07 AM, Paolo Valente wrote: >>>> >>>>> Il giorno 02 mar 2017, alle ore 16:13, Jens Axboe <axboe@fb.com> ha scritto: >>>>> >>>>> On 03/02/2017 08:00 AM, Jens Axboe wrote: >>>>>> On 03/02/2017 03:28 AM, Paolo Valente wrote: >>>>>>> >>>>>>>> Il giorno 15 feb 2017, alle ore 19:04, Jens Axboe <axboe@fb.com> ha scritto: >>>>>>>> >>>>>>>> On 02/15/2017 10:58 AM, Jens Axboe wrote: >>>>>>>>> On 02/15/2017 10:24 AM, Paolo Valente wrote: >>>>>>>>>> >>>>>>>>>>> Il giorno 10 feb 2017, alle ore 19:32, Omar Sandoval <osandov@osandov.com> ha scritto: >>>>>>>>>>> >>>>>>>>>>> From: Omar Sandoval <osandov@fb.com> >>>>>>>>>>> >>>>>>>>>>> None of the other blk-mq elevator hooks are called with this lock held. >>>>>>>>>>> Additionally, it can lead to circular locking dependencies between >>>>>>>>>>> queue_lock and the private scheduler lock. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi Omar, >>>>>>>>>> I'm sorry but it seems that a new potential deadlock has showed up. >>>>>>>>>> See lockdep splat below. >>>>>>>>>> >>>>>>>>>> I've tried to think about different solutions than turning back to >>>>>>>>>> deferring the body of exit_icq, but at no avail. >>>>>>>>> >>>>>>>>> Looks like a interaction between bfqd->lock and q->queue_lock. Since the >>>>>>>>> core has no notion of you bfqd->lock, the naturally dependency here >>>>>>>>> would be to nest bfqd->lock inside q->queue_lock. Is that possible for >>>>>>>>> you? >>>>>>>>> >>>>>>>>> Looking at the code a bit, maybe it'd just be simpler to get rid of >>>>>>>>> holding the queue lock for that spot. For the mq scheduler, we really >>>>>>>>> don't want places where we invoke with that held already. Does the below >>>>>>>>> work for you? >>>>>>>> >>>>>>>> Would need to remove one more lockdep assert. And only test this for >>>>>>>> the mq parts, we'd need to spread a bit of love on the classic >>>>>>>> scheduling icq exit path for this to work on that side. >>>>>>>> >>>>>>> >>>>>>> >>>>>>> Jens, >>>>>>> here is the reply I anticipated in my previous email: after rebasing >>>>>>> against master, I'm getting again the deadlock that this patch of >>>>>>> yours solved (together with some changes in bfq-mq). I thought you added a >>>>>>> sort of equivalent commit (now) to the mainline branch. Am I wrong? >>>>>> >>>>>> The patch I posted was never pulled to completion, it wasn't clear >>>>>> to me if it fixed your issue or not. Maybe I missed a reply on >>>>>> that? >>>>>> >>>>>> Let me take another stab at it today, I'll send you a version to test >>>>>> on top of my for-linus branch. >>>>> >>>>> I took at look at my last posting, and I think it was only missing a >>>>> lock grab in cfq, since we don't hold that for icq exit anymore. Paolo, >>>>> it'd be great if you could verify that this works. Not for bfq-mq (we >>>>> already know it does), but for CFQ. >>>> >>>> No luck, sorry :( >>>> >>>> It looks like to have just not to take q->queue_lock in cfq_exit_icq. >>> >>> I was worried about that. How about the below? We need to grab the lock >>> at some point for legacy scheduling, but the ordering should be correct. >>> >>> >> >> It seems to be: no more crashes or lockdep complains after several tests, boots and shutdowns :) > > Great! Can I add your Tested-by? Sure! Thanks, Paolo > > -- > Jens Axboe
diff --git a/block/blk-ioc.c b/block/blk-ioc.c index b12f9c87b4c3..546ff8f81ede 100644 --- a/block/blk-ioc.c +++ b/block/blk-ioc.c @@ -54,7 +54,7 @@ static void ioc_exit_icq(struct io_cq *icq) icq->flags |= ICQ_EXITED; } -/* Release an icq. Called with both ioc and q locked. */ +/* Release an icq. Called with ioc locked. */ static void ioc_destroy_icq(struct io_cq *icq) { struct io_context *ioc = icq->ioc; @@ -62,7 +62,6 @@ static void ioc_destroy_icq(struct io_cq *icq) struct elevator_type *et = q->elevator->type; lockdep_assert_held(&ioc->lock); - lockdep_assert_held(q->queue_lock); radix_tree_delete(&ioc->icq_tree, icq->q->id); hlist_del_init(&icq->ioc_node); @@ -222,25 +221,34 @@ void exit_io_context(struct task_struct *task) put_io_context_active(ioc); } +static void __ioc_clear_queue(struct list_head *icq_list) +{ + while (!list_empty(icq_list)) { + struct io_cq *icq = list_entry(icq_list->next, + struct io_cq, q_node); + struct io_context *ioc = icq->ioc; + + spin_lock_irq(&ioc->lock); + ioc_destroy_icq(icq); + spin_unlock_irq(&ioc->lock); + } +} + /** * ioc_clear_queue - break any ioc association with the specified queue * @q: request_queue being cleared * - * Walk @q->icq_list and exit all io_cq's. Must be called with @q locked. + * Walk @q->icq_list and exit all io_cq's. */ void ioc_clear_queue(struct request_queue *q) { - lockdep_assert_held(q->queue_lock); + LIST_HEAD(icq_list); - while (!list_empty(&q->icq_list)) { - struct io_cq *icq = list_entry(q->icq_list.next, - struct io_cq, q_node); - struct io_context *ioc = icq->ioc; + spin_lock_irq(q->queue_lock); + list_splice_init(&q->icq_list, &icq_list); + spin_unlock_irq(q->queue_lock); - spin_lock(&ioc->lock); - ioc_destroy_icq(icq); - spin_unlock(&ioc->lock); - } + __ioc_clear_queue(&icq_list); } int create_task_io_context(struct task_struct *task, gfp_t gfp_flags, int node) diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 002af836aa87..c44b321335f3 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -815,9 +815,7 @@ static void blk_release_queue(struct kobject *kobj) blkcg_exit_queue(q); if (q->elevator) { - spin_lock_irq(q->queue_lock); ioc_clear_queue(q); - spin_unlock_irq(q->queue_lock); elevator_exit(q->elevator); } diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index 137944777859..4a71c79de037 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -3658,6 +3658,8 @@ static void cfq_exit_icq(struct io_cq *icq) struct cfq_io_cq *cic = icq_to_cic(icq); struct cfq_data *cfqd = cic_to_cfqd(cic); + spin_lock(cfqd->queue->queue_lock); + if (cic_to_cfqq(cic, false)) { cfq_exit_cfqq(cfqd, cic_to_cfqq(cic, false)); cic_set_cfqq(cic, NULL, false); @@ -3667,6 +3669,8 @@ static void cfq_exit_icq(struct io_cq *icq) cfq_exit_cfqq(cfqd, cic_to_cfqq(cic, true)); cic_set_cfqq(cic, NULL, true); } + + spin_unlock(cfqd->queue->queue_lock); } static void cfq_init_prio_data(struct cfq_queue *cfqq, struct cfq_io_cq *cic) diff --git a/block/elevator.c b/block/elevator.c index ac1c9f481a98..01139f549b5b 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -983,9 +983,7 @@ static int elevator_switch(struct request_queue *q, struct elevator_type *new_e) if (old_registered) elv_unregister_queue(q); - spin_lock_irq(q->queue_lock); ioc_clear_queue(q); - spin_unlock_irq(q->queue_lock); } /* allocate, init and register new elevator */