Message ID | 1498095412-18731-1-git-send-email-junxiao.bi@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Boris & Juergen, Could you help review this patch? This is a race and will cause the io hung. Thanks, Junxiao. On 06/22/2017 09:36 AM, Junxiao Bi wrote: > When ring buf full, hw queue will be stopped. While blkif interrupt consume > request and make free space in ring buf, hw queue will be started again. > But since start queue is protected by spin lock while stop not, that will > cause a race. > > interrupt: process: > blkif_interrupt() blkif_queue_rq() > kick_pending_request_queues_locked() > blk_mq_start_stopped_hw_queues() > clear_bit(BLK_MQ_S_STOPPED, &hctx->state) > blk_mq_stop_hw_queue(hctx) > blk_mq_run_hw_queue(hctx, async) > > If ring buf is made empty in this case, interrupt will never come, then the > hw queue will be stopped forever, all processes waiting for the pending io > in the queue will hung. > > Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> > Reviewed-by: Ankur Arora <ankur.a.arora@oracle.com> > --- > drivers/block/xen-blkfront.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c > index 8bb160cd00e1..4767b82b2cf6 100644 > --- a/drivers/block/xen-blkfront.c > +++ b/drivers/block/xen-blkfront.c > @@ -912,8 +912,8 @@ out_err: > return BLK_MQ_RQ_QUEUE_ERROR; > > out_busy: > - spin_unlock_irqrestore(&rinfo->ring_lock, flags); > blk_mq_stop_hw_queue(hctx); > + spin_unlock_irqrestore(&rinfo->ring_lock, flags); > return BLK_MQ_RQ_QUEUE_BUSY; > } > >
On 06/23/2017 12:58 AM, Junxiao Bi wrote: > Hi Boris & Juergen, > > Could you help review this patch? This is a race and will cause the io hung. > > Thanks, > Junxiao. > > On 06/22/2017 09:36 AM, Junxiao Bi wrote: >> When ring buf full, hw queue will be stopped. While blkif interrupt consume >> request and make free space in ring buf, hw queue will be started again. >> But since start queue is protected by spin lock while stop not, that will >> cause a race. >> >> interrupt: process: >> blkif_interrupt() blkif_queue_rq() >> kick_pending_request_queues_locked() >> blk_mq_start_stopped_hw_queues() >> clear_bit(BLK_MQ_S_STOPPED, &hctx->state) >> blk_mq_stop_hw_queue(hctx) >> blk_mq_run_hw_queue(hctx, async) >> >> If ring buf is made empty in this case, interrupt will never come, then the >> hw queue will be stopped forever, all processes waiting for the pending io >> in the queue will hung. >> >> Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> >> Reviewed-by: Ankur Arora <ankur.a.arora@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> but I think Roger (copied) needs to Ack it. -boris >> --- >> drivers/block/xen-blkfront.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c >> index 8bb160cd00e1..4767b82b2cf6 100644 >> --- a/drivers/block/xen-blkfront.c >> +++ b/drivers/block/xen-blkfront.c >> @@ -912,8 +912,8 @@ out_err: >> return BLK_MQ_RQ_QUEUE_ERROR; >> >> out_busy: >> - spin_unlock_irqrestore(&rinfo->ring_lock, flags); >> blk_mq_stop_hw_queue(hctx); >> + spin_unlock_irqrestore(&rinfo->ring_lock, flags); >> return BLK_MQ_RQ_QUEUE_BUSY; >> } >> >> >
On Thu, Jun 22, 2017 at 09:36:52AM +0800, Junxiao Bi wrote: > When ring buf full, hw queue will be stopped. While blkif interrupt consume > request and make free space in ring buf, hw queue will be started again. > But since start queue is protected by spin lock while stop not, that will > cause a race. > > interrupt: process: > blkif_interrupt() blkif_queue_rq() > kick_pending_request_queues_locked() > blk_mq_start_stopped_hw_queues() > clear_bit(BLK_MQ_S_STOPPED, &hctx->state) > blk_mq_stop_hw_queue(hctx) > blk_mq_run_hw_queue(hctx, async) > > If ring buf is made empty in this case, interrupt will never come, then the > hw queue will be stopped forever, all processes waiting for the pending io > in the queue will hung. > > Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> > Reviewed-by: Ankur Arora <ankur.a.arora@oracle.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Thanks, Roger.
Hi Roger, On 06/23/2017 08:57 PM, Roger Pau Monné wrote: > On Thu, Jun 22, 2017 at 09:36:52AM +0800, Junxiao Bi wrote: >> When ring buf full, hw queue will be stopped. While blkif interrupt consume >> request and make free space in ring buf, hw queue will be started again. >> But since start queue is protected by spin lock while stop not, that will >> cause a race. >> >> interrupt: process: >> blkif_interrupt() blkif_queue_rq() >> kick_pending_request_queues_locked() >> blk_mq_start_stopped_hw_queues() >> clear_bit(BLK_MQ_S_STOPPED, &hctx->state) >> blk_mq_stop_hw_queue(hctx) >> blk_mq_run_hw_queue(hctx, async) >> >> If ring buf is made empty in this case, interrupt will never come, then the >> hw queue will be stopped forever, all processes waiting for the pending io >> in the queue will hung. >> >> Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> >> Reviewed-by: Ankur Arora <ankur.a.arora@oracle.com> > > Acked-by: Roger Pau Monné <roger.pau@citrix.com> Looks patch not in mainline. Can you please help merge it? Thanks, Junxiao. > > Thanks, Roger. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > https://lists.xen.org/xen-devel >
On Wed, Jul 19, 2017 at 09:19:49AM +0800, Junxiao Bi wrote: > Hi Roger, > > On 06/23/2017 08:57 PM, Roger Pau Monné wrote: > > On Thu, Jun 22, 2017 at 09:36:52AM +0800, Junxiao Bi wrote: > >> When ring buf full, hw queue will be stopped. While blkif interrupt consume > >> request and make free space in ring buf, hw queue will be started again. > >> But since start queue is protected by spin lock while stop not, that will > >> cause a race. > >> > >> interrupt: process: > >> blkif_interrupt() blkif_queue_rq() > >> kick_pending_request_queues_locked() > >> blk_mq_start_stopped_hw_queues() > >> clear_bit(BLK_MQ_S_STOPPED, &hctx->state) > >> blk_mq_stop_hw_queue(hctx) > >> blk_mq_run_hw_queue(hctx, async) > >> > >> If ring buf is made empty in this case, interrupt will never come, then the > >> hw queue will be stopped forever, all processes waiting for the pending io > >> in the queue will hung. > >> > >> Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> > >> Reviewed-by: Ankur Arora <ankur.a.arora@oracle.com> > > > > Acked-by: Roger Pau Monné <roger.pau@citrix.com> > Looks patch not in mainline. Can you please help merge it? I'm afraid this needs to be done by Konrad or one of the Linux maintainers, I don't have an account on kernel.org in order to send pull requests to Jens. Roger.
Hi Konrad, On 07/19/2017 03:37 PM, Roger Pau Monné wrote: > On Wed, Jul 19, 2017 at 09:19:49AM +0800, Junxiao Bi wrote: >> Hi Roger, >> >> On 06/23/2017 08:57 PM, Roger Pau Monné wrote: >>> On Thu, Jun 22, 2017 at 09:36:52AM +0800, Junxiao Bi wrote: >>>> When ring buf full, hw queue will be stopped. While blkif interrupt consume >>>> request and make free space in ring buf, hw queue will be started again. >>>> But since start queue is protected by spin lock while stop not, that will >>>> cause a race. >>>> >>>> interrupt: process: >>>> blkif_interrupt() blkif_queue_rq() >>>> kick_pending_request_queues_locked() >>>> blk_mq_start_stopped_hw_queues() >>>> clear_bit(BLK_MQ_S_STOPPED, &hctx->state) >>>> blk_mq_stop_hw_queue(hctx) >>>> blk_mq_run_hw_queue(hctx, async) >>>> >>>> If ring buf is made empty in this case, interrupt will never come, then the >>>> hw queue will be stopped forever, all processes waiting for the pending io >>>> in the queue will hung. >>>> >>>> Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> >>>> Reviewed-by: Ankur Arora <ankur.a.arora@oracle.com> >>> >>> Acked-by: Roger Pau Monné <roger.pau@citrix.com> >> Looks patch not in mainline. Can you please help merge it? > > I'm afraid this needs to be done by Konrad or one of the Linux > maintainers, I don't have an account on kernel.org in order to send > pull requests to Jens. Can you pls help merge it? Thanks, Junxiao. > > Roger. >
On Wed, Jul 19, 2017 at 03:51:48PM +0800, Junxiao Bi wrote: > Hi Konrad, > > On 07/19/2017 03:37 PM, Roger Pau Monné wrote: > > On Wed, Jul 19, 2017 at 09:19:49AM +0800, Junxiao Bi wrote: > >> Hi Roger, > >> > >> On 06/23/2017 08:57 PM, Roger Pau Monné wrote: > >>> On Thu, Jun 22, 2017 at 09:36:52AM +0800, Junxiao Bi wrote: > >>>> When ring buf full, hw queue will be stopped. While blkif interrupt consume > >>>> request and make free space in ring buf, hw queue will be started again. > >>>> But since start queue is protected by spin lock while stop not, that will > >>>> cause a race. > >>>> > >>>> interrupt: process: > >>>> blkif_interrupt() blkif_queue_rq() > >>>> kick_pending_request_queues_locked() > >>>> blk_mq_start_stopped_hw_queues() > >>>> clear_bit(BLK_MQ_S_STOPPED, &hctx->state) > >>>> blk_mq_stop_hw_queue(hctx) > >>>> blk_mq_run_hw_queue(hctx, async) > >>>> > >>>> If ring buf is made empty in this case, interrupt will never come, then the > >>>> hw queue will be stopped forever, all processes waiting for the pending io > >>>> in the queue will hung. > >>>> > >>>> Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> > >>>> Reviewed-by: Ankur Arora <ankur.a.arora@oracle.com> > >>> > >>> Acked-by: Roger Pau Monné <roger.pau@citrix.com> > >> Looks patch not in mainline. Can you please help merge it? > > > > I'm afraid this needs to be done by Konrad or one of the Linux > > maintainers, I don't have an account on kernel.org in order to send > > pull requests to Jens. > Can you pls help merge it? Could you kindly repost it with the updated tags _and_ against Linus's latest branch? I get: [konrad@char linux]$ git am -s < /tmp/a Applying: xen-blkfront: fix mq start/stop race error: patch failed: drivers/block/xen-blkfront.c:912 error: drivers/block/xen-blkfront.c: patch does not apply Patch failed at 0001 xen-blkfront: fix mq start/stop race The copy of the patch that failed is found in: .git/rebase-apply/patch When you have resolved this problem, run "git am --continue". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort". > > Thanks, > Junxiao. > > > > Roger. > > >
On 07/19/2017 10:08 PM, Konrad Rzeszutek Wilk wrote: > On Wed, Jul 19, 2017 at 03:51:48PM +0800, Junxiao Bi wrote: >> Hi Konrad, >> >> On 07/19/2017 03:37 PM, Roger Pau Monné wrote: >>> On Wed, Jul 19, 2017 at 09:19:49AM +0800, Junxiao Bi wrote: >>>> Hi Roger, >>>> >>>> On 06/23/2017 08:57 PM, Roger Pau Monné wrote: >>>>> On Thu, Jun 22, 2017 at 09:36:52AM +0800, Junxiao Bi wrote: >>>>>> When ring buf full, hw queue will be stopped. While blkif interrupt consume >>>>>> request and make free space in ring buf, hw queue will be started again. >>>>>> But since start queue is protected by spin lock while stop not, that will >>>>>> cause a race. >>>>>> >>>>>> interrupt: process: >>>>>> blkif_interrupt() blkif_queue_rq() >>>>>> kick_pending_request_queues_locked() >>>>>> blk_mq_start_stopped_hw_queues() >>>>>> clear_bit(BLK_MQ_S_STOPPED, &hctx->state) >>>>>> blk_mq_stop_hw_queue(hctx) >>>>>> blk_mq_run_hw_queue(hctx, async) >>>>>> >>>>>> If ring buf is made empty in this case, interrupt will never come, then the >>>>>> hw queue will be stopped forever, all processes waiting for the pending io >>>>>> in the queue will hung. >>>>>> >>>>>> Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> >>>>>> Reviewed-by: Ankur Arora <ankur.a.arora@oracle.com> >>>>> >>>>> Acked-by: Roger Pau Monné <roger.pau@citrix.com> >>>> Looks patch not in mainline. Can you please help merge it? >>> >>> I'm afraid this needs to be done by Konrad or one of the Linux >>> maintainers, I don't have an account on kernel.org in order to send >>> pull requests to Jens. >> Can you pls help merge it? > > Could you kindly repost it with the updated tags _and_ against Linus's latest > branch? Sure, v2 sent. Please check. Thanks, Junxiao. > > I get: > [konrad@char linux]$ git am -s < /tmp/a > Applying: xen-blkfront: fix mq start/stop race > error: patch failed: drivers/block/xen-blkfront.c:912 > error: drivers/block/xen-blkfront.c: patch does not apply > Patch failed at 0001 xen-blkfront: fix mq start/stop race > The copy of the patch that failed is found in: .git/rebase-apply/patch > When you have resolved this problem, run "git am --continue". > If you prefer to skip this patch, run "git am --skip" instead. > To restore the original branch and stop patching, run "git am --abort". > > >> >> Thanks, >> Junxiao. >>> >>> Roger. >>> >>
diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index 8bb160cd00e1..4767b82b2cf6 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -912,8 +912,8 @@ out_err: return BLK_MQ_RQ_QUEUE_ERROR; out_busy: - spin_unlock_irqrestore(&rinfo->ring_lock, flags); blk_mq_stop_hw_queue(hctx); + spin_unlock_irqrestore(&rinfo->ring_lock, flags); return BLK_MQ_RQ_QUEUE_BUSY; }