Message ID | 20190712024726.1227-3-ming.lei@redhat.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | blk-mq: improvement on handling IO during CPU hotplug | expand |
在 12/07/2019 10:47, Ming Lei 写道: > We will stop hw queue and wait for completion of in-flight requests > when one hctx is becoming dead in the following patch. This way may > cause dead-lock for some stacking blk-mq drivers, such as dm-rq and > loop. > > Add blk-mq flag of BLK_MQ_F_NO_MANAGED_IRQ and mark it for dm-rq and > loop, so we needn't to wait for completion of in-flight requests of > dm-rq & loop, then the potential dead-lock can be avoided. Wouldn't it make more sense to have the flag name be like BLK_MQ_F_DONT_DRAIN_STOPPED_HCTX? I did not think that blk-mq is specifically concerned with managed interrupts, but only their indirect effect. > > Cc: Bart Van Assche <bvanassche@acm.org> > Cc: Hannes Reinecke <hare@suse.com> > Cc: Christoph Hellwig <hch@lst.de> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Keith Busch <keith.busch@intel.com> > Signed-off-by: Ming Lei <ming.lei@redhat.com> > --- > block/blk-mq-debugfs.c | 1 + > drivers/block/loop.c | 2 +- > drivers/md/dm-rq.c | 2 +- > include/linux/blk-mq.h | 1 + > 4 files changed, 4 insertions(+), 2 deletions(-) > > diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c > index af40a02c46ee..24fff8c90942 100644 > --- a/block/blk-mq-debugfs.c > +++ b/block/blk-mq-debugfs.c > @@ -240,6 +240,7 @@ static const char *const hctx_flag_name[] = { > HCTX_FLAG_NAME(TAG_SHARED), > HCTX_FLAG_NAME(BLOCKING), > HCTX_FLAG_NAME(NO_SCHED), > + HCTX_FLAG_NAME(NO_MANAGED_IRQ), > }; > #undef HCTX_FLAG_NAME > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > index 44c9985f352a..199d76e8bf46 100644 > --- a/drivers/block/loop.c > +++ b/drivers/block/loop.c > @@ -1986,7 +1986,7 @@ static int loop_add(struct loop_device **l, int i) > lo->tag_set.queue_depth = 128; > lo->tag_set.numa_node = NUMA_NO_NODE; > lo->tag_set.cmd_size = sizeof(struct loop_cmd); > - lo->tag_set.flags = BLK_MQ_F_SHOULD_MERGE; > + lo->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_NO_MANAGED_IRQ; nit: at this point in the series you're setting a flag which is never checked. > lo->tag_set.driver_data = lo; > > err = blk_mq_alloc_tag_set(&lo->tag_set); > diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c > index 5f7063f05ae0..f7fbef2d3cd7 100644 > --- a/drivers/md/dm-rq.c > +++ b/drivers/md/dm-rq.c > @@ -546,7 +546,7 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t) > md->tag_set->ops = &dm_mq_ops; > md->tag_set->queue_depth = dm_get_blk_mq_queue_depth(); > md->tag_set->numa_node = md->numa_node_id; > - md->tag_set->flags = BLK_MQ_F_SHOULD_MERGE; > + md->tag_set->flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_NO_MANAGED_IRQ; > md->tag_set->nr_hw_queues = dm_get_blk_mq_nr_hw_queues(); > md->tag_set->driver_data = md; > > diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h > index 3a731c3c0762..911cdc6479dc 100644 > --- a/include/linux/blk-mq.h > +++ b/include/linux/blk-mq.h > @@ -219,6 +219,7 @@ struct blk_mq_ops { > enum { > BLK_MQ_F_SHOULD_MERGE = 1 << 0, > BLK_MQ_F_TAG_SHARED = 1 << 1, > + BLK_MQ_F_NO_MANAGED_IRQ = 1 << 2, > BLK_MQ_F_BLOCKING = 1 << 5, > BLK_MQ_F_NO_SCHED = 1 << 6, > BLK_MQ_F_ALLOC_POLICY_START_BIT = 8, >
On Tue, Jul 16, 2019 at 10:53:00AM +0800, John Garry wrote: > 在 12/07/2019 10:47, Ming Lei 写道: > > We will stop hw queue and wait for completion of in-flight requests > > when one hctx is becoming dead in the following patch. This way may > > cause dead-lock for some stacking blk-mq drivers, such as dm-rq and > > loop. > > > > Add blk-mq flag of BLK_MQ_F_NO_MANAGED_IRQ and mark it for dm-rq and > > loop, so we needn't to wait for completion of in-flight requests of > > dm-rq & loop, then the potential dead-lock can be avoided. > > Wouldn't it make more sense to have the flag name be like > BLK_MQ_F_DONT_DRAIN_STOPPED_HCTX? > > I did not think that blk-mq is specifically concerned with managed > interrupts, but only their indirect effect. I am fine with either one, however it is easier for drivers to recognize if this flag should be set, given BLK_MQ_F_NO_MANAGED_IRQ is self-explained. Also on the other side, this patchset serves a generic blk-mq fix for managed IRQ issue, so it is reasonable for all drivers which don't use managed IRQ to set the flag. > > > > > Cc: Bart Van Assche <bvanassche@acm.org> > > Cc: Hannes Reinecke <hare@suse.com> > > Cc: Christoph Hellwig <hch@lst.de> > > Cc: Thomas Gleixner <tglx@linutronix.de> > > Cc: Keith Busch <keith.busch@intel.com> > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > --- > > block/blk-mq-debugfs.c | 1 + > > drivers/block/loop.c | 2 +- > > drivers/md/dm-rq.c | 2 +- > > include/linux/blk-mq.h | 1 + > > 4 files changed, 4 insertions(+), 2 deletions(-) > > > > diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c > > index af40a02c46ee..24fff8c90942 100644 > > --- a/block/blk-mq-debugfs.c > > +++ b/block/blk-mq-debugfs.c > > @@ -240,6 +240,7 @@ static const char *const hctx_flag_name[] = { > > HCTX_FLAG_NAME(TAG_SHARED), > > HCTX_FLAG_NAME(BLOCKING), > > HCTX_FLAG_NAME(NO_SCHED), > > + HCTX_FLAG_NAME(NO_MANAGED_IRQ), > > }; > > #undef HCTX_FLAG_NAME > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > > index 44c9985f352a..199d76e8bf46 100644 > > --- a/drivers/block/loop.c > > +++ b/drivers/block/loop.c > > @@ -1986,7 +1986,7 @@ static int loop_add(struct loop_device **l, int i) > > lo->tag_set.queue_depth = 128; > > lo->tag_set.numa_node = NUMA_NO_NODE; > > lo->tag_set.cmd_size = sizeof(struct loop_cmd); > > - lo->tag_set.flags = BLK_MQ_F_SHOULD_MERGE; > > + lo->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_NO_MANAGED_IRQ; > > nit: at this point in the series you're setting a flag which is never > checked. Yeah, I see, and this way is a bit easier for review purpose. thanks, Ming
diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index af40a02c46ee..24fff8c90942 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -240,6 +240,7 @@ static const char *const hctx_flag_name[] = { HCTX_FLAG_NAME(TAG_SHARED), HCTX_FLAG_NAME(BLOCKING), HCTX_FLAG_NAME(NO_SCHED), + HCTX_FLAG_NAME(NO_MANAGED_IRQ), }; #undef HCTX_FLAG_NAME diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 44c9985f352a..199d76e8bf46 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1986,7 +1986,7 @@ static int loop_add(struct loop_device **l, int i) lo->tag_set.queue_depth = 128; lo->tag_set.numa_node = NUMA_NO_NODE; lo->tag_set.cmd_size = sizeof(struct loop_cmd); - lo->tag_set.flags = BLK_MQ_F_SHOULD_MERGE; + lo->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_NO_MANAGED_IRQ; lo->tag_set.driver_data = lo; err = blk_mq_alloc_tag_set(&lo->tag_set); diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index 5f7063f05ae0..f7fbef2d3cd7 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -546,7 +546,7 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t) md->tag_set->ops = &dm_mq_ops; md->tag_set->queue_depth = dm_get_blk_mq_queue_depth(); md->tag_set->numa_node = md->numa_node_id; - md->tag_set->flags = BLK_MQ_F_SHOULD_MERGE; + md->tag_set->flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_NO_MANAGED_IRQ; md->tag_set->nr_hw_queues = dm_get_blk_mq_nr_hw_queues(); md->tag_set->driver_data = md; diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 3a731c3c0762..911cdc6479dc 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -219,6 +219,7 @@ struct blk_mq_ops { enum { BLK_MQ_F_SHOULD_MERGE = 1 << 0, BLK_MQ_F_TAG_SHARED = 1 << 1, + BLK_MQ_F_NO_MANAGED_IRQ = 1 << 2, BLK_MQ_F_BLOCKING = 1 << 5, BLK_MQ_F_NO_SCHED = 1 << 6, BLK_MQ_F_ALLOC_POLICY_START_BIT = 8,
We will stop hw queue and wait for completion of in-flight requests when one hctx is becoming dead in the following patch. This way may cause dead-lock for some stacking blk-mq drivers, such as dm-rq and loop. Add blk-mq flag of BLK_MQ_F_NO_MANAGED_IRQ and mark it for dm-rq and loop, so we needn't to wait for completion of in-flight requests of dm-rq & loop, then the potential dead-lock can be avoided. Cc: Bart Van Assche <bvanassche@acm.org> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Keith Busch <keith.busch@intel.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> --- block/blk-mq-debugfs.c | 1 + drivers/block/loop.c | 2 +- drivers/md/dm-rq.c | 2 +- include/linux/blk-mq.h | 1 + 4 files changed, 4 insertions(+), 2 deletions(-)