Message ID | 20170714231601.14444-2-ming.lei@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sat, Jul 15, 2017 at 07:15:56AM +0800, Ming Lei wrote: > stopping queue may cause race and may not stop the queue really > after the API returns, and we have improved quiescing > interface and it really can block dispatching once it returns. > > So switch to quiesce/unquiece like what we did on other drivers > (NVMe, NBD, mtip32xx, ...) > > The blk_mq_stop_hw_queues() and blk_mq_start_stopped_hw_queues() > used in blkif_queue_rq() and blkif_interrupt() are for congestion > control, we leave it as it is since it is safe for this usage. Again I yet don't understand the difference between those two, neither why start/stop is not fixed instead of introducing quiesce/unquiece. Not to mention that start/stop is not documented, which makes all this even more fun. Anyway I would like to ask, is the way to re-start a stopped queue the same way to unquiece? If not I would rather prefer that start/stop or quiece/unquiece is used exclusively, in order to not make the code even more complex. It seems fairly easy to mess up and call "start" on a "quiesced" queue (or the other way around). Roger.
On Mon, Jul 17, 2017 at 12:20:56PM +0100, Roger Pau Monné wrote: > On Sat, Jul 15, 2017 at 07:15:56AM +0800, Ming Lei wrote: > > stopping queue may cause race and may not stop the queue really > > after the API returns, and we have improved quiescing > > interface and it really can block dispatching once it returns. > > > > So switch to quiesce/unquiece like what we did on other drivers > > (NVMe, NBD, mtip32xx, ...) > > > > The blk_mq_stop_hw_queues() and blk_mq_start_stopped_hw_queues() > > used in blkif_queue_rq() and blkif_interrupt() are for congestion > > control, we leave it as it is since it is safe for this usage. > > Again I yet don't understand the difference between those two, neither > why start/stop is not fixed instead of introducing quiesce/unquiece. There are two usages covered by start/stop now: - congestion control for handling queue busy(BLK_STS_RESOURCE), now only xen-blkfront and virtio-blk use that - other usages, such as in xlvbd_release_gendisk(), for blocking IO to driver/device For the 1st case, it is usually fine to use stop/start For the 2nd case, stop queue isn't enough, and we can't guarantee no IO is dispatched to device/driver after returning from stop queue, for details. Most of this usage have been fixed by Sagi Grimberg: http://marc.info/?t=149927415900006&r=1&w=2 start/stop is a bad name for 2nd usage too, what we really want is to block IO to driver/devices, so we should use quiesce/unquiesce. xen-blkfront is missed in Sagi's patchset which has been merged to linus tree already, so this patch just fixes xen-blkfront simply like other patches. We can't use quiesce/unquiesce for replacing stop/start in the case of BLK_STS_RESOURCE, because quiesce may sleep, and we needn't block IO for this usage actually. > Not to mention that start/stop is not documented, which makes all this > even more fun. Did you read comment of blk_mq_stop_hw_queue() and blk_mq_stop_hw_queues() in linus tree? > > Anyway I would like to ask, is the way to re-start a stopped queue the > same way to unquiece? I don't know what your exact question, but it is definitely that unquiesce is counter part of quiesce, and quiesce/unquiesce doesn't depend on 'stopped' state any more if you take a look at the code. > > If not I would rather prefer that start/stop or quiece/unquiece is > used exclusively, in order to not make the code even more complex. It I do not think the code becomes more complex, please see the line change of this patch: 1 file changed, 8 insertions(+), 14 deletions(-) then see the change of the whole patchset: 8 files changed, 54 insertions(+), 129 deletions(-) It is really a cleanup and simplifying. > seems fairly easy to mess up and call "start" on a "quiesced" queue > (or the other way around). Definitely it shouldn't be worried because start/stop is removed in this patchset.
On Mon, Jul 17, 2017 at 11:06:28PM +0800, Ming Lei wrote: > On Mon, Jul 17, 2017 at 12:20:56PM +0100, Roger Pau Monné wrote: > > On Sat, Jul 15, 2017 at 07:15:56AM +0800, Ming Lei wrote: > > > stopping queue may cause race and may not stop the queue really > > > after the API returns, and we have improved quiescing > > > interface and it really can block dispatching once it returns. > > > > > > So switch to quiesce/unquiece like what we did on other drivers > > > (NVMe, NBD, mtip32xx, ...) > > > > > > The blk_mq_stop_hw_queues() and blk_mq_start_stopped_hw_queues() > > > used in blkif_queue_rq() and blkif_interrupt() are for congestion > > > control, we leave it as it is since it is safe for this usage. > > > > Again I yet don't understand the difference between those two, neither > > why start/stop is not fixed instead of introducing quiesce/unquiece. > > There are two usages covered by start/stop now: > > - congestion control for handling queue busy(BLK_STS_RESOURCE), now > only xen-blkfront and virtio-blk use that > > - other usages, such as in xlvbd_release_gendisk(), for blocking > IO to driver/device > > For the 1st case, it is usually fine to use stop/start > > For the 2nd case, stop queue isn't enough, and we can't guarantee > no IO is dispatched to device/driver after returning from stop queue, > for details. Most of this usage have been fixed by Sagi Grimberg: OK, so basically after calling stop the queue might still be running. > We can't use quiesce/unquiesce for replacing stop/start in the > case of BLK_STS_RESOURCE, because quiesce may sleep, and we needn't > block IO for this usage actually. Do you mean that quiesce/unquiesce cannot be used while holding a spinlock? > > > Not to mention that start/stop is not documented, which makes all this > > even more fun. > > Did you read comment of blk_mq_stop_hw_queue() and > blk_mq_stop_hw_queues() in linus tree? OK, this has been added very recently. > > > > Anyway I would like to ask, is the way to re-start a stopped queue the > > same way to unquiece? > > I don't know what your exact question, but it is definitely that > unquiesce is counter part of quiesce, and quiesce/unquiesce doesn't > depend on 'stopped' state any more if you take a look at the code. > > > > > If not I would rather prefer that start/stop or quiece/unquiece is > > used exclusively, in order to not make the code even more complex. It > > I do not think the code becomes more complex, please see the line change > of this patch: Before this patch blkfront used: blk_mq_stop_hw_queues blk_mq_start_stopped_hw_queues blk_mq_kick_requeue_list After the patch it uses: blk_mq_quiesce_queue blk_mq_unquiesce_queue blk_mq_stop_hw_queues blk_mq_start_stopped_hw_queues blk_mq_kick_requeue_list blk_mq_run_hw_queues It's not about line changes, but the amount of interfaces blkfront has to use. Apart from introducing the quiesce/unquiesce, you also introduce a call to blk_mq_run_hw_queues, which is not documented in the commit message. > > seems fairly easy to mess up and call "start" on a "quiesced" queue > > (or the other way around). > > Definitely it shouldn't be worried because start/stop is removed > in this patchset. Hm, how is that? I haven't seen any patch to blkfront to remove the usage of start/stop, am I missing something? Roger.
On Mon, Jul 17, 2017 at 05:02:27PM +0100, Roger Pau Monné wrote: > On Mon, Jul 17, 2017 at 11:06:28PM +0800, Ming Lei wrote: > > On Mon, Jul 17, 2017 at 12:20:56PM +0100, Roger Pau Monné wrote: > > > On Sat, Jul 15, 2017 at 07:15:56AM +0800, Ming Lei wrote: > > > > stopping queue may cause race and may not stop the queue really > > > > after the API returns, and we have improved quiescing > > > > interface and it really can block dispatching once it returns. > > > > > > > > So switch to quiesce/unquiece like what we did on other drivers > > > > (NVMe, NBD, mtip32xx, ...) > > > > > > > > The blk_mq_stop_hw_queues() and blk_mq_start_stopped_hw_queues() > > > > used in blkif_queue_rq() and blkif_interrupt() are for congestion > > > > control, we leave it as it is since it is safe for this usage. > > > > > > Again I yet don't understand the difference between those two, neither > > > why start/stop is not fixed instead of introducing quiesce/unquiece. > > > > There are two usages covered by start/stop now: > > > > - congestion control for handling queue busy(BLK_STS_RESOURCE), now > > only xen-blkfront and virtio-blk use that > > > > - other usages, such as in xlvbd_release_gendisk(), for blocking > > IO to driver/device > > > > For the 1st case, it is usually fine to use stop/start > > > > For the 2nd case, stop queue isn't enough, and we can't guarantee > > no IO is dispatched to device/driver after returning from stop queue, > > for details. Most of this usage have been fixed by Sagi Grimberg: > > OK, so basically after calling stop the queue might still be running. > > > We can't use quiesce/unquiesce for replacing stop/start in the > > case of BLK_STS_RESOURCE, because quiesce may sleep, and we needn't > > block IO for this usage actually. > > Do you mean that quiesce/unquiesce cannot be used while holding a > spinlock? Yes. > > > > > > Not to mention that start/stop is not documented, which makes all this > > > even more fun. > > > > Did you read comment of blk_mq_stop_hw_queue() and > > blk_mq_stop_hw_queues() in linus tree? > > OK, this has been added very recently. > > > > > > > Anyway I would like to ask, is the way to re-start a stopped queue the > > > same way to unquiece? > > > > I don't know what your exact question, but it is definitely that > > unquiesce is counter part of quiesce, and quiesce/unquiesce doesn't > > depend on 'stopped' state any more if you take a look at the code. > > > > > > > > If not I would rather prefer that start/stop or quiece/unquiece is > > > used exclusively, in order to not make the code even more complex. It > > > > I do not think the code becomes more complex, please see the line change > > of this patch: > > Before this patch blkfront used: > blk_mq_stop_hw_queues > blk_mq_start_stopped_hw_queues > blk_mq_kick_requeue_list > > After the patch it uses: > blk_mq_quiesce_queue > blk_mq_unquiesce_queue > blk_mq_stop_hw_queues > blk_mq_start_stopped_hw_queues > blk_mq_kick_requeue_list > blk_mq_run_hw_queues > > It's not about line changes, but the amount of interfaces blkfront has > to use. Apart from introducing the quiesce/unquiesce, you also > introduce a call to blk_mq_run_hw_queues, which is not documented in > the commit message. > > > > seems fairly easy to mess up and call "start" on a "quiesced" queue > > > (or the other way around). > > > > Definitely it shouldn't be worried because start/stop is removed > > in this patchset. > > Hm, how is that? I haven't seen any patch to blkfront to remove the > usage of start/stop, am I missing something? http://marc.info/?l=linux-block&m=150007418321196&w=2
On Tue, Jul 18, 2017 at 08:53:28AM +0800, Ming Lei wrote: > On Mon, Jul 17, 2017 at 05:02:27PM +0100, Roger Pau Monné wrote: > > On Mon, Jul 17, 2017 at 11:06:28PM +0800, Ming Lei wrote: > > > On Mon, Jul 17, 2017 at 12:20:56PM +0100, Roger Pau Monné wrote: > > > > seems fairly easy to mess up and call "start" on a "quiesced" queue > > > > (or the other way around). > > > > > > Definitely it shouldn't be worried because start/stop is removed > > > in this patchset. > > > > Hm, how is that? I haven't seen any patch to blkfront to remove the > > usage of start/stop, am I missing something? > > http://secure-web.cisco.com/19iIG2bc3Ce2_t8mSF4YQ2LopepIvjAOqnPAXQ_QahSNOEvLBmechzNZ0pQOPfSRhy3hyC4t6L-JJzu81FvWRyLoBmhtq7F9uEk7XZsTt-XNxLN1KdEJvEeUAWelYSd9NthbvpGad6DmpJ0QCHSOgq4mRcDaqlz5mRmNkTvxls-ri1qCqy6am0jTDdraGINb_BUyV0894BtaFOMGJGEKLjcrBfFfT6C5XHSEdsiuB12ZLIUyaRVyU0TCUz6Sm4uhD/http%3A%2F%2Fmarc.info%2F%3Fl%3Dlinux-block%26m%3D150007418321196%26w%3D2 Sadly I have not been CCed or otherwise I've lost patch #5. Related to that patch AFAICT kick_pending_request_queues can be removed, since it's only used by one caller (or it can even be removed in this patch, #1). Thanks, Roger.
On Tue, Jul 18, 2017 at 08:40:18AM +0100, Roger Pau Monné wrote: > On Tue, Jul 18, 2017 at 08:53:28AM +0800, Ming Lei wrote: > > On Mon, Jul 17, 2017 at 05:02:27PM +0100, Roger Pau Monné wrote: > > > On Mon, Jul 17, 2017 at 11:06:28PM +0800, Ming Lei wrote: > > > > On Mon, Jul 17, 2017 at 12:20:56PM +0100, Roger Pau Monné wrote: > > > > > seems fairly easy to mess up and call "start" on a "quiesced" queue > > > > > (or the other way around). > > > > > > > > Definitely it shouldn't be worried because start/stop is removed > > > > in this patchset. > > > > > > Hm, how is that? I haven't seen any patch to blkfront to remove the > > > usage of start/stop, am I missing something? > > > > http://secure-web.cisco.com/19iIG2bc3Ce2_t8mSF4YQ2LopepIvjAOqnPAXQ_QahSNOEvLBmechzNZ0pQOPfSRhy3hyC4t6L-JJzu81FvWRyLoBmhtq7F9uEk7XZsTt-XNxLN1KdEJvEeUAWelYSd9NthbvpGad6DmpJ0QCHSOgq4mRcDaqlz5mRmNkTvxls-ri1qCqy6am0jTDdraGINb_BUyV0894BtaFOMGJGEKLjcrBfFfT6C5XHSEdsiuB12ZLIUyaRVyU0TCUz6Sm4uhD/http%3A%2F%2Fmarc.info%2F%3Fl%3Dlinux-block%26m%3D150007418321196%26w%3D2 > > Sadly I have not been CCed or otherwise I've lost patch #5. Related to > that patch AFAICT kick_pending_request_queues can be removed, since > it's only used by one caller (or it can even be removed in this patch, > #1). There are two callers: blkfront_connect() and blkif_restart_queue().
diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index c852ed3c01d5..1578befda635 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -1187,7 +1187,7 @@ static void xlvbd_release_gendisk(struct blkfront_info *info) return; /* No more blkif_request(). */ - blk_mq_stop_hw_queues(info->rq); + blk_mq_quiesce_queue(info->rq); for (i = 0; i < info->nr_rings; i++) { struct blkfront_ring_info *rinfo = &info->rinfo[i]; @@ -1216,8 +1216,10 @@ static void xlvbd_release_gendisk(struct blkfront_info *info) /* Already hold rinfo->ring_lock. */ static inline void kick_pending_request_queues_locked(struct blkfront_ring_info *rinfo) { - if (!RING_FULL(&rinfo->ring)) + if (!RING_FULL(&rinfo->ring)) { blk_mq_start_stopped_hw_queues(rinfo->dev_info->rq, true); + blk_mq_kick_requeue_list(rinfo->dev_info->rq); + } } static void kick_pending_request_queues(struct blkfront_ring_info *rinfo) @@ -1225,7 +1227,8 @@ static void kick_pending_request_queues(struct blkfront_ring_info *rinfo) unsigned long flags; spin_lock_irqsave(&rinfo->ring_lock, flags); - kick_pending_request_queues_locked(rinfo); + if (!RING_FULL(&rinfo->ring)) + blk_mq_run_hw_queues(rinfo->dev_info->rq, true); spin_unlock_irqrestore(&rinfo->ring_lock, flags); } @@ -1346,7 +1349,7 @@ static void blkif_free(struct blkfront_info *info, int suspend) BLKIF_STATE_SUSPENDED : BLKIF_STATE_DISCONNECTED; /* No more blkif_request(). */ if (info->rq) - blk_mq_stop_hw_queues(info->rq); + blk_mq_quiesce_queue(info->rq); for (i = 0; i < info->nr_rings; i++) blkif_free_ring(&info->rinfo[i]); @@ -2018,22 +2021,13 @@ static int blkif_recover(struct blkfront_info *info) /* Now safe for us to use the shared ring */ info->connected = BLKIF_STATE_CONNECTED; - for (r_index = 0; r_index < info->nr_rings; r_index++) { - struct blkfront_ring_info *rinfo; - - rinfo = &info->rinfo[r_index]; - /* Kick any other new requests queued since we resumed */ - kick_pending_request_queues(rinfo); - } - list_for_each_entry_safe(req, n, &info->requests, queuelist) { /* Requeue pending requests (flush or discard) */ list_del_init(&req->queuelist); BUG_ON(req->nr_phys_segments > segs); blk_mq_requeue_request(req, false); } - blk_mq_start_stopped_hw_queues(info->rq, true); - blk_mq_kick_requeue_list(info->rq); + blk_mq_unquiesce_queue(info->rq); while ((bio = bio_list_pop(&info->bio_list)) != NULL) { /* Traverse the list of pending bios and re-queue them */
stopping queue may cause race and may not stop the queue really after the API returns, and we have improved quiescing interface and it really can block dispatching once it returns. So switch to quiesce/unquiece like what we did on other drivers (NVMe, NBD, mtip32xx, ...) The blk_mq_stop_hw_queues() and blk_mq_start_stopped_hw_queues() used in blkif_queue_rq() and blkif_interrupt() are for congestion control, we leave it as it is since it is safe for this usage. Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: "Roger Pau Monné" <roger.pau@citrix.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: xen-devel@lists.xenproject.org Signed-off-by: Ming Lei <ming.lei@redhat.com> --- drivers/block/xen-blkfront.c | 22 ++++++++-------------- 1 file changed, 8 insertions(+), 14 deletions(-)