Message ID | 9c372b04-a194-58c4-a64f-b155b52a5244@sandisk.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
> @@ -2079,11 +2075,15 @@ EXPORT_SYMBOL_GPL(nvme_kill_queues); > void nvme_stop_queues(struct nvme_ctrl *ctrl) > { > struct nvme_ns *ns; > + struct request_queue *q; > > mutex_lock(&ctrl->namespaces_mutex); > list_for_each_entry(ns, &ctrl->namespaces, list) { > - blk_mq_cancel_requeue_work(ns->queue); > - blk_mq_stop_hw_queues(ns->queue); > + q = ns->queue; > + blk_quiesce_queue(q); > + blk_mq_cancel_requeue_work(q); > + blk_mq_stop_hw_queues(q); > + blk_resume_queue(q); > } > mutex_unlock(&ctrl->namespaces_mutex); Hey Bart, should nvme_stop_queues() really be resuming the blk queue? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/27/2016 09:31 AM, Steve Wise wrote: >> @@ -2079,11 +2075,15 @@ EXPORT_SYMBOL_GPL(nvme_kill_queues); >> void nvme_stop_queues(struct nvme_ctrl *ctrl) >> { >> struct nvme_ns *ns; >> + struct request_queue *q; >> >> mutex_lock(&ctrl->namespaces_mutex); >> list_for_each_entry(ns, &ctrl->namespaces, list) { >> - blk_mq_cancel_requeue_work(ns->queue); >> - blk_mq_stop_hw_queues(ns->queue); >> + q = ns->queue; >> + blk_quiesce_queue(q); >> + blk_mq_cancel_requeue_work(q); >> + blk_mq_stop_hw_queues(q); >> + blk_resume_queue(q); >> } >> mutex_unlock(&ctrl->namespaces_mutex); > > Hey Bart, should nvme_stop_queues() really be resuming the blk queue? Hello Steve, Would you perhaps prefer that blk_resume_queue(q) is called from nvme_start_queues()? I think that would make the NVMe code harder to review. The above code won't cause any unexpected side effects if an NVMe namespace is removed after nvme_stop_queues() has been called and before nvme_start_queues() is called. Moving the blk_resume_queue(q) call into nvme_start_queues() will only work as expected if no namespaces are added nor removed between the nvme_stop_queues() and nvme_start_queues() calls. I'm not familiar enough with the NVMe code to know whether or not this change is safe ... Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2016-09-27 at 09:43 -0700, Bart Van Assche wrote: > On 09/27/2016 09:31 AM, Steve Wise wrote: > > > @@ -2079,11 +2075,15 @@ EXPORT_SYMBOL_GPL(nvme_kill_queues); > > > void nvme_stop_queues(struct nvme_ctrl *ctrl) > > > { > > > struct nvme_ns *ns; > > > + struct request_queue *q; > > > > > > mutex_lock(&ctrl->namespaces_mutex); > > > list_for_each_entry(ns, &ctrl->namespaces, list) { > > > - blk_mq_cancel_requeue_work(ns->queue); > > > - blk_mq_stop_hw_queues(ns->queue); > > > + q = ns->queue; > > > + blk_quiesce_queue(q); > > > + blk_mq_cancel_requeue_work(q); > > > + blk_mq_stop_hw_queues(q); > > > + blk_resume_queue(q); > > > } > > > mutex_unlock(&ctrl->namespaces_mutex); > > > > Hey Bart, should nvme_stop_queues() really be resuming the blk > > queue? > > Hello Steve, > > Would you perhaps prefer that blk_resume_queue(q) is called from > nvme_start_queues()? I think that would make the NVMe code harder to > review. The above code won't cause any unexpected side effects if an > NVMe namespace is removed after nvme_stop_queues() has been called > and before nvme_start_queues() is called. Moving the > blk_resume_queue(q) call into nvme_start_queues() will only work as > expected if no namespaces are added nor removed between the > nvme_stop_queues() and nvme_start_queues() calls. I'm not familiar > enough with the NVMe code to know whether or not this change is safe > ... It's something that looks obviously wrong, so explain why you need to do it, preferably in a comment above the function. James -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On 09/27/2016 09:31 AM, Steve Wise wrote: > >> @@ -2079,11 +2075,15 @@ EXPORT_SYMBOL_GPL(nvme_kill_queues); > >> void nvme_stop_queues(struct nvme_ctrl *ctrl) > >> { > >> struct nvme_ns *ns; > >> + struct request_queue *q; > >> > >> mutex_lock(&ctrl->namespaces_mutex); > >> list_for_each_entry(ns, &ctrl->namespaces, list) { > >> - blk_mq_cancel_requeue_work(ns->queue); > >> - blk_mq_stop_hw_queues(ns->queue); > >> + q = ns->queue; > >> + blk_quiesce_queue(q); > >> + blk_mq_cancel_requeue_work(q); > >> + blk_mq_stop_hw_queues(q); > >> + blk_resume_queue(q); > >> } > >> mutex_unlock(&ctrl->namespaces_mutex); > > > > Hey Bart, should nvme_stop_queues() really be resuming the blk queue? > > Hello Steve, > > Would you perhaps prefer that blk_resume_queue(q) is called from > nvme_start_queues()? I think that would make the NVMe code harder to > review. I'm still learning the blk code (and nvme code :)), but I would think blk_resume_queue() would cause requests to start being submit on the NVME queues, which I believe shouldn't happen when they are stopped. I'm currently debugging a problem where requests are submitted to the nvme-rdma driver while it has supposedly stopped all the nvme and blk mqs. I tried your series at Christoph's request to see if it resolved my problem, but it didn't. > The above code won't cause any unexpected side effects if an > NVMe namespace is removed after nvme_stop_queues() has been called and > before nvme_start_queues() is called. Moving the blk_resume_queue(q) > call into nvme_start_queues() will only work as expected if no > namespaces are added nor removed between the nvme_stop_queues() and > nvme_start_queues() calls. I'm not familiar enough with the NVMe code to > know whether or not this change is safe ... > I'll have to look and see if new namespaces can be added/deleted while a nvme controller is in the RECONNECTING state. In the meantime, I'm going to move the blk_resume_queue() to nvme_start_queues() and see if it helps my problem. Christoph: Thoughts? Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/27/2016 09:56 AM, James Bottomley wrote: > On Tue, 2016-09-27 at 09:43 -0700, Bart Van Assche wrote: >> On 09/27/2016 09:31 AM, Steve Wise wrote: >>>> @@ -2079,11 +2075,15 @@ EXPORT_SYMBOL_GPL(nvme_kill_queues); >>>> void nvme_stop_queues(struct nvme_ctrl *ctrl) >>>> { >>>> struct nvme_ns *ns; >>>> + struct request_queue *q; >>>> >>>> mutex_lock(&ctrl->namespaces_mutex); >>>> list_for_each_entry(ns, &ctrl->namespaces, list) { >>>> - blk_mq_cancel_requeue_work(ns->queue); >>>> - blk_mq_stop_hw_queues(ns->queue); >>>> + q = ns->queue; >>>> + blk_quiesce_queue(q); >>>> + blk_mq_cancel_requeue_work(q); >>>> + blk_mq_stop_hw_queues(q); >>>> + blk_resume_queue(q); >>>> } >>>> mutex_unlock(&ctrl->namespaces_mutex); >>> >>> Hey Bart, should nvme_stop_queues() really be resuming the blk >>> queue? >> >> Hello Steve, >> >> Would you perhaps prefer that blk_resume_queue(q) is called from >> nvme_start_queues()? I think that would make the NVMe code harder to >> review. The above code won't cause any unexpected side effects if an >> NVMe namespace is removed after nvme_stop_queues() has been called >> and before nvme_start_queues() is called. Moving the >> blk_resume_queue(q) call into nvme_start_queues() will only work as >> expected if no namespaces are added nor removed between the >> nvme_stop_queues() and nvme_start_queues() calls. I'm not familiar >> enough with the NVMe code to know whether or not this change is safe >> ... > > It's something that looks obviously wrong, so explain why you need to > do it, preferably in a comment above the function. Hello James and Steve, I will add a comment. Please note that the above patch does not change the behavior of nvme_stop_queues() except that it causes nvme_stop_queues() to wait until any ongoing nvme_queue_rq() calls have finished. blk_resume_queue() does not affect the value of the BLK_MQ_S_STOPPED bit that has been set by blk_mq_stop_hw_queues(). All it does is to resume pending blk_queue_enter() calls and to ensure that future blk_queue_enter() calls do not block. Even after blk_resume_queue() has been called if a new request is queued queue_rq() won't be invoked because the BLK_MQ_S_STOPPED bit is still set. Patch "dm: Fix a race condition related to stopping and starting queues" realizes a similar change in the dm driver and that change has been tested extensively. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Hello James and Steve, > > I will add a comment. > > Please note that the above patch does not change the behavior of > nvme_stop_queues() except that it causes nvme_stop_queues() to wait > until any ongoing nvme_queue_rq() calls have finished. > blk_resume_queue() does not affect the value of the BLK_MQ_S_STOPPED bit > that has been set by blk_mq_stop_hw_queues(). All it does is to resume > pending blk_queue_enter() calls and to ensure that future > blk_queue_enter() calls do not block. Even after blk_resume_queue() has > been called if a new request is queued queue_rq() won't be invoked > because the BLK_MQ_S_STOPPED bit is still set. Patch "dm: Fix a race > condition related to stopping and starting queues" realizes a similar > change in the dm driver and that change has been tested extensively. > Thanks for the detailed explanation! I think your code, then, is correct as-is. And this series doesn't fix the issue I'm hitting, so I'll keep digging. :) Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 057f1fa..6e2bf6a 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -201,13 +201,9 @@ fail: void nvme_requeue_req(struct request *req) { - unsigned long flags; - blk_mq_requeue_request(req); - spin_lock_irqsave(req->q->queue_lock, flags); - if (!blk_mq_queue_stopped(req->q)) - blk_mq_kick_requeue_list(req->q); - spin_unlock_irqrestore(req->q->queue_lock, flags); + WARN_ON_ONCE(blk_mq_queue_stopped(req->q)); + blk_mq_kick_requeue_list(req->q); } EXPORT_SYMBOL_GPL(nvme_requeue_req); @@ -2079,11 +2075,15 @@ EXPORT_SYMBOL_GPL(nvme_kill_queues); void nvme_stop_queues(struct nvme_ctrl *ctrl) { struct nvme_ns *ns; + struct request_queue *q; mutex_lock(&ctrl->namespaces_mutex); list_for_each_entry(ns, &ctrl->namespaces, list) { - blk_mq_cancel_requeue_work(ns->queue); - blk_mq_stop_hw_queues(ns->queue); + q = ns->queue; + blk_quiesce_queue(q); + blk_mq_cancel_requeue_work(q); + blk_mq_stop_hw_queues(q); + blk_resume_queue(q); } mutex_unlock(&ctrl->namespaces_mutex); }
Avoid that nvme_queue_rq() is still running when nvme_stop_queues() returns. Untested. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> --- drivers/nvme/host/core.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)