diff mbox

[2/5] nvme-rdma: Free the I/O tags when we delete the controller

Message ID 1469822242-3477-3-git-send-email-sagi@grimberg.me (mailing list archive)
State Not Applicable
Headers show

Commit Message

Sagi Grimberg July 29, 2016, 7:57 p.m. UTC
If we wait until we free the controller (free_ctrl) we might
lose our rdma device without any notification while we still
have open resources (tags mrs and dma mappings).

Instead, destroy the tags with their rdma resources once we
delete the device and not when freeing it.

Note that we don't do that in nvme_rdma_shutdown_ctrl because
controller reset uses it as well and we want to give active I/O
a chance to complete successfully.

Reported-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/rdma.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

Comments

Christoph Hellwig Aug. 1, 2016, 11:03 a.m. UTC | #1
On Fri, Jul 29, 2016 at 10:57:19PM +0300, Sagi Grimberg wrote:
> If we wait until we free the controller (free_ctrl) we might
> lose our rdma device without any notification while we still
> have open resources (tags mrs and dma mappings).
> 
> Instead, destroy the tags with their rdma resources once we
> delete the device and not when freeing it.
> 
> Note that we don't do that in nvme_rdma_shutdown_ctrl because
> controller reset uses it as well and we want to give active I/O
> a chance to complete successfully.
> 
> Reported-by: Steve Wise <swise@opengridcomputing.com>
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>

This looks fine to me, but can we place share the code instead of
duplicating it?  E.g.

static void __nvme_rdma_remove_ctrl(struct nvme_rdma_ctrl *ctrl, bool shutdown)
{
 	nvme_remove_namespaces(&ctrl->ctrl);
	if (shutdown)
	  	nvme_rdma_shutdown_ctrl(ctrl);
  	nvme_uninit_ctrl(&ctrl->ctrl);
	if (ctrl->ctrl.tagset) {
		blk_cleanup_queue(ctrl->ctrl.connect_q);
		blk_mq_free_tag_set(&ctrl->tag_set);
		nvme_rdma_dev_put(ctrl->device);
	}
	nvme_put_ctrl(&ctrl->ctrl);
}

or in a second step we should probably always call shutdown_ctrl
but skip the actual shutdown if the ctrl state doesn't require it.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Aug. 1, 2016, 11:15 a.m. UTC | #2
>> If we wait until we free the controller (free_ctrl) we might
>> lose our rdma device without any notification while we still
>> have open resources (tags mrs and dma mappings).
>>
>> Instead, destroy the tags with their rdma resources once we
>> delete the device and not when freeing it.
>>
>> Note that we don't do that in nvme_rdma_shutdown_ctrl because
>> controller reset uses it as well and we want to give active I/O
>> a chance to complete successfully.
>>
>> Reported-by: Steve Wise <swise@opengridcomputing.com>
>> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
>
> This looks fine to me, but can we place share the code instead of
> duplicating it?  E.g.
>
> static void __nvme_rdma_remove_ctrl(struct nvme_rdma_ctrl *ctrl, bool shutdown)
> {
>  	nvme_remove_namespaces(&ctrl->ctrl);
> 	if (shutdown)
> 	  	nvme_rdma_shutdown_ctrl(ctrl);
>   	nvme_uninit_ctrl(&ctrl->ctrl);
> 	if (ctrl->ctrl.tagset) {
> 		blk_cleanup_queue(ctrl->ctrl.connect_q);
> 		blk_mq_free_tag_set(&ctrl->tag_set);
> 		nvme_rdma_dev_put(ctrl->device);
> 	}
> 	nvme_put_ctrl(&ctrl->ctrl);
> }

That sounds fine to me.

> or in a second step we should probably always call shutdown_ctrl
> but skip the actual shutdown if the ctrl state doesn't require it.

What do you mean "if the ctrl state doesn't require it"?
Up until today we managed to avoid checking the ctrl state
in queue_rq and I'd like to keep it that way. I'd be much
happier if we don't depend on queue_rq to fail early under
some assumptions, it might be a slippery slope...
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig Aug. 1, 2016, 3:47 p.m. UTC | #3
On Mon, Aug 01, 2016 at 02:15:59PM +0300, Sagi Grimberg wrote:
>> or in a second step we should probably always call shutdown_ctrl
>> but skip the actual shutdown if the ctrl state doesn't require it.
>
> What do you mean "if the ctrl state doesn't require it"?
> Up until today we managed to avoid checking the ctrl state
> in queue_rq and I'd like to keep it that way. I'd be much
> happier if we don't depend on queue_rq to fail early under
> some assumptions, it might be a slippery slope...

In this case I actually thought about checking the state in
the new __nvme_rdma_remove_ctrl.  If called from
nvme_rdma_del_ctrl_work the state will be NVME_CTRL_DELETING,
so we can check that to see if we want to do a controlled
shutdown using nvme_rdma_shutdown_ctrl instead of having
to overwrite ->remove_work.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Aug. 2, 2016, 6:14 a.m. UTC | #4
On 01/08/16 18:47, Christoph Hellwig wrote:
> On Mon, Aug 01, 2016 at 02:15:59PM +0300, Sagi Grimberg wrote:
>>> or in a second step we should probably always call shutdown_ctrl
>>> but skip the actual shutdown if the ctrl state doesn't require it.
>>
>> What do you mean "if the ctrl state doesn't require it"?
>> Up until today we managed to avoid checking the ctrl state
>> in queue_rq and I'd like to keep it that way. I'd be much
>> happier if we don't depend on queue_rq to fail early under
>> some assumptions, it might be a slippery slope...
>
> In this case I actually thought about checking the state in
> the new __nvme_rdma_remove_ctrl.  If called from
> nvme_rdma_del_ctrl_work the state will be NVME_CTRL_DELETING,
> so we can check that to see if we want to do a controlled
> shutdown using nvme_rdma_shutdown_ctrl instead of having
> to overwrite ->remove_work.
>

I don't know if that is sufficient. In case the host is
attempting reconnects periodically we can still delete
the controller, but the admin queue is not connected and
we shouldn't attempt to send a shutdown command...
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index f391d67e4ee0..a70eb3cbf656 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -686,11 +686,6 @@  static void nvme_rdma_free_ctrl(struct nvme_ctrl *nctrl)
 	list_del(&ctrl->list);
 	mutex_unlock(&nvme_rdma_ctrl_mutex);
 
-	if (ctrl->ctrl.tagset) {
-		blk_cleanup_queue(ctrl->ctrl.connect_q);
-		blk_mq_free_tag_set(&ctrl->tag_set);
-		nvme_rdma_dev_put(ctrl->device);
-	}
 	kfree(ctrl->queues);
 	nvmf_free_options(nctrl->opts);
 free_ctrl:
@@ -1666,6 +1661,11 @@  static void nvme_rdma_del_ctrl_work(struct work_struct *work)
 	nvme_remove_namespaces(&ctrl->ctrl);
 	nvme_rdma_shutdown_ctrl(ctrl);
 	nvme_uninit_ctrl(&ctrl->ctrl);
+	if (ctrl->ctrl.tagset) {
+		blk_cleanup_queue(ctrl->ctrl.connect_q);
+		blk_mq_free_tag_set(&ctrl->tag_set);
+		nvme_rdma_dev_put(ctrl->device);
+	}
 	nvme_put_ctrl(&ctrl->ctrl);
 }
 
@@ -1701,6 +1701,11 @@  static void nvme_rdma_remove_ctrl_work(struct work_struct *work)
 
 	nvme_remove_namespaces(&ctrl->ctrl);
 	nvme_uninit_ctrl(&ctrl->ctrl);
+	if (ctrl->ctrl.tagset) {
+		blk_cleanup_queue(ctrl->ctrl.connect_q);
+		blk_mq_free_tag_set(&ctrl->tag_set);
+		nvme_rdma_dev_put(ctrl->device);
+	}
 	nvme_put_ctrl(&ctrl->ctrl);
 }