Message ID | 1469967347-20466-1-git-send-email-sagi@grimberg.me (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
This looks reasonable to me, but a little question below: > @@ -1442,7 +1491,8 @@ static void nvmet_rdma_remove_port(struct nvmet_port *port) > { > struct rdma_cm_id *cm_id = port->priv; > > - rdma_destroy_id(cm_id); > + if (cm_id) > + rdma_destroy_id(cm_id); > } How is ->remove_port synchronized vs the RDMA/CM even handler? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> This looks reasonable to me, but a little question below: > >> @@ -1442,7 +1491,8 @@ static void nvmet_rdma_remove_port(struct nvmet_port *port) >> { >> struct rdma_cm_id *cm_id = port->priv; >> >> - rdma_destroy_id(cm_id); >> + if (cm_id) >> + rdma_destroy_id(cm_id); >> } > > How is ->remove_port synchronized vs the RDMA/CM even handler? Easy, it isn't :) So we have three choices here: 1. Add a lock in nvmet_port that only rdma will use for now (don't like it) or 2. Add nvmet_rdma_port as nvmet_port->priv with a lock (don't like it) or 3. take the global nvmet_config_sem (hate it) Any preferences? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/01/16 04:15, Christoph Hellwig wrote: > This looks reasonable to me, but a little question below: > >> @@ -1442,7 +1491,8 @@ static void nvmet_rdma_remove_port(struct nvmet_port *port) >> { >> struct rdma_cm_id *cm_id = port->priv; >> >> - rdma_destroy_id(cm_id); >> + if (cm_id) >> + rdma_destroy_id(cm_id); >> } > > How is ->remove_port synchronized vs the RDMA/CM event handler? rdma_destroy_id() waits until active RDMA/CM callbacks have finished. Is that sufficient or is further synchronization needed? Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/01/16 07:44, Bart Van Assche wrote: > On 08/01/16 04:15, Christoph Hellwig wrote: >> This looks reasonable to me, but a little question below: >> >>> @@ -1442,7 +1491,8 @@ static void nvmet_rdma_remove_port(struct >>> nvmet_port *port) >>> { >>> struct rdma_cm_id *cm_id = port->priv; >>> >>> - rdma_destroy_id(cm_id); >>> + if (cm_id) >>> + rdma_destroy_id(cm_id); >>> } >> >> How is ->remove_port synchronized vs the RDMA/CM event handler? > > rdma_destroy_id() waits until active RDMA/CM callbacks have finished. Is > that sufficient or is further synchronization needed? (replying to my own e-mail) Please ignore my e-mail. I just realized that Christoph's question was about synchronization of nvmet_rdma_remove_port() versus the RDMA/CM event handler instead of just rdma_destroy_id(). Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 01, 2016 at 02:30:37PM +0300, Sagi Grimberg wrote: >> How is ->remove_port synchronized vs the RDMA/CM even handler? > > Easy, it isn't :) > > So we have three choices here: > 1. Add a lock in nvmet_port that only rdma will use for now (don't like > it) > or > 2. Add nvmet_rdma_port as nvmet_port->priv with a lock (don't like it) > or > 3. take the global nvmet_config_sem (hate it) > > Any preferences? (4) use cmpxchg? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>>> How is ->remove_port synchronized vs the RDMA/CM even handler? >> >> Easy, it isn't :) >> >> So we have three choices here: >> 1. Add a lock in nvmet_port that only rdma will use for now (don't like >> it) >> or >> 2. Add nvmet_rdma_port as nvmet_port->priv with a lock (don't like it) >> or >> 3. take the global nvmet_config_sem (hate it) >> >> Any preferences? > > (4) use cmpxchg? I'm not exactly sure what you mean. Do you mean placing cmpxchg in nvmet_rdma_device_removal()? To what we cmp when we want to xchg? Care to explain in a bit more detail? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Aug 02, 2016 at 09:39:48AM +0300, Sagi Grimberg wrote: > I'm not exactly sure what you mean. Do you mean placing > cmpxchg in nvmet_rdma_device_removal()? To what we cmp > when we want to xchg? > > Care to explain in a bit more detail? Right, plain xchg() should be enough. E.g. do an xchg both in the device removal handler and ->remove_port and only delete the CM_ID if the caller was the one taking it out the private data. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c index 347cc6d37dad..220f3152328d 100644 --- a/drivers/nvme/target/rdma.c +++ b/drivers/nvme/target/rdma.c @@ -77,6 +77,7 @@ enum nvmet_rdma_queue_state { NVMET_RDMA_Q_CONNECTING, NVMET_RDMA_Q_LIVE, NVMET_RDMA_Q_DISCONNECTING, + NVMET_RDMA_IN_DEVICE_REMOVAL, }; struct nvmet_rdma_queue { @@ -979,7 +980,10 @@ static void nvmet_rdma_release_queue_work(struct work_struct *w) struct nvmet_rdma_device *dev = queue->dev; nvmet_rdma_free_queue(queue); - rdma_destroy_id(cm_id); + + if (queue->state != NVMET_RDMA_IN_DEVICE_REMOVAL) + rdma_destroy_id(cm_id); + kref_put(&dev->ref, nvmet_rdma_free_dev); } @@ -1228,8 +1232,9 @@ static void __nvmet_rdma_queue_disconnect(struct nvmet_rdma_queue *queue) switch (queue->state) { case NVMET_RDMA_Q_CONNECTING: case NVMET_RDMA_Q_LIVE: - disconnect = true; queue->state = NVMET_RDMA_Q_DISCONNECTING; + case NVMET_RDMA_IN_DEVICE_REMOVAL: + disconnect = true; break; case NVMET_RDMA_Q_DISCONNECTING: break; @@ -1267,9 +1272,62 @@ static void nvmet_rdma_queue_connect_fail(struct rdma_cm_id *cm_id, schedule_work(&queue->release_work); } +/** + * nvme_rdma_device_removal() - Handle RDMA device removal + * @queue: nvmet rdma queue (cm id qp_context) + * @addr: nvmet address (cm_id context) + * + * DEVICE_REMOVAL event notifies us that the RDMA device is about + * to unplug so we should take care of destroying our RDMA resources. + * This event will be generated for each allocated cm_id. + * + * Note that this event can be generated on a normal queue cm_id + * and/or a device bound listener cm_id (where in this case + * queue will be null). + * + * we claim ownership on destroying the cm_id. For queues we move + * the queue state to NVMET_RDMA_IN_DEVICE_REMOVAL and for port + * we nullify the priv to prevent double cm_id destruction and destroying + * the cm_id implicitely by returning a non-zero rc to the callout. + */ +static int nvmet_rdma_device_removal(struct nvmet_port *port, + struct nvmet_rdma_queue *queue) +{ + unsigned long flags; + + if (!queue) { + /* + * This is a listener cm_id. Make sure that + * future remove_port won't invoke a double + * cm_id destroy. + */ + port->priv = NULL; + } else { + /* + * This is a queue cm_id. Make sure that + * release queue will not destroy the cm_id + * and schedule all ctrl queues removal (only + * if the queue is not disconnecting already). + */ + spin_lock_irqsave(&queue->state_lock, flags); + if (queue->state != NVMET_RDMA_Q_DISCONNECTING) + queue->state = NVMET_RDMA_IN_DEVICE_REMOVAL; + spin_unlock_irqrestore(&queue->state_lock, flags); + nvmet_rdma_queue_disconnect(queue); + flush_scheduled_work(); + } + + /* + * We need to return 1 so that the core will destroy + * it's own ID. What a great API design.. + */ + return 1; +} + static int nvmet_rdma_cm_handler(struct rdma_cm_id *cm_id, struct rdma_cm_event *event) { + struct nvmet_port *port = cm_id->context; struct nvmet_rdma_queue *queue = NULL; int ret = 0; @@ -1289,20 +1347,11 @@ static int nvmet_rdma_cm_handler(struct rdma_cm_id *cm_id, break; case RDMA_CM_EVENT_ADDR_CHANGE: case RDMA_CM_EVENT_DISCONNECTED: - case RDMA_CM_EVENT_DEVICE_REMOVAL: case RDMA_CM_EVENT_TIMEWAIT_EXIT: - /* - * We can get the device removal callback even for a - * CM ID that we aren't actually using. In that case - * the context pointer is NULL, so we shouldn't try - * to disconnect a non-existing queue. But we also - * need to return 1 so that the core will destroy - * it's own ID. What a great API design.. - */ - if (queue) - nvmet_rdma_queue_disconnect(queue); - else - ret = 1; + nvmet_rdma_queue_disconnect(queue); + break; + case RDMA_CM_EVENT_DEVICE_REMOVAL: + ret = nvmet_rdma_device_removal(port, queue); break; case RDMA_CM_EVENT_REJECTED: case RDMA_CM_EVENT_UNREACHABLE: @@ -1442,7 +1491,8 @@ static void nvmet_rdma_remove_port(struct nvmet_port *port) { struct rdma_cm_id *cm_id = port->priv; - rdma_destroy_id(cm_id); + if (cm_id) + rdma_destroy_id(cm_id); } static struct nvmet_fabrics_ops nvmet_rdma_ops = {
When configuring a device attached listener, we may see device removal events. In this case we return a non-zero return code from the cm event handler which implicitly destroys the cm_id. It is possible that in the future the user will remove this listener and by that trigger a second call to rdma_destroy_id on an already destroyed cm_id -> BUG. In addition, when a queue bound (active session) cm_id generates a DEVICE_REMOVAL event we must guarantee all resources are cleaned up by the time we return from the event handler. Introduce nvmet_rdma_device_removal which addresses (or at least attempts to) both scenarios. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> --- Sorry for the write amp :) Changes from v2: - Fixed (an old) review comment from hch, remove redundant locking around the queue state. Changes from v1: - Fixed some typos resulting of uncommited code conflicts of an old patch. drivers/nvme/target/rdma.c | 82 +++++++++++++++++++++++++++++++++++++--------- 1 file changed, 66 insertions(+), 16 deletions(-)