diff mbox

[1/3] iw_cm: free cm_id resources on the last deref

Message ID 93c3c47c16406ef00184011948424a9597e4c6b8.1468879135.git.swise@opengridcomputing.com (mailing list archive)
State Superseded
Headers show

Commit Message

Steve Wise July 18, 2016, 8:44 p.m. UTC
Remove the complicated logic to free the cm_id resources in iw_cm event
handlers vs when an application thread destroys the device.  I'm not sure
why this code was written, but simply allowing the last deref to free
the memory is cleaner.  It also prevents a deadlock when applications
try to destroy cm_id's in their cm event handler function.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
---
 drivers/infiniband/core/iwcm.c | 41 ++++++-----------------------------------
 1 file changed, 6 insertions(+), 35 deletions(-)

Comments

Sagi Grimberg July 20, 2016, 8:51 a.m. UTC | #1
> Remove the complicated logic to free the cm_id resources in iw_cm event
> handlers vs when an application thread destroys the device.  I'm not sure
> why this code was written, but simply allowing the last deref to free
> the memory is cleaner.  It also prevents a deadlock when applications
> try to destroy cm_id's in their cm event handler function.

The description here is misleading. we can never destroy the cm_id
inside the cm_id handler. Also, I don't think the deadlock was on cm_id
removal but rather on the qp referenced by the cm_id. I think the change
log can be improved.

The patch looks fine to me.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise July 20, 2016, 1:51 p.m. UTC | #2
> 
> > Remove the complicated logic to free the cm_id resources in iw_cm event
> > handlers vs when an application thread destroys the device.  I'm not sure
> > why this code was written, but simply allowing the last deref to free
> > the memory is cleaner.  It also prevents a deadlock when applications
> > try to destroy cm_id's in their cm event handler function.
> 
> The description here is misleading. we can never destroy the cm_id
> inside the cm_id handler. Also, I don't think the deadlock was on cm_id
> removal but rather on the qp referenced by the cm_id. I think the change
> log can be improved.
>

I'll reword it. 
 
> The patch looks fine to me.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise July 21, 2016, 2:17 p.m. UTC | #3
> > > Remove the complicated logic to free the cm_id resources in iw_cm event
> > > handlers vs when an application thread destroys the device.  I'm not sure
> > > why this code was written, but simply allowing the last deref to free
> > > the memory is cleaner.  It also prevents a deadlock when applications
> > > try to destroy cm_id's in their cm event handler function.
> >
> > The description here is misleading. we can never destroy the cm_id
> > inside the cm_id handler. Also, I don't think the deadlock was on cm_id
> > removal but rather on the qp referenced by the cm_id. I think the change
> > log can be improved.
> >
> 
> I'll reword it.

The nvme unplug handler does indeed destroy all the qps -and- cm_ids used for
the controllers for this device, with the exception of the cm_id handling the
event.  That is what causes this deadlock.  Once I fixed iw_cxgb4 (in patch 2)
to not block until the refcnt reaches 0 in c4iw_destroy_qp(), I then hit the
block in iw_destroy_cm_id() which deadlocks the process due to the iw_cm worker
thread already stuck trying to post an event to the rdma_cm for the cm_id
handling the event.  

Perhaps I should describe the deadlock in detail like I did in the email threads
leading up to this series?

While I'm rambling, there is still a condition that probably needs to be
addressed:  if the application event handler function disconnects the cm_id that
is handling the event, the iw_cm workq thread gets stuck posting a
IW_CM_EVENT_CLOSE to rdma_cm.  So the iw_cm workq thread is stuck in
cm_close_handler() calling cm_id_priv->id.cm_handler() which is cma_iw_handler()
which is blocked in cma_disable_callback() because the application is currently
running its event handler for this cm_id.  This block is released when the
application returns from its event handler function.  

But maybe cma_iw_handler() should queue the event if it cannot deliver it, vs
blocking the iw_cm workq thread?   

Steve.




--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise July 21, 2016, 3:45 p.m. UTC | #4
> While I'm rambling, there is still a condition that probably needs to be
> addressed:  if the application event handler function disconnects the
> cm_id that is handling the event, the iw_cm workq thread gets stuck
> posting a IW_CM_EVENT_CLOSE to rdma_cm.  So the iw_cm workq thread is
> stuck in cm_close_handler() calling cm_id_priv->id.cm_handler() which is
> cma_iw_handler() which is blocked in cma_disable_callback() because the
> application is currently running its event handler for this cm_id.  This
> block is released when the application returns from its event handler
> function.
> 
> But maybe cma_iw_handler() should queue the event if it cannot deliver it,
> vs blocking the iw_cm workq thread?
> 

Answering myself:  queueing the event seems to be a no-go because the rdma_cm
can return non-zero to the iw_cm to indicate it should destroy the iw_cm_id.
This cannot be done correctly if the event is queued/deferred.





--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/core/iwcm.c b/drivers/infiniband/core/iwcm.c
index f057204..20a94ed 100644
--- a/drivers/infiniband/core/iwcm.c
+++ b/drivers/infiniband/core/iwcm.c
@@ -183,15 +183,14 @@  static void free_cm_id(struct iwcm_id_private *cm_id_priv)
 
 /*
  * Release a reference on cm_id. If the last reference is being
- * released, enable the waiting thread (in iw_destroy_cm_id) to
- * get woken up, and return 1 if a thread is already waiting.
+ * released, free the cm_id and return 1.
  */
 static int iwcm_deref_id(struct iwcm_id_private *cm_id_priv)
 {
 	BUG_ON(atomic_read(&cm_id_priv->refcount)==0);
 	if (atomic_dec_and_test(&cm_id_priv->refcount)) {
 		BUG_ON(!list_empty(&cm_id_priv->work_list));
-		complete(&cm_id_priv->destroy_comp);
+		free_cm_id(cm_id_priv);
 		return 1;
 	}
 
@@ -208,19 +207,10 @@  static void add_ref(struct iw_cm_id *cm_id)
 static void rem_ref(struct iw_cm_id *cm_id)
 {
 	struct iwcm_id_private *cm_id_priv;
-	int cb_destroy;
 
 	cm_id_priv = container_of(cm_id, struct iwcm_id_private, id);
 
-	/*
-	 * Test bit before deref in case the cm_id gets freed on another
-	 * thread.
-	 */
-	cb_destroy = test_bit(IWCM_F_CALLBACK_DESTROY, &cm_id_priv->flags);
-	if (iwcm_deref_id(cm_id_priv) && cb_destroy) {
-		BUG_ON(!list_empty(&cm_id_priv->work_list));
-		free_cm_id(cm_id_priv);
-	}
+	(void)iwcm_deref_id(cm_id_priv);
 }
 
 static int cm_event_handler(struct iw_cm_id *cm_id, struct iw_cm_event *event);
@@ -433,13 +423,7 @@  void iw_destroy_cm_id(struct iw_cm_id *cm_id)
 	struct iwcm_id_private *cm_id_priv;
 
 	cm_id_priv = container_of(cm_id, struct iwcm_id_private, id);
-	BUG_ON(test_bit(IWCM_F_CALLBACK_DESTROY, &cm_id_priv->flags));
-
 	destroy_cm_id(cm_id);
-
-	wait_for_completion(&cm_id_priv->destroy_comp);
-
-	free_cm_id(cm_id_priv);
 }
 EXPORT_SYMBOL(iw_destroy_cm_id);
 
@@ -809,10 +793,7 @@  static void cm_conn_req_handler(struct iwcm_id_private *listen_id_priv,
 	ret = cm_id->cm_handler(cm_id, iw_event);
 	if (ret) {
 		iw_cm_reject(cm_id, NULL, 0);
-		set_bit(IWCM_F_CALLBACK_DESTROY, &cm_id_priv->flags);
-		destroy_cm_id(cm_id);
-		if (atomic_read(&cm_id_priv->refcount)==0)
-			free_cm_id(cm_id_priv);
+		iw_destroy_cm_id(cm_id);
 	}
 
 out:
@@ -1000,7 +981,6 @@  static void cm_work_handler(struct work_struct *_work)
 	unsigned long flags;
 	int empty;
 	int ret = 0;
-	int destroy_id;
 
 	spin_lock_irqsave(&cm_id_priv->lock, flags);
 	empty = list_empty(&cm_id_priv->work_list);
@@ -1014,19 +994,10 @@  static void cm_work_handler(struct work_struct *_work)
 		spin_unlock_irqrestore(&cm_id_priv->lock, flags);
 
 		ret = process_event(cm_id_priv, &levent);
-		if (ret) {
-			set_bit(IWCM_F_CALLBACK_DESTROY, &cm_id_priv->flags);
+		if (ret)
 			destroy_cm_id(&cm_id_priv->id);
-		}
-		BUG_ON(atomic_read(&cm_id_priv->refcount)==0);
-		destroy_id = test_bit(IWCM_F_CALLBACK_DESTROY, &cm_id_priv->flags);
-		if (iwcm_deref_id(cm_id_priv)) {
-			if (destroy_id) {
-				BUG_ON(!list_empty(&cm_id_priv->work_list));
-				free_cm_id(cm_id_priv);
-			}
+		if (iwcm_deref_id(cm_id_priv))
 			return;
-		}
 		if (empty)
 			return;
 		spin_lock_irqsave(&cm_id_priv->lock, flags);