diff mbox

[PATCH/RFC] IB/mad: Fix lock-lock-timer deadlock in RMPP code (was: [NEW PATCH] IB/mad: Fix possible lock-lock-timer deadlock)

Message ID adavdjrkfyq.fsf_-_@cisco.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Roland Dreier Sept. 9, 2009, 8:42 p.m. UTC
Holding agent->lock across cancel_delayed_work() (which does
del_timer_sync()) in ib_cancel_rmpp_recvs() leads to lockdep reports of
possible lock-timer deadlocks if a consumer ever does something that
connects agent->lock to a lock taken in IRQ context (cf
http://marc.info/?l=linux-rdma&m=125243699026045).

However, it seems this locking is not necessary here, since the locking
did not prevent the rmpp_list from having an item added immediately
after the lock is dropped -- so there must be sufficient synchronization
protecting the rmpp_list without the locking here.  Therefore, we can
fix the lockdep issue by simply deleting the locking.


Hal/Sean, does this look right to you?
---
 drivers/infiniband/core/mad_rmpp.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Hefty, Sean Sept. 9, 2009, 9:22 p.m. UTC | #1
>Holding agent->lock across cancel_delayed_work() (which does
>del_timer_sync()) in ib_cancel_rmpp_recvs() leads to lockdep reports of
>possible lock-timer deadlocks if a consumer ever does something that
>connects agent->lock to a lock taken in IRQ context (cf
>http://marc.info/?l=linux-rdma&m=125243699026045).
>
>However, it seems this locking is not necessary here, since the locking
>did not prevent the rmpp_list from having an item added immediately
>after the lock is dropped -- so there must be sufficient synchronization
>protecting the rmpp_list without the locking here.  Therefore, we can
>fix the lockdep issue by simply deleting the locking.

The locking is needed to protect against items being removed from rmpp_list in
recv_timeout_handler() and recv_cleanup_handler().  No new items should be added
to the rmpp_list when ib_cancel_rmpp_recvs() is running (or there's a separate
bug).

- Sean

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roland Dreier Sept. 9, 2009, 9:34 p.m. UTC | #2
> The locking is needed to protect against items being removed from rmpp_list in
 > recv_timeout_handler() and recv_cleanup_handler().  No new items should be added
 > to the rmpp_list when ib_cancel_rmpp_recvs() is running (or there's a separate
 > bug).

Ah, I see.

That's trickier I guess... hmm...

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/core/mad_rmpp.c b/drivers/infiniband/core/mad_rmpp.c
index 57a3c6f..865c109 100644
--- a/drivers/infiniband/core/mad_rmpp.c
+++ b/drivers/infiniband/core/mad_rmpp.c
@@ -85,12 +85,10 @@  void ib_cancel_rmpp_recvs(struct ib_mad_agent_private *agent)
 	struct mad_rmpp_recv *rmpp_recv, *temp_rmpp_recv;
 	unsigned long flags;
 
-	spin_lock_irqsave(&agent->lock, flags);
 	list_for_each_entry(rmpp_recv, &agent->rmpp_list, list) {
 		cancel_delayed_work(&rmpp_recv->timeout_work);
 		cancel_delayed_work(&rmpp_recv->cleanup_work);
 	}
-	spin_unlock_irqrestore(&agent->lock, flags);
 
 	flush_workqueue(agent->qp_info->port_priv->wq);