diff mbox

[09/15] IB/srpt: Fix srpt_close_session()

Message ID 568BD231.10205@sandisk.com (mailing list archive)
State Superseded
Headers show

Commit Message

Bart Van Assche Jan. 5, 2016, 2:24 p.m. UTC
Avoid that srpt_close_session() waits if it doesn't have to wait.
Additionally, increase the time during which srpt_close_session()
waits until closing a session has finished. This makes it easier
to detect session shutdown bugs.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

Comments

Christoph Hellwig Jan. 6, 2016, 5:08 a.m. UTC | #1
On Tue, Jan 05, 2016 at 03:24:49PM +0100, Bart Van Assche wrote:
> Avoid that srpt_close_session() waits if it doesn't have to wait.
> Additionally, increase the time during which srpt_close_session()
> waits until closing a session has finished. This makes it easier
> to detect session shutdown bugs.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Jan. 6, 2016, 2:21 p.m. UTC | #2
> Avoid that srpt_close_session() waits if it doesn't have to wait.

Can you explain when it doesn't have to wait? is it possible that
srpt_release_channel_work() was already triggered? isn't that a problem?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche Jan. 6, 2016, 2:34 p.m. UTC | #3
On 01/06/2016 03:21 PM, Sagi Grimberg wrote:
>> Avoid that srpt_close_session() waits if it doesn't have to wait.
>
> Can you explain when it doesn't have to wait? is it possible that
> srpt_release_channel_work() was already triggered? isn't that a problem?

Hello Sagi,

The target core can decide to shut down an RDMA channel or a channel 
shutdown can be the result of the reception of a DREQ message. It is in 
the latter case that srpt_release_channel_work() can have finished 
before srpt_close_session() is called.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Jan. 6, 2016, 2:46 p.m. UTC | #4
On 06/01/2016 16:34, Bart Van Assche wrote:
> On 01/06/2016 03:21 PM, Sagi Grimberg wrote:
>>> Avoid that srpt_close_session() waits if it doesn't have to wait.
>>
>> Can you explain when it doesn't have to wait? is it possible that
>> srpt_release_channel_work() was already triggered? isn't that a problem?
>
> Hello Sagi,
>
> The target core can decide to shut down an RDMA channel or a channel
> shutdown can be the result of the reception of a DREQ message. It is in
> the latter case that srpt_release_channel_work() can have finished
> before srpt_close_session() is called.

I understood that, and the reason I asked was because of the fact that
you dereference the ch while the channel release is ongoing...

Further reading tells me this is handled in patch 15 "IB/srpt: Fix a
rare crash in srpt_close_session()" correct?

If so, should it come before this patch?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 0ff4ed6..1857d17 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -1981,8 +1981,8 @@  static void srpt_release_channel_work(struct work_struct *w)
 	struct se_session *se_sess;
 
 	ch = container_of(w, struct srpt_rdma_ch, release_work);
-	pr_debug("ch = %p; ch->sess = %p; release_done = %p\n", ch, ch->sess,
-		 ch->release_done);
+	pr_debug("%s: %s-%d; release_done = %p\n", __func__, ch->sess_name,
+		 ch->qp->qp_num, ch->release_done);
 
 	sdev = ch->sport->sdev;
 	BUG_ON(!sdev);
@@ -2006,11 +2006,10 @@  static void srpt_release_channel_work(struct work_struct *w)
 			     ch->rsp_size, DMA_TO_DEVICE);
 
 	spin_lock_irq(&sdev->spinlock);
-	list_del(&ch->list);
-	spin_unlock_irq(&sdev->spinlock);
-
+	list_del_init(&ch->list);
 	if (ch->release_done)
 		complete(ch->release_done);
+	spin_unlock_irq(&sdev->spinlock);
 
 	wake_up(&sdev->ch_releaseQ);
 
@@ -3033,24 +3032,26 @@  static void srpt_release_cmd(struct se_cmd *se_cmd)
 static void srpt_close_session(struct se_session *se_sess)
 {
 	DECLARE_COMPLETION_ONSTACK(release_done);
-	struct srpt_rdma_ch *ch;
-	struct srpt_device *sdev;
-	unsigned long res;
-
-	ch = se_sess->fabric_sess_ptr;
-	WARN_ON(ch->sess != se_sess);
+	struct srpt_rdma_ch *ch = se_sess->fabric_sess_ptr;
+	struct srpt_device *sdev = ch->sport->sdev;
+	bool wait;
 
-	pr_debug("ch %p state %d\n", ch, ch->state);
+	pr_debug("ch %s-%d state %d\n", ch->sess_name, ch->qp->qp_num,
+		 ch->state);
 
-	sdev = ch->sport->sdev;
 	spin_lock_irq(&sdev->spinlock);
 	BUG_ON(ch->release_done);
 	ch->release_done = &release_done;
+	wait = !list_empty(&ch->list);
 	__srpt_close_ch(ch);
 	spin_unlock_irq(&sdev->spinlock);
 
-	res = wait_for_completion_timeout(&release_done, 60 * HZ);
-	WARN_ON(res == 0);
+	if (!wait)
+		return;
+
+	while (wait_for_completion_timeout(&release_done, 180 * HZ) == 0)
+		pr_info("%s(%s-%d state %d): still waiting ...\n", __func__,
+			ch->sess_name, ch->qp->qp_num, ch->state);
 }
 
 /**