diff mbox

[for,kernel,v4.6] IB/srpt: Revert "Convert to percpu_ida tag allocation"

Message ID 56FDBA63.7010804@sandisk.com (mailing list archive)
State Not Applicable
Headers show

Commit Message

Bart Van Assche April 1, 2016, 12:01 a.m. UTC
That patch is wrong because it makes the ib_srpt driver use I/O
contexts allocated by transport_alloc_session_tags() but it does
not initialize these I/O contexts properly. All the initializations
performed by srpt_alloc_ioctx() are skipped. Revert commit
0fd10721fe36 and thereby fix the following kernel crash:

kernel BUG at drivers/infiniband/ulp/srpt/ib_srpt.c:1439!
invalid opcode: 0000 [#1] SMP
Workqueue: target_completion target_complete_ok_work [target_core_mod]
RIP: 0010:[<ffffffffa052ef37>]  [<ffffffffa052ef37>] srpt_queue_response+0x437/0x4a0 [ib_srpt]
Call Trace:
 [<ffffffffa052f009>] srpt_queue_data_in+0x9/0x10 [ib_srpt]
 [<ffffffffa04f1ee2>] target_complete_ok_work+0x152/0x2b0 [target_core_mod]
 [<ffffffff81071ea7>] process_one_work+0x197/0x480
 [<ffffffff810721d9>] worker_thread+0x49/0x490
 [<ffffffff8107878a>] kthread+0xea/0x100
 [<ffffffff8159b172>] ret_from_fork+0x22/0x40

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
---
 drivers/infiniband/ulp/srpt/ib_srpt.c | 55 ++++++++++++++++++++++++-----------
 drivers/infiniband/ulp/srpt/ib_srpt.h |  2 ++
 2 files changed, 40 insertions(+), 17 deletions(-)

Comments

Leon Romanovsky April 1, 2016, 3:03 a.m. UTC | #1
On Thu, Mar 31, 2016 at 05:01:39PM -0700, Bart Van Assche wrote:
> That patch is wrong because it makes the ib_srpt driver use I/O
> contexts allocated by transport_alloc_session_tags() but it does
> not initialize these I/O contexts properly.

Did you have a chance to see which initializations are missing in this
case? What is needed to do if we decide to fix original patch?

Except these questions, the revert is fine :)
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche April 1, 2016, 3:12 a.m. UTC | #2
On 03/31/16 20:05, Leon Romanovsky wrote:
> On Thu, Mar 31, 2016 at 05:01:39PM -0700, Bart Van Assche wrote:
>> That patch is wrong because it makes the ib_srpt driver use I/O
>> contexts allocated by transport_alloc_session_tags() but it does
>> not initialize these I/O contexts properly.
>
> Did you have a chance to see which initializations are missing in this
> case? What is needed to do if we decide to fix original patch?
>
> Except these questions, the revert is fine :)
> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>

Hello Leon,

Thanks for the review. The initializations that are missing from that 
patch are the 'buf' pointer in the srpt_ioctx structure and mapping that 
buffer for DMA. Another bug introduced by that patch is that it doubles 
the amount of memory that is allocated for I/O contexts. New I/O context 
allocations were added by that patch but the existing I/O context 
allocation code was not removed.

Regarding reconsidering the original patch: before we do that it has to 
be shown with numbers that the percpu_ida conversion does not decrease 
performance. This is something I had already asked two months ago. See 
also 
http://thread.gmane.org/gmane.linux.scsi.target.devel/11253/focus=110559.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky April 1, 2016, 3:45 a.m. UTC | #3
On Thu, Mar 31, 2016 at 08:12:57PM -0700, Bart Van Assche wrote:
> On 03/31/16 20:05, Leon Romanovsky wrote:
> >On Thu, Mar 31, 2016 at 05:01:39PM -0700, Bart Van Assche wrote:
> >>That patch is wrong because it makes the ib_srpt driver use I/O
> >>contexts allocated by transport_alloc_session_tags() but it does
> >>not initialize these I/O contexts properly.
> >
> >Did you have a chance to see which initializations are missing in this
> >case? What is needed to do if we decide to fix original patch?
> >
> >Except these questions, the revert is fine :)
> >Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
> 
> Hello Leon,
> 
> Thanks for the review. The initializations that are missing from that patch
> are the 'buf' pointer in the srpt_ioctx structure and mapping that buffer
> for DMA.

Generally speaking you can call to srpt_alloc_ioctx or something
similar to perform mapping, right after that line
+       ioctx = &((struct srpt_send_ioctx *)se_sess->sess_cmd_map)[tag];

> Another bug introduced by that patch is that it doubles the amount
> of memory that is allocated for I/O contexts. New I/O context allocations
> were added by that patch but the existing I/O context allocation code was
> not removed.

It should be easy to fix, just delete it.

> 
> Regarding reconsidering the original patch: before we do that it has to be
> shown with numbers that the percpu_ida conversion does not decrease
> performance. This is something I had already asked two months ago. See also
> http://thread.gmane.org/gmane.linux.scsi.target.devel/11253/focus=110559.

I don't think that performance was the main goal of that patch set. In
addition, percpu_ida is supposed allocate tags on the percpu list which
will give the same performance as initial code.

> 
> Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicholas A. Bellinger April 3, 2016, 3:56 a.m. UTC | #4
(Adding missing target-devel CC')

On Thu, 2016-03-31 at 20:12 -0700, Bart Van Assche wrote:
> On 03/31/16 20:05, Leon Romanovsky wrote:
> > On Thu, Mar 31, 2016 at 05:01:39PM -0700, Bart Van Assche wrote:
> >> That patch is wrong because it makes the ib_srpt driver use I/O
> >> contexts allocated by transport_alloc_session_tags() but it does
> >> not initialize these I/O contexts properly.
> >
> > Did you have a chance to see which initializations are missing in this
> > case? What is needed to do if we decide to fix original patch?
> >
> > Except these questions, the revert is fine :)
> > Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
> 
> Hello Leon,
> 
> Thanks for the review. The initializations that are missing from that 
> patch are the 'buf' pointer in the srpt_ioctx structure and mapping that 
> buffer for DMA. Another bug introduced by that patch is that it doubles 
> the amount of memory that is allocated for I/O contexts. New I/O context 
> allocations were added by that patch but the existing I/O context 
> allocation code was not removed.
> 
> Regarding reconsidering the original patch: before we do that it has to 
> be shown with numbers that the percpu_ida conversion does not decrease 
> performance. This is something I had already asked two months ago. See 
> also 
> http://thread.gmane.org/gmane.linux.scsi.target.devel/11253/focus=110559.
> 

Like I said two months ago, there is no reason why ib_srpt needs special
pre-allocation for descriptors containing se_cmd, when every other
single driver in the tree already uses common percpu_ida code for it.

Also, I don't know why none of your ib_srpt patches ever make it to
target-devel, but can you please stop trying to push target driver
changes upstream without first notifying target-devel..?

Beyond that, are you going to send an bug-fix to address this regression
in v4.6 code..?  If not, I'll add this to the queue and just do it
myself.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg April 3, 2016, 1:13 p.m. UTC | #5
> Like I said two months ago, there is no reason why ib_srpt needs special
> pre-allocation for descriptors containing se_cmd, when every other
> single driver in the tree already uses common percpu_ida code for it.

It kinda makes sense that ib_srpt would align to what all the rest are
doing. I think that fixing the regression with specific assignments
would've made more sense than a complete revert...
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche April 3, 2016, 2:26 p.m. UTC | #6
On 04/02/16 20:56, Nicholas A. Bellinger wrote:
> Also, I don't know why none of your ib_srpt patches ever make it to
> target-devel, but can you please stop trying to push target driver
> changes upstream without first notifying target-devel..?
>
> Beyond that, are you going to send an bug-fix to address this regression
> in v4.6 code..?  If not, I'll add this to the queue and just do it
> myself.

Hello Nic,

Everyone who is interested in Linux and RDMA and also in the ib_srpt 
target driver is already reading the linux-rdma mailing list as far as I 
know. But I can CC target-devel for future ib_srpt patch submissions if 
you prefer this.

The kernel crash introduced by "Convert to percpu_ida tag allocation" 
occurs as soon as the first SCSI command is received by the ib_srpt 
target driver. This means that that patch had not been tested at all. It 
should have been mentioned in the description of that patch that that 
patch was untested. And untested patches should not be sent to Linus.

Regarding when and how to fix this regression: aligning the ib_srpt 
target driver with what other target drivers are doing would introduce 
new code. New code should be introduced during the merge window. Since 
the v4.6 merge window has been closed I propose to send in the revert 
for v4.6-rc<n> and to send in the new version of this patch for the v4.7 
merge window.

Bart.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicholas A. Bellinger April 4, 2016, 6:34 p.m. UTC | #7
On Sun, 2016-04-03 at 16:13 +0300, sagig wrote:
> > Like I said two months ago, there is no reason why ib_srpt needs special
> > pre-allocation for descriptors containing se_cmd, when every other
> > single driver in the tree already uses common percpu_ida code for it.
> 
> It kinda makes sense that ib_srpt would align to what all the rest are
> doing. I think that fixing the regression with specific assignments
> would've made more sense than a complete revert...

The bug-fix is here:

http://marc.info/?l=linux-rdma&m=145979454519746&w=2

I need to dig up some ConnectX-3 ports that aren't running in Ethernet
mode to verify with over the next days, so any testing of this patch for
those folks who have IB handy would be appreciated.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 13594db..3b425af 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -1264,26 +1264,40 @@  free_mem:
  */
 static struct srpt_send_ioctx *srpt_get_send_ioctx(struct srpt_rdma_ch *ch)
 {
-	struct se_session *se_sess;
 	struct srpt_send_ioctx *ioctx;
-	int tag;
+	unsigned long flags;
 
 	BUG_ON(!ch);
-	se_sess = ch->sess;
 
-	tag = percpu_ida_alloc(&se_sess->sess_tag_pool, TASK_RUNNING);
-	if (tag < 0) {
-		pr_err("Unable to obtain tag for srpt_send_ioctx\n");
-		return NULL;
+	ioctx = NULL;
+	spin_lock_irqsave(&ch->spinlock, flags);
+	if (!list_empty(&ch->free_list)) {
+		ioctx = list_first_entry(&ch->free_list,
+					 struct srpt_send_ioctx, free_list);
+		list_del(&ioctx->free_list);
 	}
-	ioctx = &((struct srpt_send_ioctx *)se_sess->sess_cmd_map)[tag];
-	memset(ioctx, 0, sizeof(struct srpt_send_ioctx));
-	ioctx->ch = ch;
+	spin_unlock_irqrestore(&ch->spinlock, flags);
+
+	if (!ioctx)
+		return ioctx;
+
+	BUG_ON(ioctx->ch != ch);
 	spin_lock_init(&ioctx->spinlock);
 	ioctx->state = SRPT_STATE_NEW;
+	ioctx->n_rbuf = 0;
+	ioctx->rbufs = NULL;
+	ioctx->n_rdma = 0;
+	ioctx->n_rdma_wrs = 0;
+	ioctx->rdma_wrs = NULL;
+	ioctx->mapped_sg_count = 0;
 	init_completion(&ioctx->tx_done);
-
-	ioctx->cmd.map_tag = tag;
+	ioctx->queue_status_only = false;
+	/*
+	 * transport_init_se_cmd() does not initialize all fields, so do it
+	 * here.
+	 */
+	memset(&ioctx->cmd, 0, sizeof(ioctx->cmd));
+	memset(&ioctx->sense_data, 0, sizeof(ioctx->sense_data));
 
 	return ioctx;
 }
@@ -2013,7 +2027,7 @@  static int srpt_cm_req_recv(struct ib_cm_id *cm_id,
 	struct ib_cm_rep_param *rep_param;
 	struct srpt_rdma_ch *ch, *tmp_ch;
 	u32 it_iu_len;
-	int ret = 0;
+	int i, ret = 0;
 	unsigned char *p;
 
 	WARN_ON_ONCE(irqs_disabled());
@@ -2135,6 +2149,12 @@  static int srpt_cm_req_recv(struct ib_cm_id *cm_id,
 	if (!ch->ioctx_ring)
 		goto free_ch;
 
+	INIT_LIST_HEAD(&ch->free_list);
+	for (i = 0; i < ch->rq_size; i++) {
+		ch->ioctx_ring[i]->ch = ch;
+		list_add_tail(&ch->ioctx_ring[i]->free_list, &ch->free_list);
+	}
+
 	ret = srpt_create_ch_ib(ch);
 	if (ret) {
 		rej->reason = cpu_to_be32(
@@ -2165,8 +2185,7 @@  static int srpt_cm_req_recv(struct ib_cm_id *cm_id,
 	p = &ch->sess_name[0];
 
 try_again:
-	ch->sess = target_alloc_session(&sport->port_tpg_1, ch->rq_size,
-					sizeof(struct srpt_send_ioctx),
+	ch->sess = target_alloc_session(&sport->port_tpg_1, 0, 0,
 					TARGET_PROT_NORMAL, p, ch, NULL);
 	if (IS_ERR(ch->sess)) {
 		pr_info("Rejected login because no ACL has been"
@@ -2873,7 +2892,7 @@  static void srpt_release_cmd(struct se_cmd *se_cmd)
 	struct srpt_send_ioctx *ioctx = container_of(se_cmd,
 				struct srpt_send_ioctx, cmd);
 	struct srpt_rdma_ch *ch = ioctx->ch;
-	struct se_session *se_sess = ch->sess;
+	unsigned long flags;
 
 	WARN_ON(ioctx->state != SRPT_STATE_DONE);
 	WARN_ON(ioctx->mapped_sg_count != 0);
@@ -2884,7 +2903,9 @@  static void srpt_release_cmd(struct se_cmd *se_cmd)
 		ioctx->n_rbuf = 0;
 	}
 
-	percpu_ida_free(&se_sess->sess_tag_pool, se_cmd->map_tag);
+	spin_lock_irqsave(&ch->spinlock, flags);
+	list_add(&ioctx->free_list, &ch->free_list);
+	spin_unlock_irqrestore(&ch->spinlock, flags);
 }
 
 /**
diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.h b/drivers/infiniband/ulp/srpt/ib_srpt.h
index ca288f0..af9b8b5 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.h
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.h
@@ -179,6 +179,7 @@  struct srpt_recv_ioctx {
  * struct srpt_send_ioctx - SRPT send I/O context.
  * @ioctx:       See above.
  * @ch:          Channel pointer.
+ * @free_list:   Node in srpt_rdma_ch.free_list.
  * @n_rbuf:      Number of data buffers in the received SRP command.
  * @rbufs:       Pointer to SRP data buffer array.
  * @single_rbuf: SRP data buffer if the command has only a single buffer.
@@ -201,6 +202,7 @@  struct srpt_send_ioctx {
 	struct srp_direct_buf	*rbufs;
 	struct srp_direct_buf	single_rbuf;
 	struct scatterlist	*sg;
+	struct list_head	free_list;
 	spinlock_t		spinlock;
 	enum srpt_command_state	state;
 	struct se_cmd		cmd;