diff mbox

[v4,7/9] IB/core: generic RDMA READ/WRITE API

Message ID 56E60E0F.7060900@sandisk.com (mailing list archive)
State Superseded
Headers show

Commit Message

Bart Van Assche March 14, 2016, 1:04 a.m. UTC
On 03/11/16 22:12, Christoph Hellwig wrote:
> On Fri, Mar 11, 2016 at 02:39:16PM -0800, Bart Van Assche wrote:
>> The above is fine with me. But when I ran a test with rdma_rw_use_mr()
>> changed into "return true" the following error messages appeared in the
>> kernel log:
>>
>> [  364.460709] ib_srpt 0x1: parsing SRP descriptor table failed.
>> [  383.604809] ib_srpt 0x0: parsing SRP descriptor table failed.
>> [  383.605627] ib_srpt 0x2: parsing SRP descriptor table failed.
>> [  386.702905] ib_srpt 0x3: parsing SRP descriptor table failed.
>> [  386.703092] ib_srpt 0x4: parsing SRP descriptor table failed.
>> [  386.703242] ib_srpt 0x5: parsing SRP descriptor table failed.
>> [  386.703411] ib_srpt 0x6: parsing SRP descriptor table failed.
>>
>> Is this expected? I ran this test on a server equipped with two mlx4 HCAs
>> with latest firmware (2.36.5000). I installed git commit
>> c4c65482b56a433a82bc5b63db8ba125727e9f80 of the rdma-rw-api merged with
>> v4.5-rc7. Initiator and target drivers were running on the same server and
>> were communicating with each other via loopback. Before I modified
>> rdma_rw_use_mr() the same test passed on the same setup.
> 
> I think this might be the case when SRP gets multiple SGL entries.
> In this case the number of MRs allocated is limited and srpt should
> handle rdma_rw_ctx_init failures due to the lack of MRs.  If you add
> the ib_mr_pool_get failure printk back that you asked me to remove
> I bet it's going to trigger.

Hello Christoph,

After having applied the following patch:


and after having run:

echo 'module ib_core +pmf' > /sys/kernel/debug/dynamic_debug/control

the following output appeared:

[ 1104.391493] ib_core:rdma_rw_init_mr_wrs: failed to allocate MR 0/1 from pool (in use: 2048)
[ 1104.391621] ib_srpt 0x0: parsing SRP descriptor table failed.
[ 1104.391762] ib_core:rdma_rw_init_mr_wrs: failed to allocate MR 0/1 from pool (in use: 2048)
[ 1104.391864] ib_srpt 0x1: parsing SRP descriptor table failed.
[ 1104.391987] ib_core:rdma_rw_init_mr_wrs: failed to allocate MR 0/1 from pool (in use: 2048)
[ 1104.392085] ib_srpt 0x2: parsing SRP descriptor table failed.
[ 1104.392208] ib_core:rdma_rw_init_mr_wrs: failed to allocate MR 0/1 from pool (in use: 2048)
[ 1104.392306] ib_srpt 0x3: parsing SRP descriptor table failed.
[ 1104.392427] ib_core:rdma_rw_init_mr_wrs: failed to allocate MR 0/1 from pool (in use: 2048)
[ 1104.392525] ib_srpt 0x4: parsing SRP descriptor table failed.
[ 1104.392647] ib_core:rdma_rw_init_mr_wrs: failed to allocate MR 0/1 from pool (in use: 2048)
[ 1104.392745] ib_srpt 0x5: parsing SRP descriptor table failed.
[ 1104.392867] ib_core:rdma_rw_init_mr_wrs: failed to allocate MR 0/1 from pool (in use: 2048)
[ 1104.392965] ib_srpt 0x6: parsing SRP descriptor table failed.
[ 1104.393089] ib_core:rdma_rw_init_mr_wrs: failed to allocate MR 0/1 from pool (in use: 2048)
[ 1104.393189] ib_srpt 0x7: parsing SRP descriptor table failed.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Christoph Hellwig March 15, 2016, 8:45 a.m. UTC | #1
Yes, that's exactly what I expected. So if rdma_rw_init_mr_wrs
fails with -EAGAIN we'll need to propagate this all the way
to srpt_handle_new_iu and then add the command to the wait list.

I'll prepare a patch for that.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/core/rw.c b/drivers/infiniband/core/rw.c
index c6e8483..940dee8 100644
--- a/drivers/infiniband/core/rw.c
+++ b/drivers/infiniband/core/rw.c
@@ -64,6 +64,8 @@  static int rdma_rw_init_mr_wrs(struct rdma_rw_ctx *ctx, struct ib_qp *qp,
 
 		reg->mr = ib_mr_pool_get(qp, &qp->rdma_mrs);
 		if (!reg->mr) {
+			pr_debug("failed to allocate MR %d/%d from pool (in use: %d)\n",
+				 i, ctx->nr_ops, qp->mrs_used);
 			ret = -EAGAIN;
 			goto out_free;
 		}