diff mbox

[V6,6/9] isert: Rename IO functions to more descriptive names

Message ID 20150724161848.25617.26092.stgit@build2.ogc.int (mailing list archive)
State New, archived
Headers show

Commit Message

Steve Wise July 24, 2015, 4:18 p.m. UTC
This is in preparation for adding new FRMR-only IO handlers
for devices that support FRMR and not PI.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
---

 drivers/infiniband/ulp/isert/ib_isert.c |   28 ++++++++++++++--------------
 1 files changed, 14 insertions(+), 14 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Sagi Grimberg July 26, 2015, 10:08 a.m. UTC | #1
On 7/24/2015 7:18 PM, Steve Wise wrote:
> This is in preparation for adding new FRMR-only IO handlers
> for devices that support FRMR and not PI.

Steve,

I've given this some thought and I think we should avoid splitting
logic from PI and iWARP. The reason (other than code duplication) is
that currently the iser target support only up to 1MB IOs. I have some
code (not done yet) to support larger IOs by using multiple
registrations  per IO (with or without PI).
With a little tweaking I think we can get iwarp to fit in too...

So, do you mind if I take a crack at it?
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig July 26, 2015, 10:43 a.m. UTC | #2
On Sun, Jul 26, 2015 at 01:08:16PM +0300, Sagi Grimberg wrote:
> I've given this some thought and I think we should avoid splitting
> logic from PI and iWARP. The reason (other than code duplication) is
> that currently the iser target support only up to 1MB IOs. I have some
> code (not done yet) to support larger IOs by using multiple
> registrations  per IO (with or without PI).

Just curious: How is this going to work with iSER only having a single
rkey/offset/len field?
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg July 26, 2015, 11 a.m. UTC | #3
On 7/26/2015 1:43 PM, Christoph Hellwig wrote:
> On Sun, Jul 26, 2015 at 01:08:16PM +0300, Sagi Grimberg wrote:
>> I've given this some thought and I think we should avoid splitting
>> logic from PI and iWARP. The reason (other than code duplication) is
>> that currently the iser target support only up to 1MB IOs. I have some
>> code (not done yet) to support larger IOs by using multiple
>> registrations  per IO (with or without PI).
>
> Just curious: How is this going to work with iSER only having a single
> rkey/offset/len field?
>

Good question,

On the wire iser sends a single rkey, but the target is allowed to
transfer the data however it wants to.

Say that the local target HCA supports only 32 pages (128K bytes for 4K
pages) registration and the initiator sent:
rkey=0x1234
address=0xffffaaaa
length=512K

The target would allocate a 512K buffer and:
register offset 0-128K to lkey=0x1
register offset 128K-256K to lkey=0x2
register offset 256K-384K to lkey=0x3
register offset 384K-512K to lkey=0x4

then constructs sg_list as:
sg_list[0] = {addr=buf, length=128K, lkey=0x1}
sg_list[1] = {addr=buf+128K, length=128K, lkey=0x2}
sg_list[2] = {addr=buf+256K, length=128K, lkey=0x3}
sg_list[3] = {addr=buf+384K, length=128K, lkey=0x4}

Then set rdma_read wr with:
rdma_r_wr.sg_list=&sg_list
rdma_r_wr.rdma.addr=0xffffaaaa
rdma_r_wr.rdma.rkey=0x1234

post_send(rdma_r_wr);

Ideally, the post contains a chain of all 4 registrations and the
rdma_read (and an opportunistic good scsi response).
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig July 26, 2015, 3:53 p.m. UTC | #4
On Sun, Jul 26, 2015 at 02:00:51PM +0300, Sagi Grimberg wrote:
> On the wire iser sends a single rkey, but the target is allowed to
> transfer the data however it wants to.

So you're trying to get above the limit of a single RDMA READ, not
above the limit for memory registration in the initiator?  In that
case your explanation makes sense, that's just not what I expected
to be the limiting factor.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg July 26, 2015, 4:44 p.m. UTC | #5
On 7/26/2015 6:53 PM, Christoph Hellwig wrote:
> On Sun, Jul 26, 2015 at 02:00:51PM +0300, Sagi Grimberg wrote:
>> On the wire iser sends a single rkey, but the target is allowed to
>> transfer the data however it wants to.
>
> So you're trying to get above the limit of a single RDMA READ, not
> above the limit for memory registration in the initiator?

Correct.

>  In that case your explanation makes sense, that's just not what I expected
> to be the limiting factor.
>

In the initiator case, there is no way to support transfer size that
exceeds the device registration length capabilities (unless we start
using higher-order atomic allocations which we won't).
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise July 26, 2015, 5:32 p.m. UTC | #6
On 7/26/2015 6:00 AM, Sagi Grimberg wrote:
> On 7/26/2015 1:43 PM, Christoph Hellwig wrote:
>> On Sun, Jul 26, 2015 at 01:08:16PM +0300, Sagi Grimberg wrote:
>>> I've given this some thought and I think we should avoid splitting
>>> logic from PI and iWARP. The reason (other than code duplication) is
>>> that currently the iser target support only up to 1MB IOs. I have some
>>> code (not done yet) to support larger IOs by using multiple
>>> registrations  per IO (with or without PI).
>>
>> Just curious: How is this going to work with iSER only having a single
>> rkey/offset/len field?
>>
>
> Good question,
>
> On the wire iser sends a single rkey, but the target is allowed to
> transfer the data however it wants to.
>
> Say that the local target HCA supports only 32 pages (128K bytes for 4K
> pages) registration and the initiator sent:
> rkey=0x1234
> address=0xffffaaaa
> length=512K
>
> The target would allocate a 512K buffer and:
> register offset 0-128K to lkey=0x1
> register offset 128K-256K to lkey=0x2
> register offset 256K-384K to lkey=0x3
> register offset 384K-512K to lkey=0x4
>
> then constructs sg_list as:
> sg_list[0] = {addr=buf, length=128K, lkey=0x1}
> sg_list[1] = {addr=buf+128K, length=128K, lkey=0x2}
> sg_list[2] = {addr=buf+256K, length=128K, lkey=0x3}
> sg_list[3] = {addr=buf+384K, length=128K, lkey=0x4}
>
> Then set rdma_read wr with:
> rdma_r_wr.sg_list=&sg_list
> rdma_r_wr.rdma.addr=0xffffaaaa
> rdma_r_wr.rdma.rkey=0x1234
>
> post_send(rdma_r_wr);
>
> Ideally, the post contains a chain of all 4 registrations and the
> rdma_read (and an opportunistic good scsi response).

Just to be clear: This example is for IB only, correct?  IW would 
require rkeys with REMOTE_WRITE and 4 read wrs.  And you're ignoring 
invalidation wrs (or read-with-inv) in the example...

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg July 26, 2015, 5:40 p.m. UTC | #7
>> Ideally, the post contains a chain of all 4 registrations and the
>> rdma_read (and an opportunistic good scsi response).
>
> Just to be clear: This example is for IB only, correct?  IW would
> require rkeys with REMOTE_WRITE and 4 read wrs.

My assumption is that it would depend on max_sge_rd.

IB only? iWARP by definition isn't capable of doing rdma_read to
more than one scatter? Anyway, we'll need to calculate the number
of RDMA_READs.

> And you're ignoring invalidation wrs (or read-with-inv) in the example...

Yes, didn't want to inflate the example too much...
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise July 26, 2015, 8:15 p.m. UTC | #8
On 7/26/2015 12:40 PM, Sagi Grimberg wrote:
>
>>> Ideally, the post contains a chain of all 4 registrations and the
>>> rdma_read (and an opportunistic good scsi response).
>>
>> Just to be clear: This example is for IB only, correct?  IW would
>> require rkeys with REMOTE_WRITE and 4 read wrs.
>
> My assumption is that it would depend on max_sge_rd.
>

yea.

> IB only? iWARP by definition isn't capable of doing rdma_read to
> more than one scatter? Anyway, we'll need to calculate the number
> of RDMA_READs.
>

The wire protocol limits the destination to a single stg/to/len (aka 
rkey/addr/len).  Devices/fw/sw could implement some magic to support a 
single stg/to/len that maps to a scatter gather list of stags/tos/lens.

>> And you're ignoring invalidation wrs (or read-with-inv) in the 
>> example...
>
> Yes, didn't want to inflate the example too much...

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise July 26, 2015, 8:17 p.m. UTC | #9
On 7/26/2015 5:08 AM, Sagi Grimberg wrote:
> On 7/24/2015 7:18 PM, Steve Wise wrote:
>> This is in preparation for adding new FRMR-only IO handlers
>> for devices that support FRMR and not PI.
>
> Steve,
>
> I've given this some thought and I think we should avoid splitting
> logic from PI and iWARP. The reason (other than code duplication) is
> that currently the iser target support only up to 1MB IOs. I have some
> code (not done yet) to support larger IOs by using multiple
> registrations  per IO (with or without PI).
> With a little tweaking I think we can get iwarp to fit in too...
>
> So, do you mind if I take a crack at it?

Sure, go ahead.  Let me know how I can help.  Certainly I can test it 
for you.  I'm very keen to get this in for 4.3 if possible...


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise July 27, 2015, 9:45 p.m. UTC | #10
> -----Original Message-----
> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-owner@vger.kernel.org] On Behalf Of Steve Wise
> Sent: Sunday, July 26, 2015 3:17 PM
> To: Sagi Grimberg; dledford@redhat.com
> Cc: infinipath@intel.com; sagig@mellanox.com; ogerlitz@mellanox.com; roid@mellanox.com; linux-rdma@vger.kernel.org;
> eli@mellanox.com; target-devel@vger.kernel.org; linux-nfs@vger.kernel.org; bfields@fieldses.org
> Subject: Re: [PATCH V6 6/9] isert: Rename IO functions to more descriptive names
> 
> On 7/26/2015 5:08 AM, Sagi Grimberg wrote:
> > On 7/24/2015 7:18 PM, Steve Wise wrote:
> >> This is in preparation for adding new FRMR-only IO handlers
> >> for devices that support FRMR and not PI.
> >
> > Steve,
> >
> > I've given this some thought and I think we should avoid splitting
> > logic from PI and iWARP. The reason (other than code duplication) is
> > that currently the iser target support only up to 1MB IOs. I have some
> > code (not done yet) to support larger IOs by using multiple
> > registrations  per IO (with or without PI).
> > With a little tweaking I think we can get iwarp to fit in too...
> >
> > So, do you mind if I take a crack at it?
> 
> Sure, go ahead.  Let me know how I can help.  Certainly I can test it
> for you.  I'm very keen to get this in for 4.3 if possible...
> 
> 

Since Sagi is going to work on isert to support iwarp as part of his current isert large work, I'll drop the isert patches.  I also want to split up the max_sge_rd patches to their own submission.  So I will send out 2 new series for a final submission:

1) iser support for iwarp

2) use max_sge_rd and remove rdma_cap_read_multi_sge().

Steve.


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise Aug. 3, 2015, 7:32 p.m. UTC | #11
> > Steve,
> >
> > I've given this some thought and I think we should avoid splitting
> > logic from PI and iWARP. The reason (other than code duplication) is
> > that currently the iser target support only up to 1MB IOs. I have some
> > code (not done yet) to support larger IOs by using multiple
> > registrations  per IO (with or without PI).
> > With a little tweaking I think we can get iwarp to fit in too...
> >
> > So, do you mind if I take a crack at it?
> 
> Sure, go ahead.  Let me know how I can help.  Certainly I can test it
> for you.  I'm very keen to get this in for 4.3 if possible...
> 

Hey Sagi, how is this coming along?  How can I help?

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Aug. 4, 2015, 5:26 p.m. UTC | #12
> Hey Sagi, how is this coming along?  How can I help?
>

Hi Steve,

This is taking longer than I expected, the changes needed seem
pretty extensive throughout the IO path. I don't think it will be ready
for 4.3

I'll try to send you soon a preliminary version to play with.
Acceptable?


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise Aug. 4, 2015, 5:44 p.m. UTC | #13
> -----Original Message-----
> From: Sagi Grimberg [mailto:sagig@dev.mellanox.co.il]
> Sent: Tuesday, August 04, 2015 12:26 PM
> To: Steve Wise; dledford@redhat.com
> Cc: infinipath@intel.com; sagig@mellanox.com; ogerlitz@mellanox.com; roid@mellanox.com; linux-rdma@vger.kernel.org;
> eli@mellanox.com; target-devel@vger.kernel.org; linux-nfs@vger.kernel.org; bfields@fieldses.org
> Subject: Re: [PATCH V6 6/9] isert: Rename IO functions to more descriptive names
> 
> > Hey Sagi, how is this coming along?  How can I help?
> >
> 
> Hi Steve,
> 
> This is taking longer than I expected, the changes needed seem
> pretty extensive throughout the IO path. I don't think it will be ready
> for 4.3
>

Perhaps then we should go with my version that adds iwarp-only FRMR IO handlers for 4.3.  Then they can be phased out as the rework matures.  Thoughts?
 
> I'll try to send you soon a preliminary version to play with.
> Acceptable?

I can test the iwarp parts once you think the code is ready to try.

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise Aug. 5, 2015, 9:23 p.m. UTC | #14
> >
> > > Hey Sagi, how is this coming along?  How can I help?
> > >
> >
> > Hi Steve,
> >
> > This is taking longer than I expected, the changes needed seem
> > pretty extensive throughout the IO path. I don't think it will be ready
> > for 4.3
> >
> 
> Perhaps then we should go with my version that adds iwarp-only FRMR IO handlers for 4.3.  Then they can be phased out as the rework
> matures.  Thoughts?
> 

Shall I send out the series again for merging into 4.3?

Thanks,

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Aug. 6, 2015, 3:37 p.m. UTC | #15
On 8/6/2015 12:23 AM, Steve Wise wrote:
>
>>>
>>>> Hey Sagi, how is this coming along?  How can I help?
>>>>
>>>
>>> Hi Steve,
>>>
>>> This is taking longer than I expected, the changes needed seem
>>> pretty extensive throughout the IO path. I don't think it will be ready
>>> for 4.3
>>>
>>
>> Perhaps then we should go with my version that adds iwarp-only FRMR IO handlers for 4.3.  Then they can be phased out as the rework
>> matures.  Thoughts?
>>
>
> Shall I send out the series again for merging into 4.3?

Hi Steve,

Not sure about that.. I'm a bit reluctant in adding a code that
branches the isert code even more than it already is.

Nic, WDYT?
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise Aug. 12, 2015, 10:15 p.m. UTC | #16
> >>>> Hey Sagi, how is this coming along?  How can I help?
> >>>>
> >>>
> >>> Hi Steve,
> >>>
> >>> This is taking longer than I expected, the changes needed seem
> >>> pretty extensive throughout the IO path. I don't think it will be ready
> >>> for 4.3
> >>>
> >>
> >> Perhaps then we should go with my version that adds iwarp-only FRMR IO handlers for 4.3.  Then they can be phased out as the rework
> >> matures.  Thoughts?
> >>
> >
> > Shall I send out the series again for merging into 4.3?
> 
> Hi Steve,
> 
> Not sure about that.. I'm a bit reluctant in adding a code that
> branches the isert code even more than it already is.
> 
> Nic, WDYT?

Nic is silent... 

Sagi, do you have an ETA on when you can have the recode ready for detailed review and test? If we can't make linux-4.3, can we be early in staging it for linux-4.4?

Thanks,

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Aug. 13, 2015, 1:09 p.m. UTC | #17
>
> Nic is silent...
>
> Sagi, do you have an ETA on when you can have the recode ready for detailed review and test? If we can't make linux-4.3, can we be early in staging it for linux-4.4?

Hi Steve,

I have something, but its not remotely close to be submission ready.
This ended up being a rewrite of the registration path which is pretty
convoluted at the moment. My aim is mostly simplifying it in a way that
iWARP support would be (almost) straight-forward...

I can send you my WIP to test.

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c
index 6af3dd4..dcd3c55 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -48,15 +48,15 @@  static struct workqueue_struct *isert_comp_wq;
 static struct workqueue_struct *isert_release_wq;
 
 static void
-isert_unmap_cmd(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn);
+isert_unmap_lkey(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn);
 static int
-isert_map_rdma(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
+isert_map_lkey(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
 	       struct isert_rdma_wr *wr);
 static void
-isert_unreg_rdma(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn);
+isert_unreg_frmr_pi(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn);
 static int
-isert_reg_rdma(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
-	       struct isert_rdma_wr *wr);
+isert_reg_frmr_pi(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
+		  struct isert_rdma_wr *wr);
 static int
 isert_put_response(struct iscsi_conn *conn, struct iscsi_cmd *cmd);
 static int
@@ -367,12 +367,12 @@  isert_create_device_ib_res(struct isert_device *device)
 	if (dev_attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS &&
 	    dev_attr->device_cap_flags & IB_DEVICE_SIGNATURE_HANDOVER) {
 		device->use_fastreg = 1;
-		device->reg_rdma_mem = isert_reg_rdma;
-		device->unreg_rdma_mem = isert_unreg_rdma;
+		device->reg_rdma_mem = isert_reg_frmr_pi;
+		device->unreg_rdma_mem = isert_unreg_frmr_pi;
 	} else {
 		device->use_fastreg = 0;
-		device->reg_rdma_mem = isert_map_rdma;
-		device->unreg_rdma_mem = isert_unmap_cmd;
+		device->reg_rdma_mem = isert_map_lkey;
+		device->unreg_rdma_mem = isert_unmap_lkey;
 	}
 
 	ret = isert_alloc_comps(device, dev_attr);
@@ -1701,7 +1701,7 @@  isert_unmap_data_buf(struct isert_conn *isert_conn, struct isert_data_buf *data)
 
 
 static void
-isert_unmap_cmd(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn)
+isert_unmap_lkey(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn)
 {
 	struct isert_rdma_wr *wr = &isert_cmd->rdma_wr;
 
@@ -1726,7 +1726,7 @@  isert_unmap_cmd(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn)
 }
 
 static void
-isert_unreg_rdma(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn)
+isert_unreg_frmr_pi(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn)
 {
 	struct isert_rdma_wr *wr = &isert_cmd->rdma_wr;
 
@@ -2442,7 +2442,7 @@  isert_build_rdma_wr(struct isert_conn *isert_conn, struct isert_cmd *isert_cmd,
 }
 
 static int
-isert_map_rdma(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
+isert_map_lkey(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
 	       struct isert_rdma_wr *wr)
 {
 	struct se_cmd *se_cmd = &cmd->se_cmd;
@@ -2848,8 +2848,8 @@  unmap_prot_cmd:
 }
 
 static int
-isert_reg_rdma(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
-	       struct isert_rdma_wr *wr)
+isert_reg_frmr_pi(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
+		  struct isert_rdma_wr *wr)
 {
 	struct se_cmd *se_cmd = &cmd->se_cmd;
 	struct isert_cmd *isert_cmd = iscsit_priv_cmd(cmd);