diff mbox

Can someone help me understand the reason for this code in ib_isert.c?

Message ID 544772AA.7060003@mellanox.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Or Gerlitz Oct. 22, 2014, 9:02 a.m. UTC
On 10/22/2014 8:06 AM, Nicholas A. Bellinger wrote:
> On Tue, 2014-10-21 at 00:13 +0300, Or Gerlitz wrote:
>> On Mon, Oct 20, 2014 at 6:29 PM, Chris Moore <Chris.Moore@emulex.com> wrote:
>>> The following code is in isert_conn_setup_qp() in ib_isert.c:
>>>
>>>           /*
>>>             * FIXME: Use devattr.max_sge - 2 for max_send_sge as
>>>             * work-around for RDMA_READ..
>>>             */
>>>           attr.cap.max_send_sge = device->dev_attr.max_sge - 2;
>>>
>>> It's not clear from the comment what this is a work-around for, and I wasn't able
>>> to figure it out from looking at logs.
>> I believe this refers to some IBTA spec corner case which comes into
>> play with the max_sges advertized by mlx4, Eli, can you shed some
>> light (IBTA pointer) on that? is this the case (i.e dev_attr.max_sge
>> isn't always achievable) with mlx5 too?
>>
> It's for ConnectX-2 reporting max_sge=31 during development, while the
> largest max_send_sge for stable large block RDMA reads was (is..?) 29
> SGEs.
>
>>> In trying to get isert working with ocrdma I ran into a problem where the code
>>> thought there was only 1 send SGE available when in fact the device has 3.
>>> Then I found the above work-around, which explained the problem.
>> Nic, any reason for attr.cap.max_send_sge = 1 not to work OK?
>>
> IIRC, pre fastreg code supported max_send_sge=1 by limiting the transfer
> size per RDMA read, and would issue subsequent RDMA read + offset from
> completion up to the total se_cmd->data_length.
>
> Not sure how this works with fastreg today.  Sagi..?
>

While on fastreg mode, for RDMA reads  we use only one SGE, talking to 
Sagi he explained me that the problem with creating the QP with 
max_send_sge = 1 has to do with other flows where two or even three SGEs 
are needed, e.g when the iscsi/iser header (doesn't exist in RDMA ops) 
is taken from one spot of the memory and the data (sense) taken from 
another one?

Nic, we need to see what is the minimum needed by the code (two?) and 
tweak that line to make sure it gets there as long as the device 
supports it, SB, makes sense?


         attr.cap.max_recv_sge = 1;


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Nicholas A. Bellinger Oct. 22, 2014, 9:50 p.m. UTC | #1
On Wed, 2014-10-22 at 12:02 +0300, Or Gerlitz wrote:
> On 10/22/2014 8:06 AM, Nicholas A. Bellinger wrote:
> > On Tue, 2014-10-21 at 00:13 +0300, Or Gerlitz wrote:
> >> On Mon, Oct 20, 2014 at 6:29 PM, Chris Moore <Chris.Moore@emulex.com> wrote:
> >>> The following code is in isert_conn_setup_qp() in ib_isert.c:
> >>>
> >>>           /*
> >>>             * FIXME: Use devattr.max_sge - 2 for max_send_sge as
> >>>             * work-around for RDMA_READ..
> >>>             */
> >>>           attr.cap.max_send_sge = device->dev_attr.max_sge - 2;
> >>>
> >>> It's not clear from the comment what this is a work-around for, and I wasn't able
> >>> to figure it out from looking at logs.
> >> I believe this refers to some IBTA spec corner case which comes into
> >> play with the max_sges advertized by mlx4, Eli, can you shed some
> >> light (IBTA pointer) on that? is this the case (i.e dev_attr.max_sge
> >> isn't always achievable) with mlx5 too?
> >>
> > It's for ConnectX-2 reporting max_sge=31 during development, while the
> > largest max_send_sge for stable large block RDMA reads was (is..?) 29
> > SGEs.
> >
> >>> In trying to get isert working with ocrdma I ran into a problem where the code
> >>> thought there was only 1 send SGE available when in fact the device has 3.
> >>> Then I found the above work-around, which explained the problem.
> >> Nic, any reason for attr.cap.max_send_sge = 1 not to work OK?
> >>
> > IIRC, pre fastreg code supported max_send_sge=1 by limiting the transfer
> > size per RDMA read, and would issue subsequent RDMA read + offset from
> > completion up to the total se_cmd->data_length.
> >
> > Not sure how this works with fastreg today.  Sagi..?
> >
> 
> While on fastreg mode, for RDMA reads  we use only one SGE, talking to 
> Sagi he explained me that the problem with creating the QP with 
> max_send_sge = 1 has to do with other flows where two or even three SGEs 
> are needed, e.g when the iscsi/iser header (doesn't exist in RDMA ops) 
> is taken from one spot of the memory and the data (sense) taken from 
> another one?
> 
> Nic, we need to see what is the minimum needed by the code (two?) and 
> tweak that line to make sure it gets there as long as the device 
> supports it, SB, makes sense?
> 
>
> diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
> b/drivers/infiniband/ulp/isert/ib_isert.c
> index 0bea577..24abbb3 100644
> --- a/drivers/infiniband/ulp/isert/ib_isert.c
> +++ b/drivers/infiniband/ulp/isert/ib_isert.c
> @@ -115,9 +115,10 @@ isert_conn_setup_qp(struct isert_conn *isert_conn, 
> struct rdma_cm_id *cma_id,
>          attr.cap.max_recv_wr = ISERT_QP_MAX_RECV_DTOS;
>          /*
>           * FIXME: Use devattr.max_sge - 2 for max_send_sge as
> -        * work-around for RDMA_READ..
> +        * work-around for RDMA_READ.. still need to make sure to
> +        * have at least two SGEs for SCSI responses.
>           */
> -       attr.cap.max_send_sge = device->dev_attr.max_sge - 2;
> +       attr.cap.max_send_sge = max(2, device->dev_attr.max_sge - 2);
>          isert_conn->max_sge = attr.cap.max_send_sge;
> 
>          attr.cap.max_recv_sge = 1;
> 
> 

After a quick audit, the minimum max_send_sge=2 patch looks correct for
the hardcoded tx_desc->num_sge cases in isert_put_login_tx(),
isert_put_response(), isert_put_reject() and isert_put_text_rsp().

For the RDMA read path with non fast-reg, isert_build_rdma_wr() is still
correctly limiting the SGEs per operation based upon the negotiated
isert_conn->max_sge set above in isert_conn_setup_qp().

Chris, please confirm on ocrdma with Or's patch, and I'll include his
patch into target-pending/queue for now, and move into master once you
give a proper Tested-By.

Thanks!

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chris Moore Oct. 29, 2014, 6:05 p.m. UTC | #2
PiBPbiBXZWQsIDIwMTQtMTAtMjIgYXQgMTI6MDIgKzAzMDAsIE9yIEdlcmxpdHogd3JvdGU6DQo+
ID4gT24gMTAvMjIvMjAxNCA4OjA2IEFNLCBOaWNob2xhcyBBLiBCZWxsaW5nZXIgd3JvdGU6DQo+
ID4gPiBPbiBUdWUsIDIwMTQtMTAtMjEgYXQgMDA6MTMgKzAzMDAsIE9yIEdlcmxpdHogd3JvdGU6
DQo+ID4gPj4gT24gTW9uLCBPY3QgMjAsIDIwMTQgYXQgNjoyOSBQTSwgQ2hyaXMgTW9vcmUNCj4g
PENocmlzLk1vb3JlQGVtdWxleC5jb20+IHdyb3RlOg0KPiA+ID4+PiBUaGUgZm9sbG93aW5nIGNv
ZGUgaXMgaW4gaXNlcnRfY29ubl9zZXR1cF9xcCgpIGluIGliX2lzZXJ0LmM6DQo+ID4gPj4+DQo+
ID4gPj4+ICAgICAgICAgICAvKg0KPiA+ID4+PiAgICAgICAgICAgICAqIEZJWE1FOiBVc2UgZGV2
YXR0ci5tYXhfc2dlIC0gMiBmb3IgbWF4X3NlbmRfc2dlIGFzDQo+ID4gPj4+ICAgICAgICAgICAg
ICogd29yay1hcm91bmQgZm9yIFJETUFfUkVBRC4uDQo+ID4gPj4+ICAgICAgICAgICAgICovDQo+
ID4gPj4+ICAgICAgICAgICBhdHRyLmNhcC5tYXhfc2VuZF9zZ2UgPSBkZXZpY2UtPmRldl9hdHRy
Lm1heF9zZ2UgLSAyOw0KPiA+ID4+Pg0KPiA+ID4+PiBJdCdzIG5vdCBjbGVhciBmcm9tIHRoZSBj
b21tZW50IHdoYXQgdGhpcyBpcyBhIHdvcmstYXJvdW5kIGZvciwNCj4gPiA+Pj4gYW5kIEkgd2Fz
bid0IGFibGUgdG8gZmlndXJlIGl0IG91dCBmcm9tIGxvb2tpbmcgYXQgbG9ncy4NCj4gPiA+PiBJ
IGJlbGlldmUgdGhpcyByZWZlcnMgdG8gc29tZSBJQlRBIHNwZWMgY29ybmVyIGNhc2Ugd2hpY2gg
Y29tZXMNCj4gPiA+PiBpbnRvIHBsYXkgd2l0aCB0aGUgbWF4X3NnZXMgYWR2ZXJ0aXplZCBieSBt
bHg0LCBFbGksIGNhbiB5b3Ugc2hlZA0KPiA+ID4+IHNvbWUgbGlnaHQgKElCVEEgcG9pbnRlcikg
b24gdGhhdD8gaXMgdGhpcyB0aGUgY2FzZSAoaS5lDQo+ID4gPj4gZGV2X2F0dHIubWF4X3NnZSBp
c24ndCBhbHdheXMgYWNoaWV2YWJsZSkgd2l0aCBtbHg1IHRvbz8NCj4gPiA+Pg0KPiA+ID4gSXQn
cyBmb3IgQ29ubmVjdFgtMiByZXBvcnRpbmcgbWF4X3NnZT0zMSBkdXJpbmcgZGV2ZWxvcG1lbnQs
IHdoaWxlDQo+ID4gPiB0aGUgbGFyZ2VzdCBtYXhfc2VuZF9zZ2UgZm9yIHN0YWJsZSBsYXJnZSBi
bG9jayBSRE1BIHJlYWRzIHdhcw0KPiA+ID4gKGlzLi4/KSAyOSBTR0VzLg0KPiA+ID4NCj4gPiA+
Pj4gSW4gdHJ5aW5nIHRvIGdldCBpc2VydCB3b3JraW5nIHdpdGggb2NyZG1hIEkgcmFuIGludG8g
YSBwcm9ibGVtDQo+ID4gPj4+IHdoZXJlIHRoZSBjb2RlIHRob3VnaHQgdGhlcmUgd2FzIG9ubHkg
MSBzZW5kIFNHRSBhdmFpbGFibGUgd2hlbiBpbg0KPiBmYWN0IHRoZSBkZXZpY2UgaGFzIDMuDQo+
ID4gPj4+IFRoZW4gSSBmb3VuZCB0aGUgYWJvdmUgd29yay1hcm91bmQsIHdoaWNoIGV4cGxhaW5l
ZCB0aGUgcHJvYmxlbS4NCj4gPiA+PiBOaWMsIGFueSByZWFzb24gZm9yIGF0dHIuY2FwLm1heF9z
ZW5kX3NnZSA9IDEgbm90IHRvIHdvcmsgT0s/DQo+ID4gPj4NCj4gPiA+IElJUkMsIHByZSBmYXN0
cmVnIGNvZGUgc3VwcG9ydGVkIG1heF9zZW5kX3NnZT0xIGJ5IGxpbWl0aW5nIHRoZQ0KPiA+ID4g
dHJhbnNmZXIgc2l6ZSBwZXIgUkRNQSByZWFkLCBhbmQgd291bGQgaXNzdWUgc3Vic2VxdWVudCBS
RE1BIHJlYWQgKw0KPiA+ID4gb2Zmc2V0IGZyb20gY29tcGxldGlvbiB1cCB0byB0aGUgdG90YWwg
c2VfY21kLT5kYXRhX2xlbmd0aC4NCj4gPiA+DQo+ID4gPiBOb3Qgc3VyZSBob3cgdGhpcyB3b3Jr
cyB3aXRoIGZhc3RyZWcgdG9kYXkuICBTYWdpLi4/DQo+ID4gPg0KPiA+DQo+ID4gV2hpbGUgb24g
ZmFzdHJlZyBtb2RlLCBmb3IgUkRNQSByZWFkcyAgd2UgdXNlIG9ubHkgb25lIFNHRSwgdGFsa2lu
ZyB0bw0KPiA+IFNhZ2kgaGUgZXhwbGFpbmVkIG1lIHRoYXQgdGhlIHByb2JsZW0gd2l0aCBjcmVh
dGluZyB0aGUgUVAgd2l0aA0KPiA+IG1heF9zZW5kX3NnZSA9IDEgaGFzIHRvIGRvIHdpdGggb3Ro
ZXIgZmxvd3Mgd2hlcmUgdHdvIG9yIGV2ZW4gdGhyZWUNCj4gPiBTR0VzIGFyZSBuZWVkZWQsIGUu
ZyB3aGVuIHRoZSBpc2NzaS9pc2VyIGhlYWRlciAoZG9lc24ndCBleGlzdCBpbiBSRE1BDQo+ID4g
b3BzKSBpcyB0YWtlbiBmcm9tIG9uZSBzcG90IG9mIHRoZSBtZW1vcnkgYW5kIHRoZSBkYXRhIChz
ZW5zZSkgdGFrZW4NCj4gPiBmcm9tIGFub3RoZXIgb25lPw0KPiA+DQo+ID4gTmljLCB3ZSBuZWVk
IHRvIHNlZSB3aGF0IGlzIHRoZSBtaW5pbXVtIG5lZWRlZCBieSB0aGUgY29kZSAodHdvPykgYW5k
DQo+ID4gdHdlYWsgdGhhdCBsaW5lIHRvIG1ha2Ugc3VyZSBpdCBnZXRzIHRoZXJlIGFzIGxvbmcg
YXMgdGhlIGRldmljZQ0KPiA+IHN1cHBvcnRzIGl0LCBTQiwgbWFrZXMgc2Vuc2U/DQo+ID4NCj4g
Pg0KPiA+IGRpZmYgLS1naXQgYS9kcml2ZXJzL2luZmluaWJhbmQvdWxwL2lzZXJ0L2liX2lzZXJ0
LmMNCj4gPiBiL2RyaXZlcnMvaW5maW5pYmFuZC91bHAvaXNlcnQvaWJfaXNlcnQuYw0KPiA+IGlu
ZGV4IDBiZWE1NzcuLjI0YWJiYjMgMTAwNjQ0DQo+ID4gLS0tIGEvZHJpdmVycy9pbmZpbmliYW5k
L3VscC9pc2VydC9pYl9pc2VydC5jDQo+ID4gKysrIGIvZHJpdmVycy9pbmZpbmliYW5kL3VscC9p
c2VydC9pYl9pc2VydC5jDQo+ID4gQEAgLTExNSw5ICsxMTUsMTAgQEAgaXNlcnRfY29ubl9zZXR1
cF9xcChzdHJ1Y3QgaXNlcnRfY29ubg0KPiA+ICppc2VydF9jb25uLCBzdHJ1Y3QgcmRtYV9jbV9p
ZCAqY21hX2lkLA0KPiA+ICAgICAgICAgIGF0dHIuY2FwLm1heF9yZWN2X3dyID0gSVNFUlRfUVBf
TUFYX1JFQ1ZfRFRPUzsNCj4gPiAgICAgICAgICAvKg0KPiA+ICAgICAgICAgICAqIEZJWE1FOiBV
c2UgZGV2YXR0ci5tYXhfc2dlIC0gMiBmb3IgbWF4X3NlbmRfc2dlIGFzDQo+ID4gLSAgICAgICAg
KiB3b3JrLWFyb3VuZCBmb3IgUkRNQV9SRUFELi4NCj4gPiArICAgICAgICAqIHdvcmstYXJvdW5k
IGZvciBSRE1BX1JFQUQuLiBzdGlsbCBuZWVkIHRvIG1ha2Ugc3VyZSB0bw0KPiA+ICsgICAgICAg
ICogaGF2ZSBhdCBsZWFzdCB0d28gU0dFcyBmb3IgU0NTSSByZXNwb25zZXMuDQo+ID4gICAgICAg
ICAgICovDQo+ID4gLSAgICAgICBhdHRyLmNhcC5tYXhfc2VuZF9zZ2UgPSBkZXZpY2UtPmRldl9h
dHRyLm1heF9zZ2UgLSAyOw0KPiA+ICsgICAgICAgYXR0ci5jYXAubWF4X3NlbmRfc2dlID0gbWF4
KDIsIGRldmljZS0+ZGV2X2F0dHIubWF4X3NnZSAtIDIpOw0KPiA+ICAgICAgICAgIGlzZXJ0X2Nv
bm4tPm1heF9zZ2UgPSBhdHRyLmNhcC5tYXhfc2VuZF9zZ2U7DQo+ID4NCj4gPiAgICAgICAgICBh
dHRyLmNhcC5tYXhfcmVjdl9zZ2UgPSAxOw0KPiA+DQo+ID4NCj4gDQo+IEFmdGVyIGEgcXVpY2sg
YXVkaXQsIHRoZSBtaW5pbXVtIG1heF9zZW5kX3NnZT0yIHBhdGNoIGxvb2tzIGNvcnJlY3QgZm9y
DQo+IHRoZSBoYXJkY29kZWQgdHhfZGVzYy0+bnVtX3NnZSBjYXNlcyBpbiBpc2VydF9wdXRfbG9n
aW5fdHgoKSwNCj4gaXNlcnRfcHV0X3Jlc3BvbnNlKCksIGlzZXJ0X3B1dF9yZWplY3QoKSBhbmQg
aXNlcnRfcHV0X3RleHRfcnNwKCkuDQo+IA0KPiBGb3IgdGhlIFJETUEgcmVhZCBwYXRoIHdpdGgg
bm9uIGZhc3QtcmVnLCBpc2VydF9idWlsZF9yZG1hX3dyKCkgaXMgc3RpbGwNCj4gY29ycmVjdGx5
IGxpbWl0aW5nIHRoZSBTR0VzIHBlciBvcGVyYXRpb24gYmFzZWQgdXBvbiB0aGUgbmVnb3RpYXRl
ZA0KPiBpc2VydF9jb25uLT5tYXhfc2dlIHNldCBhYm92ZSBpbiBpc2VydF9jb25uX3NldHVwX3Fw
KCkuDQo+IA0KPiBDaHJpcywgcGxlYXNlIGNvbmZpcm0gb24gb2NyZG1hIHdpdGggT3IncyBwYXRj
aCwgYW5kIEknbGwgaW5jbHVkZSBoaXMgcGF0Y2ggaW50bw0KPiB0YXJnZXQtcGVuZGluZy9xdWV1
ZSBmb3Igbm93LCBhbmQgbW92ZSBpbnRvIG1hc3RlciBvbmNlIHlvdSBnaXZlIGENCj4gcHJvcGVy
IFRlc3RlZC1CeS4NCj4gDQo+IFRoYW5rcyENCj4gDQo+IC0tbmFiDQoNClNvcnJ5IGZvciB0aGUg
bG9uZyBkZWxheSwgSSB3YXMgZGVhbGluZyB3aXRoIHNvbWUgaGFyZHdhcmUgaXNzdWVzLg0KDQpJ
J3ZlIHRlc3RlZCB0aGUgcHJvcG9zZWQgcGF0Y2ggYW5kIGl0IGZpeGVzIHRoZSBvY3JkbWEgaXNz
dWUuDQoNClRlc3RlZC1ieTogQ2hyaXMgTW9vcmUgPGNocmlzLm1vb3JlQGVtdWxleC5jb20+DQoN
ClRoYW5rcywNCg0KQ2hyaXMNCg0K
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Oct. 30, 2014, 8:24 a.m. UTC | #3
On 10/29/2014 8:05 PM, Chris Moore wrote:
>> On Wed, 2014-10-22 at 12:02 +0300, Or Gerlitz wrote:
>>> On 10/22/2014 8:06 AM, Nicholas A. Bellinger wrote:
>>>> On Tue, 2014-10-21 at 00:13 +0300, Or Gerlitz wrote:
>>>>> On Mon, Oct 20, 2014 at 6:29 PM, Chris Moore
>> <Chris.Moore@emulex.com> wrote:
>>>>>> The following code is in isert_conn_setup_qp() in ib_isert.c:
>>>>>>
>>>>>>            /*
>>>>>>              * FIXME: Use devattr.max_sge - 2 for max_send_sge as
>>>>>>              * work-around for RDMA_READ..
>>>>>>              */
>>>>>>            attr.cap.max_send_sge = device->dev_attr.max_sge - 2;
>>>>>>
>>>>>> It's not clear from the comment what this is a work-around for,
>>>>>> and I wasn't able to figure it out from looking at logs.
>>>>> I believe this refers to some IBTA spec corner case which comes
>>>>> into play with the max_sges advertized by mlx4, Eli, can you shed
>>>>> some light (IBTA pointer) on that? is this the case (i.e
>>>>> dev_attr.max_sge isn't always achievable) with mlx5 too?
>>>>>
>>>> It's for ConnectX-2 reporting max_sge=31 during development, while
>>>> the largest max_send_sge for stable large block RDMA reads was
>>>> (is..?) 29 SGEs.
>>>>
>>>>>> In trying to get isert working with ocrdma I ran into a problem
>>>>>> where the code thought there was only 1 send SGE available when in
>> fact the device has 3.
>>>>>> Then I found the above work-around, which explained the problem.
>>>>> Nic, any reason for attr.cap.max_send_sge = 1 not to work OK?
>>>>>
>>>> IIRC, pre fastreg code supported max_send_sge=1 by limiting the
>>>> transfer size per RDMA read, and would issue subsequent RDMA read +
>>>> offset from completion up to the total se_cmd->data_length.
>>>>
>>>> Not sure how this works with fastreg today.  Sagi..?
>>>>
>>>
>>> While on fastreg mode, for RDMA reads  we use only one SGE, talking to
>>> Sagi he explained me that the problem with creating the QP with
>>> max_send_sge = 1 has to do with other flows where two or even three
>>> SGEs are needed, e.g when the iscsi/iser header (doesn't exist in RDMA
>>> ops) is taken from one spot of the memory and the data (sense) taken
>>> from another one?
>>>
>>> Nic, we need to see what is the minimum needed by the code (two?) and
>>> tweak that line to make sure it gets there as long as the device
>>> supports it, SB, makes sense?
>>>
>>>
>>> diff --git a/drivers/infiniband/ulp/isert/ib_isert.c
>>> b/drivers/infiniband/ulp/isert/ib_isert.c
>>> index 0bea577..24abbb3 100644
>>> --- a/drivers/infiniband/ulp/isert/ib_isert.c
>>> +++ b/drivers/infiniband/ulp/isert/ib_isert.c
>>> @@ -115,9 +115,10 @@ isert_conn_setup_qp(struct isert_conn
>>> *isert_conn, struct rdma_cm_id *cma_id,
>>>           attr.cap.max_recv_wr = ISERT_QP_MAX_RECV_DTOS;
>>>           /*
>>>            * FIXME: Use devattr.max_sge - 2 for max_send_sge as
>>> -        * work-around for RDMA_READ..
>>> +        * work-around for RDMA_READ.. still need to make sure to
>>> +        * have at least two SGEs for SCSI responses.
>>>            */
>>> -       attr.cap.max_send_sge = device->dev_attr.max_sge - 2;
>>> +       attr.cap.max_send_sge = max(2, device->dev_attr.max_sge - 2);

I think we need to fail if (dev_attr.max_sge - 2 <= 0)

>>>           isert_conn->max_sge = attr.cap.max_send_sge;
>>>
>>>           attr.cap.max_recv_sge = 1;
>>>
>>>
>>
>> After a quick audit, the minimum max_send_sge=2 patch looks correct for
>> the hardcoded tx_desc->num_sge cases in isert_put_login_tx(),
>> isert_put_response(), isert_put_reject() and isert_put_text_rsp().
>>
>> For the RDMA read path with non fast-reg, isert_build_rdma_wr() is still
>> correctly limiting the SGEs per operation based upon the negotiated
>> isert_conn->max_sge set above in isert_conn_setup_qp().
>>
>> Chris, please confirm on ocrdma with Or's patch, and I'll include his patch into
>> target-pending/queue for now, and move into master once you give a
>> proper Tested-By.
>>
>> Thanks!
>>
>> --nab
>
> Sorry for the long delay, I was dealing with some hardware issues.
>
> I've tested the proposed patch and it fixes the ocrdma issue.

This is a good patch for a work-around. But we should fix
this one from the root.

Sagi.

>
> Tested-by: Chris Moore <chris.moore@emulex.com>
>
> Thanks,
>
> Chris

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chris Moore Oct. 30, 2014, 3:06 p.m. UTC | #4
> On 10/29/2014 8:05 PM, Chris Moore wrote:

> >> On Wed, 2014-10-22 at 12:02 +0300, Or Gerlitz wrote:

> >>> On 10/22/2014 8:06 AM, Nicholas A. Bellinger wrote:

> >>>> On Tue, 2014-10-21 at 00:13 +0300, Or Gerlitz wrote:

> >>>>> On Mon, Oct 20, 2014 at 6:29 PM, Chris Moore

> >> <Chris.Moore@emulex.com> wrote:

> >>>>>> The following code is in isert_conn_setup_qp() in ib_isert.c:

> >>>>>>

> >>>>>>            /*

> >>>>>>              * FIXME: Use devattr.max_sge - 2 for max_send_sge as

> >>>>>>              * work-around for RDMA_READ..

> >>>>>>              */

> >>>>>>            attr.cap.max_send_sge = device->dev_attr.max_sge - 2;

> >>>>>>

> >>>>>> It's not clear from the comment what this is a work-around for,

> >>>>>> and I wasn't able to figure it out from looking at logs.

> >>>>> I believe this refers to some IBTA spec corner case which comes

> >>>>> into play with the max_sges advertized by mlx4, Eli, can you shed

> >>>>> some light (IBTA pointer) on that? is this the case (i.e

> >>>>> dev_attr.max_sge isn't always achievable) with mlx5 too?

> >>>>>

> >>>> It's for ConnectX-2 reporting max_sge=31 during development, while

> >>>> the largest max_send_sge for stable large block RDMA reads was

> >>>> (is..?) 29 SGEs.

> >>>>

> >>>>>> In trying to get isert working with ocrdma I ran into a problem

> >>>>>> where the code thought there was only 1 send SGE available when

> >>>>>> in

> >> fact the device has 3.

> >>>>>> Then I found the above work-around, which explained the problem.

> >>>>> Nic, any reason for attr.cap.max_send_sge = 1 not to work OK?

> >>>>>

> >>>> IIRC, pre fastreg code supported max_send_sge=1 by limiting the

> >>>> transfer size per RDMA read, and would issue subsequent RDMA read

> +

> >>>> offset from completion up to the total se_cmd->data_length.

> >>>>

> >>>> Not sure how this works with fastreg today.  Sagi..?

> >>>>

> >>>

> >>> While on fastreg mode, for RDMA reads  we use only one SGE, talking

> >>> to Sagi he explained me that the problem with creating the QP with

> >>> max_send_sge = 1 has to do with other flows where two or even three

> >>> SGEs are needed, e.g when the iscsi/iser header (doesn't exist in

> >>> RDMA

> >>> ops) is taken from one spot of the memory and the data (sense) taken

> >>> from another one?

> >>>

> >>> Nic, we need to see what is the minimum needed by the code (two?)

> >>> and tweak that line to make sure it gets there as long as the device

> >>> supports it, SB, makes sense?

> >>>

> >>>

> >>> diff --git a/drivers/infiniband/ulp/isert/ib_isert.c

> >>> b/drivers/infiniband/ulp/isert/ib_isert.c

> >>> index 0bea577..24abbb3 100644

> >>> --- a/drivers/infiniband/ulp/isert/ib_isert.c

> >>> +++ b/drivers/infiniband/ulp/isert/ib_isert.c

> >>> @@ -115,9 +115,10 @@ isert_conn_setup_qp(struct isert_conn

> >>> *isert_conn, struct rdma_cm_id *cma_id,

> >>>           attr.cap.max_recv_wr = ISERT_QP_MAX_RECV_DTOS;

> >>>           /*

> >>>            * FIXME: Use devattr.max_sge - 2 for max_send_sge as

> >>> -        * work-around for RDMA_READ..

> >>> +        * work-around for RDMA_READ.. still need to make sure to

> >>> +        * have at least two SGEs for SCSI responses.

> >>>            */

> >>> -       attr.cap.max_send_sge = device->dev_attr.max_sge - 2;

> >>> +       attr.cap.max_send_sge = max(2, device->dev_attr.max_sge -

> >>> + 2);

> 

> I think we need to fail if (dev_attr.max_sge - 2 <= 0)

> 


We may need to fail if (dev_attr.max_sge - 2 <= 1).   I think LOGIN RESPONSE requires
2 SGEs.

> >>>           isert_conn->max_sge = attr.cap.max_send_sge;

> >>>

> >>>           attr.cap.max_recv_sge = 1;

> >>>

> >>>

> >>

> >> After a quick audit, the minimum max_send_sge=2 patch looks correct

> >> for the hardcoded tx_desc->num_sge cases in isert_put_login_tx(),

> >> isert_put_response(), isert_put_reject() and isert_put_text_rsp().

> >>

> >> For the RDMA read path with non fast-reg, isert_build_rdma_wr() is

> >> still correctly limiting the SGEs per operation based upon the

> >> negotiated isert_conn->max_sge set above in isert_conn_setup_qp().

> >>

> >> Chris, please confirm on ocrdma with Or's patch, and I'll include his

> >> patch into target-pending/queue for now, and move into master once

> >> you give a proper Tested-By.

> >>

> >> Thanks!

> >>

> >> --nab

> >

> > Sorry for the long delay, I was dealing with some hardware issues.

> >

> > I've tested the proposed patch and it fixes the ocrdma issue.

> 

> This is a good patch for a work-around. But we should fix this one from the

> root.

> 

> Sagi.

> 


I agree, it's a work-around, but I think it's better than the work-around that
it's replacing.   If we add the check to fail if we don't have enough SGEs 
could we go with that for now?

In the long term maybe we can consider re-writing the LOGIN RESPONSE code to
only use a single SGE (Assuming I'm correct about the LOGIN RESPONSE taking 2 SGEs)

Another thing to consider is rather than subtracting 2 from the reported
number of SGEs maybe we just need to put a max on it.

Chris
> >

> > Tested-by: Chris Moore <chris.moore@emulex.com>

> >

> > Thanks,

> >

> > Chris
diff mbox

Patch

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index 0bea577..24abbb3 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -115,9 +115,10 @@  isert_conn_setup_qp(struct isert_conn *isert_conn, 
struct rdma_cm_id *cma_id,
         attr.cap.max_recv_wr = ISERT_QP_MAX_RECV_DTOS;
         /*
          * FIXME: Use devattr.max_sge - 2 for max_send_sge as
-        * work-around for RDMA_READ..
+        * work-around for RDMA_READ.. still need to make sure to
+        * have at least two SGEs for SCSI responses.
          */
-       attr.cap.max_send_sge = device->dev_attr.max_sge - 2;
+       attr.cap.max_send_sge = max(2, device->dev_attr.max_sge - 2);
         isert_conn->max_sge = attr.cap.max_send_sge;