diff mbox

[1/1] IB/iser: Remove hard coded values for cqe and send_wr

Message ID ecfd3441-253c-47bf-b2cb-030b2a00f689@CMEXHTCAS1.ad.emulex.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Minh Duc Tran Oct. 20, 2014, 5:36 a.m. UTC
Hi Sagi,

I've created a new patch over the 21 iser patches you have mentioned early in this thread.  It is pasted at the end of this email.
If I understand correctly, this patch will be applied along with Or's TODO list.


>> ISER_MAX_RX_CQ_LEN             4096               /* This number should be calculated during create_session */

>So in iSER CQs are shared a across the device - so this number should satisfy maximum number of connections per CQ (which is currently 8).

Again, there should be no rules enforcing a CQ to support 8 connections.  The underlining hw should be able to supports more or less as it is configured to do.  Specific to the ocrdma hw, it has lower number of CQE per CQ but it has 32 CQ

>> ISER_QP_MAX_RECV_DTOS    512                 /* Why can't we use ISCSI_DEF_XMIT_CMDS_MAX here ? */

>iSER creates the connection QP when before the session is created - so it doesn't know what the user will set at cmds_max (which is potentially larger than ISCSI_DEF_XMIT_CMDS_MAX). So we allow 512 at the moment and adjust the session cmds_max accordingly. I agree that this is a >work around for the moment as we don't know at QP creation time what is the user setting of cmds_max.

Like any other drivers, we should limit this number to the range hw supports.  If user setting is within hw range then number will be adjusted accordingly.  If user setting is not within hw range, set it to default values.
What about something like this:
#define ISER_GET_MAX_XMIT_CMDS(send_wr) (send_wr - ISER_MAX_TX_MISC_PDUS - \
					ISER_MAX_RX_MISC_PDUS)	/	\
					 (1 + ISER_INFLIGHT_DATAOUTS)
cmds_max_supported = ISER_GET_MAX_XMIT_CMDS(dev_attr->max_qp_wr)


>> ISER_MAX_TX_CQ_LEN             36944           /* the mlx4 hw supports up to 3 CQ, but the ocrdma hw supports up to  32CQ with lower number of cqe per CQ */

>What led you to conclude that: "the mlx4 hw supports up to 3 CQ"?
>TX CQ length should be

I was debugging some iser target problems sometimes back.  I could be wrong with 3CQ but it's not far from the hard limit of 4CQ set by current ib/iser code

>> ISER_QP_MAX_REQ_DTOS       4618
> >ISCSI_ISER_MAX_CONN            8                  /* I am not sure what this 8 connections per CQ is.  Open-iscsi will supports 1 connection per session so this can implies either one of these two things:
>>                                                                                          1- mlx4 is limited to 8 sessions per CQ
>>                                                                                          
>> 2- mlx4 is doing something proprietary on the hw to have multiple QP 
>> per session */

>As I said, CQs are per device and shared across iscsi connections. So each CQ will support up to 8 connections per CQ. I agree we should allocate more CQs in case of more connections are opened - but we never got any CQ to overrun (even in high stress) so this still on my todo list...

>> ISER_MAX_CQ                               4                  /* Should this number be much higher or bases on the number of cpu cores on the system to distribute CQ processing per core? */

>I completely agree, This is a legacy MAX ceiling. I have a patch for that pending at Or's table. I am all for getting it to 3.18/3.19

>>
>> We are open for suggestions.

>OK,

>I'll respond to Or's reply on the TODOs.

>Sagi.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Or Gerlitz Oct. 20, 2014, 8:01 a.m. UTC | #1
On Mon, Oct 20, 2014 at 8:36 AM, Minh Duc Tran
<MinhDuc.Tran@emulex.com> > ---
a/drivers/infiniband/ulp/iser/iser_verbs.c
> +++ b/drivers/infiniband/ulp/iser/iser_verbs.c
> @@ -114,6 +114,9 @@ static int iser_create_device_ib_res(struct iser_device *device)
>         if (IS_ERR(device->pd))
>                 goto pd_err;
>
> +       max_cqe = (dev_attr->max_cqe < ISER_MAX_CQ_LEN) ?
> +                  dev_attr->max_cqe : ISER_MAX_CQ_LEN;
> +

If I was the ocrdma maintainer I would say load and clear: NO, please.

Your current offering supports 32 CQs per device, and this means that
on 32 core node you will be able to run only iSER, no other ULP.

Generally, this approach is wrong, causing bad user experience and we
will not do that.




>         for (i = 0; i < device->comps_used; i++) {
>                 struct iser_comp *comp = &device->comps[i];
>
> @@ -122,7 +125,7 @@ static int iser_create_device_ib_res(struct iser_device *device)
>                                         iser_cq_callback,
>                                         iser_cq_event_callback,
>                                         (void *)comp,
> -                                       ISER_MAX_CQ_LEN, i);
> +                                       max_cqe, i);
>                 if (IS_ERR(comp->cq)) {
>                         comp->cq = NULL;
>                         goto cq_err;
> @@ -426,6 +429,7 @@ void iser_free_fastreg_pool(struct ib_conn *ib_conn)
>  static int iser_create_ib_conn_res(struct ib_conn *ib_conn)
>  {
>         struct iser_device      *device;
> +       struct ib_device_attr *dev_attr;
>         struct ib_qp_init_attr  init_attr;
>         int                     ret = -ENOMEM;
>         int index, min_index = 0;
> @@ -433,6 +437,7 @@ static int iser_create_ib_conn_res(struct ib_conn *ib_conn)
>         BUG_ON(ib_conn->device == NULL);
>
>         device = ib_conn->device;
> +       dev_attr = &device->dev_attr;
>
>         memset(&init_attr, 0, sizeof init_attr);
>
> @@ -461,7 +466,9 @@ static int iser_create_ib_conn_res(struct ib_conn *ib_conn)
>                 init_attr.cap.max_send_wr = ISER_QP_SIG_MAX_REQ_DTOS + 1;
>                 init_attr.create_flags |= IB_QP_CREATE_SIGNATURE_EN;
>         } else {
> -               init_attr.cap.max_send_wr  = ISER_QP_MAX_REQ_DTOS + 1;
> +               init_attr.cap.max_send_wr  =
> +                       (dev_attr->max_qp_wr < ISER_QP_MAX_REQ_DTOS) ?
> +                        dev_attr->max_qp_wr : ISER_QP_MAX_REQ_DTOS;
>         }
>
>         ret = rdma_create_qp(ib_conn->cma_id, device->pd, &init_attr);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Oct. 20, 2014, 4:14 p.m. UTC | #2
On 10/20/2014 11:01 AM, Or Gerlitz wrote:
> On Mon, Oct 20, 2014 at 8:36 AM, Minh Duc Tran
> <MinhDuc.Tran@emulex.com> > ---
> a/drivers/infiniband/ulp/iser/iser_verbs.c
>> +++ b/drivers/infiniband/ulp/iser/iser_verbs.c
>> @@ -114,6 +114,9 @@ static int iser_create_device_ib_res(struct iser_device *device)
>>          if (IS_ERR(device->pd))
>>                  goto pd_err;
>>
>> +       max_cqe = (dev_attr->max_cqe < ISER_MAX_CQ_LEN) ?
>> +                  dev_attr->max_cqe : ISER_MAX_CQ_LEN;
>> +
>
> If I was the ocrdma maintainer I would say load and clear: NO, please.
>
> Your current offering supports 32 CQs per device, and this means that
> on 32 core node you will be able to run only iSER, no other ULP.

This is max CQ entries, are you referring to max_cq? or am I missing
something?

I tend to agree that if the device supports smaller CQs we should not
just fail everything, but adjust accordingly.

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Minh Duc Tran Oct. 20, 2014, 6:11 p.m. UTC | #3
Pj4gKysrIGIvZHJpdmVycy9pbmZpbmliYW5kL3VscC9pc2VyL2lzZXJfdmVyYnMuYw0KPj4gQEAg
LTExNCw2ICsxMTQsOSBAQCBzdGF0aWMgaW50IGlzZXJfY3JlYXRlX2RldmljZV9pYl9yZXMoc3Ry
dWN0IGlzZXJfZGV2aWNlICpkZXZpY2UpDQo+PiAgICAgICAgIGlmIChJU19FUlIoZGV2aWNlLT5w
ZCkpDQo+PiAgICAgICAgICAgICAgICAgZ290byBwZF9lcnI7DQo+Pg0KPj4gKyAgICAgICBtYXhf
Y3FlID0gKGRldl9hdHRyLT5tYXhfY3FlIDwgSVNFUl9NQVhfQ1FfTEVOKSA/DQo+PiArICAgICAg
ICAgICAgICAgICAgZGV2X2F0dHItPm1heF9jcWUgOiBJU0VSX01BWF9DUV9MRU47DQo+PiArDQoN
Cj5JZiBJIHdhcyB0aGUgb2NyZG1hIG1haW50YWluZXIgSSB3b3VsZCBzYXkgbG9hZCBhbmQgY2xl
YXI6IE5PLCBwbGVhc2UuDQo+WW91ciBjdXJyZW50IG9mZmVyaW5nIHN1cHBvcnRzIDMyIENRcyBw
ZXIgZGV2aWNlLCBhbmQgdGhpcyBtZWFucyB0aGF0IG9uIDMyIGNvcmUgbm9kZSB5b3Ugd2lsbCBi
ZSBhYmxlIHRvIHJ1biBvbmx5IGlTRVIsIG5vIG90aGVyIFVMUC4NCj5HZW5lcmFsbHksIHRoaXMg
YXBwcm9hY2ggaXMgd3JvbmcsIGNhdXNpbmcgYmFkIHVzZXIgZXhwZXJpZW5jZSBhbmQgd2Ugd2ls
bCBub3QgZG8gdGhhdC4NCg0KSGkgT3IsDQoNCkFsbCAzMiBDUXMgd2lsbCBuZXZlciBnZXQgdXNl
ZC4gIEhlcmUgYXJlIHRoZSByZWFzb25zOg0KMSkgY3VycmVudCAjZGVmaW5lIElTRVJfTUFYX0NR
IDQNCjIpICBZb3VyIHByb3Bvc2VkIFRPRE8gIiMxLiBjaGFuZ2UgdGhlIG51bWJlciBvZiBDUXMg
dG8gYmUgbWluKG51bV9jcHVzLCAxLzIgb2Ygd2hhdCB0aGUgZGV2aWNlIGNhbiBzdXBwb3J0KSIN
CiAgICAgIFRoaXMgd291bGQgbGltaXRzIHRoZSBtYXggQ1FzIGRldmljZSBzdXBwb3J0cyB0byAx
LzIgYW55d2F5Lg0KDQpJZiB5b3UgZG9uJ3QgYWdyZWUsIEkgbGlrZSB0byBnZXQgeW91ciBzdWdn
ZXN0aW9ucyBvZiB3aGF0IEkgc2hvdWxkIGRvIGluIHRoaXMgY2FzZSBmb3IgQ1FzIHRoYXQgaGF2
ZSA4ayBvZiBlbnRyaWVzIGVhY2guDQoNClRoYW5rcy4NCi1NaW5oDQoNCj4+ICAgICAgICAgZm9y
IChpID0gMDsgaSA8IGRldmljZS0+Y29tcHNfdXNlZDsgaSsrKSB7DQo+PiAgICAgICAgICAgICAg
ICAgc3RydWN0IGlzZXJfY29tcCAqY29tcCA9ICZkZXZpY2UtPmNvbXBzW2ldOw0KPj4NCj4+IEBA
IC0xMjIsNyArMTI1LDcgQEAgc3RhdGljIGludCBpc2VyX2NyZWF0ZV9kZXZpY2VfaWJfcmVzKHN0
cnVjdCBpc2VyX2RldmljZSAqZGV2aWNlKQ0KPj4gICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgIGlzZXJfY3FfY2FsbGJhY2ssDQo+PiAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgaXNlcl9jcV9ldmVudF9jYWxsYmFjaywNCj4+ICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAodm9pZCAqKWNvbXAsDQo+PiAtICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgSVNFUl9NQVhfQ1FfTEVOLCBpKTsNCj4+ICsg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBtYXhfY3FlLCBpKTsNCg0K
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Minh Duc Tran Oct. 20, 2014, 8:56 p.m. UTC | #4
Pk9uIDEwLzIwLzIwMTQgMTE6MDEgQU0sIE9yIEdlcmxpdHogd3JvdGU6DQo+PiBPbiBNb24sIE9j
dCAyMCwgMjAxNCBhdCA4OjM2IEFNLCBNaW5oIER1YyBUcmFuIA0KPj4gPE1pbmhEdWMuVHJhbkBl
bXVsZXguY29tPiA+IC0tLSANCj4+IGEvZHJpdmVycy9pbmZpbmliYW5kL3VscC9pc2VyL2lzZXJf
dmVyYnMuYw0KPj4+ICsrKyBiL2RyaXZlcnMvaW5maW5pYmFuZC91bHAvaXNlci9pc2VyX3ZlcmJz
LmMNCj4+IEBAIC0xMTQsNiArMTE0LDkgQEAgc3RhdGljIGludCBpc2VyX2NyZWF0ZV9kZXZpY2Vf
aWJfcmVzKHN0cnVjdCBpc2VyX2RldmljZSAqZGV2aWNlKQ0KPj4+ICAgICAgICAgIGlmIChJU19F
UlIoZGV2aWNlLT5wZCkpDQo+Pj4gICAgICAgICAgICAgICAgICBnb3RvIHBkX2VycjsNCj4+Pg0K
Pj4+ICsgICAgICAgbWF4X2NxZSA9IChkZXZfYXR0ci0+bWF4X2NxZSA8IElTRVJfTUFYX0NRX0xF
TikgPw0KPj4+ICsgICAgICAgICAgICAgICAgICBkZXZfYXR0ci0+bWF4X2NxZSA6IElTRVJfTUFY
X0NRX0xFTjsNCj4+PiArDQo+Pg0KPj4gSWYgSSB3YXMgdGhlIG9jcmRtYSBtYWludGFpbmVyIEkg
d291bGQgc2F5IGxvYWQgYW5kIGNsZWFyOiBOTywgcGxlYXNlLg0KPj4NCj4+IFlvdXIgY3VycmVu
dCBvZmZlcmluZyBzdXBwb3J0cyAzMiBDUXMgcGVyIGRldmljZSwgYW5kIHRoaXMgbWVhbnMgdGhh
dCANCj4+IG9uIDMyIGNvcmUgbm9kZSB5b3Ugd2lsbCBiZSBhYmxlIHRvIHJ1biBvbmx5IGlTRVIs
IG5vIG90aGVyIFVMUC4NCg0KPlRoaXMgaXMgbWF4IENRIGVudHJpZXMsIGFyZSB5b3UgcmVmZXJy
aW5nIHRvIG1heF9jcT8gb3IgYW0gSSBtaXNzaW5nIHNvbWV0aGluZz8NCg0KPkkgdGVuZCB0byBh
Z3JlZSB0aGF0IGlmIHRoZSBkZXZpY2Ugc3VwcG9ydHMgc21hbGxlciBDUXMgd2Ugc2hvdWxkIG5v
dCBqdXN0IGZhaWwgZXZlcnl0aGluZywgYnV0IGFkanVzdCBhY2NvcmRpbmdseS4NCg0KPlNhZ2ku
DQoNClllcywgSSB0aGluayBPcidzIHJlc3BvbnNlIGlzIGEgYml0IGNvbmZ1c2luZyB3aGVuIHBv
c3QgdW5kZXIgdGhlIGJsb2NrIG9mIHBhdGNoIHdoaWNoIGlzIHJlbGF0aW5nIHRvIG1heCBDUSBl
bnRyaWVzIHBlciBDUSwgbm90IG1heCBDUXMuDQpUaG91Z2gsIHdlIGRpZCBtZW50aW9uIHN1cHBv
cnRpbmcgbW9yZSBDUXMgYXMgcGFydCBvZiB0aGUgVE9ETyBsaXN0IGF0IHNvbWUgcG9pbnRzIGlu
IHRoaXMgZGlzY3Vzc2lvbi4NClRoZSBwYXRjaCB3ZSBoYXZlIHBvc3RlZCB3aWxsIHdvcmsgd2l0
aCBjdXJyZW50IElCL2lzZXIgc3VwcG9ydGVkIENRcyBhcyB3ZWxsIGFzIG1heF9jcS8yIGluIHRo
ZSBUT0RPIGxpc3QuDQoNCi1NaW5oIA0K
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz Oct. 20, 2014, 9:06 p.m. UTC | #5
On Mon, Oct 20, 2014 at 9:11 PM, Minh Duc Tran <MinhDuc.Tran@emulex.com> wrote:
>>> +++ b/drivers/infiniband/ulp/iser/iser_verbs.c
>>> @@ -114,6 +114,9 @@ static int iser_create_device_ib_res(struct iser_device *device)
>>>         if (IS_ERR(device->pd))
>>>                 goto pd_err;
>>>
>>> +       max_cqe = (dev_attr->max_cqe < ISER_MAX_CQ_LEN) ?
>>> +                  dev_attr->max_cqe : ISER_MAX_CQ_LEN;
>>> +
>
>>If I was the ocrdma maintainer I would say load and clear: NO, please.
>>Your current offering supports 32 CQs per device, and this means that on 32 core node you will be able to run only iSER, no other ULP.
>>Generally, this approach is wrong, causing bad user experience and we will not do that.
>
> Hi Or,
>
> All 32 CQs will never get used.  Here are the reasons:
> 1) current #define ISER_MAX_CQ 4
> 2)  Your proposed TODO "#1. change the number of CQs to be min(num_cpus, 1/2 of what the device can support)"
>       This would limits the max CQs device supports to 1/2 anyway.
>
> If you don't agree, I like to get your suggestions of what I should do in this case for CQs that have 8k of entries each.


Oh, I am still OK with my suggestions... min(num_cpus, 1/2 dev max
CQs)" is fine, sure, but this isn't what your patch was doing, right?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz Oct. 20, 2014, 9:09 p.m. UTC | #6
On Tue, Oct 21, 2014 at 12:06 AM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> On Mon, Oct 20, 2014 at 9:11 PM, Minh Duc Tran <MinhDuc.Tran@emulex.com> wrote:
>>>> +++ b/drivers/infiniband/ulp/iser/iser_verbs.c
>>>> @@ -114,6 +114,9 @@ static int iser_create_device_ib_res(struct iser_device *device)
>>>>         if (IS_ERR(device->pd))
>>>>                 goto pd_err;
>>>>
>>>> +       max_cqe = (dev_attr->max_cqe < ISER_MAX_CQ_LEN) ?
>>>> +                  dev_attr->max_cqe : ISER_MAX_CQ_LEN;
>>>> +
>>
>>>If I was the ocrdma maintainer I would say load and clear: NO, please.
>>>Your current offering supports 32 CQs per device, and this means that on 32 core node you will be able to run only iSER, no other ULP.
>>>Generally, this approach is wrong, causing bad user experience and we will not do that.
>>
>> Hi Or,
>>
>> All 32 CQs will never get used.  Here are the reasons:
>> 1) current #define ISER_MAX_CQ 4
>> 2)  Your proposed TODO "#1. change the number of CQs to be min(num_cpus, 1/2 of what the device can support)"
>>       This would limits the max CQs device supports to 1/2 anyway.
>>
>> If you don't agree, I like to get your suggestions of what I should do in this case for CQs that have 8k of entries each.


So I was wrong, sorry... you were talking on CQEs not CQs. I think
Sagi has some proposal on how to harden the code that deals with
setting the size of CQs, let me talk to him.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz Oct. 21, 2014, 2:22 p.m. UTC | #7
On 10/21/2014 12:09 AM, Or Gerlitz wrote:
> So I was wrong, sorry... you were talking on CQEs not CQs. I think
> Sagi has some proposal on how to harden the code that deals with
> setting the size of CQs, let me talk to him.

So just to recaphere.

With your driver maximal attributes of 8K CQEs per CQ and 32 CQs per 
device, do/what we need to change s.t that the iser initiator functional 
with it's current code is functional on 3.18-rc1?

Are you hitting an ability to create a CQ?

I'd 1st and most want to make sure we are operative and later do further 
hardening.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz Oct. 21, 2014, 2:26 p.m. UTC | #8
On 10/21/2014 5:22 PM, Or Gerlitz wrote:
> Are you hitting an ability to create a CQ? 
inability
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Oct. 21, 2014, 2:49 p.m. UTC | #9
On 10/20/2014 8:36 AM, Minh Duc Tran wrote:
> Hi Sagi,
>
> I've created a new patch over the 21 iser patches you have mentioned early in this thread.  It is pasted at the end of this email.
> If I understand correctly, this patch will be applied along with Or's TODO list.
>
>
>>> ISER_MAX_RX_CQ_LEN             4096               /* This number should be calculated during create_session */
>
>> So in iSER CQs are shared a across the device - so this number should satisfy maximum number of connections per CQ (which is currently 8).
>
> Again, there should be no rules enforcing a CQ to support 8 connections.  The underlining hw should be able to supports more or less as it is configured to do.  Specific to the ocrdma hw, it has lower number of CQE per CQ but it has 32 CQ

We can take the minimum of supported, calculated.

>
>>> ISER_QP_MAX_RECV_DTOS    512                 /* Why can't we use ISCSI_DEF_XMIT_CMDS_MAX here ? */
>
>> iSER creates the connection QP when before the session is created - so it doesn't know what the user will set at cmds_max (which is potentially larger than ISCSI_DEF_XMIT_CMDS_MAX). So we allow 512 at the moment and adjust the session cmds_max accordingly. I agree that this is a >work around for the moment as we don't know at QP creation time what is the user setting of cmds_max.
>
> Like any other drivers, we should limit this number to the range hw supports.  If user setting is within hw range then number will be adjusted accordingly.  If user setting is not within hw range, set it to default values.
> What about something like this:
> #define ISER_GET_MAX_XMIT_CMDS(send_wr) (send_wr - ISER_MAX_TX_MISC_PDUS - \
> 					ISER_MAX_RX_MISC_PDUS)	/	\
> 					 (1 + ISER_INFLIGHT_DATAOUTS)
> cmds_max_supported = ISER_GET_MAX_XMIT_CMDS(dev_attr->max_qp_wr)
>

I agree, but we should not let the user set too much commands if we do.

>
>>> ISER_MAX_TX_CQ_LEN             36944           /* the mlx4 hw supports up to 3 CQ, but the ocrdma hw supports up to  32CQ with lower number of cqe per CQ */
>
>> What led you to conclude that: "the mlx4 hw supports up to 3 CQ"?
>> TX CQ length should be
>
> I was debugging some iser target problems sometimes back.  I could be wrong with 3CQ but it's not far from the hard limit of 4CQ set by current ib/iser code

This specifically originates in a mlx4_core bug in RoCE mode (supports
only 3 EQs). CCing Matan who promised me he would fix it...

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Minh Duc Tran Oct. 21, 2014, 9:11 p.m. UTC | #10
Hi Or,

>So just to recaphere.

>With your driver maximal attributes of 8K CQEs per CQ and 32 CQs per device, do/what we need to change s.t that the iser initiator functional with it's current code is functional on 3.18-rc1?


Yes, I've just verified the patch with 3.18-rc1 on both ocrdma and mlx4 drivers.  They are working as expected.

>Are you hitting an (inability) to create a CQ?


Yes

>I'd 1st and most want to make sure we are operative and later do further hardening.


This patch should not adversely impact the implementation with the proposed TODO list.

>Or.



Since the patch is fine, would be great if you can accept the patch.

Thanks.
-minh
Or Gerlitz Oct. 22, 2014, 4:01 a.m. UTC | #11
On 10/22/2014 12:11 AM, Minh Duc Tran wrote:
> Hi Or,
>
>> So just to recaphere.
>> With your driver maximal attributes of 8K CQEs per CQ and 32 CQs per device, do/what we need to change s.t that the iser initiator functional with it's current code is functional on 3.18-rc1?
> Yes, I've just verified the patch with 3.18-rc1 on both ocrdma and mlx4 drivers.  They are working as expected.
>
>> Are you hitting an (inability) to create a CQ?
> Yes

So for 3.18 we should be OK with one liner patch that makes sure the 
size of the created CQ is maxed out against the quantity advertized in 
the device attributes, agree?

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Minh Duc Tran Oct. 22, 2014, 4:29 a.m. UTC | #12
Pk9uIDEwLzIyLzIwMTQgMTI6MTEgQU0sIE1pbmggRHVjIFRyYW4gd3JvdGU6DQo+PiBIaSBPciwN
Cj4+DQo+Pj4gU28ganVzdCB0byByZWNhcGhlcmUuDQo+Pj4gV2l0aCB5b3VyIGRyaXZlciBtYXhp
bWFsIGF0dHJpYnV0ZXMgb2YgOEsgQ1FFcyBwZXIgQ1EgYW5kIDMyIENRcyBwZXIgZGV2aWNlLCBk
by93aGF0IHdlIG5lZWQgdG8gY2hhbmdlIHMudCB0aGF0IHRoZSBpc2VyIGluaXRpYXRvciBmdW5j
dGlvbmFsIHdpdGggaXQncyBjdXJyZW50IGNvZGUgaXMgZnVuY3Rpb25hbCBvbiAzLjE4LXJjMT8N
Cj4+IFllcywgSSd2ZSBqdXN0IHZlcmlmaWVkIHRoZSBwYXRjaCB3aXRoIDMuMTgtcmMxIG9uIGJv
dGggb2NyZG1hIGFuZCBtbHg0IGRyaXZlcnMuICBUaGV5IGFyZSB3b3JraW5nIGFzIGV4cGVjdGVk
Lg0KPj4NCj4+PiBBcmUgeW91IGhpdHRpbmcgYW4gKGluYWJpbGl0eSkgdG8gY3JlYXRlIGEgQ1E/
DQo+PiBZZXMNCg0KPlNvIGZvciAzLjE4IHdlIHNob3VsZCBiZSBPSyB3aXRoIG9uZSBsaW5lciBw
YXRjaCB0aGF0IG1ha2VzIHN1cmUgdGhlIHNpemUgb2YgdGhlIGNyZWF0ZWQgQ1EgaXMgbWF4ZWQg
b3V0IGFnYWluc3QgdGhlIHF1YW50aXR5IGFkdmVydGl6ZWQgaW4gdGhlIGRldmljZSBhdHRyaWJ1
dGVzLCBhZ3JlZT8NCg0KTm90IGp1c3QgdGhhdCBvbmUgbGluZXIgY2hhbmdlLiAgVGhlIGNoYW5n
ZSBpcyBhYm91dCAxMCBsaW5lcyBhbmQgdmVyeSBzaW1wbGUuICBJdCBzaG91bGQgdGFrZSB5b3Ug
YWJvdXQgMzAgc2Vjb25kcyB0byByZWFkIGl0LiAgV2UgbmVlZCB0aGUgd2hvbGUgcGF0Y2ggdG8g
Zml4IHR3byBwcm9ibGVtczoNCjEpIHRoZSBjcmVhdGUgQ1EgZmFpbHVyZSBkdWUgdG8gcmVxdWVz
dGVkIENRIGVudHJpZXMgdG9vIGxhcmdlLg0KMikgdGhlIGNyZWF0ZSBxcCBmYWlsdXJlIGR1ZSB0
byByZXF1ZXN0ZWQgd3IgZW50cmllcyB0b28gbGFyZ2UuICBPdXIgaHcgaGFzIGEgbGltaXQgb2Yg
YXJvdW5kIDJrIHdyIHBlciBxcCBpbiB0aGlzIHByb2ZpbGUuICBJc2VyIGNhbGxzIHRvIGNyZWF0
ZSBxcCB3aXRoIGEgaGFyZGNvZGVkIHZhbHVlIG9mIGFyb3VuZCA0ay4NCg0KSWYgeW91IGhhdmUg
b3RoZXIgaWRlYXMgaG93IHRvIHNvbHZlIG91ciBwcm9ibGVtLCBwbGVhc2UgbGV0IHVzIGtub3cu
DQoNClRoYW5rcy4NCi1NaW5oDQoNCg0K
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz Oct. 22, 2014, 4:54 a.m. UTC | #13
On 10/22/2014 7:29 AM, Minh Duc Tran wrote:
> Not just that one liner change.  The change is about 10 lines and very simple.  It should take you about 30 seconds to read it.  We need the whole patch to fix two problems:
> 1) the create CQ failure due to requested CQ entries too large.
> 2) the create qp failure due to requested wr entries too large.  Our hw has a limit of around 2k wr per qp in this profile.  Iser calls to create qp with a hardcoded value of around 4k.
OK, got it.

>
> If you have other ideas how to solve our problem, please let us know.

So let's max the QP size too against the device. The large QP size 
originates from what does it take to support unsolicited data outs, so 
in your case, you can document some sort of limitation in that respect 
for 3.18 and we'll do a full features proofed solution for 3.19. Please 
send the minimal patch for these two fixes on a new thread and we'll 
pick it up.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Oct. 22, 2014, 11:08 a.m. UTC | #14
On 10/20/2014 8:36 AM, Minh Duc Tran wrote:
<SNIP>
> ================
> From: Minh Tran <minhduc.tran@emulex.com>
>
>          This patch allows the underlying hardware to choose
> values other than  hard coded max values for cqe and send_wr
> while preventing them from exceeding max supported values.

Hi Minh,

In order to take it, I would like to address a couple of bits
below. (sorry for the late response. I'm juggling across projects...)

>
> Signed-off-by: Minh Tran <minhduc.tran@emulex.com>
> Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com>
> ---
>   drivers/infiniband/ulp/iser/iser_verbs.c | 13 ++++++++++---
>   1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c b/drivers/infiniband/ulp/iser/iser_verbs.c
> index 67225bb..73955c1 100644
> --- a/drivers/infiniband/ulp/iser/iser_verbs.c
> +++ b/drivers/infiniband/ulp/iser/iser_verbs.c
> @@ -76,7 +76,7 @@ static void iser_event_handler(struct ib_event_handler *handler,
>   static int iser_create_device_ib_res(struct iser_device *device)
>   {
>          struct ib_device_attr *dev_attr = &device->dev_attr;
> -       int ret, i;
> +       int ret, i, max_cqe;
>
>          ret = ib_query_device(device->ib_device, dev_attr);
>          if (ret) {
> @@ -114,6 +114,9 @@ static int iser_create_device_ib_res(struct iser_device *device)
>          if (IS_ERR(device->pd))
>                  goto pd_err;
>
> +       max_cqe = (dev_attr->max_cqe < ISER_MAX_CQ_LEN) ?
> +                  dev_attr->max_cqe : ISER_MAX_CQ_LEN;
> +

Why not do:
min_t(int, dev_attr->max_cqe, ISER_MAX_CQ_LEN)?

Please move it up before the CQs print, and add this information to the
print.

>          for (i = 0; i < device->comps_used; i++) {
>                  struct iser_comp *comp = &device->comps[i];
>
> @@ -122,7 +125,7 @@ static int iser_create_device_ib_res(struct iser_device *device)
>                                          iser_cq_callback,
>                                          iser_cq_event_callback,
>                                          (void *)comp,
> -                                       ISER_MAX_CQ_LEN, i);
> +                                       max_cqe, i);
>                  if (IS_ERR(comp->cq)) {
>                          comp->cq = NULL;
>                          goto cq_err;
> @@ -426,6 +429,7 @@ void iser_free_fastreg_pool(struct ib_conn *ib_conn)
>   static int iser_create_ib_conn_res(struct ib_conn *ib_conn)
>   {
>          struct iser_device      *device;
> +       struct ib_device_attr *dev_attr;
>          struct ib_qp_init_attr  init_attr;
>          int                     ret = -ENOMEM;
>          int index, min_index = 0;
> @@ -433,6 +437,7 @@ static int iser_create_ib_conn_res(struct ib_conn *ib_conn)
>          BUG_ON(ib_conn->device == NULL);
>
>          device = ib_conn->device;
> +       dev_attr = &device->dev_attr;
>
>          memset(&init_attr, 0, sizeof init_attr);
>
> @@ -461,7 +466,9 @@ static int iser_create_ib_conn_res(struct ib_conn *ib_conn)
>                  init_attr.cap.max_send_wr = ISER_QP_SIG_MAX_REQ_DTOS + 1;
>                  init_attr.create_flags |= IB_QP_CREATE_SIGNATURE_EN;
>          } else {
> -               init_attr.cap.max_send_wr  = ISER_QP_MAX_REQ_DTOS + 1;
> +               init_attr.cap.max_send_wr  =
> +                       (dev_attr->max_qp_wr < ISER_QP_MAX_REQ_DTOS) ?
> +                        dev_attr->max_qp_wr : ISER_QP_MAX_REQ_DTOS;

Again, why not do min_t?

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

================
From: Minh Tran <minhduc.tran@emulex.com>

        This patch allows the underlying hardware to choose
values other than  hard coded max values for cqe and send_wr
while preventing them from exceeding max supported values.

Signed-off-by: Minh Tran <minhduc.tran@emulex.com>
Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com>
---
 drivers/infiniband/ulp/iser/iser_verbs.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c b/drivers/infiniband/ulp/iser/iser_verbs.c
index 67225bb..73955c1 100644
--- a/drivers/infiniband/ulp/iser/iser_verbs.c
+++ b/drivers/infiniband/ulp/iser/iser_verbs.c
@@ -76,7 +76,7 @@  static void iser_event_handler(struct ib_event_handler *handler,
 static int iser_create_device_ib_res(struct iser_device *device)
 {
        struct ib_device_attr *dev_attr = &device->dev_attr;
-       int ret, i;
+       int ret, i, max_cqe;

        ret = ib_query_device(device->ib_device, dev_attr);
        if (ret) {
@@ -114,6 +114,9 @@  static int iser_create_device_ib_res(struct iser_device *device)
        if (IS_ERR(device->pd))
                goto pd_err;

+       max_cqe = (dev_attr->max_cqe < ISER_MAX_CQ_LEN) ?
+                  dev_attr->max_cqe : ISER_MAX_CQ_LEN;
+
        for (i = 0; i < device->comps_used; i++) {
                struct iser_comp *comp = &device->comps[i];

@@ -122,7 +125,7 @@  static int iser_create_device_ib_res(struct iser_device *device)
                                        iser_cq_callback,
                                        iser_cq_event_callback,
                                        (void *)comp,
-                                       ISER_MAX_CQ_LEN, i);
+                                       max_cqe, i);
                if (IS_ERR(comp->cq)) {
                        comp->cq = NULL;
                        goto cq_err;
@@ -426,6 +429,7 @@  void iser_free_fastreg_pool(struct ib_conn *ib_conn)
 static int iser_create_ib_conn_res(struct ib_conn *ib_conn)
 {
        struct iser_device      *device;
+       struct ib_device_attr *dev_attr;
        struct ib_qp_init_attr  init_attr;
        int                     ret = -ENOMEM;
        int index, min_index = 0;
@@ -433,6 +437,7 @@  static int iser_create_ib_conn_res(struct ib_conn *ib_conn)
        BUG_ON(ib_conn->device == NULL);

        device = ib_conn->device;
+       dev_attr = &device->dev_attr;

        memset(&init_attr, 0, sizeof init_attr);

@@ -461,7 +466,9 @@  static int iser_create_ib_conn_res(struct ib_conn *ib_conn)
                init_attr.cap.max_send_wr = ISER_QP_SIG_MAX_REQ_DTOS + 1;
                init_attr.create_flags |= IB_QP_CREATE_SIGNATURE_EN;
        } else {
-               init_attr.cap.max_send_wr  = ISER_QP_MAX_REQ_DTOS + 1;
+               init_attr.cap.max_send_wr  =
+                       (dev_attr->max_qp_wr < ISER_QP_MAX_REQ_DTOS) ?
+                        dev_attr->max_qp_wr : ISER_QP_MAX_REQ_DTOS;
        }

        ret = rdma_create_qp(ib_conn->cma_id, device->pd, &init_attr);