Message ID | 1415908465-24392-4-git-send-email-ira.weiny@intel.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On 11/13/2014 9:54 PM, ira.weiny@intel.com wrote: > Add a device capability flag for OPA devices to signal their support of "jumbo" > MADs. > > Check for IB_DEVICE_JUMBO_MAD_SUPPORT in the device capabilities and if > supported mark the special QPs created. Ira, Few comments here, the device capability is OK, specifically if it helps for the mad layer logic to be simpler and/or more robust. You should add device attribute telling what is the size of MADs supported by the device, I think the IBTA mandated 256Band in the OPA case, the driver will fill there 2k. You can't just state that (jumbo == 2k) and go to complete design/changes which use this hard-coded assumption. You need to see how to re-spin this series w.o this assumption. I find it very annoying that upper level drivers replicate in different ways elements from the IB device attributes returned by ib_query_device. I met that in multiple drivers and upcoming designs for which I do code review. Are you up to come up with a patch that caches the device attributes on the device structure? if not, I can do that.. and have your code to see it. Also, I would go and unify (squash) this patch and the preceding one. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11/27/2014 1:47 PM, Or Gerlitz wrote: > On 11/13/2014 9:54 PM, ira.weiny@intel.com wrote: > >> Add a device capability flag for OPA devices to signal their support >> of "jumbo" >> MADs. >> >> Check for IB_DEVICE_JUMBO_MAD_SUPPORT in the device capabilities and if >> supported mark the special QPs created. > > Ira, > > Few comments here, the device capability is OK, specifically if it helps > for the mad > layer logic to be simpler and/or more robust. > > You should add device attribute telling what is the size of MADs > supported by the > device, I think the IBTA mandated 256Band in the OPA case, the driver > will fill there 2k. > > You can't just state that (jumbo == 2k) and go to complete design/changes > which use this hard-coded assumption. You need to see how to re-spin > this series > w.o this assumption. Why do we need both? can't we just rely on just the size? We're already short on bits in device_cap_flags so maybe we can do without? Sagi. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11/27/2014 3:51 PM, Sagi Grimberg wrote:
> We're already short on bits in device_cap_flags
no shortage @ the kernel... we can add more 32 bits if/when we need it
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
> > On 11/13/2014 9:54 PM, ira.weiny@intel.com wrote: > > > Add a device capability flag for OPA devices to signal their support of "jumbo" > > MADs. > > > > Check for IB_DEVICE_JUMBO_MAD_SUPPORT in the device capabilities and > > if supported mark the special QPs created. > > Ira, > > Few comments here, the device capability is OK, specifically if it helps for the > mad layer logic to be simpler and/or more robust. > > You should add device attribute telling what is the size of MADs supported by > the device, I think the IBTA mandated 256Band in the OPA case, the driver will > fill there 2k. I don't think this is a bad idea, however, the complication is determining the kmem_cache object size in that case. How about we limit this value to < 2K until such time as there is a need for larger support? > > You can't just state that (jumbo == 2k) and go to complete design/changes > which use this hard-coded assumption. You need to see how to re-spin this > series w.o this assumption. I'm looking into it. I believe I will still need to have a capability bit, but it may end up having a different meaning. > > I find it very annoying that upper level drivers replicate in different ways > elements from the IB device attributes returned by ib_query_device. I met that > in multiple drivers and upcoming designs for which I do code review. Are you > up to come up with a patch that caches the device attributes on the device > structure? I don't follow what you are asking for. Could you give more details? > if not, > I can do that.. and have your code to see it. > > Also, I would go and unify (squash) this patch and the preceding one. > I'll keep this in mind as I rework the patches. Ira -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Dec 7, 2014 at 4:23 PM, Weiny, Ira <ira.weiny@intel.com> wrote:
> I don't think this is a bad idea, however, the complication is determining the kmem_cache object size in that case. How about we limit this value to < 2K until such time as there is a need for larger support?
Can we just get rid of all the kmem_caches used in the MAD code? I
don't think any of this is so performance-critical that we couldn't
just use kmalloc...
- R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Dec 8, 2014 at 2:23 AM, Weiny, Ira <ira.weiny@intel.com> wrote: >> I find it very annoying that upper level drivers replicate in different ways >> elements from the IB device attributes returned by ib_query_device. I met that >> in multiple drivers and upcoming designs for which I do code review. Are you >> up to come up with a patch that caches the device attributes on the device >> structure? > I don't follow what you are asking for. Could you give more details? 1. add a struct ib_device_attr field to struct ib_device 2. when the device registers itself with the IB core, go and run the query_device verb with the param being pointer to that field -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
PiANCj4gT24gU3VuLCBEZWMgNywgMjAxNCBhdCA0OjIzIFBNLCBXZWlueSwgSXJhIDxpcmEud2Vp bnlAaW50ZWwuY29tPiB3cm90ZToNCj4gPiBJIGRvbid0IHRoaW5rIHRoaXMgaXMgYSBiYWQgaWRl YSwgaG93ZXZlciwgdGhlIGNvbXBsaWNhdGlvbiBpcyBkZXRlcm1pbmluZyB0aGUNCj4ga21lbV9j YWNoZSBvYmplY3Qgc2l6ZSBpbiB0aGF0IGNhc2UuICBIb3cgYWJvdXQgd2UgbGltaXQgdGhpcyB2 YWx1ZSB0byA8IDJLDQo+IHVudGlsIHN1Y2ggdGltZSBhcyB0aGVyZSBpcyBhIG5lZWQgZm9yIGxh cmdlciBzdXBwb3J0Pw0KPiANCj4gQ2FuIHdlIGp1c3QgZ2V0IHJpZCBvZiBhbGwgdGhlIGttZW1f Y2FjaGVzIHVzZWQgaW4gdGhlIE1BRCBjb2RlPyAgSSBkb24ndA0KPiB0aGluayBhbnkgb2YgdGhp cyBpcyBzbyBwZXJmb3JtYW5jZS1jcml0aWNhbCB0aGF0IHdlIGNvdWxkbid0IGp1c3QgdXNlIGtt YWxsb2MuLi4NCj4gDQoNCk9uIGFuIFNNIG5vZGUgdGhlIE1BRCBjb2RlIGlzIF92ZXJ5XyBwZXJm b3JtYW5jZS1jcml0aWNhbC4gIEkgZG9u4oCZdCB0aGluayB3ZSBzaG91bGQgZ2l2ZSB0aGF0IHVw Lg0KDQpBbm90aGVyIGFsdGVybmF0aXZlIGlzIHRvIGhhdmUgYSBjYWNoZSBwZXIgZGV2aWNlIGJh c2VkIG9uIHRoZSBzaXplIHJlcG9ydGVkLg0KDQotLSBJcmENCg0K -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > On 11/27/2014 3:51 PM, Sagi Grimberg wrote: > > We're already short on bits in device_cap_flags > > no shortage @ the kernel... we can add more 32 bits if/when we need it Why is there a gap in the bits? enum ib_device_cap_flags { ... IB_DEVICE_MEM_WINDOW_TYPE_2B = (1<<24), IB_DEVICE_MANAGED_FLOW_STEERING = (1<<29), ... }; 25-28 are not used? Is there some legacy software out there which uses these? Ira -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 1/7/2015 6:32 PM, Weiny, Ira wrote: >> >On 11/27/2014 3:51 PM, Sagi Grimberg wrote: >>> > >We're already short on bits in device_cap_flags >> > >> >no shortage @ the kernel... we can add more 32 bits if/when we need it > Why is there a gap in the bits? > > enum ib_device_cap_flags { > ... > IB_DEVICE_MEM_WINDOW_TYPE_2B = (1<<24), > IB_DEVICE_MANAGED_FLOW_STEERING = (1<<29), > ... > }; > > 25-28 are not used? Is there some legacy software out there which uses these? sort of... the verbs RSS design which wasn't accepted upstream uses bits 25/26/27 and Mellanox Core-Direct (whose verbs model called "Cross-Channel") which wasn't submitted upstream yet uses bit 28. Since in the kernel we can trivially use 64 bits, I would opt to keep these entries free for now. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c index 4673262..9f5641d 100644 --- a/drivers/infiniband/core/mad.c +++ b/drivers/infiniband/core/mad.c @@ -2855,7 +2855,8 @@ static void init_mad_queue(struct ib_mad_qp_info *qp_info, } static void init_mad_qp(struct ib_mad_port_private *port_priv, - struct ib_mad_qp_info *qp_info) + struct ib_mad_qp_info *qp_info, + int supports_jumbo_mads) { qp_info->port_priv = port_priv; init_mad_queue(qp_info, &qp_info->send_queue); @@ -2865,6 +2866,7 @@ static void init_mad_qp(struct ib_mad_port_private *port_priv, qp_info->snoop_table = NULL; qp_info->snoop_table_size = 0; atomic_set(&qp_info->snoop_count, 0); + qp_info->supports_jumbo_mads = supports_jumbo_mads; } static int create_mad_qp(struct ib_mad_qp_info *qp_info, @@ -2911,6 +2913,17 @@ static void destroy_mad_qp(struct ib_mad_qp_info *qp_info) kfree(qp_info->snoop_table); } +static int +mad_device_supports_jumbo_mads(struct ib_device *device) +{ + struct ib_device_attr attr; + + if (!ib_query_device(device, &attr)) + return (attr.device_cap_flags + & IB_DEVICE_JUMBO_MAD_SUPPORT); + return 0; +} + /* * Open the port * Create the QP, PD, MR, and CQ if needed @@ -2923,6 +2936,7 @@ static int ib_mad_port_open(struct ib_device *device, unsigned long flags; char name[sizeof "ib_mad123"]; int has_smi; + int supports_jumbo_mads; /* Create new device info */ port_priv = kzalloc(sizeof *port_priv, GFP_KERNEL); @@ -2935,8 +2949,14 @@ static int ib_mad_port_open(struct ib_device *device, port_priv->port_num = port_num; spin_lock_init(&port_priv->reg_lock); INIT_LIST_HEAD(&port_priv->agent_list); - init_mad_qp(port_priv, &port_priv->qp_info[0]); - init_mad_qp(port_priv, &port_priv->qp_info[1]); + + supports_jumbo_mads = mad_device_supports_jumbo_mads(device); + if (supports_jumbo_mads) + pr_info("Jumbo MAD support enabled for %s:%d\n", + device->name, port_num); + + init_mad_qp(port_priv, &port_priv->qp_info[0], supports_jumbo_mads); + init_mad_qp(port_priv, &port_priv->qp_info[1], supports_jumbo_mads); cq_size = mad_sendq_size + mad_recvq_size; has_smi = rdma_port_get_link_layer(device, port_num) == IB_LINK_LAYER_INFINIBAND; diff --git a/drivers/infiniband/core/mad_priv.h b/drivers/infiniband/core/mad_priv.h index d1a0b0e..4b4110d 100644 --- a/drivers/infiniband/core/mad_priv.h +++ b/drivers/infiniband/core/mad_priv.h @@ -192,6 +192,7 @@ struct ib_mad_qp_info { struct ib_mad_snoop_private **snoop_table; int snoop_table_size; atomic_t snoop_count; + int supports_jumbo_mads; }; struct ib_mad_port_private {