Message ID | 20150625212917.14869.66238.stgit@build.ogc.int (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Thu, Jun 25, 2015 at 04:29:17PM -0500, Steve Wise wrote: > - * ib_dma_mapping_error - check a DMA addr for error > + * rdma_mr_roles - possible roles a MR will be used for > + * > + * This allows a transport independent RDMA application to > + * create MRs that are usable for all the desired roles w/o > + * having to understand which access rights are needed. > + */ > +enum rdma_mr_roles { > + RDMA_MRR_RECV = 1, > + RDMA_MRR_SEND = (1<<1), > + RDMA_MRR_READ_SOURCE = (1<<2), > + RDMA_MRR_READ_SINK = (1<<3), > + RDMA_MRR_WRITE_SOURCE = (1<<4), > + RDMA_MRR_WRITE_SINK = (1<<5), > + RDMA_MRR_ATOMIC = (1<<6), > + RDMA_MRR_MW_BIND = (1<<7), > + RDMA_MRR_ZERO_BASED = (1<<8), > + RDMA_MRR_ACCESS_ON_DEMAND = (1<<9), So we don't have this problem again, it would be best to explicitly list exactly which ib_wc_opcode members what use a RKEY are matched to which MMR flags RDMA_MRR_RECV LOCAL: post to receive work queue RDMA_MRR_READ_SOURCE LOCAL: post IB_WC_RDMA_READ to send work queue RDMA_MRR_READ_SINK REMOTE: respond to IB_WC_RDMA_READ [..] Seems reasonable to me otherwise Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: Jason Gunthorpe [mailto:jgunthorpe@obsidianresearch.com] > Sent: Thursday, June 25, 2015 4:40 PM > To: Steve Wise > Cc: sagig@mellanox.com; roid@mellanox.com; ogerlitz@mellanox.com; sean.hefty@intel.com; linux-rdma@vger.kernel.org > Subject: Re: [PATCH RFC] RDMA/core: add rdma_get_dma_mr() > > On Thu, Jun 25, 2015 at 04:29:17PM -0500, Steve Wise wrote: > > > - * ib_dma_mapping_error - check a DMA addr for error > > + * rdma_mr_roles - possible roles a MR will be used for > > + * > > + * This allows a transport independent RDMA application to > > + * create MRs that are usable for all the desired roles w/o > > + * having to understand which access rights are needed. > > + */ > > +enum rdma_mr_roles { > > + RDMA_MRR_RECV = 1, > > + RDMA_MRR_SEND = (1<<1), > > + RDMA_MRR_READ_SOURCE = (1<<2), > > + RDMA_MRR_READ_SINK = (1<<3), > > + RDMA_MRR_WRITE_SOURCE = (1<<4), > > + RDMA_MRR_WRITE_SINK = (1<<5), > > + RDMA_MRR_ATOMIC = (1<<6), > > + RDMA_MRR_MW_BIND = (1<<7), > > + RDMA_MRR_ZERO_BASED = (1<<8), > > + RDMA_MRR_ACCESS_ON_DEMAND = (1<<9), > > So we don't have this problem again, it would be best to explicitly > list exactly which ib_wc_opcode members what use a RKEY are matched to > which MMR flags > > RDMA_MRR_RECV > LOCAL: post to receive work queue > RDMA_MRR_READ_SOURCE > LOCAL: post IB_WC_RDMA_READ to send work queue > RDMA_MRR_READ_SINK > REMOTE: respond to IB_WC_RDMA_READ > [..] > > Seems reasonable to me otherwise > Good idea. Thanks for the reviews! Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> +enum rdma_mr_roles { I would drop naming the enum - it shouldn't be used, as the values are bit flags. > + RDMA_MRR_RECV = 1, > + RDMA_MRR_SEND = (1<<1), > + RDMA_MRR_READ_SOURCE = (1<<2), > + RDMA_MRR_READ_SINK = (1<<3), > + RDMA_MRR_WRITE_SOURCE = (1<<4), > + RDMA_MRR_WRITE_SINK = (1<<5), > + RDMA_MRR_ATOMIC = (1<<6), > + RDMA_MRR_MW_BIND = (1<<7), > + RDMA_MRR_ZERO_BASED = (1<<8), There's 'something' different about this role that cause me hesitation. Maybe that it's dependent on other roles being set to be useful? I'm not sure. Maybe we need both roles and registration flags, with this declared as a flag? > + RDMA_MRR_ACCESS_ON_DEMAND = (1<<9), This one is even more different, as it doesn't impact how the MR interacts with the interfaces, or change how the application uses the MR. This is really a hint to the provider regarding the selection of different implementation flows.
On 6/25/2015 5:37 PM, Hefty, Sean wrote: >> +enum rdma_mr_roles { > I would drop naming the enum - it shouldn't be used, as the values are bit flags. ok. >> + RDMA_MRR_RECV = 1, >> + RDMA_MRR_SEND = (1<<1), >> + RDMA_MRR_READ_SOURCE = (1<<2), >> + RDMA_MRR_READ_SINK = (1<<3), >> + RDMA_MRR_WRITE_SOURCE = (1<<4), >> + RDMA_MRR_WRITE_SINK = (1<<5), >> + RDMA_MRR_ATOMIC = (1<<6), >> + RDMA_MRR_MW_BIND = (1<<7), >> + RDMA_MRR_ZERO_BASED = (1<<8), > There's 'something' different about this role that cause me hesitation. Maybe that it's dependent on other roles being set to be useful? I'm not sure. > > Maybe we need both roles and registration flags, with this declared as a flag? > >> + RDMA_MRR_ACCESS_ON_DEMAND = (1<<9), > This one is even more different, as it doesn't impact how the MR interacts with the interfaces, or change how the application uses the MR. This is really a hint to the provider regarding the selection of different implementation flows. How about roles and attributes? ZERO_BASED and ACCESS_ON_DEMAND would be attributes and the rest roles. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 6/26/2015 9:02 AM, Steve Wise wrote: > On 6/25/2015 5:37 PM, Hefty, Sean wrote: >>> +enum rdma_mr_roles { >> I would drop naming the enum - it shouldn't be used, as the values >> are bit flags. > > ok. > >>> + RDMA_MRR_RECV = 1, >>> + RDMA_MRR_SEND = (1<<1), >>> + RDMA_MRR_READ_SOURCE = (1<<2), >>> + RDMA_MRR_READ_SINK = (1<<3), >>> + RDMA_MRR_WRITE_SOURCE = (1<<4), >>> + RDMA_MRR_WRITE_SINK = (1<<5), >>> + RDMA_MRR_ATOMIC = (1<<6), >>> + RDMA_MRR_MW_BIND = (1<<7), >>> + RDMA_MRR_ZERO_BASED = (1<<8), >> There's 'something' different about this role that cause me >> hesitation. Maybe that it's dependent on other roles being set to be >> useful? I'm not sure. >> >> Maybe we need both roles and registration flags, with this declared >> as a flag? >> >>> + RDMA_MRR_ACCESS_ON_DEMAND = (1<<9), >> This one is even more different, as it doesn't impact how the MR >> interacts with the interfaces, or change how the application uses the >> MR. This is really a hint to the provider regarding the selection of >> different implementation flows. > > How about roles and attributes? ZERO_BASED and ACCESS_ON_DEMAND would > be attributes and the rest roles. > I'm thinking now about the access flags specified in a IB_WR_FAST_REG_MR work request. I think we need a similar function for this, but it will return the ib_access_flags bits to be stored in ib_send_wr->wr.fast_reg.access_flags field. With this function defined, then rdma_get_dma_mr() can use it as well. So perhaps an internal static function in verbs.c that does the bulk of what I put in rdma_get_dma_mr() originally called: device_mr_access_flags(device, roles, attrs) used by exported functions (or inlines in ib_verbs.h): rdma_get_dma_mr(device, roles, attrs) rdma_fast_reg_access_flags(device, roles, attrs) I'll code this up and send for another round of review. Eventually i'll get this included in the series with the iSER/iWARP patches for testing/further review. Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 26/06/2015 01:37, Hefty, Sean wrote: >> > + RDMA_MRR_ACCESS_ON_DEMAND = (1<<9), > This one is even more different, as it doesn't impact how the MR interacts with the interfaces, or change how the application uses the MR. This is really a hint to the provider regarding the selection of different implementation flows. I don't think this bit is relevant at all for DMA MR access flags. It is only used to allow page faults on an MR, so it is only relevant when creating a user-space MR. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 26/06/2015 00:29, Steve Wise wrote: > +enum rdma_mr_roles { > + RDMA_MRR_RECV = 1, > + RDMA_MRR_SEND = (1<<1), > + RDMA_MRR_READ_SOURCE = (1<<2), > + RDMA_MRR_READ_SINK = (1<<3), Maybe it's just me, but it took me a second to figure out which was the source and which was the sink in RDMA reads. Do you think calling them initiator and responder/target would be better? > + RDMA_MRR_WRITE_SOURCE = (1<<4), > + RDMA_MRR_WRITE_SINK = (1<<5), > + RDMA_MRR_ATOMIC = (1<<6), > + RDMA_MRR_MW_BIND = (1<<7), > + RDMA_MRR_ZERO_BASED = (1<<8), > + RDMA_MRR_ACCESS_ON_DEMAND = (1<<9), > +}; -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 26/06/2015 00:29, Steve Wise wrote: > The semantics for MR access rights are not consistent across RDMA > protocols. So rather than have applications try and glean what they need, > have them pass in the intended roles for the MR to be allocated and let > the RDMA core select the appropriate access rights given the roles and > device capabilities. I wanted to point out that with this scheme the ULP may sometimes get an MR with a wider set of permissions that it asked for, and I'm not sure that's always safe. Perhaps the ULP wants to guarantee that the MR doesn't allow certain kinds of accesses and doesn't expect the verbs layer to change that. This patch looks fine since it is just a helper function, but if you replace the MR creation with these flags, you might confuse users to think that they can control e.g. write sink and read sink separately. Haggai -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
PiBJIHdhbnRlZCB0byBwb2ludCBvdXQgdGhhdCB3aXRoIHRoaXMgc2NoZW1lIHRoZSBVTFAgbWF5 IHNvbWV0aW1lcyBnZXQgYW4NCj4gTVIgd2l0aCBhIHdpZGVyIHNldCBvZiBwZXJtaXNzaW9ucyB0 aGF0IGl0IGFza2VkIGZvciwgYW5kIEknbSBub3Qgc3VyZQ0KPiB0aGF0J3MgYWx3YXlzIHNhZmUu IFBlcmhhcHMgdGhlIFVMUCB3YW50cyB0byBndWFyYW50ZWUgdGhhdCB0aGUgTVINCj4gZG9lc24n dCBhbGxvdyBjZXJ0YWluIGtpbmRzIG9mIGFjY2Vzc2VzIGFuZCBkb2Vzbid0IGV4cGVjdCB0aGUg dmVyYnMNCj4gbGF5ZXIgdG8gY2hhbmdlIHRoYXQuDQoNClRoZSBwcm92aWRlcnMgY2FuIHJldHVy biB0aGUgYWN0dWFsIHBlcm1pc3Npb25zIHRoYXQgd2VyZSBncmFudGVkLiAgSWYgdGhlIGFwcCBj YXJlcywgaXQgY2FuIGNoZWNrLg0K -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-owner@vger.kernel.org] On Behalf Of Haggai Eran > Sent: Sunday, June 28, 2015 10:47 AM > To: Steve Wise; jgunthorpe@obsidianresearch.com > Cc: sagig@mellanox.com; roid@mellanox.com; ogerlitz@mellanox.com; sean.hefty@intel.com; linux-rdma@vger.kernel.org > Subject: Re: [PATCH RFC] RDMA/core: add rdma_get_dma_mr() > > On 26/06/2015 00:29, Steve Wise wrote: > > +enum rdma_mr_roles { > > + RDMA_MRR_RECV = 1, > > + RDMA_MRR_SEND = (1<<1), > > + RDMA_MRR_READ_SOURCE = (1<<2), > > + RDMA_MRR_READ_SINK = (1<<3), > > Maybe it's just me, but it took me a second to figure out which was the > source and which was the sink in RDMA reads. Do you think calling them > initiator and responder/target would be better? Not to me. For an RDMA operation, the "initiator" is the app that issues the read request WR. That app doesn't create what I call the READ_SOURCE MR. Its peer application does. So calling READ_SOURCE something like READ_INITIATOR doesn't make sense to me. That's why I thought SOURCE and SINK were clearer. Perhaps not... I have a new version I'll send out soon that will comment all of these in the enum declaration. Perhaps that will make it clear. > > > + RDMA_MRR_WRITE_SOURCE = (1<<4), > > + RDMA_MRR_WRITE_SINK = (1<<5), > > + RDMA_MRR_ATOMIC = (1<<6), > > + RDMA_MRR_MW_BIND = (1<<7), > > + RDMA_MRR_ZERO_BASED = (1<<8), > > + RDMA_MRR_ACCESS_ON_DEMAND = (1<<9), > > +}; > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jun 29, 2015 at 08:47:55AM -0500, Steve Wise wrote: > > On 26/06/2015 00:29, Steve Wise wrote: > > > +enum rdma_mr_roles { > > > + RDMA_MRR_RECV = 1, > > > + RDMA_MRR_SEND = (1<<1), > > > + RDMA_MRR_READ_SOURCE = (1<<2), > > > + RDMA_MRR_READ_SINK = (1<<3), > > > > Maybe it's just me, but it took me a second to figure out which was the > > source and which was the sink in RDMA reads. Do you think calling them > > initiator and responder/target would be better? > > Not to me. For an RDMA operation, the "initiator" is the app that > issues the read request WR. That app doesn't create what I call the > READ_SOURCE MR. Its peer application does. So calling READ_SOURCE > something like READ_INITIATOR doesn't make sense to me. That's why > I thought SOURCE and SINK were clearer. Perhaps not... I would call SOURCE the RESPONDER.. Initiator/Poster and Responder is closest to the language used in other places in the API for READ. > I have a new version I'll send out soon that will comment all of > these in the enum declaration. Perhaps that will make it clear. I think once you have a comment with the table mapping to allowed local/remote verbs then things will be fine with whatever names you pick. Honestly, the names that IBA picked were fine and descriptive too, the only problem is that the iwarp hardware can't implement the IBA names :( Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > +enum rdma_mr_roles { > > > > + RDMA_MRR_RECV = 1, > > > > + RDMA_MRR_SEND = (1<<1), > > > > + RDMA_MRR_READ_SOURCE = (1<<2), > > > > + RDMA_MRR_READ_SINK = (1<<3), > > > > > > Maybe it's just me, but it took me a second to figure out which was > the > > > source and which was the sink in RDMA reads. Do you think calling them > > > initiator and responder/target would be better? > > > > Not to me. For an RDMA operation, the "initiator" is the app that > > issues the read request WR. That app doesn't create what I call the > > READ_SOURCE MR. Its peer application does. So calling READ_SOURCE > > something like READ_INITIATOR doesn't make sense to me. That's why > > I thought SOURCE and SINK were clearer. Perhaps not... > > I would call SOURCE the RESPONDER.. > > Initiator/Poster and Responder is closest to the language used in > other places in the API for READ. I agree with Steve. I think of this in terms of copy and like source/dest myself, similar to source/sink. A memory buffer is not a responder or initiator. In hindsight, we probably never should have exposed initiator_depth and responder_resources. Those values have been an endless source of confusion and bugs. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c index bac3fb4..8a365d5 100644 --- a/drivers/infiniband/core/verbs.c +++ b/drivers/infiniband/core/verbs.c @@ -1144,6 +1144,40 @@ struct ib_mr *ib_get_dma_mr(struct ib_pd *pd, int mr_access_flags) } EXPORT_SYMBOL(ib_get_dma_mr); +struct ib_mr *rdma_get_dma_mr(struct ib_pd *pd, int mr_roles) +{ + int access_flags = 0; + struct ib_mr *mr; + int err; + + if (mr_roles & RDMA_MRR_RECV) + access_flags |= IB_ACCESS_LOCAL_WRITE; + + if (mr_roles & RDMA_MRR_WRITE_SINK) + access_flags |= IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_WRITE; + + if (mr_roles & RDMA_MRR_READ_SINK) { + access_flags |= IB_ACCESS_LOCAL_WRITE; + if (rdma_protocol_iwarp(pd->device, rdma_start_port(pd->device)) + access_flags |= IB_ACCESS_REMOTE_WRITE; + } + + if (mr_roles & RDMA_MRR_ATOMIC) + access_flags |= IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_ATOMIC; + + if (mr_roles & RDMA_MRR_MW_BIND) + access_flags |= IB_ACCESS_MW_BIND: + + if (mr_roles & RDMA_MRR_ZERO_BASED) + access_flags |= IB_ACCESS_ZERO_BASED: + + if (mr_roles & RDMA_MRR_ON_DEMAND) + access_flags |= IB_ACCESS_ON_DEMAND: + + return ib_get_dma_mr(pd, access_flags); +} +EXPORT_SYMBOL(rdma_get_dma_mr); + struct ib_mr *ib_reg_phys_mr(struct ib_pd *pd, struct ib_phys_buf *phys_buf_array, int num_phys_buf, diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 986fddb..feee869 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -2494,7 +2494,43 @@ static inline int ib_req_ncomp_notif(struct ib_cq *cq, int wc_cnt) struct ib_mr *ib_get_dma_mr(struct ib_pd *pd, int mr_access_flags); /** - * ib_dma_mapping_error - check a DMA addr for error + * rdma_mr_roles - possible roles a MR will be used for + * + * This allows a transport independent RDMA application to + * create MRs that are usable for all the desired roles w/o + * having to understand which access rights are needed. + */ +enum rdma_mr_roles { + RDMA_MRR_RECV = 1, + RDMA_MRR_SEND = (1<<1), + RDMA_MRR_READ_SOURCE = (1<<2), + RDMA_MRR_READ_SINK = (1<<3), + RDMA_MRR_WRITE_SOURCE = (1<<4), + RDMA_MRR_WRITE_SINK = (1<<5), + RDMA_MRR_ATOMIC = (1<<6), + RDMA_MRR_MW_BIND = (1<<7), + RDMA_MRR_ZERO_BASED = (1<<8), + RDMA_MRR_ACCESS_ON_DEMAND = (1<<9), +}; + +/** + * rdma_get_dma_mr - Returns a memory region for system memory that is + * usable for DMA. + * @pd: The protection domain associated with the memory region. + * @mr_roles: Specifies the intended roles of the MR + * + * Use the intended roles from @mr_roles along with the device + * capabilities to define the needed access rights, and call + * ib_get_dma_mr() to allocate the MR. + * + * Note that the ib_dma_*() functions defined below must be used + * to create/destroy addresses used with the Lkey or Rkey returned + * by ib_get_dma_mr(). + */ +struct ib_mr *rdma_get_dma_mr(struct ib_pd *pd, int mr_roles); + +/** + * rdma_dma_mapping_error - check a DMA addr for error * @dev: The device for which the dma_addr was created * @dma_addr: The DMA address to check */