Message ID | 1434042225-4975-2-git-send-email-kaike.wan@intel.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Thu, Jun 11, 2015 at 01:03:42PM -0400, kaike.wan@intel.com wrote: > From: Kaike Wan <kaike.wan@intel.com> > > This patch adds netlink defines for SA client, local service group, local > service operations, and related attributes. Why are you sending this again without responding the comments I provided on this patch? Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> From: Jason Gunthorpe [mailto:jgunthorpe@obsidianresearch.com] > Sent: Thursday, June 11, 2015 1:58 PM > To: Wan, Kaike > Cc: linux-rdma@vger.kernel.org; Fleck, John; Weiny, Ira > Subject: Re: [PATCH v5 1/4] IB/netlink: Add defines for local service requests > through netlink > > On Thu, Jun 11, 2015 at 01:03:42PM -0400, kaike.wan@intel.com wrote: > > From: Kaike Wan <kaike.wan@intel.com> > > > > This patch adds netlink defines for SA client, local service group, > > local service operations, and related attributes. > > Why are you sending this again without responding the comments I provided > on this patch? Are you referring to the mandatory/option attributes comments? Kaike -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jun 11, 2015 at 06:32:44PM +0000, Wan, Kaike wrote: > > Why are you sending this again without responding the comments I provided > > on this patch? > > Are you referring to the mandatory/option attributes comments? Hum weird, I typed in a message yesterday but it seems to have evaporated. I will try again. Sorry. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jun 11, 2015 at 06:32:44PM +0000, Wan, Kaike wrote: > > From: Jason Gunthorpe [mailto:jgunthorpe@obsidianresearch.com] > > Sent: Thursday, June 11, 2015 1:58 PM > > To: Wan, Kaike > > Cc: linux-rdma@vger.kernel.org; Fleck, John; Weiny, Ira > > Subject: Re: [PATCH v5 1/4] IB/netlink: Add defines for local service requests > > through netlink > > > > On Thu, Jun 11, 2015 at 01:03:42PM -0400, kaike.wan@intel.com wrote: > > > From: Kaike Wan <kaike.wan@intel.com> > > > > > > This patch adds netlink defines for SA client, local service group, > > > local service operations, and related attributes. > + LS_NLA_TYPE_REVERSIBLE, > + LS_NLA_TYPE_NUMB_PATH, These need to be combined. It does not make sense to request numb_paths when the API we have is designed to return APM data. Recommend a 'QP Type' with options of: - RC QP: Return up to the full 6 path tuple with full APM data - UD QP: Return a single non-reversible path - GMP QP: Return up to two reversible GMP paths. If it is hard to get accurate information then you can sketch this in and infer RC QP and GMP QP by the request having the reversible set bit. But ideally RDMA CM and SRP would request RC QP, IPoIB would request UD QP. > +/* Local Service ServiceID attribute */ > +struct rdma_nla_ls_service_id { > + __be64 service_id; > +}; Do not expose BE to userspace, everything should be in cpu order. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> From: Jason Gunthorpe [mailto:jgunthorpe@obsidianresearch.com] > Sent: Thursday, June 11, 2015 2:52 PM > To: Wan, Kaike > Cc: linux-rdma@vger.kernel.org; Fleck, John; Weiny, Ira > Subject: Re: [PATCH v5 1/4] IB/netlink: Add defines for local service requests > through netlink > > On Thu, Jun 11, 2015 at 06:32:44PM +0000, Wan, Kaike wrote: > > > From: Jason Gunthorpe [mailto:jgunthorpe@obsidianresearch.com] > > > Sent: Thursday, June 11, 2015 1:58 PM > > > To: Wan, Kaike > > > Cc: linux-rdma@vger.kernel.org; Fleck, John; Weiny, Ira > > > Subject: Re: [PATCH v5 1/4] IB/netlink: Add defines for local > > > service requests through netlink > > > > > > On Thu, Jun 11, 2015 at 01:03:42PM -0400, kaike.wan@intel.com wrote: > > > > From: Kaike Wan <kaike.wan@intel.com> > > > > > > > > This patch adds netlink defines for SA client, local service > > > > group, local service operations, and related attributes. > > > + LS_NLA_TYPE_REVERSIBLE, > > + LS_NLA_TYPE_NUMB_PATH, > > These need to be combined. It does not make sense to request numb_paths > when the API we have is designed to return APM data. > > Recommend a 'QP Type' with options of: > - RC QP: Return up to the full 6 path tuple with full APM data > - UD QP: Return a single non-reversible path > - GMP QP: Return up to two reversible GMP paths. > > If it is hard to get accurate information then you can sketch this in and infer > RC QP and GMP QP by the request having the reversible set bit. But ideally > RDMA CM and SRP would request RC QP, IPoIB would request UD QP. It's quite indirect here for this implementation. (ib_sa_path_rec_get()) Caller reversible numb_path QP Type -------------------------------------------------------------------------- cma 1 1 GMP ipoib Not set 1 (?) UD srp Not set 1 UD In addition, on the user side, we need to convert QP type back into reversible_numpath again, which may not interpret the original request correctly. Why should we do all the translation? Is QP type definition already exported to user space? I see IBV_QPT_RC/UC/UD/RAW_PACKET/XRC_SEND/XRC_RECV, but not GMP in verbs.h. > > > +/* Local Service ServiceID attribute */ struct rdma_nla_ls_service_id > > +{ > > + __be64 service_id; > > +}; > > Do not expose BE to userspace, everything should be in cpu order. If we use cpu order, we need to do two conversions: from BE to cpu order in kernel and from cpu order to BE in user space. Struct ib_user_path_rec contains many __be32 fields. > > Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jun 11, 2015 at 07:21:11PM +0000, Wan, Kaike wrote: > > Recommend a 'QP Type' with options of: > > - RC QP: Return up to the full 6 path tuple with full APM data > > - UD QP: Return a single non-reversible path > > - GMP QP: Return up to two reversible GMP paths. > > > > If it is hard to get accurate information then you can sketch this in and infer > > RC QP and GMP QP by the request having the reversible set bit. But ideally > > RDMA CM and SRP would request RC QP, IPoIB would request UD QP. > > It's quite indirect here for this implementation. (ib_sa_path_rec_get()) > > Caller reversible numb_path QP Type > cma 1 1 GMP > ipoib Not set 1 (?) UD > srp Not set 1 UD CMA (mostly) and SRP use RC paths. IPoIB uses UD and RC. The IPoIB path record query looks wrong for RC mode because it does not request reversible, that should be fixed independently. The indirection is because you don't want to modify the callers to provide their additional information, which is probably a mistake. Add a netlink fill callback and do it there? > In addition, on the user side, we need to convert QP type back into > reversible_numpath again, which may not interpret the original > request correctly. Why should we do all the translation? The goal is to carry enough context information so that user space can do something smart. We have alot more information available to drive policy than is present in the simple IBA Path Record MAD. In this case, the kernel does not actually care about numb paths or reversible, except in the context of how it intends to use the path(s). For instance, having CM ask for reversible makes no sense, since in the general case it wants the 3 path tuple of GMP(reversible=1),FORWARD(reversible=0),RETURN(reversible=0). Similarly, it doesn't make sense to ask for numbPath=1 because in the general case it wants the 6 path APM set. Being more specific here allows userspace to implement more detailed policy beyond what the IB Path Record scheme allows. And for that we need the context information from the requestor. acm won't use it, it will always request numbPaths=1 and derive the reversible bit from this netlink field. (or always set reversible=1) > Is QP type definition already exported to user space? QP Type could be a bad name, open to ideas > I see IBV_QPT_RC/UC/UD/RAW_PACKET/XRC_SEND/XRC_RECV, but not GMP in verbs.h. GMP is a special case of UD. > > > +/* Local Service ServiceID attribute */ struct rdma_nla_ls_service_id > > > +{ > > > + __be64 service_id; > > > +}; > > > > Do not expose BE to userspace, everything should be in cpu order. > > If we use cpu order, we need to do two conversions: from BE to cpu > order in kernel and from cpu order to BE in user space. Struct > ib_user_path_rec contains many __be32 fields. I don't see a problem with the extra conversion. IIRC ib_user_path_rec is an image of the SA Path Record Reply MAD, so it should be byte for byte identical, which is why it has BE elements, that is not to be encouraged generally.. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> From: Jason Gunthorpe [mailto:jgunthorpe@obsidianresearch.com] > Sent: Thursday, June 11, 2015 3:52 PM > To: Wan, Kaike > Cc: linux-rdma@vger.kernel.org; Fleck, John; Weiny, Ira > Subject: Re: [PATCH v5 1/4] IB/netlink: Add defines for local service requests > through netlink > > On Thu, Jun 11, 2015 at 07:21:11PM +0000, Wan, Kaike wrote: > > > Recommend a 'QP Type' with options of: > > > - RC QP: Return up to the full 6 path tuple with full APM data > > > - UD QP: Return a single non-reversible path > > > - GMP QP: Return up to two reversible GMP paths. > > > > > > If it is hard to get accurate information then you can sketch this > > > in and infer RC QP and GMP QP by the request having the reversible > > > set bit. But ideally RDMA CM and SRP would request RC QP, IPoIB would > request UD QP. > > > > It's quite indirect here for this implementation. > > (ib_sa_path_rec_get()) > > > > Caller reversible numb_path QP Type > > cma 1 1 GMP > > ipoib Not set 1 (?) UD > > srp Not set 1 UD > > CMA (mostly) and SRP use RC paths. IPoIB uses UD and RC. The IPoIB path > record query looks wrong for RC mode because it does not request reversible, > that should be fixed independently. > > The indirection is because you don't want to modify the callers to provide > their additional information, which is probably a mistake. Add a netlink fill > callback and do it there? Apparently, to achieve this goal, we need the following changes: 1. Caller (cma, ipoib, srp): - Add a new callback to handle multiple pathrecords returned in the response. - Change the calling function to supply additional context info. The problem is that currently only one copy of pathrecord is required for these callers and we need to find the use of other pathrecords in these callers. Otherwise, we simply are overdesigning. 2. ib_sa: - Add a function to take additional context info; - Code this addition context into new attributes. - New callback function to return multiple pathrecords to the caller. 3. User space application: Apparently, ibacm can only return one copy of pathrecord and we need to find something different to test the code. And the current implementation of ibacm can't make use of the supplied context info (QP type or whatever). I think that these changes run well beyond the original goal of this patch series to speed up the pathrecord query performance and may warranty a subsequent patch series. For this patch series, how about limiting the change to combining reversible and numb_path into reversible_numb_path attribute? > > > In addition, on the user side, we need to convert QP type back into > > reversible_numpath again, which may not interpret the original request > > correctly. Why should we do all the translation? > > The goal is to carry enough context information so that user space can do > something smart. We have alot more information available to drive policy > than is present in the simple IBA Path Record MAD. > > In this case, the kernel does not actually care about numb paths or reversible, > except in the context of how it intends to use the path(s). > > For instance, having CM ask for reversible makes no sense, since in the > general case it wants the 3 path tuple of > GMP(reversible=1),FORWARD(reversible=0),RETURN(reversible=0). Similarly, > it doesn't make sense to ask for numbPath=1 because in the general case it > wants the 6 path APM set. > > Being more specific here allows userspace to implement more detailed policy > beyond what the IB Path Record scheme allows. And for that we need the > context information from the requestor. > > acm won't use it, it will always request numbPaths=1 and derive the > reversible bit from this netlink field. (or always set reversible=1) > > > Is QP type definition already exported to user space? > > QP Type could be a bad name, open to ideas > > > I see IBV_QPT_RC/UC/UD/RAW_PACKET/XRC_SEND/XRC_RECV, but not > GMP in verbs.h. > > GMP is a special case of UD. > > > > > +/* Local Service ServiceID attribute */ struct > > > > +rdma_nla_ls_service_id { > > > > + __be64 service_id; > > > > +}; > > > > > > Do not expose BE to userspace, everything should be in cpu order. > > > > If we use cpu order, we need to do two conversions: from BE to cpu > > order in kernel and from cpu order to BE in user space. Struct > > ib_user_path_rec contains many __be32 fields. > > I don't see a problem with the extra conversion. That's fine. I can do the conversion on both sides. In the future, if the callers do not do the conversion, there will be only one conversion on the user space side: caller (cpu byte order) ->ib_sa (cpu byte order) ->user space (cpu byte order ->network byte order). Kaike -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > Is QP type definition already exported to user space? > > QP Type could be a bad name, open to ideas I think the name "QP Type" is bad... I want to say "transport type" but I don't think that really communicates what we want here. > > > I see IBV_QPT_RC/UC/UD/RAW_PACKET/XRC_SEND/XRC_RECV, but not > GMP in verbs.h. > > GMP is a special case of UD. I think this is where QP Type (or "Transport type") are not equivalent to if a path needs to be reversible or not. To properly derive the reversibility requires more than the transport. There is nothing which precludes me from requesting a reversible path for any QP Type. This is really a "protocol" attribute and it seems that each protocol could potentially want a reversible Path Record independent of the QP or "Transport Type". > > > > > +/* Local Service ServiceID attribute */ struct > > > > +rdma_nla_ls_service_id { > > > > + __be64 service_id; > > > > +}; > > > > > > Do not expose BE to userspace, everything should be in cpu order. > > > > If we use cpu order, we need to do two conversions: from BE to cpu > > order in kernel and from cpu order to BE in user space. Struct > > ib_user_path_rec contains many __be32 fields. > > I don't see a problem with the extra conversion. Neither do I. All user space values should be in host order. Ira -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jun 11, 2015 at 11:45:46PM +0000, Wan, Kaike wrote:
> Apparently, to achieve this goal, we need the following changes:
I don't expect anything more than a proper hole in the UAPI to make
this work.
I already have patches that implement full kernel support for the
6-path format, and globally enable APM. With this netlink interface it
becomes feasible to mainline them. That will require some reworking of
what you've done, but I don't expect you to do that.
The only remaining thing we need here is that tiny bit of policy
information for the kernel to specify what the intended use of the
path is:
- For connected CM operation (6 path records)
- For unidirectional UD (1 path record)
- For 'misc' GMP like operation (at least 1 reversible path record)
I'm happy if you call it PATH_USE_XX
Testing the reversible bit should be enough to rough this in, if
not reversible it is PATH_USE_UNIDIRECTIONAL_UD, otherwise it is
PATH_USE_GMP.
Recommend that userspace treat any unknown usage type as GMP (it is
the most general).
Ira: Do you like 'path use' better as a name? I think I do..
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
> From: Jason Gunthorpe [mailto:jgunthorpe@obsidianresearch.com] > Sent: Thursday, June 11, 2015 8:22 PM > To: Wan, Kaike > Cc: linux-rdma@vger.kernel.org; Fleck, John; Weiny, Ira > Subject: Re: [PATCH v5 1/4] IB/netlink: Add defines for local service requests > through netlink > > On Thu, Jun 11, 2015 at 11:45:46PM +0000, Wan, Kaike wrote: > > Apparently, to achieve this goal, we need the following changes: > > I don't expect anything more than a proper hole in the UAPI to make this > work. > > I already have patches that implement full kernel support for the 6-path > format, and globally enable APM. With this netlink interface it becomes > feasible to mainline them. That will require some reworking of what you've > done, but I don't expect you to do that. > > The only remaining thing we need here is that tiny bit of policy information > for the kernel to specify what the intended use of the path is: > - For connected CM operation (6 path records) > - For unidirectional UD (1 path record) > - For 'misc' GMP like operation (at least 1 reversible path record) > > I'm happy if you call it PATH_USE_XX > > Testing the reversible bit should be enough to rough this in, if not reversible it > is PATH_USE_UNIDIRECTIONAL_UD, otherwise it is PATH_USE_GMP. > > Recommend that userspace treat any unknown usage type as GMP (it is the > most general). That would be cool. If we add this attribute, should we remove reversible and numb_path attributes? Kaike > > Ira: Do you like 'path use' better as a name? I think I do.. > > Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > From: Jason Gunthorpe [mailto:jgunthorpe@obsidianresearch.com] > > Sent: Thursday, June 11, 2015 8:22 PM > > To: Wan, Kaike > > Cc: linux-rdma@vger.kernel.org; Fleck, John; Weiny, Ira > > Subject: Re: [PATCH v5 1/4] IB/netlink: Add defines for local service > > requests through netlink > > > > On Thu, Jun 11, 2015 at 11:45:46PM +0000, Wan, Kaike wrote: > > > Apparently, to achieve this goal, we need the following changes: > > > > I don't expect anything more than a proper hole in the UAPI to make > > this work. > > > > I already have patches that implement full kernel support for the > > 6-path format, and globally enable APM. With this netlink interface it > > becomes feasible to mainline them. That will require some reworking of > > what you've done, but I don't expect you to do that. > > > > The only remaining thing we need here is that tiny bit of policy > > information for the kernel to specify what the intended use of the path is: > > - For connected CM operation (6 path records) > > - For unidirectional UD (1 path record) > > - For 'misc' GMP like operation (at least 1 reversible path record) > > > > I'm happy if you call it PATH_USE_XX > > > > Testing the reversible bit should be enough to rough this in, if not > > reversible it is PATH_USE_UNIDIRECTIONAL_UD, otherwise it is > PATH_USE_GMP. > > PATH_USE_* is a better name. But I think the defines should be: PATH_USE_ANY (or ALL?) PATH_USE_UNIDIRECTIONAL PATH_USE_GMP > > Recommend that userspace treat any unknown usage type as GMP (it is > > the > > > most general). Why not treat unknown type as "ANY/ALL" and return all 6 if possible. If not possible/supported (ie like ibacm) return GMP which is always valid. > > That would be cool. If we add this attribute, should we remove reversible and > numb_path attributes? Yes. > > Kaike > > > > > Ira: Do you like 'path use' better as a name? I think I do.. Yea that is a much better name. Thanks, Ira -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jun 12, 2015 at 05:24:32AM +0000, Weiny, Ira wrote: > > PATH_USE_* is a better name. But I think the defines should be: > > PATH_USE_ANY (or ALL?) How about PATH_USE_CONNECTED? > Why not treat unknown type as "ANY/ALL" and return all 6 if > possible. If not possible/supported (ie like ibacm) return GMP > which is always valid. Make sense to me.. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > On Fri, Jun 12, 2015 at 05:24:32AM +0000, Weiny, Ira wrote: > > > > PATH_USE_* is a better name. But I think the defines should be: > > > > PATH_USE_ANY (or ALL?) > > How about PATH_USE_CONNECTED? I don't want to bikeshed this too much as I'm not really good on names but... How does "Connected" relate to wanting all available Path options? Can't a connected transport use a reversible "GMP" path? I guess that is where I am getting hung up. Ira > > > Why not treat unknown type as "ANY/ALL" and return all 6 if possible. > > If not possible/supported (ie like ibacm) return GMP which is always > > valid. > > Make sense to me.. > > Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/uapi/rdma/rdma_netlink.h b/include/uapi/rdma/rdma_netlink.h index 6e4bb42..6aa9281 100644 --- a/include/uapi/rdma/rdma_netlink.h +++ b/include/uapi/rdma/rdma_netlink.h @@ -7,12 +7,14 @@ enum { RDMA_NL_RDMA_CM = 1, RDMA_NL_NES, RDMA_NL_C4IW, + RDMA_NL_SA, RDMA_NL_NUM_CLIENTS }; enum { RDMA_NL_GROUP_CM = 1, RDMA_NL_GROUP_IWPM, + RDMA_NL_GROUP_LS, RDMA_NL_NUM_GROUPS }; @@ -128,5 +130,85 @@ enum { IWPM_NLA_ERR_MAX }; +/* Local service group opcodes */ +enum { + RDMA_NL_LS_OP_RESOLVE = 0, + RDMA_NL_LS_OP_SET_TIMEOUT, + RDMA_NL_LS_NUM_OPS +}; + +/* Local service netlink message flags */ +#define RDMA_NL_LS_F_OK 0x0100 /* Success response */ +#define RDMA_NL_LS_F_ERR 0x0200 /* Failed response */ + +/* Local service attribute type */ +enum { + LS_NLA_TYPE_STATUS = 0, + LS_NLA_TYPE_PATH_RECORD, + LS_NLA_TYPE_TIMEOUT, + LS_NLA_TYPE_SERVICE_ID, + LS_NLA_TYPE_DGID, + LS_NLA_TYPE_SGID, + LS_NLA_TYPE_TCLASS, + LS_NLA_TYPE_REVERSIBLE, + LS_NLA_TYPE_NUMB_PATH, + LS_NLA_TYPE_PKEY, + LS_NLA_TYPE_QOS_CLASS, + LS_NLA_TYPE_MAX +}; + +/* Local service status attribute */ +enum { + LS_NLA_STATUS_SUCCESS = 0, + LS_NLA_STATUS_EINVAL, + LS_NLA_STATUS_ENODATA, + LS_NLA_STATUS_MAX +}; + +struct rdma_nla_ls_status { + __u32 status; +}; + +/* Local service pathrecord attribute: struct ib_path_rec_data */ + +/* Local service timeout attribute */ +struct rdma_nla_ls_timeout { + __u32 timeout; +}; + +/* Local Service ServiceID attribute */ +struct rdma_nla_ls_service_id { + __be64 service_id; +}; + +/* Local Service DGID/SGID attribute: big endian */ +struct rdma_nla_ls_gid { + __u8 gid[16]; +}; + +/* Local Service Traffic Class attribute */ +struct rdma_nla_ls_tclass { + __u8 tclass; +}; + +/* Local Service Reversible attribute */ +struct rdma_nla_ls_reversible { + __u32 reversible; +}; + +/* Local Service numb_path attribute */ +struct rdma_nla_ls_numb_path { + __u8 numb_path; +}; + +/* Local Service Pkey attribute*/ +struct rdma_nla_ls_pkey { + __be16 pkey; +}; + +/* Local Service Qos Class attribute */ +struct rdma_nla_ls_qos_class { + __be16 qos_class; +}; #endif /* _UAPI_RDMA_NETLINK_H */