Message ID | 20161223010752.GA20676@obsidianresearch.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
On 12/22/2016 8:07 PM, Jason Gunthorpe wrote: > Crucially this means the kernel has never, set reversible > correctly in the path record request. It has always asked for > irreversible paths even if the ULP requests otherwise. Issuing path record query with reversible bit set to 0 means that any path (reversible or not) can be returned by SA. It is not requesting an irreversible path. In the response, the reversible bit indicates whether the path is reversible or not. > When the kernel is used with a SM that supports this feature, it > completely breaks communication management if reversible paths are > not properly requested. > > The only reason this ever worked is because opensm ignores the > reversible bit. OpenSM supports reversible PR queries. I tested all 3 cases (comp mask bit not set, comp mask bit set and reversible set to 1, and comp mask bit set and reversible set to 0) in a single subnet. If you are referring to multiple (IB routed) subnet case, OpenSM only has rudimentary support for this and path queries is just one area where shortcuts were done to demo primitive IB routers a long time ago so I would not be surprised if reversible was ignored in that scenario. Is that what you are referring to ? If so, it would be more accurate to add that into your description rather than a blanket statement that OpenSM ignores the reversible bit in path queries. Otherwise, please elaborate/provide more details on the issue you are seeing with OpenSM. -- Hal -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jan 03, 2017 at 02:16:42PM -0500, Hal Rosenstock wrote: > On 12/22/2016 8:07 PM, Jason Gunthorpe wrote: > > Crucially this means the kernel has never, set reversible > > correctly in the path record request. It has always asked for > > irreversible paths even if the ULP requests otherwise. > > Issuing path record query with reversible bit set to 0 means that any > path (reversible or not) can be returned by SA. It is not requesting an > irreversible path. In the response, the reversible bit indicates whether > the path is reversible or not. 'asked for' == 'indicated it would accept'. > > When the kernel is used with a SM that supports this feature, it > > completely breaks communication management if reversible paths are > > not properly requested. > > > > The only reason this ever worked is because opensm ignores the > > reversible bit. > > OpenSM supports reversible PR queries. I tested all 3 cases (comp mask > bit not set, comp mask bit set and reversible set to 1, and comp mask > bit set and reversible set to 0) in a single subnet. The purpose of the statement to elaborate why the bug has gone undetected for so long. opensm ignores the reversible bit in the sense that opensm *only* supports reversible paths internally and every PR response it generates has a reversible DLID/SLID/SL tuple, no matter what the reversible response bit might claim. (AFAIK this outcome is baked into the lid routers) If this was not the case then this bug would have been discovered long ago because the kernel RDMA CM simply *does not work* if a PR response contains an irreversible DLID/SLID/SL tuple. > Otherwise, please elaborate/provide more details on the issue you are > seeing with OpenSM. There is no issue with OpenSM. It simply does not make use of an optional spec feature in way that would expose this kernel bug. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2016-12-22 at 18:07 -0700, Jason Gunthorpe wrote: > The RDMA core uses ib_pack() to convert from unpacked CPU structs > to on-the-wire bitpacked structs. > > This process requires that 1 bit fields are declared as u8 in the > unpacked struct, otherwise the packing process does not read the > value properly and the packed result is wired to 0. Several > places wrongly used int. > > Crucially this means the kernel has never, set reversible > correctly in the path record request. It has always asked for > irreversible paths even if the ULP requests otherwise. > > When the kernel is used with a SM that supports this feature, it > completely breaks communication management if reversible paths are > not properly requested. > > The only reason this ever worked is because opensm ignores the > reversible bit. > > Cc: stable@vger.kernel.org > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Thanks, applied.
diff --git a/include/rdma/ib_sa.h b/include/rdma/ib_sa.h index 5ee7aab95eb849..fd0e53219f93e5 100644 --- a/include/rdma/ib_sa.h +++ b/include/rdma/ib_sa.h @@ -153,12 +153,12 @@ struct ib_sa_path_rec { union ib_gid sgid; __be16 dlid; __be16 slid; - int raw_traffic; + u8 raw_traffic; /* reserved */ __be32 flow_label; u8 hop_limit; u8 traffic_class; - int reversible; + u8 reversible; u8 numb_path; __be16 pkey; __be16 qos_class; @@ -220,7 +220,7 @@ struct ib_sa_mcmember_rec { u8 hop_limit; u8 scope; u8 join_state; - int proxy_join; + u8 proxy_join; }; /* Service Record Component Mask Sec 15.2.5.14 Ver 1.1 */
The RDMA core uses ib_pack() to convert from unpacked CPU structs to on-the-wire bitpacked structs. This process requires that 1 bit fields are declared as u8 in the unpacked struct, otherwise the packing process does not read the value properly and the packed result is wired to 0. Several places wrongly used int. Crucially this means the kernel has never, set reversible correctly in the path record request. It has always asked for irreversible paths even if the ULP requests otherwise. When the kernel is used with a SM that supports this feature, it completely breaks communication management if reversible paths are not properly requested. The only reason this ever worked is because opensm ignores the reversible bit. Cc: stable@vger.kernel.org Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> --- include/rdma/ib_sa.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) For 4.10 please!