diff mbox

IB: Fix incorrect structure packing for booleans

Message ID 20161223010752.GA20676@obsidianresearch.com (mailing list archive)
State Accepted
Headers show

Commit Message

Jason Gunthorpe Dec. 23, 2016, 1:07 a.m. UTC
The RDMA core uses ib_pack() to convert from unpacked CPU structs
to on-the-wire bitpacked structs.

This process requires that 1 bit fields are declared as u8 in the
unpacked struct, otherwise the packing process does not read the
value properly and the packed result is wired to 0. Several
places wrongly used int.

Crucially this means the kernel has never, set reversible
correctly in the path record request. It has always asked for
irreversible paths even if the ULP requests otherwise.

When the kernel is used with a SM that supports this feature, it
completely breaks communication management if reversible paths are
not properly requested.

The only reason this ever worked is because opensm ignores the
reversible bit.

Cc: stable@vger.kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
---
 include/rdma/ib_sa.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

For 4.10 please!

Comments

Hal Rosenstock Jan. 3, 2017, 7:16 p.m. UTC | #1
On 12/22/2016 8:07 PM, Jason Gunthorpe wrote:
> Crucially this means the kernel has never, set reversible
> correctly in the path record request. It has always asked for
> irreversible paths even if the ULP requests otherwise.

Issuing path record query with reversible bit set to 0 means that any
path (reversible or not) can be returned by SA. It is not requesting an
irreversible path. In the response, the reversible bit indicates whether
the path is reversible or not.

> When the kernel is used with a SM that supports this feature, it
> completely breaks communication management if reversible paths are
> not properly requested.
> 
> The only reason this ever worked is because opensm ignores the
> reversible bit.

OpenSM supports reversible PR queries. I tested all 3 cases (comp mask
bit not set, comp mask bit set and reversible set to 1, and comp mask
bit set and reversible set to 0) in a single subnet.

If you are referring to multiple (IB routed) subnet case, OpenSM only
has rudimentary support for this and path queries is just one area where
shortcuts were done to demo primitive IB routers a long time ago so I
would not be surprised if reversible was ignored in that scenario.
Is that what you are referring to ? If so, it would be more accurate to
add that into your description rather than a blanket statement that
OpenSM ignores the reversible bit in path queries.

Otherwise, please elaborate/provide more details on the issue you are
seeing with OpenSM.

-- Hal
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Jan. 3, 2017, 7:46 p.m. UTC | #2
On Tue, Jan 03, 2017 at 02:16:42PM -0500, Hal Rosenstock wrote:
> On 12/22/2016 8:07 PM, Jason Gunthorpe wrote:
> > Crucially this means the kernel has never, set reversible
> > correctly in the path record request. It has always asked for
> > irreversible paths even if the ULP requests otherwise.
> 
> Issuing path record query with reversible bit set to 0 means that any
> path (reversible or not) can be returned by SA. It is not requesting an
> irreversible path. In the response, the reversible bit indicates whether
> the path is reversible or not.

'asked for' == 'indicated it would accept'.

> > When the kernel is used with a SM that supports this feature, it
> > completely breaks communication management if reversible paths are
> > not properly requested.
> > 
> > The only reason this ever worked is because opensm ignores the
> > reversible bit.
> 
> OpenSM supports reversible PR queries. I tested all 3 cases (comp mask
> bit not set, comp mask bit set and reversible set to 1, and comp mask
> bit set and reversible set to 0) in a single subnet.

The purpose of the statement to elaborate why the bug has gone
undetected for so long.

opensm ignores the reversible bit in the sense that opensm *only*
supports reversible paths internally and every PR response it
generates has a reversible DLID/SLID/SL tuple, no matter what the
reversible response bit might claim. (AFAIK this outcome is baked into
the lid routers)

If this was not the case then this bug would have been discovered long
ago because the kernel RDMA CM simply *does not work* if a PR response
contains an irreversible DLID/SLID/SL tuple.

> Otherwise, please elaborate/provide more details on the issue you are
> seeing with OpenSM.

There is no issue with OpenSM. It simply does not make use of an
optional spec feature in way that would expose this kernel bug.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Ledford Jan. 12, 2017, 5:10 p.m. UTC | #3
On Thu, 2016-12-22 at 18:07 -0700, Jason Gunthorpe wrote:
> The RDMA core uses ib_pack() to convert from unpacked CPU structs
> to on-the-wire bitpacked structs.
> 
> This process requires that 1 bit fields are declared as u8 in the
> unpacked struct, otherwise the packing process does not read the
> value properly and the packed result is wired to 0. Several
> places wrongly used int.
> 
> Crucially this means the kernel has never, set reversible
> correctly in the path record request. It has always asked for
> irreversible paths even if the ULP requests otherwise.
> 
> When the kernel is used with a SM that supports this feature, it
> completely breaks communication management if reversible paths are
> not properly requested.
> 
> The only reason this ever worked is because opensm ignores the
> reversible bit.
> 
> Cc: stable@vger.kernel.org
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>

Thanks, applied.
diff mbox

Patch

diff --git a/include/rdma/ib_sa.h b/include/rdma/ib_sa.h
index 5ee7aab95eb849..fd0e53219f93e5 100644
--- a/include/rdma/ib_sa.h
+++ b/include/rdma/ib_sa.h
@@ -153,12 +153,12 @@  struct ib_sa_path_rec {
 	union ib_gid sgid;
 	__be16       dlid;
 	__be16       slid;
-	int          raw_traffic;
+	u8           raw_traffic;
 	/* reserved */
 	__be32       flow_label;
 	u8           hop_limit;
 	u8           traffic_class;
-	int          reversible;
+	u8           reversible;
 	u8           numb_path;
 	__be16       pkey;
 	__be16       qos_class;
@@ -220,7 +220,7 @@  struct ib_sa_mcmember_rec {
 	u8           hop_limit;
 	u8           scope;
 	u8           join_state;
-	int          proxy_join;
+	u8           proxy_join;
 };
 
 /* Service Record Component Mask Sec 15.2.5.14 Ver 1.1	*/