diff mbox series

RDMA/rxe: Fix parameter errors

Message ID 20230119180506.5197-1-rpearsonhpe@gmail.com (mailing list archive)
State Changes Requested
Headers show
Series RDMA/rxe: Fix parameter errors | expand

Commit Message

Bob Pearson Jan. 19, 2023, 6:05 p.m. UTC
Correct errors in rxe_param.h caused by extending the range of
indices for MRs allowing it to overlap the range for MWs. Since
the driver determines whether an rkey is for an MR or MW by comparing
the index part of the rkey with these ranges this can cause an
MR to be incorrectly determined to be an MW.

Additionally the parameters which determine the size of the index
ranges for MR, MW, QP and SRQ are incorrect since the actual
number of integers in the range [min, max] is (max - min + 1) not
(max - min).

This patch corrects these errors.

Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
 drivers/infiniband/sw/rxe/rxe_param.h | 27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)

Comments

Jason Gunthorpe Jan. 19, 2023, 7:18 p.m. UTC | #1
On Thu, Jan 19, 2023 at 12:05:07PM -0600, Bob Pearson wrote:
> Correct errors in rxe_param.h caused by extending the range of
> indices for MRs allowing it to overlap the range for MWs. Since
> the driver determines whether an rkey is for an MR or MW by comparing
> the index part of the rkey with these ranges this can cause an
> MR to be incorrectly determined to be an MW.
> 
> Additionally the parameters which determine the size of the index
> ranges for MR, MW, QP and SRQ are incorrect since the actual
> number of integers in the range [min, max] is (max - min + 1) not
> (max - min).
> 
> This patch corrects these errors.
> 
> Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_param.h | 27 +++++++++++++++++++--------
>  1 file changed, 19 insertions(+), 8 deletions(-)

This

commit 1aefe5c177c1922119afb4ee443ddd6ac3140b37
Author: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Date:   Tue Dec 20 17:08:48 2022 +0900

    RDMA/rxe: Prevent faulty rkey generation
    
    If you create MRs more than 0x10000 times after loading the module,
    responder starts to reply NAKs for RDMA/Atomic operations because of rkey
    violation detected in check_rkey(). The root cause is that rkeys are
    incremented each time a new MR is created and the value overflows into the
    range reserved for MWs.
    
    This commit also increases the value of RXE_MAX_MW that has been limited
    unlike other parameters.
    
    Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
    Link: https://lore.kernel.org/r/20221220080848.253785-2-matsuda-daisuke@fujitsu.com
    Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
    Tested-by: Li Zhijian <lizhijian@fujitsu.com>
    Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


Is already in v6.2-rc and conflicts with this patch, it looks like it
is doing the same thing, can you sort it out please?

Thanks,
Jason
Bob Pearson Jan. 19, 2023, 8:18 p.m. UTC | #2
On 1/19/23 13:18, Jason Gunthorpe wrote:
> On Thu, Jan 19, 2023 at 12:05:07PM -0600, Bob Pearson wrote:
>> Correct errors in rxe_param.h caused by extending the range of
>> indices for MRs allowing it to overlap the range for MWs. Since
>> the driver determines whether an rkey is for an MR or MW by comparing
>> the index part of the rkey with these ranges this can cause an
>> MR to be incorrectly determined to be an MW.
>>
>> Additionally the parameters which determine the size of the index
>> ranges for MR, MW, QP and SRQ are incorrect since the actual
>> number of integers in the range [min, max] is (max - min + 1) not
>> (max - min).
>>
>> This patch corrects these errors.
>>
>> Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
>> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
>> ---
>>  drivers/infiniband/sw/rxe/rxe_param.h | 27 +++++++++++++++++++--------
>>  1 file changed, 19 insertions(+), 8 deletions(-)
> 
> This
> 
> commit 1aefe5c177c1922119afb4ee443ddd6ac3140b37
> Author: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
> Date:   Tue Dec 20 17:08:48 2022 +0900
> 
>     RDMA/rxe: Prevent faulty rkey generation
>     
>     If you create MRs more than 0x10000 times after loading the module,
>     responder starts to reply NAKs for RDMA/Atomic operations because of rkey
>     violation detected in check_rkey(). The root cause is that rkeys are
>     incremented each time a new MR is created and the value overflows into the
>     range reserved for MWs.
>     
>     This commit also increases the value of RXE_MAX_MW that has been limited
>     unlike other parameters.
>     
>     Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
>     Link: https://lore.kernel.org/r/20221220080848.253785-2-matsuda-daisuke@fujitsu.com
>     Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
>     Tested-by: Li Zhijian <lizhijian@fujitsu.com>
>     Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
>     Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> 
> 
> Is already in v6.2-rc and conflicts with this patch, it looks like it
> is doing the same thing, can you sort it out please?
> 
> Thanks,
> Jason

Missed that one. Yes, they are basically identical except he cut the range in half and gave one to each and I doubled it. The other change I made is still a bug but much less important. It reports an incorrect max_xxx number in hca attributes but has no ill effect.
We can leave it the way it is for now.

Bob
Bob Pearson March 1, 2023, 11:15 p.m. UTC | #3
On 1/19/23 13:18, Jason Gunthorpe wrote:
> On Thu, Jan 19, 2023 at 12:05:07PM -0600, Bob Pearson wrote:
>> Correct errors in rxe_param.h caused by extending the range of
>> indices for MRs allowing it to overlap the range for MWs. Since
>> the driver determines whether an rkey is for an MR or MW by comparing
>> the index part of the rkey with these ranges this can cause an
>> MR to be incorrectly determined to be an MW.
>>
>> Additionally the parameters which determine the size of the index
>> ranges for MR, MW, QP and SRQ are incorrect since the actual
>> number of integers in the range [min, max] is (max - min + 1) not
>> (max - min).
>>
>> This patch corrects these errors.
>>
>> Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
>> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
>> ---
>>  drivers/infiniband/sw/rxe/rxe_param.h | 27 +++++++++++++++++++--------
>>  1 file changed, 19 insertions(+), 8 deletions(-)
> 
> This
> 
> commit 1aefe5c177c1922119afb4ee443ddd6ac3140b37
> Author: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
> Date:   Tue Dec 20 17:08:48 2022 +0900
> 
>     RDMA/rxe: Prevent faulty rkey generation
>     
>     If you create MRs more than 0x10000 times after loading the module,
>     responder starts to reply NAKs for RDMA/Atomic operations because of rkey
>     violation detected in check_rkey(). The root cause is that rkeys are
>     incremented each time a new MR is created and the value overflows into the
>     range reserved for MWs.
>     
>     This commit also increases the value of RXE_MAX_MW that has been limited
>     unlike other parameters.
>     
>     Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
>     Link: https://lore.kernel.org/r/20221220080848.253785-2-matsuda-daisuke@fujitsu.com
>     Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
>     Tested-by: Li Zhijian <lizhijian@fujitsu.com>
>     Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
>     Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> 
> 
> Is already in v6.2-rc and conflicts with this patch, it looks like it
> is doing the same thing, can you sort it out please?
> 
> Thanks,
> Jason

Did this get lost? for-next is now at 6.2-rc3 now and the bug is still in rxe_param.h.

Bob
Jason Gunthorpe March 6, 2023, 8:51 p.m. UTC | #4
On Wed, Mar 01, 2023 at 05:15:07PM -0600, Bob Pearson wrote:
> On 1/19/23 13:18, Jason Gunthorpe wrote:
> > On Thu, Jan 19, 2023 at 12:05:07PM -0600, Bob Pearson wrote:
> >> Correct errors in rxe_param.h caused by extending the range of
> >> indices for MRs allowing it to overlap the range for MWs. Since
> >> the driver determines whether an rkey is for an MR or MW by comparing
> >> the index part of the rkey with these ranges this can cause an
> >> MR to be incorrectly determined to be an MW.
> >>
> >> Additionally the parameters which determine the size of the index
> >> ranges for MR, MW, QP and SRQ are incorrect since the actual
> >> number of integers in the range [min, max] is (max - min + 1) not
> >> (max - min).
> >>
> >> This patch corrects these errors.
> >>
> >> Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
> >> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> >> ---
> >>  drivers/infiniband/sw/rxe/rxe_param.h | 27 +++++++++++++++++++--------
> >>  1 file changed, 19 insertions(+), 8 deletions(-)
> > 
> > This
> > 
> > commit 1aefe5c177c1922119afb4ee443ddd6ac3140b37
> > Author: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
> > Date:   Tue Dec 20 17:08:48 2022 +0900
> > 
> >     RDMA/rxe: Prevent faulty rkey generation
> >     
> >     If you create MRs more than 0x10000 times after loading the module,
> >     responder starts to reply NAKs for RDMA/Atomic operations because of rkey
> >     violation detected in check_rkey(). The root cause is that rkeys are
> >     incremented each time a new MR is created and the value overflows into the
> >     range reserved for MWs.
> >     
> >     This commit also increases the value of RXE_MAX_MW that has been limited
> >     unlike other parameters.
> >     
> >     Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
> >     Link: https://lore.kernel.org/r/20221220080848.253785-2-matsuda-daisuke@fujitsu.com
> >     Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
> >     Tested-by: Li Zhijian <lizhijian@fujitsu.com>
> >     Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
> >     Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > 
> > 
> > Is already in v6.2-rc and conflicts with this patch, it looks like it
> > is doing the same thing, can you sort it out please?
> > 
> > Thanks,
> > Jason
> 
> Did this get lost? for-next is now at 6.2-rc3 now and the bug is
> still in rxe_param.h.

Check again we are at v6.3-rc1 now, if something needs to be fixed
send a new patch..

Jason
Bob Pearson March 13, 2023, 7:55 p.m. UTC | #5
On 3/6/23 14:51, Jason Gunthorpe wrote:
> On Wed, Mar 01, 2023 at 05:15:07PM -0600, Bob Pearson wrote:
>> On 1/19/23 13:18, Jason Gunthorpe wrote:
>>> On Thu, Jan 19, 2023 at 12:05:07PM -0600, Bob Pearson wrote:
>>>> Correct errors in rxe_param.h caused by extending the range of
>>>> indices for MRs allowing it to overlap the range for MWs. Since
>>>> the driver determines whether an rkey is for an MR or MW by comparing
>>>> the index part of the rkey with these ranges this can cause an
>>>> MR to be incorrectly determined to be an MW.
>>>>
>>>> Additionally the parameters which determine the size of the index
>>>> ranges for MR, MW, QP and SRQ are incorrect since the actual
>>>> number of integers in the range [min, max] is (max - min + 1) not
>>>> (max - min).
>>>>
>>>> This patch corrects these errors.
>>>>
>>>> Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
>>>> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
>>>> ---
>>>>  drivers/infiniband/sw/rxe/rxe_param.h | 27 +++++++++++++++++++--------
>>>>  1 file changed, 19 insertions(+), 8 deletions(-)
>>>
>>> This
>>>
>>> commit 1aefe5c177c1922119afb4ee443ddd6ac3140b37
>>> Author: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
>>> Date:   Tue Dec 20 17:08:48 2022 +0900
>>>
>>>     RDMA/rxe: Prevent faulty rkey generation
>>>     
>>>     If you create MRs more than 0x10000 times after loading the module,
>>>     responder starts to reply NAKs for RDMA/Atomic operations because of rkey
>>>     violation detected in check_rkey(). The root cause is that rkeys are
>>>     incremented each time a new MR is created and the value overflows into the
>>>     range reserved for MWs.
>>>     
>>>     This commit also increases the value of RXE_MAX_MW that has been limited
>>>     unlike other parameters.
>>>     
>>>     Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
>>>     Link: https://lore.kernel.org/r/20221220080848.253785-2-matsuda-daisuke@fujitsu.com
>>>     Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
>>>     Tested-by: Li Zhijian <lizhijian@fujitsu.com>
>>>     Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
>>>     Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>>>
>>>
>>> Is already in v6.2-rc and conflicts with this patch, it looks like it
>>> is doing the same thing, can you sort it out please?
>>>
>>> Thanks,
>>> Jason
>>
>> Did this get lost? for-next is now at 6.2-rc3 now and the bug is
>> still in rxe_param.h.
> 
> Check again we are at v6.3-rc1 now, if something needs to be fixed
> send a new patch..
> 
> Jason

Just checked. It now looks good in for-next.

Thanks

Bob
diff mbox series

Patch

diff --git a/drivers/infiniband/sw/rxe/rxe_param.h b/drivers/infiniband/sw/rxe/rxe_param.h
index a754fc902e3d..14baa84d1d9d 100644
--- a/drivers/infiniband/sw/rxe/rxe_param.h
+++ b/drivers/infiniband/sw/rxe/rxe_param.h
@@ -91,18 +91,29 @@  enum rxe_device_param {
 
 	RXE_MIN_QP_INDEX		= 16,
 	RXE_MAX_QP_INDEX		= DEFAULT_MAX_VALUE,
-	RXE_MAX_QP			= DEFAULT_MAX_VALUE - RXE_MIN_QP_INDEX,
+	RXE_MAX_QP			= RXE_MAX_QP_INDEX
+						- RXE_MIN_QP_INDEX + 1,
 
 	RXE_MIN_SRQ_INDEX		= 0x00020001,
 	RXE_MAX_SRQ_INDEX		= DEFAULT_MAX_VALUE,
-	RXE_MAX_SRQ			= DEFAULT_MAX_VALUE - RXE_MIN_SRQ_INDEX,
-
-	RXE_MIN_MR_INDEX		= 0x00000001,
+	RXE_MAX_SRQ			= RXE_MAX_SRQ_INDEX
+						- RXE_MIN_SRQ_INDEX + 1,
+
+	/*
+	 * MR and MW indices are converted to rkeys by shifting
+	 * left 8 bits and oring in an 8 bit key which either
+	 * belongs to the driver or the user depending on the
+	 * MR type. In order to determine if the rkey is an MR
+	 * or an MW the index ranges below must not overlap.
+	 */
+	RXE_MIN_MR_INDEX		= 1,
 	RXE_MAX_MR_INDEX		= DEFAULT_MAX_VALUE,
-	RXE_MAX_MR			= DEFAULT_MAX_VALUE - RXE_MIN_MR_INDEX,
-	RXE_MIN_MW_INDEX		= 0x00010001,
-	RXE_MAX_MW_INDEX		= 0x00020000,
-	RXE_MAX_MW			= 0x00001000,
+	RXE_MAX_MR			= RXE_MAX_MR_INDEX
+						- RXE_MIN_MR_INDEX + 1,
+	RXE_MIN_MW_INDEX		= DEFAULT_MAX_VALUE + 1,
+	RXE_MAX_MW_INDEX		= 2*DEFAULT_MAX_VALUE,
+	RXE_MAX_MW			= RXE_MAX_MW_INDEX
+						- RXE_MIN_MW_INDEX + 1,
 
 	RXE_MAX_PKT_PER_ACK		= 64,