diff mbox series

[v4,rdma-next,2/6] RDMA/mlx5: Remove explicit ODP cache entry

Message ID 20230115133454.29000-3-michaelgur@nvidia.com (mailing list archive)
State Superseded
Headers show
Series RDMA/mlx5: Switch MR cache to use RB-tree | expand

Commit Message

Michael Guralnik Jan. 15, 2023, 1:34 p.m. UTC
From: Aharon Landau <aharonl@nvidia.com>

Explicit ODP mkey doesn't have unique properties. It shares the same
properties as the order 18 cache entry. There is no need to devote a special
entry for that.

Signed-off-by: Aharon Landau <aharonl@nvidia.com>
---
 drivers/infiniband/hw/mlx5/odp.c | 20 +++++---------------
 include/linux/mlx5/driver.h      |  1 -
 2 files changed, 5 insertions(+), 16 deletions(-)

Comments

Jason Gunthorpe Jan. 16, 2023, 4:59 p.m. UTC | #1
On Sun, Jan 15, 2023 at 03:34:50PM +0200, Michael Guralnik wrote:
> From: Aharon Landau <aharonl@nvidia.com>
> 
> Explicit ODP mkey doesn't have unique properties. It shares the same
> properties as the order 18 cache entry. There is no need to devote a special
> entry for that.

IMR is "implicit mr" for implicit ODP, the commit message is wrong

> @@ -1591,20 +1593,8 @@ void mlx5_odp_init_mkey_cache_entry(struct mlx5_cache_ent *ent)
>  {
>  	if (!(ent->dev->odp_caps.general_caps & IB_ODP_SUPPORT_IMPLICIT))
>  		return;
> -
> -	switch (ent->order - 2) {
> -	case MLX5_IMR_MTT_CACHE_ENTRY:
> -		ent->ndescs = MLX5_IMR_MTT_ENTRIES;
> -		ent->access_mode = MLX5_MKC_ACCESS_MODE_MTT;
> -		ent->limit = 0;
> -		break;
> -
> -	case MLX5_IMR_KSM_CACHE_ENTRY:
> -		ent->ndescs = mlx5_imr_ksm_entries;
> -		ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM;
> -		ent->limit = 0;
> -		break;
> -	}
> +	ent->ndescs = mlx5_imr_ksm_entries;
> +	ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM;

And you didn't answer my question, is this URMable?

Because I don't quite understand how this can work at this point, for
lower orders the access_mode is assumed to be MTT, a KLM cannot be put
in a low order entry at this point.

Ideally you'd teach UMR to switch between MTT/KSM and then the cache
is fine, size the amount of space required based on the number of
bytes in the memory.

Jason
Michael Guralnik Jan. 16, 2023, 11:24 p.m. UTC | #2
On 1/16/2023 6:59 PM, Jason Gunthorpe wrote:
> On Sun, Jan 15, 2023 at 03:34:50PM +0200, Michael Guralnik wrote:
>> From: Aharon Landau <aharonl@nvidia.com>
>>
>> Explicit ODP mkey doesn't have unique properties. It shares the same
>> properties as the order 18 cache entry. There is no need to devote a special
>> entry for that.
> IMR is "implicit mr" for implicit ODP, the commit message is wrong

Yes. I'll change to: "IMR MTT mkeys don't have unique properties..."

>> @@ -1591,20 +1593,8 @@ void mlx5_odp_init_mkey_cache_entry(struct mlx5_cache_ent *ent)
>>   {
>>   	if (!(ent->dev->odp_caps.general_caps & IB_ODP_SUPPORT_IMPLICIT))
>>   		return;
>> -
>> -	switch (ent->order - 2) {
>> -	case MLX5_IMR_MTT_CACHE_ENTRY:
>> -		ent->ndescs = MLX5_IMR_MTT_ENTRIES;
>> -		ent->access_mode = MLX5_MKC_ACCESS_MODE_MTT;
>> -		ent->limit = 0;
>> -		break;
>> -
>> -	case MLX5_IMR_KSM_CACHE_ENTRY:
>> -		ent->ndescs = mlx5_imr_ksm_entries;
>> -		ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM;
>> -		ent->limit = 0;
>> -		break;
>> -	}
>> +	ent->ndescs = mlx5_imr_ksm_entries;
>> +	ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM;
> And you didn't answer my question, is this URMable?
Yes, we can UMR between access modes.
> Because I don't quite understand how this can work at this point, for
> lower orders the access_mode is assumed to be MTT, a KLM cannot be put
> in a low order entry at this point.

In our current code, the only non-MTT mkeys using the cache are the IMR 
KSM that this patch doesn't change.

> Ideally you'd teach UMR to switch between MTT/KSM and then the cache
> is fine, size the amount of space required based on the number of
> bytes in the memory.
>
> Jason

Agreed, access_mode and ndescs can be dropped from the rb_key that this 
series introduces and instead we'll add the size of the descriptors as a 
cache entry property.
Doing this will reduce number of entries in the RB tree but will add 
complexity to the dereg and rereg flows .

I'd prefer to look into this in a later stage.

Michael
Jason Gunthorpe Jan. 16, 2023, 11:45 p.m. UTC | #3
On Tue, Jan 17, 2023 at 01:24:34AM +0200, Michael Guralnik wrote:
> 
> On 1/16/2023 6:59 PM, Jason Gunthorpe wrote:
> > On Sun, Jan 15, 2023 at 03:34:50PM +0200, Michael Guralnik wrote:
> > > From: Aharon Landau <aharonl@nvidia.com>
> > > 
> > > Explicit ODP mkey doesn't have unique properties. It shares the same
> > > properties as the order 18 cache entry. There is no need to devote a special
> > > entry for that.
> > IMR is "implicit mr" for implicit ODP, the commit message is wrong
> 
> Yes. I'll change to: "IMR MTT mkeys don't have unique properties..."
> 
> > > @@ -1591,20 +1593,8 @@ void mlx5_odp_init_mkey_cache_entry(struct mlx5_cache_ent *ent)
> > >   {
> > >   	if (!(ent->dev->odp_caps.general_caps & IB_ODP_SUPPORT_IMPLICIT))
> > >   		return;
> > > -
> > > -	switch (ent->order - 2) {
> > > -	case MLX5_IMR_MTT_CACHE_ENTRY:
> > > -		ent->ndescs = MLX5_IMR_MTT_ENTRIES;
> > > -		ent->access_mode = MLX5_MKC_ACCESS_MODE_MTT;
> > > -		ent->limit = 0;
> > > -		break;
> > > -
> > > -	case MLX5_IMR_KSM_CACHE_ENTRY:
> > > -		ent->ndescs = mlx5_imr_ksm_entries;
> > > -		ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM;
> > > -		ent->limit = 0;
> > > -		break;
> > > -	}
> > > +	ent->ndescs = mlx5_imr_ksm_entries;
> > > +	ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM;
> > And you didn't answer my question, is this URMable?
> Yes, we can UMR between access modes.
> > Because I don't quite understand how this can work at this point, for
> > lower orders the access_mode is assumed to be MTT, a KLM cannot be put
> > in a low order entry at this point.
> 
> In our current code, the only non-MTT mkeys using the cache are the IMR KSM
> that this patch doesn't change.

It does change it, the isolation between the special IMR and the
normal MTT order is removed right here.

Now it is broken

> > Ideally you'd teach UMR to switch between MTT/KSM and then the cache
> > is fine, size the amount of space required based on the number of
> > bytes in the memory.

> Agreed, access_mode and ndescs can be dropped from the rb_key that this
> series introduces and instead we'll add the size of the descriptors as a
> cache entry property.
> Doing this will reduce number of entries in the RB tree but will add
> complexity to the dereg and rereg flows .

Not really, you just always set the access mode in the UMR like
everything else.

Jason
Michael Guralnik Jan. 17, 2023, 12:08 a.m. UTC | #4
On 1/17/2023 1:45 AM, Jason Gunthorpe wrote:
> On Tue, Jan 17, 2023 at 01:24:34AM +0200, Michael Guralnik wrote:
>> On 1/16/2023 6:59 PM, Jason Gunthorpe wrote:
>>> On Sun, Jan 15, 2023 at 03:34:50PM +0200, Michael Guralnik wrote:
>>>> From: Aharon Landau <aharonl@nvidia.com>
>>>>
>>>> Explicit ODP mkey doesn't have unique properties. It shares the same
>>>> properties as the order 18 cache entry. There is no need to devote a special
>>>> entry for that.
>>> IMR is "implicit mr" for implicit ODP, the commit message is wrong
>> Yes. I'll change to: "IMR MTT mkeys don't have unique properties..."
>>
>>>> @@ -1591,20 +1593,8 @@ void mlx5_odp_init_mkey_cache_entry(struct mlx5_cache_ent *ent)
>>>>    {
>>>>    	if (!(ent->dev->odp_caps.general_caps & IB_ODP_SUPPORT_IMPLICIT))
>>>>    		return;
>>>> -
>>>> -	switch (ent->order - 2) {
>>>> -	case MLX5_IMR_MTT_CACHE_ENTRY:
>>>> -		ent->ndescs = MLX5_IMR_MTT_ENTRIES;
>>>> -		ent->access_mode = MLX5_MKC_ACCESS_MODE_MTT;
>>>> -		ent->limit = 0;
>>>> -		break;
>>>> -
>>>> -	case MLX5_IMR_KSM_CACHE_ENTRY:
>>>> -		ent->ndescs = mlx5_imr_ksm_entries;
>>>> -		ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM;
>>>> -		ent->limit = 0;
>>>> -		break;
>>>> -	}
>>>> +	ent->ndescs = mlx5_imr_ksm_entries;
>>>> +	ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM;
>>> And you didn't answer my question, is this URMable?
>> Yes, we can UMR between access modes.
>>> Because I don't quite understand how this can work at this point, for
>>> lower orders the access_mode is assumed to be MTT, a KLM cannot be put
>>> in a low order entry at this point.
>> In our current code, the only non-MTT mkeys using the cache are the IMR KSM
>> that this patch doesn't change.
> It does change it, the isolation between the special IMR and the
> normal MTT order is removed right here.
>
> Now it is broken

How do IMR MTT mkeys sharing a cache entry with other MTT mkeys break 
anything?

>>> Ideally you'd teach UMR to switch between MTT/KSM and then the cache
>>> is fine, size the amount of space required based on the number of
>>> bytes in the memory.
>> Agreed, access_mode and ndescs can be dropped from the rb_key that this
>> series introduces and instead we'll add the size of the descriptors as a
>> cache entry property.
>> Doing this will reduce number of entries in the RB tree but will add
>> complexity to the dereg and rereg flows .
> Not really, you just always set the access mode in the UMR like
> everything else.
>
> Jason

ok, I'll give this a second look. if it's really only this, I can 
probably push this quickly.
BTW, this will mean that IMR KSM mkeys will also share an entry with 
other MTT mkeys
Jason Gunthorpe Jan. 17, 2023, 2:49 p.m. UTC | #5
On Tue, Jan 17, 2023 at 02:08:35AM +0200, Michael Guralnik wrote:
> 
> On 1/17/2023 1:45 AM, Jason Gunthorpe wrote:
> > On Tue, Jan 17, 2023 at 01:24:34AM +0200, Michael Guralnik wrote:
> > > On 1/16/2023 6:59 PM, Jason Gunthorpe wrote:
> > > > On Sun, Jan 15, 2023 at 03:34:50PM +0200, Michael Guralnik wrote:
> > > > > From: Aharon Landau <aharonl@nvidia.com>
> > > > > 
> > > > > Explicit ODP mkey doesn't have unique properties. It shares the same
> > > > > properties as the order 18 cache entry. There is no need to devote a special
> > > > > entry for that.
> > > > IMR is "implicit mr" for implicit ODP, the commit message is wrong
> > > Yes. I'll change to: "IMR MTT mkeys don't have unique properties..."
> > > 
> > > > > @@ -1591,20 +1593,8 @@ void mlx5_odp_init_mkey_cache_entry(struct mlx5_cache_ent *ent)
> > > > >    {
> > > > >    	if (!(ent->dev->odp_caps.general_caps & IB_ODP_SUPPORT_IMPLICIT))
> > > > >    		return;
> > > > > -
> > > > > -	switch (ent->order - 2) {
> > > > > -	case MLX5_IMR_MTT_CACHE_ENTRY:
> > > > > -		ent->ndescs = MLX5_IMR_MTT_ENTRIES;
> > > > > -		ent->access_mode = MLX5_MKC_ACCESS_MODE_MTT;
> > > > > -		ent->limit = 0;
> > > > > -		break;
> > > > > -
> > > > > -	case MLX5_IMR_KSM_CACHE_ENTRY:
> > > > > -		ent->ndescs = mlx5_imr_ksm_entries;
> > > > > -		ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM;
> > > > > -		ent->limit = 0;
> > > > > -		break;
> > > > > -	}
> > > > > +	ent->ndescs = mlx5_imr_ksm_entries;
> > > > > +	ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM;
> > > > And you didn't answer my question, is this URMable?
> > > Yes, we can UMR between access modes.
> > > > Because I don't quite understand how this can work at this point, for
> > > > lower orders the access_mode is assumed to be MTT, a KLM cannot be put
> > > > in a low order entry at this point.
> > > In our current code, the only non-MTT mkeys using the cache are the IMR KSM
> > > that this patch doesn't change.
> > It does change it, the isolation between the special IMR and the
> > normal MTT order is removed right here.
> > 
> > Now it is broken
> 
> How do IMR MTT mkeys sharing a cache entry with other MTT mkeys break
> anything?

Oh, I read it wrong, this is still keeping the high order
MLX5_IMR_KSM_CACHE_ENTRY

> > > > Ideally you'd teach UMR to switch between MTT/KSM and then the cache
> > > > is fine, size the amount of space required based on the number of
> > > > bytes in the memory.
> > > Agreed, access_mode and ndescs can be dropped from the rb_key that this
> > > series introduces and instead we'll add the size of the descriptors as a
> > > cache entry property.
> > > Doing this will reduce number of entries in the RB tree but will add
> > > complexity to the dereg and rereg flows .
> > Not really, you just always set the access mode in the UMR like
> > everything else.
> > 
> > Jason
> 
> ok, I'll give this a second look. if it's really only this, I can probably
> push this quickly.
> BTW, this will mean that IMR KSM mkeys will also share an entry with other
> MTT mkeys

That would be perfect, you should definately do it

But it seems there is not an issue here, so a followup is OK

Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c
index 8a78580a2a72..72044f8ec883 100644
--- a/drivers/infiniband/hw/mlx5/odp.c
+++ b/drivers/infiniband/hw/mlx5/odp.c
@@ -405,6 +405,7 @@  static void mlx5_ib_page_fault_resume(struct mlx5_ib_dev *dev,
 static struct mlx5_ib_mr *implicit_get_child_mr(struct mlx5_ib_mr *imr,
 						unsigned long idx)
 {
+	int order = order_base_2(MLX5_IMR_MTT_ENTRIES);
 	struct mlx5_ib_dev *dev = mr_to_mdev(imr);
 	struct ib_umem_odp *odp;
 	struct mlx5_ib_mr *mr;
@@ -417,7 +418,8 @@  static struct mlx5_ib_mr *implicit_get_child_mr(struct mlx5_ib_mr *imr,
 	if (IS_ERR(odp))
 		return ERR_CAST(odp);
 
-	mr = mlx5_mr_cache_alloc(dev, &dev->cache.ent[MLX5_IMR_MTT_CACHE_ENTRY],
+	BUILD_BUG_ON(order > MKEY_CACHE_LAST_STD_ENTRY);
+	mr = mlx5_mr_cache_alloc(dev, &dev->cache.ent[order],
 				 imr->access_flags);
 	if (IS_ERR(mr)) {
 		ib_umem_odp_release(odp);
@@ -1591,20 +1593,8 @@  void mlx5_odp_init_mkey_cache_entry(struct mlx5_cache_ent *ent)
 {
 	if (!(ent->dev->odp_caps.general_caps & IB_ODP_SUPPORT_IMPLICIT))
 		return;
-
-	switch (ent->order - 2) {
-	case MLX5_IMR_MTT_CACHE_ENTRY:
-		ent->ndescs = MLX5_IMR_MTT_ENTRIES;
-		ent->access_mode = MLX5_MKC_ACCESS_MODE_MTT;
-		ent->limit = 0;
-		break;
-
-	case MLX5_IMR_KSM_CACHE_ENTRY:
-		ent->ndescs = mlx5_imr_ksm_entries;
-		ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM;
-		ent->limit = 0;
-		break;
-	}
+	ent->ndescs = mlx5_imr_ksm_entries;
+	ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM;
 }
 
 static const struct ib_device_ops mlx5_ib_dev_odp_ops = {
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index d476255c9a3f..f79c20d50eb4 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -734,7 +734,6 @@  enum {
 
 enum {
 	MKEY_CACHE_LAST_STD_ENTRY = 20,
-	MLX5_IMR_MTT_CACHE_ENTRY,
 	MLX5_IMR_KSM_CACHE_ENTRY,
 	MAX_MKEY_CACHE_ENTRIES
 };