Message ID | 20230115133454.29000-3-michaelgur@nvidia.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | RDMA/mlx5: Switch MR cache to use RB-tree | expand |
On Sun, Jan 15, 2023 at 03:34:50PM +0200, Michael Guralnik wrote: > From: Aharon Landau <aharonl@nvidia.com> > > Explicit ODP mkey doesn't have unique properties. It shares the same > properties as the order 18 cache entry. There is no need to devote a special > entry for that. IMR is "implicit mr" for implicit ODP, the commit message is wrong > @@ -1591,20 +1593,8 @@ void mlx5_odp_init_mkey_cache_entry(struct mlx5_cache_ent *ent) > { > if (!(ent->dev->odp_caps.general_caps & IB_ODP_SUPPORT_IMPLICIT)) > return; > - > - switch (ent->order - 2) { > - case MLX5_IMR_MTT_CACHE_ENTRY: > - ent->ndescs = MLX5_IMR_MTT_ENTRIES; > - ent->access_mode = MLX5_MKC_ACCESS_MODE_MTT; > - ent->limit = 0; > - break; > - > - case MLX5_IMR_KSM_CACHE_ENTRY: > - ent->ndescs = mlx5_imr_ksm_entries; > - ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM; > - ent->limit = 0; > - break; > - } > + ent->ndescs = mlx5_imr_ksm_entries; > + ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM; And you didn't answer my question, is this URMable? Because I don't quite understand how this can work at this point, for lower orders the access_mode is assumed to be MTT, a KLM cannot be put in a low order entry at this point. Ideally you'd teach UMR to switch between MTT/KSM and then the cache is fine, size the amount of space required based on the number of bytes in the memory. Jason
On 1/16/2023 6:59 PM, Jason Gunthorpe wrote: > On Sun, Jan 15, 2023 at 03:34:50PM +0200, Michael Guralnik wrote: >> From: Aharon Landau <aharonl@nvidia.com> >> >> Explicit ODP mkey doesn't have unique properties. It shares the same >> properties as the order 18 cache entry. There is no need to devote a special >> entry for that. > IMR is "implicit mr" for implicit ODP, the commit message is wrong Yes. I'll change to: "IMR MTT mkeys don't have unique properties..." >> @@ -1591,20 +1593,8 @@ void mlx5_odp_init_mkey_cache_entry(struct mlx5_cache_ent *ent) >> { >> if (!(ent->dev->odp_caps.general_caps & IB_ODP_SUPPORT_IMPLICIT)) >> return; >> - >> - switch (ent->order - 2) { >> - case MLX5_IMR_MTT_CACHE_ENTRY: >> - ent->ndescs = MLX5_IMR_MTT_ENTRIES; >> - ent->access_mode = MLX5_MKC_ACCESS_MODE_MTT; >> - ent->limit = 0; >> - break; >> - >> - case MLX5_IMR_KSM_CACHE_ENTRY: >> - ent->ndescs = mlx5_imr_ksm_entries; >> - ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM; >> - ent->limit = 0; >> - break; >> - } >> + ent->ndescs = mlx5_imr_ksm_entries; >> + ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM; > And you didn't answer my question, is this URMable? Yes, we can UMR between access modes. > Because I don't quite understand how this can work at this point, for > lower orders the access_mode is assumed to be MTT, a KLM cannot be put > in a low order entry at this point. In our current code, the only non-MTT mkeys using the cache are the IMR KSM that this patch doesn't change. > Ideally you'd teach UMR to switch between MTT/KSM and then the cache > is fine, size the amount of space required based on the number of > bytes in the memory. > > Jason Agreed, access_mode and ndescs can be dropped from the rb_key that this series introduces and instead we'll add the size of the descriptors as a cache entry property. Doing this will reduce number of entries in the RB tree but will add complexity to the dereg and rereg flows . I'd prefer to look into this in a later stage. Michael
On Tue, Jan 17, 2023 at 01:24:34AM +0200, Michael Guralnik wrote: > > On 1/16/2023 6:59 PM, Jason Gunthorpe wrote: > > On Sun, Jan 15, 2023 at 03:34:50PM +0200, Michael Guralnik wrote: > > > From: Aharon Landau <aharonl@nvidia.com> > > > > > > Explicit ODP mkey doesn't have unique properties. It shares the same > > > properties as the order 18 cache entry. There is no need to devote a special > > > entry for that. > > IMR is "implicit mr" for implicit ODP, the commit message is wrong > > Yes. I'll change to: "IMR MTT mkeys don't have unique properties..." > > > > @@ -1591,20 +1593,8 @@ void mlx5_odp_init_mkey_cache_entry(struct mlx5_cache_ent *ent) > > > { > > > if (!(ent->dev->odp_caps.general_caps & IB_ODP_SUPPORT_IMPLICIT)) > > > return; > > > - > > > - switch (ent->order - 2) { > > > - case MLX5_IMR_MTT_CACHE_ENTRY: > > > - ent->ndescs = MLX5_IMR_MTT_ENTRIES; > > > - ent->access_mode = MLX5_MKC_ACCESS_MODE_MTT; > > > - ent->limit = 0; > > > - break; > > > - > > > - case MLX5_IMR_KSM_CACHE_ENTRY: > > > - ent->ndescs = mlx5_imr_ksm_entries; > > > - ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM; > > > - ent->limit = 0; > > > - break; > > > - } > > > + ent->ndescs = mlx5_imr_ksm_entries; > > > + ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM; > > And you didn't answer my question, is this URMable? > Yes, we can UMR between access modes. > > Because I don't quite understand how this can work at this point, for > > lower orders the access_mode is assumed to be MTT, a KLM cannot be put > > in a low order entry at this point. > > In our current code, the only non-MTT mkeys using the cache are the IMR KSM > that this patch doesn't change. It does change it, the isolation between the special IMR and the normal MTT order is removed right here. Now it is broken > > Ideally you'd teach UMR to switch between MTT/KSM and then the cache > > is fine, size the amount of space required based on the number of > > bytes in the memory. > Agreed, access_mode and ndescs can be dropped from the rb_key that this > series introduces and instead we'll add the size of the descriptors as a > cache entry property. > Doing this will reduce number of entries in the RB tree but will add > complexity to the dereg and rereg flows . Not really, you just always set the access mode in the UMR like everything else. Jason
On 1/17/2023 1:45 AM, Jason Gunthorpe wrote: > On Tue, Jan 17, 2023 at 01:24:34AM +0200, Michael Guralnik wrote: >> On 1/16/2023 6:59 PM, Jason Gunthorpe wrote: >>> On Sun, Jan 15, 2023 at 03:34:50PM +0200, Michael Guralnik wrote: >>>> From: Aharon Landau <aharonl@nvidia.com> >>>> >>>> Explicit ODP mkey doesn't have unique properties. It shares the same >>>> properties as the order 18 cache entry. There is no need to devote a special >>>> entry for that. >>> IMR is "implicit mr" for implicit ODP, the commit message is wrong >> Yes. I'll change to: "IMR MTT mkeys don't have unique properties..." >> >>>> @@ -1591,20 +1593,8 @@ void mlx5_odp_init_mkey_cache_entry(struct mlx5_cache_ent *ent) >>>> { >>>> if (!(ent->dev->odp_caps.general_caps & IB_ODP_SUPPORT_IMPLICIT)) >>>> return; >>>> - >>>> - switch (ent->order - 2) { >>>> - case MLX5_IMR_MTT_CACHE_ENTRY: >>>> - ent->ndescs = MLX5_IMR_MTT_ENTRIES; >>>> - ent->access_mode = MLX5_MKC_ACCESS_MODE_MTT; >>>> - ent->limit = 0; >>>> - break; >>>> - >>>> - case MLX5_IMR_KSM_CACHE_ENTRY: >>>> - ent->ndescs = mlx5_imr_ksm_entries; >>>> - ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM; >>>> - ent->limit = 0; >>>> - break; >>>> - } >>>> + ent->ndescs = mlx5_imr_ksm_entries; >>>> + ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM; >>> And you didn't answer my question, is this URMable? >> Yes, we can UMR between access modes. >>> Because I don't quite understand how this can work at this point, for >>> lower orders the access_mode is assumed to be MTT, a KLM cannot be put >>> in a low order entry at this point. >> In our current code, the only non-MTT mkeys using the cache are the IMR KSM >> that this patch doesn't change. > It does change it, the isolation between the special IMR and the > normal MTT order is removed right here. > > Now it is broken How do IMR MTT mkeys sharing a cache entry with other MTT mkeys break anything? >>> Ideally you'd teach UMR to switch between MTT/KSM and then the cache >>> is fine, size the amount of space required based on the number of >>> bytes in the memory. >> Agreed, access_mode and ndescs can be dropped from the rb_key that this >> series introduces and instead we'll add the size of the descriptors as a >> cache entry property. >> Doing this will reduce number of entries in the RB tree but will add >> complexity to the dereg and rereg flows . > Not really, you just always set the access mode in the UMR like > everything else. > > Jason ok, I'll give this a second look. if it's really only this, I can probably push this quickly. BTW, this will mean that IMR KSM mkeys will also share an entry with other MTT mkeys
On Tue, Jan 17, 2023 at 02:08:35AM +0200, Michael Guralnik wrote: > > On 1/17/2023 1:45 AM, Jason Gunthorpe wrote: > > On Tue, Jan 17, 2023 at 01:24:34AM +0200, Michael Guralnik wrote: > > > On 1/16/2023 6:59 PM, Jason Gunthorpe wrote: > > > > On Sun, Jan 15, 2023 at 03:34:50PM +0200, Michael Guralnik wrote: > > > > > From: Aharon Landau <aharonl@nvidia.com> > > > > > > > > > > Explicit ODP mkey doesn't have unique properties. It shares the same > > > > > properties as the order 18 cache entry. There is no need to devote a special > > > > > entry for that. > > > > IMR is "implicit mr" for implicit ODP, the commit message is wrong > > > Yes. I'll change to: "IMR MTT mkeys don't have unique properties..." > > > > > > > > @@ -1591,20 +1593,8 @@ void mlx5_odp_init_mkey_cache_entry(struct mlx5_cache_ent *ent) > > > > > { > > > > > if (!(ent->dev->odp_caps.general_caps & IB_ODP_SUPPORT_IMPLICIT)) > > > > > return; > > > > > - > > > > > - switch (ent->order - 2) { > > > > > - case MLX5_IMR_MTT_CACHE_ENTRY: > > > > > - ent->ndescs = MLX5_IMR_MTT_ENTRIES; > > > > > - ent->access_mode = MLX5_MKC_ACCESS_MODE_MTT; > > > > > - ent->limit = 0; > > > > > - break; > > > > > - > > > > > - case MLX5_IMR_KSM_CACHE_ENTRY: > > > > > - ent->ndescs = mlx5_imr_ksm_entries; > > > > > - ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM; > > > > > - ent->limit = 0; > > > > > - break; > > > > > - } > > > > > + ent->ndescs = mlx5_imr_ksm_entries; > > > > > + ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM; > > > > And you didn't answer my question, is this URMable? > > > Yes, we can UMR between access modes. > > > > Because I don't quite understand how this can work at this point, for > > > > lower orders the access_mode is assumed to be MTT, a KLM cannot be put > > > > in a low order entry at this point. > > > In our current code, the only non-MTT mkeys using the cache are the IMR KSM > > > that this patch doesn't change. > > It does change it, the isolation between the special IMR and the > > normal MTT order is removed right here. > > > > Now it is broken > > How do IMR MTT mkeys sharing a cache entry with other MTT mkeys break > anything? Oh, I read it wrong, this is still keeping the high order MLX5_IMR_KSM_CACHE_ENTRY > > > > Ideally you'd teach UMR to switch between MTT/KSM and then the cache > > > > is fine, size the amount of space required based on the number of > > > > bytes in the memory. > > > Agreed, access_mode and ndescs can be dropped from the rb_key that this > > > series introduces and instead we'll add the size of the descriptors as a > > > cache entry property. > > > Doing this will reduce number of entries in the RB tree but will add > > > complexity to the dereg and rereg flows . > > Not really, you just always set the access mode in the UMR like > > everything else. > > > > Jason > > ok, I'll give this a second look. if it's really only this, I can probably > push this quickly. > BTW, this will mean that IMR KSM mkeys will also share an entry with other > MTT mkeys That would be perfect, you should definately do it But it seems there is not an issue here, so a followup is OK Jason
diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 8a78580a2a72..72044f8ec883 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -405,6 +405,7 @@ static void mlx5_ib_page_fault_resume(struct mlx5_ib_dev *dev, static struct mlx5_ib_mr *implicit_get_child_mr(struct mlx5_ib_mr *imr, unsigned long idx) { + int order = order_base_2(MLX5_IMR_MTT_ENTRIES); struct mlx5_ib_dev *dev = mr_to_mdev(imr); struct ib_umem_odp *odp; struct mlx5_ib_mr *mr; @@ -417,7 +418,8 @@ static struct mlx5_ib_mr *implicit_get_child_mr(struct mlx5_ib_mr *imr, if (IS_ERR(odp)) return ERR_CAST(odp); - mr = mlx5_mr_cache_alloc(dev, &dev->cache.ent[MLX5_IMR_MTT_CACHE_ENTRY], + BUILD_BUG_ON(order > MKEY_CACHE_LAST_STD_ENTRY); + mr = mlx5_mr_cache_alloc(dev, &dev->cache.ent[order], imr->access_flags); if (IS_ERR(mr)) { ib_umem_odp_release(odp); @@ -1591,20 +1593,8 @@ void mlx5_odp_init_mkey_cache_entry(struct mlx5_cache_ent *ent) { if (!(ent->dev->odp_caps.general_caps & IB_ODP_SUPPORT_IMPLICIT)) return; - - switch (ent->order - 2) { - case MLX5_IMR_MTT_CACHE_ENTRY: - ent->ndescs = MLX5_IMR_MTT_ENTRIES; - ent->access_mode = MLX5_MKC_ACCESS_MODE_MTT; - ent->limit = 0; - break; - - case MLX5_IMR_KSM_CACHE_ENTRY: - ent->ndescs = mlx5_imr_ksm_entries; - ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM; - ent->limit = 0; - break; - } + ent->ndescs = mlx5_imr_ksm_entries; + ent->access_mode = MLX5_MKC_ACCESS_MODE_KSM; } static const struct ib_device_ops mlx5_ib_dev_odp_ops = { diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index d476255c9a3f..f79c20d50eb4 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -734,7 +734,6 @@ enum { enum { MKEY_CACHE_LAST_STD_ENTRY = 20, - MLX5_IMR_MTT_CACHE_ENTRY, MLX5_IMR_KSM_CACHE_ENTRY, MAX_MKEY_CACHE_ENTRIES };