Message ID | 9c6478b70dc23cfec3a7bfc345c30ff817e7e799.1631660866.git.leonro@nvidia.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Jason Gunthorpe |
Headers | show |
Series | [rdma-rc] RDMA/mlx5: Add dummy umem to IB_MR_TYPE_DM | expand |
On Wed, Sep 15, 2021 at 02:08:25AM +0300, Leon Romanovsky wrote: > From: Alaa Hleihel <alaa@nvidia.com> > > After the cited patch, and for the case of IB_MR_TYPE_DM that doesn't > have a umem (even though it is a user MR), function mlx5_free_priv_descs() > will think that it's a kernel MR, leading to wrongly accessing mr->descs > that will get wrong values in the union which leads to attempting to > release resources that were not allocated in the first place. > > For example: > DMA-API: mlx5_core 0000:08:00.1: device driver tries to free DMA memory it has not allocated [device address=0x0000000000000000] [size=0 bytes] > WARNING: CPU: 8 PID: 1021 at kernel/dma/debug.c:961 check_unmap+0x54f/0x8b0 > RIP: 0010:check_unmap+0x54f/0x8b0 > Call Trace: > debug_dma_unmap_page+0x57/0x60 > mlx5_free_priv_descs+0x57/0x70 [mlx5_ib] > mlx5_ib_dereg_mr+0x1fb/0x3d0 [mlx5_ib] > ib_dereg_mr_user+0x60/0x140 [ib_core] > uverbs_destroy_uobject+0x59/0x210 [ib_uverbs] > uobj_destroy+0x3f/0x80 [ib_uverbs] > ib_uverbs_cmd_verbs+0x435/0xd10 [ib_uverbs] > ? uverbs_finalize_object+0x50/0x50 [ib_uverbs] > ? lock_acquire+0xc4/0x2e0 > ? lock_acquired+0x12/0x380 > ? lock_acquire+0xc4/0x2e0 > ? lock_acquire+0xc4/0x2e0 > ? ib_uverbs_ioctl+0x7c/0x140 [ib_uverbs] > ? lock_release+0x28a/0x400 > ib_uverbs_ioctl+0xc0/0x140 [ib_uverbs] > ? ib_uverbs_ioctl+0x7c/0x140 [ib_uverbs] > __x64_sys_ioctl+0x7f/0xb0 > do_syscall_64+0x38/0x90 > > Fix it by adding a dummy umem to IB_MR_TYPE_DM MRs. > > Fixes: f18ec4223117 ("RDMA/mlx5: Use a union inside mlx5_ib_mr") > Signed-off-by: Alaa Hleihel <alaa@nvidia.com> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com> > --- > drivers/infiniband/core/umem.c | 21 +++++++++++++++++++++ > drivers/infiniband/hw/mlx5/mr.c | 5 +++++ > include/rdma/ib_umem.h | 5 +++++ > 3 files changed, 31 insertions(+) Please drop this patch, it seems that the proposed solution is too naive. Thanks
diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index 44a0f0b2570f..518682a64daf 100644 --- a/drivers/infiniband/core/umem.c +++ b/drivers/infiniband/core/umem.c @@ -299,6 +299,27 @@ struct ib_umem *ib_umem_get_peer(struct ib_device *device, unsigned long addr, } EXPORT_SYMBOL(ib_umem_get_peer); +/** + * ib_umem_get_dummy - Create an empty umem + * + * @device: IB device to connect UMEM + */ +struct ib_umem *ib_umem_get_dummy(struct ib_device *device) +{ + struct ib_umem *umem; + + umem = kzalloc(sizeof(*umem), GFP_KERNEL); + if (!umem) + return ERR_PTR(-ENOMEM); + + umem->ibdev = device; + umem->owning_mm = current->mm; + mmgrab(umem->owning_mm); + + return umem; +} +EXPORT_SYMBOL(ib_umem_get_dummy); + /** * ib_umem_release - release memory pinned with ib_umem_get * @umem: umem struct to release diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index 94f2c0c0f42c..2d54db152e54 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1386,6 +1386,11 @@ static struct ib_mr *mlx5_ib_get_dm_mr(struct ib_pd *pd, u64 start_addr, kfree(in); set_mr_fields(dev, mr, length, acc); + mr->umem = ib_umem_get_dummy(&dev->ib_dev); + if (IS_ERR(mr->umem)) { + err = PTR_ERR(mr->umem); + goto err_free; + } return &mr->ibmr; diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h index bd64e6749951..18ea9c25207d 100644 --- a/include/rdma/ib_umem.h +++ b/include/rdma/ib_umem.h @@ -106,6 +106,7 @@ static inline void __rdma_umem_block_iter_start(struct ib_block_iter *biter, struct ib_umem *ib_umem_get(struct ib_device *device, unsigned long addr, size_t size, int access); +struct ib_umem *ib_umem_get_dummy(struct ib_device *device); void ib_umem_release(struct ib_umem *umem); int ib_umem_copy_from(void *dst, struct ib_umem *umem, size_t offset, size_t length); @@ -167,6 +168,10 @@ static inline struct ib_umem *ib_umem_get(struct ib_device *device, { return ERR_PTR(-EOPNOTSUPP); } +static struct ib_umem *ib_umem_get_dummy(struct ib_device *device) +{ + return ERR_PTR(-EOPNOTSUPP); +} static inline void ib_umem_release(struct ib_umem *umem) { } static inline int ib_umem_copy_from(void *dst, struct ib_umem *umem, size_t offset, size_t length) {