Message ID | 9ba5a611ceac86774d3d0fda12704cecc30606f9.1618753038.git.leonro@nvidia.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | [rdma-next] RDMA/restrack: Delay QP deletion till all users are gone | expand |
On Sun, Apr 18, 2021 at 04:37:35PM +0300, Leon Romanovsky wrote: > From: Shay Drory <shayd@nvidia.com> > > Currently, in case of QP, the following use-after-free is possible: > > cpu0 cpu1 > ---- ---- > res_get_common_dumpit() > rdma_restrack_get() > fill_res_qp_entry() > ib_destroy_qp_user() > rdma_restrack_del() > qp->device->ops.destroy_qp() > ib_query_qp() > qp->device->ops.query_qp() > --> use-after-free-qp > > This is because rdma_restrack_del(), in case of QP, isn't waiting until > all users are gone. > > Fix it by making rdma_restrack_del() wait until all users are gone for > QPs as well. > > Fixes: 13ef5539def7 ("RDMA/restrack: Count references to the verbs objects") > Signed-off-by: Shay Drory <shayd@nvidia.com> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com> > --- > drivers/infiniband/core/restrack.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c > index ffabaf327242..def0c5b0efe9 100644 > --- a/drivers/infiniband/core/restrack.c > +++ b/drivers/infiniband/core/restrack.c > @@ -340,7 +340,7 @@ void rdma_restrack_del(struct rdma_restrack_entry *res) > rt = &dev->res[res->type]; > > old = xa_erase(&rt->xa, res->id); > - if (res->type == RDMA_RESTRACK_MR || res->type == RDMA_RESTRACK_QP) > + if (res->type == RDMA_RESTRACK_MR) > return; Why is MR skipping this? It also calls into the driver under its dumpit, at the very least this needs a comment. Jason
On Tue, Apr 20, 2021 at 09:39:50AM -0300, Jason Gunthorpe wrote: > On Sun, Apr 18, 2021 at 04:37:35PM +0300, Leon Romanovsky wrote: > > From: Shay Drory <shayd@nvidia.com> > > > > Currently, in case of QP, the following use-after-free is possible: > > > > cpu0 cpu1 > > ---- ---- > > res_get_common_dumpit() > > rdma_restrack_get() > > fill_res_qp_entry() > > ib_destroy_qp_user() > > rdma_restrack_del() > > qp->device->ops.destroy_qp() > > ib_query_qp() > > qp->device->ops.query_qp() > > --> use-after-free-qp > > > > This is because rdma_restrack_del(), in case of QP, isn't waiting until > > all users are gone. > > > > Fix it by making rdma_restrack_del() wait until all users are gone for > > QPs as well. > > > > Fixes: 13ef5539def7 ("RDMA/restrack: Count references to the verbs objects") > > Signed-off-by: Shay Drory <shayd@nvidia.com> > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com> > > --- > > drivers/infiniband/core/restrack.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c > > index ffabaf327242..def0c5b0efe9 100644 > > --- a/drivers/infiniband/core/restrack.c > > +++ b/drivers/infiniband/core/restrack.c > > @@ -340,7 +340,7 @@ void rdma_restrack_del(struct rdma_restrack_entry *res) > > rt = &dev->res[res->type]; > > > > old = xa_erase(&rt->xa, res->id); > > - if (res->type == RDMA_RESTRACK_MR || res->type == RDMA_RESTRACK_QP) > > + if (res->type == RDMA_RESTRACK_MR) > > return; > > Why is MR skipping this? I don't remember the justification for RDMA_RESTRACK_MR || RDMA_RESTRACK_QP check. My guess that it is related to the allocation flow (both are not converted) and have internal objects to the drivers. > > > It also calls into the driver under its dumpit, at the very least this > needs a comment. > > Jason
On Tue, Apr 20, 2021 at 04:06:03PM +0300, Leon Romanovsky wrote: > On Tue, Apr 20, 2021 at 09:39:50AM -0300, Jason Gunthorpe wrote: > > On Sun, Apr 18, 2021 at 04:37:35PM +0300, Leon Romanovsky wrote: > > > From: Shay Drory <shayd@nvidia.com> > > > > > > Currently, in case of QP, the following use-after-free is possible: > > > > > > cpu0 cpu1 > > > res_get_common_dumpit() > > > rdma_restrack_get() > > > fill_res_qp_entry() > > > ib_destroy_qp_user() > > > rdma_restrack_del() > > > qp->device->ops.destroy_qp() > > > ib_query_qp() > > > qp->device->ops.query_qp() > > > > > > This is because rdma_restrack_del(), in case of QP, isn't waiting until > > > all users are gone. > > > > > > Fix it by making rdma_restrack_del() wait until all users are gone for > > > QPs as well. > > > > > > Fixes: 13ef5539def7 ("RDMA/restrack: Count references to the verbs objects") > > > Signed-off-by: Shay Drory <shayd@nvidia.com> > > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com> > > > drivers/infiniband/core/restrack.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c > > > index ffabaf327242..def0c5b0efe9 100644 > > > +++ b/drivers/infiniband/core/restrack.c > > > @@ -340,7 +340,7 @@ void rdma_restrack_del(struct rdma_restrack_entry *res) > > > rt = &dev->res[res->type]; > > > > > > old = xa_erase(&rt->xa, res->id); > > > - if (res->type == RDMA_RESTRACK_MR || res->type == RDMA_RESTRACK_QP) > > > + if (res->type == RDMA_RESTRACK_MR) > > > return; > > > > Why is MR skipping this? > > I don't remember the justification for RDMA_RESTRACK_MR || RDMA_RESTRACK_QP check. > My guess that it is related to the allocation flow (both are not converted) and > have internal objects to the drivers. Well, I don't understand why QP has a bug but MR would not have the same one?? Jason
On Tue, Apr 20, 2021 at 12:25:41PM -0300, Jason Gunthorpe wrote: > On Tue, Apr 20, 2021 at 04:06:03PM +0300, Leon Romanovsky wrote: > > On Tue, Apr 20, 2021 at 09:39:50AM -0300, Jason Gunthorpe wrote: > > > On Sun, Apr 18, 2021 at 04:37:35PM +0300, Leon Romanovsky wrote: > > > > From: Shay Drory <shayd@nvidia.com> > > > > > > > > Currently, in case of QP, the following use-after-free is possible: > > > > > > > > cpu0 cpu1 > > > > res_get_common_dumpit() > > > > rdma_restrack_get() > > > > fill_res_qp_entry() > > > > ib_destroy_qp_user() > > > > rdma_restrack_del() > > > > qp->device->ops.destroy_qp() > > > > ib_query_qp() > > > > qp->device->ops.query_qp() > > > > > > > > This is because rdma_restrack_del(), in case of QP, isn't waiting until > > > > all users are gone. > > > > > > > > Fix it by making rdma_restrack_del() wait until all users are gone for > > > > QPs as well. > > > > > > > > Fixes: 13ef5539def7 ("RDMA/restrack: Count references to the verbs objects") > > > > Signed-off-by: Shay Drory <shayd@nvidia.com> > > > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com> > > > > drivers/infiniband/core/restrack.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c > > > > index ffabaf327242..def0c5b0efe9 100644 > > > > +++ b/drivers/infiniband/core/restrack.c > > > > @@ -340,7 +340,7 @@ void rdma_restrack_del(struct rdma_restrack_entry *res) > > > > rt = &dev->res[res->type]; > > > > > > > > old = xa_erase(&rt->xa, res->id); > > > > - if (res->type == RDMA_RESTRACK_MR || res->type == RDMA_RESTRACK_QP) > > > > + if (res->type == RDMA_RESTRACK_MR) > > > > return; > > > > > > Why is MR skipping this? > > > > I don't remember the justification for RDMA_RESTRACK_MR || RDMA_RESTRACK_QP check. > > My guess that it is related to the allocation flow (both are not converted) and > > have internal objects to the drivers. > > Well, I don't understand why QP has a bug but MR would not have the > same one?? I didn't understand when reviewed either, but decided to post it anyway to get possible explanation for this RDMA_RESTRACK_MR || RDMA_RESTRACK_QP check. Thanks > > Jason
On Wed, Apr 21, 2021 at 08:03:22AM +0300, Leon Romanovsky wrote: > I didn't understand when reviewed either, but decided to post it anyway > to get possible explanation for this RDMA_RESTRACK_MR || RDMA_RESTRACK_QP > check. I think the whole thing should look more like this and we delete the if entirely. diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 2b9ffc21cbc4ad..479b16b8f6723a 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1860,6 +1860,7 @@ static void _destroy_id(struct rdma_id_private *id_priv, iw_destroy_cm_id(id_priv->cm_id.iw); } cma_leave_mc_groups(id_priv); + rdma_restrack_del(&id_priv->res); cma_release_dev(id_priv); } @@ -1873,7 +1874,6 @@ static void _destroy_id(struct rdma_id_private *id_priv, kfree(id_priv->id.route.path_rec); put_net(id_priv->id.route.addr.dev_addr.net); - rdma_restrack_del(&id_priv->res); kfree(id_priv); } diff --git a/drivers/infiniband/core/counters.c b/drivers/infiniband/core/counters.c index 15493357cfef09..1808bc2533f155 100644 --- a/drivers/infiniband/core/counters.c +++ b/drivers/infiniband/core/counters.c @@ -176,6 +176,8 @@ static void rdma_counter_free(struct rdma_counter *counter) { struct rdma_port_counter *port_counter; + rdma_restrack_del(&counter->res); + port_counter = &counter->device->port_data[counter->port].port_counter; mutex_lock(&port_counter->lock); port_counter->num_counters--; @@ -185,7 +187,6 @@ static void rdma_counter_free(struct rdma_counter *counter) mutex_unlock(&port_counter->lock); - rdma_restrack_del(&counter->res); kfree(counter->stats); kfree(counter); } diff --git a/drivers/infiniband/core/cq.c b/drivers/infiniband/core/cq.c index 433b426729d4ce..3884db637d33ab 100644 --- a/drivers/infiniband/core/cq.c +++ b/drivers/infiniband/core/cq.c @@ -339,11 +339,15 @@ void ib_free_cq(struct ib_cq *cq) WARN_ON_ONCE(1); } + rdma_restrack_prepare_del(&cq->res); rdma_dim_destroy(cq); trace_cq_free(cq); ret = cq->device->ops.destroy_cq(cq, NULL); - WARN_ONCE(ret, "Destroy of kernel CQ shouldn't fail"); - rdma_restrack_del(&cq->res); + if (WARN_ON(ret)) { + rdma_restrack_abort_del(&cq->res); + return; + } + rdma_restrack_finish_del(&cq->res); kfree(cq->wc); kfree(cq); } diff --git a/drivers/infiniband/core/rdma_core.c b/drivers/infiniband/core/rdma_core.c index 94d83b665a2fe0..f57234ced93ca1 100644 --- a/drivers/infiniband/core/rdma_core.c +++ b/drivers/infiniband/core/rdma_core.c @@ -854,6 +854,8 @@ static void ufile_destroy_ucontext(struct ib_uverbs_file *ufile, struct ib_ucontext *ucontext = ufile->ucontext; struct ib_device *ib_dev = ucontext->device; + rdma_restrack_prepare_del(&ucontext->res); + /* * If we are closing the FD then the user mmap VMAs must have * already been destroyed as they hold on to the filep, otherwise @@ -868,9 +870,8 @@ static void ufile_destroy_ucontext(struct ib_uverbs_file *ufile, ib_rdmacg_uncharge(&ucontext->cg_obj, ib_dev, RDMACG_RESOURCE_HCA_HANDLE); - rdma_restrack_del(&ucontext->res); - ib_dev->ops.dealloc_ucontext(ucontext); + rdma_restrack_finish_del(&ucontext->res); WARN_ON(!xa_empty(&ucontext->mmap_xa)); kfree(ucontext); diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c index 033207882c82ce..8aa1dae40f38a7 100644 --- a/drivers/infiniband/core/restrack.c +++ b/drivers/infiniband/core/restrack.c @@ -315,11 +315,49 @@ int rdma_restrack_put(struct rdma_restrack_entry *res) } EXPORT_SYMBOL(rdma_restrack_put); -/** - * rdma_restrack_del() - delete object from the reource tracking database - * @res: resource entry - */ -void rdma_restrack_del(struct rdma_restrack_entry *res) +void rdma_restrack_prepare_del(struct rdma_restrack_entry *res) +{ + struct rdma_restrack_entry *old; + struct rdma_restrack_root *rt; + struct ib_device *dev; + + if (!res->valid) + return; + + dev = res_to_dev(res); + rt = &dev->res[res->type]; + + if (!res->no_track) { + old = xa_cmpxchg(&rt->xa, res->id, res, XA_ZERO_ENTRY, + GFP_KERNEL); + WARN_ON(old != res); + } + + /* Fence access from all concurrent threads, like netlink */ + rdma_restrack_put(res); + wait_for_completion(&res->comp); +} +EXPORT_SYMBOL(rdma_restrack_prepare_del); + +void rdma_restrack_abort_del(struct rdma_restrack_entry *res) +{ + struct rdma_restrack_entry *old; + struct rdma_restrack_root *rt; + struct ib_device *dev; + + if (!res->valid || res->no_track) + return; + + dev = res_to_dev(res); + rt = &dev->res[res->type]; + + kref_init(&res->kref); + old = xa_cmpxchg(&rt->xa, res->id, XA_ZERO_ENTRY, res, GFP_KERNEL); + WARN_ON(old != res); +} +EXPORT_SYMBOL(rdma_restrack_abort_del); + +void rdma_restrack_finish_del(struct rdma_restrack_entry *res) { struct rdma_restrack_entry *old; struct rdma_restrack_root *rt; @@ -332,24 +370,22 @@ void rdma_restrack_del(struct rdma_restrack_entry *res) } return; } - - if (res->no_track) - goto out; + res->valid = false; dev = res_to_dev(res); - if (WARN_ON(!dev)) - return; - rt = &dev->res[res->type]; - old = xa_erase(&rt->xa, res->id); - if (res->type == RDMA_RESTRACK_MR || res->type == RDMA_RESTRACK_QP) - return; - WARN_ON(old != res); + WARN_ON(old != NULL); +} +EXPORT_SYMBOL(rdma_restrack_finish_del); -out: - res->valid = false; - rdma_restrack_put(res); - wait_for_completion(&res->comp); +/** + * rdma_restrack_del() - delete object from the reource tracking database + * @res: resource entry + */ +void rdma_restrack_del(struct rdma_restrack_entry *res) +{ + rdma_restrack_prepare_del(res); + rdma_restrack_finish_del(res); } EXPORT_SYMBOL(rdma_restrack_del); diff --git a/drivers/infiniband/core/restrack.h b/drivers/infiniband/core/restrack.h index 6a04fc41f73801..87a670334acabe 100644 --- a/drivers/infiniband/core/restrack.h +++ b/drivers/infiniband/core/restrack.h @@ -27,6 +27,9 @@ int rdma_restrack_init(struct ib_device *dev); void rdma_restrack_clean(struct ib_device *dev); void rdma_restrack_add(struct rdma_restrack_entry *res); void rdma_restrack_del(struct rdma_restrack_entry *res); +void rdma_restrack_prepare_del(struct rdma_restrack_entry *res); +void rdma_restrack_finish_del(struct rdma_restrack_entry *res); +void rdma_restrack_abort_del(struct rdma_restrack_entry *res); void rdma_restrack_new(struct rdma_restrack_entry *res, enum rdma_restrack_type type); void rdma_restrack_set_name(struct rdma_restrack_entry *res, diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c index 2b0798151fb753..2e761a7eb92847 100644 --- a/drivers/infiniband/core/verbs.c +++ b/drivers/infiniband/core/verbs.c @@ -346,13 +346,16 @@ int ib_dealloc_pd_user(struct ib_pd *pd, struct ib_udata *udata) */ WARN_ON(atomic_read(&pd->usecnt)); + rdma_restrack_prepare_del(&pd->res); ret = pd->device->ops.dealloc_pd(pd, udata); - if (ret) + if (ret) { + rdma_restrack_abort_del(&pd->res); return ret; + } - rdma_restrack_del(&pd->res); + rdma_restrack_finish_del(&pd->res); kfree(pd); - return ret; + return 0; } EXPORT_SYMBOL(ib_dealloc_pd_user); @@ -1085,19 +1088,22 @@ int ib_destroy_srq_user(struct ib_srq *srq, struct ib_udata *udata) if (atomic_read(&srq->usecnt)) return -EBUSY; + rdma_restrack_prepare_del(&srq->res); ret = srq->device->ops.destroy_srq(srq, udata); - if (ret) + if (ret) { + rdma_restrack_abort_del(&srq->res); return ret; + } + rdma_restrack_finish_del(&srq->res); atomic_dec(&srq->pd->usecnt); if (srq->srq_type == IB_SRQT_XRC) atomic_dec(&srq->ext.xrc.xrcd->usecnt); if (ib_srq_has_cq(srq->srq_type)) atomic_dec(&srq->ext.cq->usecnt); - rdma_restrack_del(&srq->res); kfree(srq); - return ret; + return 0; } EXPORT_SYMBOL(ib_destroy_srq_user); @@ -1963,31 +1969,33 @@ int ib_destroy_qp_user(struct ib_qp *qp, struct ib_udata *udata) rdma_rw_cleanup_mrs(qp); rdma_counter_unbind_qp(qp, true); - rdma_restrack_del(&qp->res); + rdma_restrack_prepare_del(&qp->res); ret = qp->device->ops.destroy_qp(qp, udata); - if (!ret) { - if (alt_path_sgid_attr) - rdma_put_gid_attr(alt_path_sgid_attr); - if (av_sgid_attr) - rdma_put_gid_attr(av_sgid_attr); - if (pd) - atomic_dec(&pd->usecnt); - if (scq) - atomic_dec(&scq->usecnt); - if (rcq) - atomic_dec(&rcq->usecnt); - if (srq) - atomic_dec(&srq->usecnt); - if (ind_tbl) - atomic_dec(&ind_tbl->usecnt); - if (sec) - ib_destroy_qp_security_end(sec); - } else { + if (ret) { + rdma_restrack_abort_del(&qp->res); if (sec) ib_destroy_qp_security_abort(sec); + return ret; } - return ret; + rdma_restrack_finish_del(&qp->res); + if (alt_path_sgid_attr) + rdma_put_gid_attr(alt_path_sgid_attr); + if (av_sgid_attr) + rdma_put_gid_attr(av_sgid_attr); + if (pd) + atomic_dec(&pd->usecnt); + if (scq) + atomic_dec(&scq->usecnt); + if (rcq) + atomic_dec(&rcq->usecnt); + if (srq) + atomic_dec(&srq->usecnt); + if (ind_tbl) + atomic_dec(&ind_tbl->usecnt); + if (sec) + ib_destroy_qp_security_end(sec); + return 0; } EXPORT_SYMBOL(ib_destroy_qp_user); @@ -2050,13 +2058,16 @@ int ib_destroy_cq_user(struct ib_cq *cq, struct ib_udata *udata) if (atomic_read(&cq->usecnt)) return -EBUSY; + rdma_restrack_prepare_del(&cq->res); ret = cq->device->ops.destroy_cq(cq, udata); - if (ret) + if (ret) { + rdma_restrack_abort_del(&cq->res); return ret; + } - rdma_restrack_del(&cq->res); + rdma_restrack_finish_del(&cq->res); kfree(cq); - return ret; + return 0; } EXPORT_SYMBOL(ib_destroy_cq_user); @@ -2126,16 +2137,19 @@ int ib_dereg_mr_user(struct ib_mr *mr, struct ib_udata *udata) int ret; trace_mr_dereg(mr); - rdma_restrack_del(&mr->res); + rdma_restrack_prepare_del(&mr->res); ret = mr->device->ops.dereg_mr(mr, udata); - if (!ret) { - atomic_dec(&pd->usecnt); - if (dm) - atomic_dec(&dm->usecnt); - kfree(sig_attrs); + if (ret) { + rdma_restrack_abort_del(&mr->res); + return ret; } - return ret; + rdma_restrack_finish_del(&mr->res); + atomic_dec(&pd->usecnt); + if (dm) + atomic_dec(&dm->usecnt); + kfree(sig_attrs); + return 0; } EXPORT_SYMBOL(ib_dereg_mr_user);
On Thu, Apr 22, 2021 at 11:29:39AM -0300, Jason Gunthorpe wrote: > On Wed, Apr 21, 2021 at 08:03:22AM +0300, Leon Romanovsky wrote: > > > I didn't understand when reviewed either, but decided to post it anyway > > to get possible explanation for this RDMA_RESTRACK_MR || RDMA_RESTRACK_QP > > check. > > I think the whole thing should look more like this and we delete the > if entirely. I have mixed feelings about this approach. Before "destroy can fail disaster", the restrack goal was to provide the following flow: 1. create new memory object - rdma_restrack_new() 2. create new HW object - .create_XXX() callback in the driver 3. add HW object to the DB - rdma_restrack_del() .... 4. wait for any work on this HW object to complete - rdma_restrack_del() 5. safely destroy HW object - .destroy_XXX() I really would like to stay with this flow and block any access to the object that failed to destruct - maybe add to some zombie list. The proposed prepare/abort/finish flow is much harder to implement correctly. Let's take as an example ib_destroy_qp_user(), we called to rdma_rw_cleanup_mrs(), but didn't restore them after .destroy_qp() failure. Thanks > > diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c > index 2b9ffc21cbc4ad..479b16b8f6723a 100644 > --- a/drivers/infiniband/core/cma.c > +++ b/drivers/infiniband/core/cma.c > @@ -1860,6 +1860,7 @@ static void _destroy_id(struct rdma_id_private *id_priv, > iw_destroy_cm_id(id_priv->cm_id.iw); > } > cma_leave_mc_groups(id_priv); > + rdma_restrack_del(&id_priv->res); > cma_release_dev(id_priv); > } > > @@ -1873,7 +1874,6 @@ static void _destroy_id(struct rdma_id_private *id_priv, > kfree(id_priv->id.route.path_rec); > > put_net(id_priv->id.route.addr.dev_addr.net); > - rdma_restrack_del(&id_priv->res); > kfree(id_priv); > } > > diff --git a/drivers/infiniband/core/counters.c b/drivers/infiniband/core/counters.c > index 15493357cfef09..1808bc2533f155 100644 > --- a/drivers/infiniband/core/counters.c > +++ b/drivers/infiniband/core/counters.c > @@ -176,6 +176,8 @@ static void rdma_counter_free(struct rdma_counter *counter) > { > struct rdma_port_counter *port_counter; > > + rdma_restrack_del(&counter->res); > + > port_counter = &counter->device->port_data[counter->port].port_counter; > mutex_lock(&port_counter->lock); > port_counter->num_counters--; > @@ -185,7 +187,6 @@ static void rdma_counter_free(struct rdma_counter *counter) > > mutex_unlock(&port_counter->lock); > > - rdma_restrack_del(&counter->res); > kfree(counter->stats); > kfree(counter); > } > diff --git a/drivers/infiniband/core/cq.c b/drivers/infiniband/core/cq.c > index 433b426729d4ce..3884db637d33ab 100644 > --- a/drivers/infiniband/core/cq.c > +++ b/drivers/infiniband/core/cq.c > @@ -339,11 +339,15 @@ void ib_free_cq(struct ib_cq *cq) > WARN_ON_ONCE(1); > } > > + rdma_restrack_prepare_del(&cq->res); > rdma_dim_destroy(cq); > trace_cq_free(cq); > ret = cq->device->ops.destroy_cq(cq, NULL); > - WARN_ONCE(ret, "Destroy of kernel CQ shouldn't fail"); > - rdma_restrack_del(&cq->res); > + if (WARN_ON(ret)) { > + rdma_restrack_abort_del(&cq->res); > + return; > + } > + rdma_restrack_finish_del(&cq->res); > kfree(cq->wc); > kfree(cq); > } > diff --git a/drivers/infiniband/core/rdma_core.c b/drivers/infiniband/core/rdma_core.c > index 94d83b665a2fe0..f57234ced93ca1 100644 > --- a/drivers/infiniband/core/rdma_core.c > +++ b/drivers/infiniband/core/rdma_core.c > @@ -854,6 +854,8 @@ static void ufile_destroy_ucontext(struct ib_uverbs_file *ufile, > struct ib_ucontext *ucontext = ufile->ucontext; > struct ib_device *ib_dev = ucontext->device; > > + rdma_restrack_prepare_del(&ucontext->res); > + > /* > * If we are closing the FD then the user mmap VMAs must have > * already been destroyed as they hold on to the filep, otherwise > @@ -868,9 +870,8 @@ static void ufile_destroy_ucontext(struct ib_uverbs_file *ufile, > ib_rdmacg_uncharge(&ucontext->cg_obj, ib_dev, > RDMACG_RESOURCE_HCA_HANDLE); > > - rdma_restrack_del(&ucontext->res); > - > ib_dev->ops.dealloc_ucontext(ucontext); > + rdma_restrack_finish_del(&ucontext->res); > WARN_ON(!xa_empty(&ucontext->mmap_xa)); > kfree(ucontext); > > diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c > index 033207882c82ce..8aa1dae40f38a7 100644 > --- a/drivers/infiniband/core/restrack.c > +++ b/drivers/infiniband/core/restrack.c > @@ -315,11 +315,49 @@ int rdma_restrack_put(struct rdma_restrack_entry *res) > } > EXPORT_SYMBOL(rdma_restrack_put); > > -/** > - * rdma_restrack_del() - delete object from the reource tracking database > - * @res: resource entry > - */ > -void rdma_restrack_del(struct rdma_restrack_entry *res) > +void rdma_restrack_prepare_del(struct rdma_restrack_entry *res) > +{ > + struct rdma_restrack_entry *old; > + struct rdma_restrack_root *rt; > + struct ib_device *dev; > + > + if (!res->valid) > + return; > + > + dev = res_to_dev(res); > + rt = &dev->res[res->type]; > + > + if (!res->no_track) { > + old = xa_cmpxchg(&rt->xa, res->id, res, XA_ZERO_ENTRY, > + GFP_KERNEL); > + WARN_ON(old != res); > + } > + > + /* Fence access from all concurrent threads, like netlink */ > + rdma_restrack_put(res); > + wait_for_completion(&res->comp); > +} > +EXPORT_SYMBOL(rdma_restrack_prepare_del); > + > +void rdma_restrack_abort_del(struct rdma_restrack_entry *res) > +{ > + struct rdma_restrack_entry *old; > + struct rdma_restrack_root *rt; > + struct ib_device *dev; > + > + if (!res->valid || res->no_track) > + return; > + > + dev = res_to_dev(res); > + rt = &dev->res[res->type]; > + > + kref_init(&res->kref); > + old = xa_cmpxchg(&rt->xa, res->id, XA_ZERO_ENTRY, res, GFP_KERNEL); > + WARN_ON(old != res); > +} > +EXPORT_SYMBOL(rdma_restrack_abort_del); > + > +void rdma_restrack_finish_del(struct rdma_restrack_entry *res) > { > struct rdma_restrack_entry *old; > struct rdma_restrack_root *rt; > @@ -332,24 +370,22 @@ void rdma_restrack_del(struct rdma_restrack_entry *res) > } > return; > } > - > - if (res->no_track) > - goto out; > + res->valid = false; > > dev = res_to_dev(res); > - if (WARN_ON(!dev)) > - return; > - > rt = &dev->res[res->type]; > - > old = xa_erase(&rt->xa, res->id); > - if (res->type == RDMA_RESTRACK_MR || res->type == RDMA_RESTRACK_QP) > - return; > - WARN_ON(old != res); > + WARN_ON(old != NULL); > +} > +EXPORT_SYMBOL(rdma_restrack_finish_del); > > -out: > - res->valid = false; > - rdma_restrack_put(res); > - wait_for_completion(&res->comp); > +/** > + * rdma_restrack_del() - delete object from the reource tracking database > + * @res: resource entry > + */ > +void rdma_restrack_del(struct rdma_restrack_entry *res) > +{ > + rdma_restrack_prepare_del(res); > + rdma_restrack_finish_del(res); > } > EXPORT_SYMBOL(rdma_restrack_del); > diff --git a/drivers/infiniband/core/restrack.h b/drivers/infiniband/core/restrack.h > index 6a04fc41f73801..87a670334acabe 100644 > --- a/drivers/infiniband/core/restrack.h > +++ b/drivers/infiniband/core/restrack.h > @@ -27,6 +27,9 @@ int rdma_restrack_init(struct ib_device *dev); > void rdma_restrack_clean(struct ib_device *dev); > void rdma_restrack_add(struct rdma_restrack_entry *res); > void rdma_restrack_del(struct rdma_restrack_entry *res); > +void rdma_restrack_prepare_del(struct rdma_restrack_entry *res); > +void rdma_restrack_finish_del(struct rdma_restrack_entry *res); > +void rdma_restrack_abort_del(struct rdma_restrack_entry *res); > void rdma_restrack_new(struct rdma_restrack_entry *res, > enum rdma_restrack_type type); > void rdma_restrack_set_name(struct rdma_restrack_entry *res, > diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c > index 2b0798151fb753..2e761a7eb92847 100644 > --- a/drivers/infiniband/core/verbs.c > +++ b/drivers/infiniband/core/verbs.c > @@ -346,13 +346,16 @@ int ib_dealloc_pd_user(struct ib_pd *pd, struct ib_udata *udata) > */ > WARN_ON(atomic_read(&pd->usecnt)); > > + rdma_restrack_prepare_del(&pd->res); > ret = pd->device->ops.dealloc_pd(pd, udata); > - if (ret) > + if (ret) { > + rdma_restrack_abort_del(&pd->res); > return ret; > + } > > - rdma_restrack_del(&pd->res); > + rdma_restrack_finish_del(&pd->res); > kfree(pd); > - return ret; > + return 0; > } > EXPORT_SYMBOL(ib_dealloc_pd_user); > > @@ -1085,19 +1088,22 @@ int ib_destroy_srq_user(struct ib_srq *srq, struct ib_udata *udata) > if (atomic_read(&srq->usecnt)) > return -EBUSY; > > + rdma_restrack_prepare_del(&srq->res); > ret = srq->device->ops.destroy_srq(srq, udata); > - if (ret) > + if (ret) { > + rdma_restrack_abort_del(&srq->res); > return ret; > + } > + rdma_restrack_finish_del(&srq->res); > > atomic_dec(&srq->pd->usecnt); > if (srq->srq_type == IB_SRQT_XRC) > atomic_dec(&srq->ext.xrc.xrcd->usecnt); > if (ib_srq_has_cq(srq->srq_type)) > atomic_dec(&srq->ext.cq->usecnt); > - rdma_restrack_del(&srq->res); > kfree(srq); > > - return ret; > + return 0; > } > EXPORT_SYMBOL(ib_destroy_srq_user); > > @@ -1963,31 +1969,33 @@ int ib_destroy_qp_user(struct ib_qp *qp, struct ib_udata *udata) > rdma_rw_cleanup_mrs(qp); > > rdma_counter_unbind_qp(qp, true); > - rdma_restrack_del(&qp->res); > + rdma_restrack_prepare_del(&qp->res); > ret = qp->device->ops.destroy_qp(qp, udata); > - if (!ret) { > - if (alt_path_sgid_attr) > - rdma_put_gid_attr(alt_path_sgid_attr); > - if (av_sgid_attr) > - rdma_put_gid_attr(av_sgid_attr); > - if (pd) > - atomic_dec(&pd->usecnt); > - if (scq) > - atomic_dec(&scq->usecnt); > - if (rcq) > - atomic_dec(&rcq->usecnt); > - if (srq) > - atomic_dec(&srq->usecnt); > - if (ind_tbl) > - atomic_dec(&ind_tbl->usecnt); > - if (sec) > - ib_destroy_qp_security_end(sec); > - } else { > + if (ret) { > + rdma_restrack_abort_del(&qp->res); > if (sec) > ib_destroy_qp_security_abort(sec); > + return ret; > } > > - return ret; > + rdma_restrack_finish_del(&qp->res); > + if (alt_path_sgid_attr) > + rdma_put_gid_attr(alt_path_sgid_attr); > + if (av_sgid_attr) > + rdma_put_gid_attr(av_sgid_attr); > + if (pd) > + atomic_dec(&pd->usecnt); > + if (scq) > + atomic_dec(&scq->usecnt); > + if (rcq) > + atomic_dec(&rcq->usecnt); > + if (srq) > + atomic_dec(&srq->usecnt); > + if (ind_tbl) > + atomic_dec(&ind_tbl->usecnt); > + if (sec) > + ib_destroy_qp_security_end(sec); > + return 0; > } > EXPORT_SYMBOL(ib_destroy_qp_user); > > @@ -2050,13 +2058,16 @@ int ib_destroy_cq_user(struct ib_cq *cq, struct ib_udata *udata) > if (atomic_read(&cq->usecnt)) > return -EBUSY; > > + rdma_restrack_prepare_del(&cq->res); > ret = cq->device->ops.destroy_cq(cq, udata); > - if (ret) > + if (ret) { > + rdma_restrack_abort_del(&cq->res); > return ret; > + } > > - rdma_restrack_del(&cq->res); > + rdma_restrack_finish_del(&cq->res); > kfree(cq); > - return ret; > + return 0; > } > EXPORT_SYMBOL(ib_destroy_cq_user); > > @@ -2126,16 +2137,19 @@ int ib_dereg_mr_user(struct ib_mr *mr, struct ib_udata *udata) > int ret; > > trace_mr_dereg(mr); > - rdma_restrack_del(&mr->res); > + rdma_restrack_prepare_del(&mr->res); > ret = mr->device->ops.dereg_mr(mr, udata); > - if (!ret) { > - atomic_dec(&pd->usecnt); > - if (dm) > - atomic_dec(&dm->usecnt); > - kfree(sig_attrs); > + if (ret) { > + rdma_restrack_abort_del(&mr->res); > + return ret; > } > > - return ret; > + rdma_restrack_finish_del(&mr->res); > + atomic_dec(&pd->usecnt); > + if (dm) > + atomic_dec(&dm->usecnt); > + kfree(sig_attrs); > + return 0; > } > EXPORT_SYMBOL(ib_dereg_mr_user); >
On Sun, Apr 25, 2021 at 04:03:47PM +0300, Leon Romanovsky wrote: > On Thu, Apr 22, 2021 at 11:29:39AM -0300, Jason Gunthorpe wrote: > > On Wed, Apr 21, 2021 at 08:03:22AM +0300, Leon Romanovsky wrote: > > > > > I didn't understand when reviewed either, but decided to post it anyway > > > to get possible explanation for this RDMA_RESTRACK_MR || RDMA_RESTRACK_QP > > > check. > > > > I think the whole thing should look more like this and we delete the > > if entirely. > > I have mixed feelings about this approach. Before "destroy can fail disaster", > the restrack goal was to provide the following flow: > 1. create new memory object - rdma_restrack_new() > 2. create new HW object - .create_XXX() callback in the driver > 3. add HW object to the DB - rdma_restrack_del() > .... > 4. wait for any work on this HW object to complete - rdma_restrack_del() > 5. safely destroy HW object - .destroy_XXX() > > I really would like to stay with this flow and block any access to the > object that failed to destruct - maybe add to some zombie list. That isn't the semantic we now have for destroy. > The proposed prepare/abort/finish flow is much harder to implement correctly. > Let's take as an example ib_destroy_qp_user(), we called to rdma_rw_cleanup_mrs(), > but didn't restore them after .destroy_qp() failure. I think it is a bug we call rdma_rw code in a a user path. Jason
On Sun, Apr 25, 2021 at 10:08:57AM -0300, Jason Gunthorpe wrote: > On Sun, Apr 25, 2021 at 04:03:47PM +0300, Leon Romanovsky wrote: > > On Thu, Apr 22, 2021 at 11:29:39AM -0300, Jason Gunthorpe wrote: > > > On Wed, Apr 21, 2021 at 08:03:22AM +0300, Leon Romanovsky wrote: > > > > > > > I didn't understand when reviewed either, but decided to post it anyway > > > > to get possible explanation for this RDMA_RESTRACK_MR || RDMA_RESTRACK_QP > > > > check. > > > > > > I think the whole thing should look more like this and we delete the > > > if entirely. > > > > I have mixed feelings about this approach. Before "destroy can fail disaster", > > the restrack goal was to provide the following flow: > > 1. create new memory object - rdma_restrack_new() > > 2. create new HW object - .create_XXX() callback in the driver > > 3. add HW object to the DB - rdma_restrack_del() > > .... > > 4. wait for any work on this HW object to complete - rdma_restrack_del() > > 5. safely destroy HW object - .destroy_XXX() > > > > I really would like to stay with this flow and block any access to the > > object that failed to destruct - maybe add to some zombie list. > > That isn't the semantic we now have for destroy. I would say that it is my mistake introduced when changed destroy to return an error. > > > The proposed prepare/abort/finish flow is much harder to implement correctly. > > Let's take as an example ib_destroy_qp_user(), we called to rdma_rw_cleanup_mrs(), > > but didn't restore them after .destroy_qp() failure. > > I think it is a bug we call rdma_rw code in a a user path. It was an example of a flow that wasn't restored properly. The same goes for ib_dealloc_pd_user(), release of __internal_mr. Of course, these flows shouldn't fail because of being kernel flows, but it is not clear from the code. Thanks > > Jason
On Sun, Apr 25, 2021 at 04:44:55PM +0300, Leon Romanovsky wrote: > > > The proposed prepare/abort/finish flow is much harder to implement correctly. > > > Let's take as an example ib_destroy_qp_user(), we called to rdma_rw_cleanup_mrs(), > > > but didn't restore them after .destroy_qp() failure. > > > > I think it is a bug we call rdma_rw code in a a user path. > > It was an example of a flow that wasn't restored properly. > The same goes for ib_dealloc_pd_user(), release of __internal_mr. > > Of course, these flows shouldn't fail because of being kernel flows, but it is not clear > from the code. Well, exactly, user flows are not allowed to do extra stuff before calling the driver destroy So the arrangement I gave is reasonable and make sense, it is certainly better than the hodge podge of ordering that we have today Jason
On Sun, Apr 25, 2021 at 02:22:54PM -0300, Jason Gunthorpe wrote: > On Sun, Apr 25, 2021 at 04:44:55PM +0300, Leon Romanovsky wrote: > > > > The proposed prepare/abort/finish flow is much harder to implement correctly. > > > > Let's take as an example ib_destroy_qp_user(), we called to rdma_rw_cleanup_mrs(), > > > > but didn't restore them after .destroy_qp() failure. > > > > > > I think it is a bug we call rdma_rw code in a a user path. > > > > It was an example of a flow that wasn't restored properly. > > The same goes for ib_dealloc_pd_user(), release of __internal_mr. > > > > Of course, these flows shouldn't fail because of being kernel flows, but it is not clear > > from the code. > > Well, exactly, user flows are not allowed to do extra stuff before > calling the driver destroy > > So the arrangement I gave is reasonable and make sense, it is > certainly better than the hodge podge of ordering that we have today I thought about simpler solution - move rdma_restrack_del() before .destroy() callbacks together with attempt to readd res object if destroy fails. Thanks > > Jason
On Sun, Apr 25, 2021 at 08:38:57PM +0300, Leon Romanovsky wrote: > On Sun, Apr 25, 2021 at 02:22:54PM -0300, Jason Gunthorpe wrote: > > On Sun, Apr 25, 2021 at 04:44:55PM +0300, Leon Romanovsky wrote: > > > > > The proposed prepare/abort/finish flow is much harder to implement correctly. > > > > > Let's take as an example ib_destroy_qp_user(), we called to rdma_rw_cleanup_mrs(), > > > > > but didn't restore them after .destroy_qp() failure. > > > > > > > > I think it is a bug we call rdma_rw code in a a user path. > > > > > > It was an example of a flow that wasn't restored properly. > > > The same goes for ib_dealloc_pd_user(), release of __internal_mr. > > > > > > Of course, these flows shouldn't fail because of being kernel flows, but it is not clear > > > from the code. > > > > Well, exactly, user flows are not allowed to do extra stuff before > > calling the driver destroy > > > > So the arrangement I gave is reasonable and make sense, it is > > certainly better than the hodge podge of ordering that we have today > > I thought about simpler solution - move rdma_restrack_del() before .destroy() > callbacks together with attempt to readd res object if destroy fails. Is isn't simpler, now add can fail and can't be recovered Jason
On Mon, Apr 26, 2021 at 09:03:49AM -0300, Jason Gunthorpe wrote: > On Sun, Apr 25, 2021 at 08:38:57PM +0300, Leon Romanovsky wrote: > > On Sun, Apr 25, 2021 at 02:22:54PM -0300, Jason Gunthorpe wrote: > > > On Sun, Apr 25, 2021 at 04:44:55PM +0300, Leon Romanovsky wrote: > > > > > > The proposed prepare/abort/finish flow is much harder to implement correctly. > > > > > > Let's take as an example ib_destroy_qp_user(), we called to rdma_rw_cleanup_mrs(), > > > > > > but didn't restore them after .destroy_qp() failure. > > > > > > > > > > I think it is a bug we call rdma_rw code in a a user path. > > > > > > > > It was an example of a flow that wasn't restored properly. > > > > The same goes for ib_dealloc_pd_user(), release of __internal_mr. > > > > > > > > Of course, these flows shouldn't fail because of being kernel flows, but it is not clear > > > > from the code. > > > > > > Well, exactly, user flows are not allowed to do extra stuff before > > > calling the driver destroy > > > > > > So the arrangement I gave is reasonable and make sense, it is > > > certainly better than the hodge podge of ordering that we have today > > > > I thought about simpler solution - move rdma_restrack_del() before .destroy() > > callbacks together with attempt to readd res object if destroy fails. > > Is isn't simpler, now add can fail and can't be recovered It is not different from failure during first call to rdma_restrack_add(). You didn't like the idea to be strict with addition of restrack, but want to be strict in reinsert. Thanks > > Jason
On Mon, Apr 26, 2021 at 04:08:42PM +0300, Leon Romanovsky wrote: > On Mon, Apr 26, 2021 at 09:03:49AM -0300, Jason Gunthorpe wrote: > > On Sun, Apr 25, 2021 at 08:38:57PM +0300, Leon Romanovsky wrote: > > > On Sun, Apr 25, 2021 at 02:22:54PM -0300, Jason Gunthorpe wrote: > > > > On Sun, Apr 25, 2021 at 04:44:55PM +0300, Leon Romanovsky wrote: > > > > > > > The proposed prepare/abort/finish flow is much harder to implement correctly. > > > > > > > Let's take as an example ib_destroy_qp_user(), we called to rdma_rw_cleanup_mrs(), > > > > > > > but didn't restore them after .destroy_qp() failure. > > > > > > > > > > > > I think it is a bug we call rdma_rw code in a a user path. > > > > > > > > > > It was an example of a flow that wasn't restored properly. > > > > > The same goes for ib_dealloc_pd_user(), release of __internal_mr. > > > > > > > > > > Of course, these flows shouldn't fail because of being kernel flows, but it is not clear > > > > > from the code. > > > > > > > > Well, exactly, user flows are not allowed to do extra stuff before > > > > calling the driver destroy > > > > > > > > So the arrangement I gave is reasonable and make sense, it is > > > > certainly better than the hodge podge of ordering that we have today > > > > > > I thought about simpler solution - move rdma_restrack_del() before .destroy() > > > callbacks together with attempt to readd res object if destroy fails. > > > > Is isn't simpler, now add can fail and can't be recovered > > It is not different from failure during first call to rdma_restrack_add(). > You didn't like the idea to be strict with addition of restrack, but > want to be strict in reinsert. It is ugly we couldn't fix the add side, lets not repeat that uglyness in other places Jason
On Mon, Apr 26, 2021 at 10:11:07AM -0300, Jason Gunthorpe wrote: > On Mon, Apr 26, 2021 at 04:08:42PM +0300, Leon Romanovsky wrote: > > On Mon, Apr 26, 2021 at 09:03:49AM -0300, Jason Gunthorpe wrote: > > > On Sun, Apr 25, 2021 at 08:38:57PM +0300, Leon Romanovsky wrote: > > > > On Sun, Apr 25, 2021 at 02:22:54PM -0300, Jason Gunthorpe wrote: > > > > > On Sun, Apr 25, 2021 at 04:44:55PM +0300, Leon Romanovsky wrote: > > > > > > > > The proposed prepare/abort/finish flow is much harder to implement correctly. > > > > > > > > Let's take as an example ib_destroy_qp_user(), we called to rdma_rw_cleanup_mrs(), > > > > > > > > but didn't restore them after .destroy_qp() failure. > > > > > > > > > > > > > > I think it is a bug we call rdma_rw code in a a user path. > > > > > > > > > > > > It was an example of a flow that wasn't restored properly. > > > > > > The same goes for ib_dealloc_pd_user(), release of __internal_mr. > > > > > > > > > > > > Of course, these flows shouldn't fail because of being kernel flows, but it is not clear > > > > > > from the code. > > > > > > > > > > Well, exactly, user flows are not allowed to do extra stuff before > > > > > calling the driver destroy > > > > > > > > > > So the arrangement I gave is reasonable and make sense, it is > > > > > certainly better than the hodge podge of ordering that we have today > > > > > > > > I thought about simpler solution - move rdma_restrack_del() before .destroy() > > > > callbacks together with attempt to readd res object if destroy fails. > > > > > > Is isn't simpler, now add can fail and can't be recovered > > > > It is not different from failure during first call to rdma_restrack_add(). > > You didn't like the idea to be strict with addition of restrack, but > > want to be strict in reinsert. > > It is ugly we couldn't fix the add side, lets not repeat that uglyness > in other places Why can't we fix _add? Thanks > > Jason
On Thu, Apr 22, 2021 at 11:29:39AM -0300, Jason Gunthorpe wrote: > On Wed, Apr 21, 2021 at 08:03:22AM +0300, Leon Romanovsky wrote: > > > I didn't understand when reviewed either, but decided to post it anyway > > to get possible explanation for this RDMA_RESTRACK_MR || RDMA_RESTRACK_QP > > check. > > I think the whole thing should look more like this and we delete the > if entirely. Finally, I had a time to work on it and got immediate reminder why we have "if ...". > > diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c > index 2b9ffc21cbc4ad..479b16b8f6723a 100644 <...> > @@ -1963,31 +1969,33 @@ int ib_destroy_qp_user(struct ib_qp *qp, struct ib_udata *udata) > rdma_rw_cleanup_mrs(qp); > > rdma_counter_unbind_qp(qp, true); > - rdma_restrack_del(&qp->res); > + rdma_restrack_prepare_del(&qp->res); > ret = qp->device->ops.destroy_qp(qp, udata); > - if (!ret) { > - if (alt_path_sgid_attr) > - rdma_put_gid_attr(alt_path_sgid_attr); > - if (av_sgid_attr) > - rdma_put_gid_attr(av_sgid_attr); > - if (pd) > - atomic_dec(&pd->usecnt); > - if (scq) > - atomic_dec(&scq->usecnt); > - if (rcq) > - atomic_dec(&rcq->usecnt); > - if (srq) > - atomic_dec(&srq->usecnt); > - if (ind_tbl) > - atomic_dec(&ind_tbl->usecnt); > - if (sec) > - ib_destroy_qp_security_end(sec); > - } else { > + if (ret) { > + rdma_restrack_abort_del(&qp->res); > if (sec) > ib_destroy_qp_security_abort(sec); > + return ret; > } > > - return ret; > + rdma_restrack_finish_del(&qp->res); This causes to use-after-free because "qp" is released in the driver. [leonro@vm ~]$ mkt test .s....[ 24.350657] ================================================================== [ 24.350917] BUG: KASAN: use-after-free in rdma_restrack_finish_del+0x1a7/0x210 [ib_core] [ 24.351303] Read of size 1 at addr ffff88800c59f118 by task python3/243 [ 24.351416] [ 24.351495] CPU: 4 PID: 243 Comm: python3 Not tainted 5.12.0-rc2+ #2923 [ 24.351644] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 24.351832] Call Trace: [ 24.351886] dump_stack+0x93/0xc2 [ 24.351965] print_address_description.constprop.0+0x18/0x140 [ 24.352074] ? rdma_restrack_finish_del+0x1a7/0x210 [ib_core] [ 24.352204] ? rdma_restrack_finish_del+0x1a7/0x210 [ib_core] [ 24.352333] kasan_report.cold+0x7c/0xd8 [ 24.352413] ? rdma_restrack_finish_del+0x1a7/0x210 [ib_core] [ 24.352543] rdma_restrack_finish_del+0x1a7/0x210 [ib_core] [ 24.352654] ib_destroy_qp_user+0x297/0x540 [ib_core] [ 24.352769] uverbs_free_qp+0x7d/0x150 [ib_uverbs] [ 24.352864] uverbs_destroy_uobject+0x164/0x6c0 [ib_uverbs] [ 24.352949] uobj_destroy+0x72/0xf0 [ib_uverbs] [ 24.353041] ib_uverbs_cmd_verbs+0x19b9/0x2ee0 [ib_uverbs] [ 24.353147] ? uverbs_free_qp+0x150/0x150 [ib_uverbs] [ 24.353246] ? lockdep_hardirqs_on_prepare+0x3e0/0x3e0 [ 24.353340] ? uverbs_fill_udata+0x540/0x540 [ib_uverbs] [ 24.353455] ? filemap_map_pages+0x2d9/0xf30 [ 24.353552] ? lock_acquire+0x1a9/0x6d0 [ 24.353620] ? ib_uverbs_ioctl+0x11e/0x260 [ib_uverbs] [ 24.353720] ? __might_fault+0xba/0x160 [ 24.353790] ? lock_release+0x6c0/0x6c0 [ 24.353866] ib_uverbs_ioctl+0x169/0x260 [ib_uverbs] [ 24.353958] ? ib_uverbs_ioctl+0x11e/0x260 [ib_uverbs] [ 24.354049] ? ib_uverbs_cmd_verbs+0x2ee0/0x2ee0 [ib_uverbs] [ 24.354158] __x64_sys_ioctl+0x7e6/0x1050 [ 24.354217] ? generic_block_fiemap+0x60/0x60 [ 24.354305] ? down_write_nested+0x150/0x150 [ 24.354400] ? do_user_addr_fault+0x219/0xdc0 [ 24.354497] ? lockdep_hardirqs_on_prepare+0x273/0x3e0 [ 24.354584] ? syscall_enter_from_user_mode+0x1d/0x50 [ 24.354678] do_syscall_64+0x2d/0x40 [ 24.354747] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 24.354837] RIP: 0033:0x7fbfd862b5db [ 24.354911] Code: 89 d8 49 8d 3c 1c 48 f7 d8 49 39 c4 72 b5 e8 1c ff ff ff 85 c0 78 ba 4c 89 e0 5b 5d 41 5c c3 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 6d b8 0c 00 f7 d8 64 89 01 48 [ 24.355200] RSP: 002b:00007fff7e9fde58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 24.355336] RAX: ffffffffffffffda RBX: 00007fff7e9fdf88 RCX: 00007fbfd862b5db [ 24.355468] RDX: 00007fff7e9fdf70 RSI: 00000000c0181b01 RDI: 0000000000000005 [ 24.355587] RBP: 00007fff7e9fdf50 R08: 000055d6a4f8b3f0 R09: 00007fbfd800d000 [ 24.355713] R10: 00007fbfd800d0f0 R11: 0000000000000246 R12: 00007fff7e9fdf50 [ 24.355836] R13: 00007fff7e9fdf2c R14: 000055d6a4e59d90 R15: 000055d6a4f8b530 [ 24.355973] So let's finish memory allocation conversion patches before. Thanks
diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c index ffabaf327242..def0c5b0efe9 100644 --- a/drivers/infiniband/core/restrack.c +++ b/drivers/infiniband/core/restrack.c @@ -340,7 +340,7 @@ void rdma_restrack_del(struct rdma_restrack_entry *res) rt = &dev->res[res->type]; old = xa_erase(&rt->xa, res->id); - if (res->type == RDMA_RESTRACK_MR || res->type == RDMA_RESTRACK_QP) + if (res->type == RDMA_RESTRACK_MR) return; WARN_ON(old != res);