Message ID | 20240809131538.944907-3-haris.iqbal@ionos.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Misc patches for RTRS | expand |
On Fri, Aug 09, 2024 at 03:15:27PM +0200, Md Haris Iqbal wrote: > From: Jack Wang <jinpu.wang@ionos.com> > > In case of error happening during session stablishment, close_work is > running. A new RDMA CM event may arrive since we don't destroy cm_id > before destroying qp. To fix this, we first destroy cm_id after drain_qp, > so no new RDMA CM event will arrive afterwards. > > Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") > Signed-off-by: Jack Wang <jinpu.wang@ionos.com> > Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com> > Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> > --- > drivers/infiniband/ulp/rtrs/rtrs-srv.c | 2 +- > drivers/infiniband/ulp/rtrs/rtrs.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c > index fb67b58a7f62..90ea25ad6720 100644 > --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c > @@ -1540,6 +1540,7 @@ static void rtrs_srv_close_work(struct work_struct *work) > con = to_srv_con(srv_path->s.con[i]); > rdma_disconnect(con->c.cm_id); > ib_drain_qp(con->c.qp); > + rdma_destroy_id(con->c.cm_id); > } > > /* > @@ -1564,7 +1565,6 @@ static void rtrs_srv_close_work(struct work_struct *work) > continue; > con = to_srv_con(srv_path->s.con[i]); > rtrs_cq_qp_destroy(&con->c); > - rdma_destroy_id(con->c.cm_id); > kfree(con); > } > rtrs_ib_dev_put(srv_path->s.dev); > diff --git a/drivers/infiniband/ulp/rtrs/rtrs.c b/drivers/infiniband/ulp/rtrs/rtrs.c > index 4e17d546d4cc..44167fd1c958 100644 > --- a/drivers/infiniband/ulp/rtrs/rtrs.c > +++ b/drivers/infiniband/ulp/rtrs/rtrs.c > @@ -318,7 +318,7 @@ EXPORT_SYMBOL_GPL(rtrs_cq_qp_create); > void rtrs_cq_qp_destroy(struct rtrs_con *con) > { > if (con->qp) { > - rdma_destroy_qp(con->cm_id); > + ib_destroy_qp(con->qp); You created that QP with rdma_create_qp() and you should destroy it with rdma_destroy_qp(). Thanks > con->qp = NULL; > } > destroy_cq(con); > -- > 2.25.1 >
Hi, On Sun, Aug 11, 2024 at 10:41 AM Leon Romanovsky <leon@kernel.org> wrote: > > On Fri, Aug 09, 2024 at 03:15:27PM +0200, Md Haris Iqbal wrote: > > From: Jack Wang <jinpu.wang@ionos.com> > > > > In case of error happening during session stablishment, close_work is > > running. A new RDMA CM event may arrive since we don't destroy cm_id > > before destroying qp. To fix this, we first destroy cm_id after drain_qp, > > so no new RDMA CM event will arrive afterwards. > > > > Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") > > Signed-off-by: Jack Wang <jinpu.wang@ionos.com> > > Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com> > > Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> > > --- > > drivers/infiniband/ulp/rtrs/rtrs-srv.c | 2 +- > > drivers/infiniband/ulp/rtrs/rtrs.c | 2 +- > > 2 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c > > index fb67b58a7f62..90ea25ad6720 100644 > > --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c > > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c > > @@ -1540,6 +1540,7 @@ static void rtrs_srv_close_work(struct work_struct *work) > > con = to_srv_con(srv_path->s.con[i]); > > rdma_disconnect(con->c.cm_id); > > ib_drain_qp(con->c.qp); > > + rdma_destroy_id(con->c.cm_id); > > } > > > > /* > > @@ -1564,7 +1565,6 @@ static void rtrs_srv_close_work(struct work_struct *work) > > continue; > > con = to_srv_con(srv_path->s.con[i]); > > rtrs_cq_qp_destroy(&con->c); > > - rdma_destroy_id(con->c.cm_id); > > kfree(con); > > } > > rtrs_ib_dev_put(srv_path->s.dev); > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs.c b/drivers/infiniband/ulp/rtrs/rtrs.c > > index 4e17d546d4cc..44167fd1c958 100644 > > --- a/drivers/infiniband/ulp/rtrs/rtrs.c > > +++ b/drivers/infiniband/ulp/rtrs/rtrs.c > > @@ -318,7 +318,7 @@ EXPORT_SYMBOL_GPL(rtrs_cq_qp_create); > > void rtrs_cq_qp_destroy(struct rtrs_con *con) > > { > > if (con->qp) { > > - rdma_destroy_qp(con->cm_id); > > + ib_destroy_qp(con->qp); > > You created that QP with rdma_create_qp() and you should destroy it with rdma_destroy_qp(). We can't do it, as we move rdma_destroy_id before rtrs_cq_qp_destroy, if we still call rdma_destroy_qp, which will lead to use after free as cm_id could already be free-ed. > > Thanks Thx! > > > con->qp = NULL; > > } > > destroy_cq(con); > > -- > > 2.25.1 > >
On Mon, Aug 12, 2024 at 12:52:25PM +0200, Jinpu Wang wrote: > Hi, > > On Sun, Aug 11, 2024 at 10:41 AM Leon Romanovsky <leon@kernel.org> wrote: > > > > On Fri, Aug 09, 2024 at 03:15:27PM +0200, Md Haris Iqbal wrote: > > > From: Jack Wang <jinpu.wang@ionos.com> > > > > > > In case of error happening during session stablishment, close_work is > > > running. A new RDMA CM event may arrive since we don't destroy cm_id > > > before destroying qp. To fix this, we first destroy cm_id after drain_qp, > > > so no new RDMA CM event will arrive afterwards. > > > > > > Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") > > > Signed-off-by: Jack Wang <jinpu.wang@ionos.com> > > > Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com> > > > Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> > > > --- > > > drivers/infiniband/ulp/rtrs/rtrs-srv.c | 2 +- > > > drivers/infiniband/ulp/rtrs/rtrs.c | 2 +- > > > 2 files changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c > > > index fb67b58a7f62..90ea25ad6720 100644 > > > --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c > > > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c > > > @@ -1540,6 +1540,7 @@ static void rtrs_srv_close_work(struct work_struct *work) > > > con = to_srv_con(srv_path->s.con[i]); > > > rdma_disconnect(con->c.cm_id); > > > ib_drain_qp(con->c.qp); > > > + rdma_destroy_id(con->c.cm_id); > > > } > > > > > > /* > > > @@ -1564,7 +1565,6 @@ static void rtrs_srv_close_work(struct work_struct *work) > > > continue; > > > con = to_srv_con(srv_path->s.con[i]); > > > rtrs_cq_qp_destroy(&con->c); > > > - rdma_destroy_id(con->c.cm_id); > > > kfree(con); > > > } > > > rtrs_ib_dev_put(srv_path->s.dev); > > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs.c b/drivers/infiniband/ulp/rtrs/rtrs.c > > > index 4e17d546d4cc..44167fd1c958 100644 > > > --- a/drivers/infiniband/ulp/rtrs/rtrs.c > > > +++ b/drivers/infiniband/ulp/rtrs/rtrs.c > > > @@ -318,7 +318,7 @@ EXPORT_SYMBOL_GPL(rtrs_cq_qp_create); > > > void rtrs_cq_qp_destroy(struct rtrs_con *con) > > > { > > > if (con->qp) { > > > - rdma_destroy_qp(con->cm_id); > > > + ib_destroy_qp(con->qp); > > > > You created that QP with rdma_create_qp() and you should destroy it with rdma_destroy_qp(). > We can't do it, as we move rdma_destroy_id before rtrs_cq_qp_destroy, > if we still call rdma_destroy_qp, which will lead to use after free as > cm_id could already be free-ed. It is a hint that you are doing something wrong. Thanks > > > > > Thanks > Thx! > > > > > con->qp = NULL; > > > } > > > destroy_cq(con); > > > -- > > > 2.25.1 > > > >
On Mon, Aug 12, 2024 at 1:01 PM Leon Romanovsky <leon@kernel.org> wrote: > > On Mon, Aug 12, 2024 at 12:52:25PM +0200, Jinpu Wang wrote: > > Hi, > > > > On Sun, Aug 11, 2024 at 10:41 AM Leon Romanovsky <leon@kernel.org> wrote: > > > > > > On Fri, Aug 09, 2024 at 03:15:27PM +0200, Md Haris Iqbal wrote: > > > > From: Jack Wang <jinpu.wang@ionos.com> > > > > > > > > In case of error happening during session stablishment, close_work is > > > > running. A new RDMA CM event may arrive since we don't destroy cm_id > > > > before destroying qp. To fix this, we first destroy cm_id after drain_qp, > > > > so no new RDMA CM event will arrive afterwards. > > > > > > > > Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") > > > > Signed-off-by: Jack Wang <jinpu.wang@ionos.com> > > > > Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com> > > > > Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> > > > > --- > > > > drivers/infiniband/ulp/rtrs/rtrs-srv.c | 2 +- > > > > drivers/infiniband/ulp/rtrs/rtrs.c | 2 +- > > > > 2 files changed, 2 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c > > > > index fb67b58a7f62..90ea25ad6720 100644 > > > > --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c > > > > +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c > > > > @@ -1540,6 +1540,7 @@ static void rtrs_srv_close_work(struct work_struct *work) > > > > con = to_srv_con(srv_path->s.con[i]); > > > > rdma_disconnect(con->c.cm_id); > > > > ib_drain_qp(con->c.qp); > > > > + rdma_destroy_id(con->c.cm_id); > > > > } > > > > > > > > /* > > > > @@ -1564,7 +1565,6 @@ static void rtrs_srv_close_work(struct work_struct *work) > > > > continue; > > > > con = to_srv_con(srv_path->s.con[i]); > > > > rtrs_cq_qp_destroy(&con->c); > > > > - rdma_destroy_id(con->c.cm_id); > > > > kfree(con); > > > > } > > > > rtrs_ib_dev_put(srv_path->s.dev); > > > > diff --git a/drivers/infiniband/ulp/rtrs/rtrs.c b/drivers/infiniband/ulp/rtrs/rtrs.c > > > > index 4e17d546d4cc..44167fd1c958 100644 > > > > --- a/drivers/infiniband/ulp/rtrs/rtrs.c > > > > +++ b/drivers/infiniband/ulp/rtrs/rtrs.c > > > > @@ -318,7 +318,7 @@ EXPORT_SYMBOL_GPL(rtrs_cq_qp_create); > > > > void rtrs_cq_qp_destroy(struct rtrs_con *con) > > > > { > > > > if (con->qp) { > > > > - rdma_destroy_qp(con->cm_id); > > > > + ib_destroy_qp(con->qp); > > > > > > You created that QP with rdma_create_qp() and you should destroy it with rdma_destroy_qp(). > > We can't do it, as we move rdma_destroy_id before rtrs_cq_qp_destroy, > > if we still call rdma_destroy_qp, which will lead to use after free as > > cm_id could already be free-ed. > > It is a hint that you are doing something wrong. will drop it for now, will see if there is other way to fix it. > > Thanks Thx > > > > > > > > > Thanks > > Thx! > > > > > > > con->qp = NULL; > > > > } > > > > destroy_cq(con); > > > > -- > > > > 2.25.1 > > > > > >
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c index fb67b58a7f62..90ea25ad6720 100644 --- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c @@ -1540,6 +1540,7 @@ static void rtrs_srv_close_work(struct work_struct *work) con = to_srv_con(srv_path->s.con[i]); rdma_disconnect(con->c.cm_id); ib_drain_qp(con->c.qp); + rdma_destroy_id(con->c.cm_id); } /* @@ -1564,7 +1565,6 @@ static void rtrs_srv_close_work(struct work_struct *work) continue; con = to_srv_con(srv_path->s.con[i]); rtrs_cq_qp_destroy(&con->c); - rdma_destroy_id(con->c.cm_id); kfree(con); } rtrs_ib_dev_put(srv_path->s.dev); diff --git a/drivers/infiniband/ulp/rtrs/rtrs.c b/drivers/infiniband/ulp/rtrs/rtrs.c index 4e17d546d4cc..44167fd1c958 100644 --- a/drivers/infiniband/ulp/rtrs/rtrs.c +++ b/drivers/infiniband/ulp/rtrs/rtrs.c @@ -318,7 +318,7 @@ EXPORT_SYMBOL_GPL(rtrs_cq_qp_create); void rtrs_cq_qp_destroy(struct rtrs_con *con) { if (con->qp) { - rdma_destroy_qp(con->cm_id); + ib_destroy_qp(con->qp); con->qp = NULL; } destroy_cq(con);