Message ID | 1579845165-18002-3-git-send-email-devesh.sharma@broadcom.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Refactor control path of bnxt_re driver | expand |
On Fri, Jan 24, 2020 at 12:52:40AM -0500, Devesh Sharma wrote: > static void bnxt_re_destroy_chip_ctx(struct bnxt_re_dev *rdev) > { > + struct bnxt_qplib_chip_ctx *chip_ctx; > + > + if (!rdev->chip_ctx) > + return; > + chip_ctx = rdev->chip_ctx; > + rdev->chip_ctx = NULL; > rdev->rcfw.res = NULL; > rdev->qplib_res.cctx = NULL; > + kfree(chip_ctx); > } Are you sure this kfree is late enough? I couldn't deduce if it was really safe to NULL chip_ctx here. IMHO you should free it as part of the ib device dealloc. Jason
On Sat, Jan 25, 2020 at 11:33 PM Jason Gunthorpe <jgg@mellanox.com> wrote: > > On Fri, Jan 24, 2020 at 12:52:40AM -0500, Devesh Sharma wrote: > > static void bnxt_re_destroy_chip_ctx(struct bnxt_re_dev *rdev) > > { > > + struct bnxt_qplib_chip_ctx *chip_ctx; > > + > > + if (!rdev->chip_ctx) > > + return; > > + chip_ctx = rdev->chip_ctx; > > + rdev->chip_ctx = NULL; > > rdev->rcfw.res = NULL; > > rdev->qplib_res.cctx = NULL; > > + kfree(chip_ctx); > > } > > Are you sure this kfree is late enough? I couldn't deduce if it was > really safe to NULL chip_ctx here. With the current design its okay to free this here because bnxt_re_destroy_chip_ctx is indeed the last deallocation performed before ib_device_dealloc() in any exit path. Further, the call to bnxt_re_destroy_chip_ctx is protected by rtnl. following is the exit sequence anyewere in the driver control path bnxt_re_ib_unreg(rdev); --->> the last deallocation in this func is destroy_chip_ctx(). bnxt_re_remove_one(rdev); -->> this is a single line function just to put pci device reference bnxt_re_dev_unreg(rdev); -->> the first deallocation in this func is ib_device_dealloc(). > > IMHO you should free it as part of the ib device dealloc. > > Jason
On Mon, Jan 27, 2020 at 01:09:46PM +0530, Devesh Sharma wrote: > On Sat, Jan 25, 2020 at 11:33 PM Jason Gunthorpe <jgg@mellanox.com> wrote: > > > > On Fri, Jan 24, 2020 at 12:52:40AM -0500, Devesh Sharma wrote: > > > static void bnxt_re_destroy_chip_ctx(struct bnxt_re_dev *rdev) > > > { > > > + struct bnxt_qplib_chip_ctx *chip_ctx; > > > + > > > + if (!rdev->chip_ctx) > > > + return; > > > + chip_ctx = rdev->chip_ctx; > > > + rdev->chip_ctx = NULL; > > > rdev->rcfw.res = NULL; > > > rdev->qplib_res.cctx = NULL; > > > + kfree(chip_ctx); > > > } > > > > Are you sure this kfree is late enough? I couldn't deduce if it was > > really safe to NULL chip_ctx here. > With the current design its okay to free this here because > bnxt_re_destroy_chip_ctx is indeed the last deallocation performed > before ib_device_dealloc() in any exit path. Further, the call to > bnxt_re_destroy_chip_ctx is protected by rtnl. ??? Why does device destroy path need global lock to already existing protection from ib_core? Thanks
On Mon, Jan 27, 2020 at 1:34 PM Leon Romanovsky <leon@kernel.org> wrote: > > On Mon, Jan 27, 2020 at 01:09:46PM +0530, Devesh Sharma wrote: > > On Sat, Jan 25, 2020 at 11:33 PM Jason Gunthorpe <jgg@mellanox.com> wrote: > > > > > > On Fri, Jan 24, 2020 at 12:52:40AM -0500, Devesh Sharma wrote: > > > > static void bnxt_re_destroy_chip_ctx(struct bnxt_re_dev *rdev) > > > > { > > > > + struct bnxt_qplib_chip_ctx *chip_ctx; > > > > + > > > > + if (!rdev->chip_ctx) > > > > + return; > > > > + chip_ctx = rdev->chip_ctx; > > > > + rdev->chip_ctx = NULL; > > > > rdev->rcfw.res = NULL; > > > > rdev->qplib_res.cctx = NULL; > > > > + kfree(chip_ctx); > > > > } > > > > > > Are you sure this kfree is late enough? I couldn't deduce if it was > > > really safe to NULL chip_ctx here. > > With the current design its okay to free this here because > > bnxt_re_destroy_chip_ctx is indeed the last deallocation performed > > before ib_device_dealloc() in any exit path. Further, the call to > > bnxt_re_destroy_chip_ctx is protected by rtnl. > > ??? Why does device destroy path need global lock to already existing > protection from ib_core? Oh! probably I confused you, What I meant to say was exit path is executed under netdev-notifier context which already holds the lock. RoCE driver is not taking that lock explicitly. > > Thanks
On Mon, Jan 27, 2020 at 01:09:46PM +0530, Devesh Sharma wrote: > On Sat, Jan 25, 2020 at 11:33 PM Jason Gunthorpe <jgg@mellanox.com> wrote: > > > > On Fri, Jan 24, 2020 at 12:52:40AM -0500, Devesh Sharma wrote: > > > static void bnxt_re_destroy_chip_ctx(struct bnxt_re_dev *rdev) > > > { > > > + struct bnxt_qplib_chip_ctx *chip_ctx; > > > + > > > + if (!rdev->chip_ctx) > > > + return; > > > + chip_ctx = rdev->chip_ctx; > > > + rdev->chip_ctx = NULL; > > > rdev->rcfw.res = NULL; > > > rdev->qplib_res.cctx = NULL; > > > + kfree(chip_ctx); > > > } > > > > Are you sure this kfree is late enough? I couldn't deduce if it was > > really safe to NULL chip_ctx here. > With the current design its okay to free this here because > bnxt_re_destroy_chip_ctx is indeed the last deallocation performed > before ib_device_dealloc() in any exit path. Further, the call to > bnxt_re_destroy_chip_ctx is protected by rtnl. > following is the exit sequence anyewere in the driver control path > bnxt_re_ib_unreg(rdev); --->> the last deallocation in this func is > destroy_chip_ctx(). > bnxt_re_remove_one(rdev); -->> this is a single line function just to > put pci device reference > bnxt_re_dev_unreg(rdev); -->> the first deallocation in this func is > ib_device_dealloc(). It makes more sense to me to put all the memory deallocation together in one place, then there is no concern about ordering. We now have the dealloc_driver callback for this purpose. It is not 'last deallocation' that matters, but what all the other stuff is doing between destroy_chip_ctx() and ib_device_dealloc() Jason
On Wed, Jan 29, 2020 at 1:45 AM Jason Gunthorpe <jgg@mellanox.com> wrote: > > On Mon, Jan 27, 2020 at 01:09:46PM +0530, Devesh Sharma wrote: > > On Sat, Jan 25, 2020 at 11:33 PM Jason Gunthorpe <jgg@mellanox.com> wrote: > > > > > > On Fri, Jan 24, 2020 at 12:52:40AM -0500, Devesh Sharma wrote: > > > > static void bnxt_re_destroy_chip_ctx(struct bnxt_re_dev *rdev) > > > > { > > > > + struct bnxt_qplib_chip_ctx *chip_ctx; > > > > + > > > > + if (!rdev->chip_ctx) > > > > + return; > > > > + chip_ctx = rdev->chip_ctx; > > > > + rdev->chip_ctx = NULL; > > > > rdev->rcfw.res = NULL; > > > > rdev->qplib_res.cctx = NULL; > > > > + kfree(chip_ctx); > > > > } > > > > > > Are you sure this kfree is late enough? I couldn't deduce if it was > > > really safe to NULL chip_ctx here. > > With the current design its okay to free this here because > > bnxt_re_destroy_chip_ctx is indeed the last deallocation performed > > before ib_device_dealloc() in any exit path. Further, the call to > > bnxt_re_destroy_chip_ctx is protected by rtnl. > > following is the exit sequence anyewere in the driver control path > > bnxt_re_ib_unreg(rdev); --->> the last deallocation in this func is > > destroy_chip_ctx(). > > bnxt_re_remove_one(rdev); -->> this is a single line function just to > > put pci device reference > > bnxt_re_dev_unreg(rdev); -->> the first deallocation in this func is > > ib_device_dealloc(). > > It makes more sense to me to put all the memory deallocation together > in one place, then there is no concern about ordering. > > We now have the dealloc_driver callback for this purpose. > > It is not 'last deallocation' that matters, but what all the other > stuff is doing between destroy_chip_ctx() and ib_device_dealloc() As far as this series is concerned, driver is saving crashes however in a nasty way. I would like to move forward with what I have in this patch and submit a new patch series which would implement your suggestion. > > Jason
diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h index c2805384..86274f4 100644 --- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h +++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h @@ -133,7 +133,7 @@ struct bnxt_re_dev { #define BNXT_RE_FLAG_ISSUE_ROCE_STATS 29 struct net_device *netdev; unsigned int version, major, minor; - struct bnxt_qplib_chip_ctx chip_ctx; + struct bnxt_qplib_chip_ctx *chip_ctx; struct bnxt_en_dev *en_dev; struct bnxt_msix_entry msix_entries[BNXT_RE_MAX_MSIX]; int num_msix; diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c index 127eb8f..865728c 100644 --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c @@ -865,7 +865,7 @@ static int bnxt_re_init_user_qp(struct bnxt_re_dev *rdev, struct bnxt_re_pd *pd, bytes = (qplib_qp->sq.max_wqe * BNXT_QPLIB_MAX_SQE_ENTRY_SIZE); /* Consider mapping PSN search memory only for RC QPs. */ if (qplib_qp->type == CMDQ_CREATE_QP_TYPE_RC) { - psn_sz = bnxt_qplib_is_chip_gen_p5(&rdev->chip_ctx) ? + psn_sz = bnxt_qplib_is_chip_gen_p5(rdev->chip_ctx) ? sizeof(struct sq_psn_search_ext) : sizeof(struct sq_psn_search); bytes += (qplib_qp->sq.max_wqe * psn_sz); @@ -1065,6 +1065,7 @@ static void bnxt_re_adjust_gsi_rq_attr(struct bnxt_re_qp *qp) qplqp->rq.max_sge = dev_attr->max_qp_sges; if (qplqp->rq.max_sge > dev_attr->max_qp_sges) qplqp->rq.max_sge = dev_attr->max_qp_sges; + qplqp->rq.max_sge = 6; } static void bnxt_re_init_sq_attr(struct bnxt_re_qp *qp, @@ -1128,7 +1129,7 @@ static int bnxt_re_init_qp_type(struct bnxt_re_dev *rdev, struct bnxt_qplib_chip_ctx *chip_ctx; int qptype; - chip_ctx = &rdev->chip_ctx; + chip_ctx = rdev->chip_ctx; qptype = __from_ib_qp_type(init_attr->qp_type); if (qptype == IB_QPT_MAX) { @@ -1348,7 +1349,7 @@ struct ib_qp *bnxt_re_create_qp(struct ib_pd *ib_pd, goto fail; if (qp_init_attr->qp_type == IB_QPT_GSI && - !(bnxt_qplib_is_chip_gen_p5(&rdev->chip_ctx))) { + !(bnxt_qplib_is_chip_gen_p5(rdev->chip_ctx))) { rc = bnxt_re_create_gsi_qp(qp, pd, qp_init_attr); if (rc == -ENODEV) goto qp_destroy; @@ -3821,10 +3822,10 @@ int bnxt_re_alloc_ucontext(struct ib_ucontext *ctx, struct ib_udata *udata) spin_lock_init(&uctx->sh_lock); resp.comp_mask = BNXT_RE_UCNTX_CMASK_HAVE_CCTX; - chip_met_rev_num = rdev->chip_ctx.chip_num; - chip_met_rev_num |= ((u32)rdev->chip_ctx.chip_rev & 0xFF) << + chip_met_rev_num = rdev->chip_ctx->chip_num; + chip_met_rev_num |= ((u32)rdev->chip_ctx->chip_rev & 0xFF) << BNXT_RE_CHIP_ID0_CHIP_REV_SFT; - chip_met_rev_num |= ((u32)rdev->chip_ctx.chip_metal & 0xFF) << + chip_met_rev_num |= ((u32)rdev->chip_ctx->chip_metal & 0xFF) << BNXT_RE_CHIP_ID0_CHIP_MET_SFT; resp.chip_id0 = chip_met_rev_num; /* Future extension of chip info */ diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c index 82a5350..390daeb 100644 --- a/drivers/infiniband/hw/bnxt_re/main.c +++ b/drivers/infiniband/hw/bnxt_re/main.c @@ -82,22 +82,35 @@ static void bnxt_re_destroy_chip_ctx(struct bnxt_re_dev *rdev) { + struct bnxt_qplib_chip_ctx *chip_ctx; + + if (!rdev->chip_ctx) + return; + chip_ctx = rdev->chip_ctx; + rdev->chip_ctx = NULL; rdev->rcfw.res = NULL; rdev->qplib_res.cctx = NULL; + kfree(chip_ctx); } static int bnxt_re_setup_chip_ctx(struct bnxt_re_dev *rdev) { + struct bnxt_qplib_chip_ctx *chip_ctx; struct bnxt_en_dev *en_dev; struct bnxt *bp; en_dev = rdev->en_dev; bp = netdev_priv(en_dev->net); - rdev->chip_ctx.chip_num = bp->chip_num; + chip_ctx = kzalloc(sizeof(*chip_ctx), GFP_KERNEL); + if (!chip_ctx) + return -ENOMEM; + chip_ctx->chip_num = bp->chip_num; + + rdev->chip_ctx = chip_ctx; /* rest members to follow eventually */ - rdev->qplib_res.cctx = &rdev->chip_ctx; + rdev->qplib_res.cctx = rdev->chip_ctx; rdev->rcfw.res = &rdev->qplib_res; return 0; @@ -136,7 +149,7 @@ static void bnxt_re_limit_pf_res(struct bnxt_re_dev *rdev) ctx->srqc_count = min_t(u32, BNXT_RE_MAX_SRQC_COUNT, attr->max_srq); ctx->cq_count = min_t(u32, BNXT_RE_MAX_CQ_COUNT, attr->max_cq); - if (!bnxt_qplib_is_chip_gen_p5(&rdev->chip_ctx)) + if (!bnxt_qplib_is_chip_gen_p5(rdev->chip_ctx)) for (i = 0; i < MAX_TQM_ALLOC_REQ; i++) rdev->qplib_ctx.tqm_count[i] = rdev->dev_attr.tqm_alloc_reqs[i]; @@ -185,7 +198,7 @@ static void bnxt_re_set_resource_limits(struct bnxt_re_dev *rdev) memset(&rdev->qplib_ctx.vf_res, 0, sizeof(struct bnxt_qplib_vf_res)); bnxt_re_limit_pf_res(rdev); - num_vfs = bnxt_qplib_is_chip_gen_p5(&rdev->chip_ctx) ? + num_vfs = bnxt_qplib_is_chip_gen_p5(rdev->chip_ctx) ? BNXT_RE_GEN_P5_MAX_VF : rdev->num_vfs; if (num_vfs) bnxt_re_limit_vf_res(&rdev->qplib_ctx, num_vfs); @@ -208,7 +221,7 @@ static void bnxt_re_sriov_config(void *p, int num_vfs) return; rdev->num_vfs = num_vfs; - if (!bnxt_qplib_is_chip_gen_p5(&rdev->chip_ctx)) { + if (!bnxt_qplib_is_chip_gen_p5(rdev->chip_ctx)) { bnxt_re_set_resource_limits(rdev); bnxt_qplib_set_func_resources(&rdev->qplib_res, &rdev->rcfw, &rdev->qplib_ctx); @@ -916,7 +929,7 @@ static int bnxt_re_cqn_handler(struct bnxt_qplib_nq *nq, #define BNXT_RE_GEN_P5_VF_NQ_DB 0x4000 static u32 bnxt_re_get_nqdb_offset(struct bnxt_re_dev *rdev, u16 indx) { - return bnxt_qplib_is_chip_gen_p5(&rdev->chip_ctx) ? + return bnxt_qplib_is_chip_gen_p5(rdev->chip_ctx) ? (rdev->is_virtfn ? BNXT_RE_GEN_P5_VF_NQ_DB : BNXT_RE_GEN_P5_PF_NQ_DB) : rdev->msix_entries[indx].db_offset; @@ -967,7 +980,7 @@ static void bnxt_re_free_nq_res(struct bnxt_re_dev *rdev) int i; for (i = 0; i < rdev->num_msix - 1; i++) { - type = bnxt_qplib_get_ring_type(&rdev->chip_ctx); + type = bnxt_qplib_get_ring_type(rdev->chip_ctx); bnxt_re_net_ring_free(rdev, rdev->nq[i].ring_id, type); rdev->nq[i].res = NULL; bnxt_qplib_free_nq(&rdev->nq[i]); @@ -1025,7 +1038,7 @@ static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev) i, rc); goto free_nq; } - type = bnxt_qplib_get_ring_type(&rdev->chip_ctx); + type = bnxt_qplib_get_ring_type(rdev->chip_ctx); pg_map = rdev->nq[i].hwq.pbl[PBL_LVL_0].pg_map_arr; pages = rdev->nq[i].hwq.pbl[rdev->nq[i].hwq.level].pg_count; rc = bnxt_re_net_ring_alloc(rdev, pg_map, pages, type, @@ -1044,7 +1057,7 @@ static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev) return 0; free_nq: for (i = num_vec_created; i >= 0; i--) { - type = bnxt_qplib_get_ring_type(&rdev->chip_ctx); + type = bnxt_qplib_get_ring_type(rdev->chip_ctx); bnxt_re_net_ring_free(rdev, rdev->nq[i].ring_id, type); bnxt_qplib_free_nq(&rdev->nq[i]); } @@ -1324,7 +1337,7 @@ static void bnxt_re_ib_unreg(struct bnxt_re_dev *rdev) bnxt_re_net_stats_ctx_free(rdev, rdev->qplib_ctx.stats.fw_id); bnxt_qplib_free_ctx(rdev->en_dev->pdev, &rdev->qplib_ctx); bnxt_qplib_disable_rcfw_channel(&rdev->rcfw); - type = bnxt_qplib_get_ring_type(&rdev->chip_ctx); + type = bnxt_qplib_get_ring_type(rdev->chip_ctx); bnxt_re_net_ring_free(rdev, rdev->rcfw.creq_ring_id, type); bnxt_qplib_free_rcfw_channel(&rdev->rcfw); } @@ -1405,7 +1418,8 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev) pr_err("Failed to allocate RCFW Channel: %#x\n", rc); goto fail; } - type = bnxt_qplib_get_ring_type(&rdev->chip_ctx); + + type = bnxt_qplib_get_ring_type(rdev->chip_ctx); pg_map = rdev->rcfw.creq.pbl[PBL_LVL_0].pg_map_arr; pages = rdev->rcfw.creq.pbl[rdev->rcfw.creq.level].pg_count; ridx = rdev->msix_entries[BNXT_RE_AEQ_IDX].ring_idx; @@ -1434,7 +1448,7 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev) bnxt_re_set_resource_limits(rdev); rc = bnxt_qplib_alloc_ctx(rdev->en_dev->pdev, &rdev->qplib_ctx, 0, - bnxt_qplib_is_chip_gen_p5(&rdev->chip_ctx)); + bnxt_qplib_is_chip_gen_p5(rdev->chip_ctx)); if (rc) { pr_err("Failed to allocate QPLIB context: %#x\n", rc); goto disable_rcfw; @@ -1504,7 +1518,7 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev) disable_rcfw: bnxt_qplib_disable_rcfw_channel(&rdev->rcfw); free_ring: - type = bnxt_qplib_get_ring_type(&rdev->chip_ctx); + type = bnxt_qplib_get_ring_type(rdev->chip_ctx); bnxt_re_net_ring_free(rdev, rdev->rcfw.creq_ring_id, type); free_rcfw: bnxt_qplib_free_rcfw_channel(&rdev->rcfw); diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c index 958c1ff..bc9f747 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c @@ -2426,7 +2426,7 @@ static int bnxt_qplib_cq_process_res_ud(struct bnxt_qplib_cq *cq, } cqe = *pcqe; cqe->opcode = hwcqe->cqe_type_toggle & CQ_BASE_CQE_TYPE_MASK; - cqe->length = (u32)le16_to_cpu(hwcqe->length); + cqe->length = le16_to_cpu(hwcqe->length) & CQ_RES_UD_LENGTH_MASK; cqe->cfa_meta = le16_to_cpu(hwcqe->cfa_metadata); cqe->invrkey = le32_to_cpu(hwcqe->imm_data); cqe->flags = le16_to_cpu(hwcqe->flags);