From patchwork Tue Jun 6 05:50:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cheng Xu X-Patchwork-Id: 13268258 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C42AC77B73 for ; Tue, 6 Jun 2023 05:50:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233891AbjFFFua (ORCPT ); Tue, 6 Jun 2023 01:50:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50760 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234050AbjFFFuT (ORCPT ); Tue, 6 Jun 2023 01:50:19 -0400 Received: from out30-124.freemail.mail.aliyun.com (out30-124.freemail.mail.aliyun.com [115.124.30.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D6617E6A for ; Mon, 5 Jun 2023 22:50:12 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R701e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046049;MF=chengyou@linux.alibaba.com;NM=1;PH=DS;RN=4;SR=0;TI=SMTPD_---0VkV2uPa_1686030608; Received: from localhost(mailfrom:chengyou@linux.alibaba.com fp:SMTPD_---0VkV2uPa_1686030608) by smtp.aliyun-inc.com; Tue, 06 Jun 2023 13:50:09 +0800 From: Cheng Xu To: jgg@ziepe.ca, leon@kernel.org Cc: linux-rdma@vger.kernel.org, KaiShen@linux.alibaba.com Subject: [PATCH for-next 1/4] RDMA/erdma: Configure PAGE_SIZE to hardware Date: Tue, 6 Jun 2023 13:50:02 +0800 Message-Id: <20230606055005.80729-2-chengyou@linux.alibaba.com> X-Mailer: git-send-email 2.37.0 In-Reply-To: <20230606055005.80729-1-chengyou@linux.alibaba.com> References: <20230606055005.80729-1-chengyou@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Add a new CMDQ message to configure hardware. Initially the page size (in the format of shift) will be passed to hardware, so that hardware can organize the mmio space properly. It's called only if hardware supports it. Signed-off-by: Cheng Xu --- drivers/infiniband/hw/erdma/erdma_hw.h | 12 ++++++++++++ drivers/infiniband/hw/erdma/erdma_main.c | 20 ++++++++++++++++++++ 2 files changed, 32 insertions(+) diff --git a/drivers/infiniband/hw/erdma/erdma_hw.h b/drivers/infiniband/hw/erdma/erdma_hw.h index 76ce2856be28..670796c22bcc 100644 --- a/drivers/infiniband/hw/erdma/erdma_hw.h +++ b/drivers/infiniband/hw/erdma/erdma_hw.h @@ -159,6 +159,7 @@ enum CMDQ_COMMON_OPCODE { CMDQ_OPCODE_DESTROY_EQ = 1, CMDQ_OPCODE_QUERY_FW_INFO = 2, CMDQ_OPCODE_CONF_MTU = 3, + CMDQ_OPCODE_CONF_DEVICE = 5, }; /* cmdq-SQE HDR */ @@ -196,6 +197,16 @@ struct erdma_cmdq_destroy_eq_req { u8 qtype; }; +/* config device cfg */ +#define ERDMA_CMD_CONFIG_DEVICE_PS_EN_MASK BIT(31) +#define ERDMA_CMD_CONFIG_DEVICE_PGSHIFT_MASK GENMASK(4, 0) + +struct erdma_cmdq_config_device_req { + u64 hdr; + u32 cfg; + u32 rsvd[5]; +}; + struct erdma_cmdq_config_mtu_req { u64 hdr; u32 mtu; @@ -329,6 +340,7 @@ struct erdma_cmdq_reflush_req { enum { ERDMA_DEV_CAP_FLAGS_ATOMIC = 1 << 7, + ERDMA_DEV_CAP_FLAGS_EXTEND_DB = 1 << 3, }; #define ERDMA_CMD_INFO0_FW_VER_MASK GENMASK_ULL(31, 0) diff --git a/drivers/infiniband/hw/erdma/erdma_main.c b/drivers/infiniband/hw/erdma/erdma_main.c index 7c74abeee864..525edea987b2 100644 --- a/drivers/infiniband/hw/erdma/erdma_main.c +++ b/drivers/infiniband/hw/erdma/erdma_main.c @@ -426,6 +426,22 @@ static int erdma_dev_attrs_init(struct erdma_dev *dev) return err; } +static int erdma_device_config(struct erdma_dev *dev) +{ + struct erdma_cmdq_config_device_req req = {}; + + if (!(dev->attrs.cap_flags & ERDMA_DEV_CAP_FLAGS_EXTEND_DB)) + return 0; + + erdma_cmdq_build_reqhdr(&req.hdr, CMDQ_SUBMOD_COMMON, + CMDQ_OPCODE_CONF_DEVICE); + + req.cfg = FIELD_PREP(ERDMA_CMD_CONFIG_DEVICE_PGSHIFT_MASK, PAGE_SHIFT) | + FIELD_PREP(ERDMA_CMD_CONFIG_DEVICE_PS_EN_MASK, 1); + + return erdma_post_cmd_wait(&dev->cmdq, &req, sizeof(req), NULL, NULL); +} + static int erdma_res_cb_init(struct erdma_dev *dev) { int i, j; @@ -512,6 +528,10 @@ static int erdma_ib_device_add(struct pci_dev *pdev) if (ret) return ret; + ret = erdma_device_config(dev); + if (ret) + return ret; + ibdev->node_type = RDMA_NODE_RNIC; memcpy(ibdev->node_desc, ERDMA_NODE_DESC, sizeof(ERDMA_NODE_DESC)); From patchwork Tue Jun 6 05:50:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cheng Xu X-Patchwork-Id: 13268260 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68C5CC77B73 for ; Tue, 6 Jun 2023 05:50:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233695AbjFFFud (ORCPT ); Tue, 6 Jun 2023 01:50:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50784 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234238AbjFFFuV (ORCPT ); Tue, 6 Jun 2023 01:50:21 -0400 Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CB7A0E71 for ; Mon, 5 Jun 2023 22:50:14 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R831e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045176;MF=chengyou@linux.alibaba.com;NM=1;PH=DS;RN=4;SR=0;TI=SMTPD_---0VkV2uQy_1686030609; Received: from localhost(mailfrom:chengyou@linux.alibaba.com fp:SMTPD_---0VkV2uQy_1686030609) by smtp.aliyun-inc.com; Tue, 06 Jun 2023 13:50:10 +0800 From: Cheng Xu To: jgg@ziepe.ca, leon@kernel.org Cc: linux-rdma@vger.kernel.org, KaiShen@linux.alibaba.com Subject: [PATCH for-next 2/4] RDMA/erdma: Allocate doorbell resources from hardware Date: Tue, 6 Jun 2023 13:50:03 +0800 Message-Id: <20230606055005.80729-3-chengyou@linux.alibaba.com> X-Mailer: git-send-email 2.37.0 In-Reply-To: <20230606055005.80729-1-chengyou@linux.alibaba.com> References: <20230606055005.80729-1-chengyou@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Each ucontext will try to allocate doorbell resources in the extended bar space from hardware. For compatibility, we change nothing for the original bar space, and it will be used only for applications with CAP_SYS_RAWIO authority in the older HW/FW environments. Signed-off-by: Cheng Xu --- drivers/infiniband/hw/erdma/erdma.h | 2 + drivers/infiniband/hw/erdma/erdma_hw.h | 22 ++++ drivers/infiniband/hw/erdma/erdma_verbs.c | 117 ++++++++++++++++++---- drivers/infiniband/hw/erdma/erdma_verbs.h | 9 ++ 4 files changed, 131 insertions(+), 19 deletions(-) diff --git a/drivers/infiniband/hw/erdma/erdma.h b/drivers/infiniband/hw/erdma/erdma.h index e819e4032490..a361d4bcd714 100644 --- a/drivers/infiniband/hw/erdma/erdma.h +++ b/drivers/infiniband/hw/erdma/erdma.h @@ -268,6 +268,8 @@ static inline u32 erdma_reg_read32_filed(struct erdma_dev *dev, u32 reg, return FIELD_GET(filed_mask, val); } +#define ERDMA_GET(val, name) FIELD_GET(ERDMA_CMD_##name##_MASK, val) + int erdma_cmdq_init(struct erdma_dev *dev); void erdma_finish_cmdq_init(struct erdma_dev *dev); void erdma_cmdq_destroy(struct erdma_dev *dev); diff --git a/drivers/infiniband/hw/erdma/erdma_hw.h b/drivers/infiniband/hw/erdma/erdma_hw.h index 670796c22bcc..812fc40de64b 100644 --- a/drivers/infiniband/hw/erdma/erdma_hw.h +++ b/drivers/infiniband/hw/erdma/erdma_hw.h @@ -160,6 +160,8 @@ enum CMDQ_COMMON_OPCODE { CMDQ_OPCODE_QUERY_FW_INFO = 2, CMDQ_OPCODE_CONF_MTU = 3, CMDQ_OPCODE_CONF_DEVICE = 5, + CMDQ_OPCODE_ALLOC_DB = 8, + CMDQ_OPCODE_FREE_DB = 9, }; /* cmdq-SQE HDR */ @@ -212,6 +214,26 @@ struct erdma_cmdq_config_mtu_req { u32 mtu; }; +/* ext db requests(alloc and free) cfg */ +#define ERDMA_CMD_EXT_DB_CQ_EN_MASK BIT(2) +#define ERDMA_CMD_EXT_DB_RQ_EN_MASK BIT(1) +#define ERDMA_CMD_EXT_DB_SQ_EN_MASK BIT(0) + +struct erdma_cmdq_ext_db_req { + u64 hdr; + u32 cfg; + u16 rdb_off; + u16 sdb_off; + u16 rsvd0; + u16 cdb_off; + u32 rsvd1[3]; +}; + +/* alloc db response qword 0 definition */ +#define ERDMA_CMD_ALLOC_DB_RESP_RDB_MASK GENMASK_ULL(63, 48) +#define ERDMA_CMD_ALLOC_DB_RESP_CDB_MASK GENMASK_ULL(47, 32) +#define ERDMA_CMD_ALLOC_DB_RESP_SDB_MASK GENMASK_ULL(15, 0) + /* create_cq cfg0 */ #define ERDMA_CMD_CREATE_CQ_DEPTH_MASK GENMASK(31, 24) #define ERDMA_CMD_CREATE_CQ_PAGESIZE_MASK GENMASK(23, 20) diff --git a/drivers/infiniband/hw/erdma/erdma_verbs.c b/drivers/infiniband/hw/erdma/erdma_verbs.c index 83e1b0d55977..376f70219ecd 100644 --- a/drivers/infiniband/hw/erdma/erdma_verbs.c +++ b/drivers/infiniband/hw/erdma/erdma_verbs.c @@ -1188,6 +1188,60 @@ static void alloc_db_resources(struct erdma_dev *dev, ctx->sdb = dev->func_bar_addr + (ctx->sdb_page_idx << PAGE_SHIFT); } +static int alloc_ext_db_resources(struct erdma_dev *dev, + struct erdma_ucontext *ctx) +{ + struct erdma_cmdq_ext_db_req req = {}; + u64 val0, val1; + int ret; + + erdma_cmdq_build_reqhdr(&req.hdr, CMDQ_SUBMOD_COMMON, + CMDQ_OPCODE_ALLOC_DB); + + req.cfg = FIELD_PREP(ERDMA_CMD_EXT_DB_CQ_EN_MASK, 1) | + FIELD_PREP(ERDMA_CMD_EXT_DB_RQ_EN_MASK, 1) | + FIELD_PREP(ERDMA_CMD_EXT_DB_SQ_EN_MASK, 1); + + ret = erdma_post_cmd_wait(&dev->cmdq, &req, sizeof(req), &val0, &val1); + if (ret) + return ret; + + ctx->ext_db.enable = true; + ctx->ext_db.sdb_off = ERDMA_GET(val0, ALLOC_DB_RESP_SDB); + ctx->ext_db.rdb_off = ERDMA_GET(val0, ALLOC_DB_RESP_RDB); + ctx->ext_db.cdb_off = ERDMA_GET(val0, ALLOC_DB_RESP_CDB); + + ctx->sdb_type = ERDMA_SDB_PAGE; + ctx->sdb = dev->func_bar_addr + (ctx->ext_db.sdb_off << PAGE_SHIFT); + ctx->cdb = dev->func_bar_addr + (ctx->ext_db.rdb_off << PAGE_SHIFT); + ctx->rdb = dev->func_bar_addr + (ctx->ext_db.cdb_off << PAGE_SHIFT); + + return 0; +} + +static void free_ext_db_resources(struct erdma_dev *dev, + struct erdma_ucontext *ctx) +{ + struct erdma_cmdq_ext_db_req req = {}; + int ret; + + erdma_cmdq_build_reqhdr(&req.hdr, CMDQ_SUBMOD_COMMON, + CMDQ_OPCODE_FREE_DB); + + req.cfg = FIELD_PREP(ERDMA_CMD_EXT_DB_CQ_EN_MASK, 1) | + FIELD_PREP(ERDMA_CMD_EXT_DB_RQ_EN_MASK, 1) | + FIELD_PREP(ERDMA_CMD_EXT_DB_SQ_EN_MASK, 1); + + req.sdb_off = ctx->ext_db.sdb_off; + req.rdb_off = ctx->ext_db.rdb_off; + req.cdb_off = ctx->ext_db.cdb_off; + + ret = erdma_post_cmd_wait(&dev->cmdq, &req, sizeof(req), NULL, NULL); + if (ret) + ibdev_err_ratelimited(&dev->ibdev, + "free db resources failed %d", ret); +} + static void erdma_uctx_user_mmap_entries_remove(struct erdma_ucontext *uctx) { rdma_user_mmap_entry_remove(uctx->sq_db_mmap_entry); @@ -1201,44 +1255,60 @@ int erdma_alloc_ucontext(struct ib_ucontext *ibctx, struct ib_udata *udata) struct erdma_dev *dev = to_edev(ibctx->device); int ret; struct erdma_uresp_alloc_ctx uresp = {}; + bool ext_db_en; if (atomic_inc_return(&dev->num_ctx) > ERDMA_MAX_CONTEXT) { ret = -ENOMEM; goto err_out; } + if (udata->outlen < sizeof(uresp)) { + ret = -EINVAL; + goto err_out; + } + INIT_LIST_HEAD(&ctx->dbrecords_page_list); mutex_init(&ctx->dbrecords_page_mutex); - alloc_db_resources(dev, ctx); - - ctx->rdb = dev->func_bar_addr + ERDMA_BAR_RQDB_SPACE_OFFSET; - ctx->cdb = dev->func_bar_addr + ERDMA_BAR_CQDB_SPACE_OFFSET; - - if (udata->outlen < sizeof(uresp)) { - ret = -EINVAL; + /* + * CAP_SYS_RAWIO is required if hardware does not support extend + * doorbell mechanism. + */ + ext_db_en = !!(dev->attrs.cap_flags & ERDMA_DEV_CAP_FLAGS_EXTEND_DB); + if (!ext_db_en && !capable(CAP_SYS_RAWIO)) { + ret = -EPERM; goto err_out; } + if (ext_db_en) { + ret = alloc_ext_db_resources(dev, ctx); + if (ret) + goto err_out; + } else { + alloc_db_resources(dev, ctx); + ctx->rdb = dev->func_bar_addr + ERDMA_BAR_RQDB_SPACE_OFFSET; + ctx->cdb = dev->func_bar_addr + ERDMA_BAR_CQDB_SPACE_OFFSET; + } + ctx->sq_db_mmap_entry = erdma_user_mmap_entry_insert( ctx, (void *)ctx->sdb, PAGE_SIZE, ERDMA_MMAP_IO_NC, &uresp.sdb); if (!ctx->sq_db_mmap_entry) { ret = -ENOMEM; - goto err_out; + goto err_free_ext_db; } ctx->rq_db_mmap_entry = erdma_user_mmap_entry_insert( ctx, (void *)ctx->rdb, PAGE_SIZE, ERDMA_MMAP_IO_NC, &uresp.rdb); if (!ctx->rq_db_mmap_entry) { ret = -EINVAL; - goto err_out; + goto err_put_mmap_entries; } ctx->cq_db_mmap_entry = erdma_user_mmap_entry_insert( ctx, (void *)ctx->cdb, PAGE_SIZE, ERDMA_MMAP_IO_NC, &uresp.cdb); if (!ctx->cq_db_mmap_entry) { ret = -EINVAL; - goto err_out; + goto err_put_mmap_entries; } uresp.dev_id = dev->pdev->device; @@ -1247,12 +1317,18 @@ int erdma_alloc_ucontext(struct ib_ucontext *ibctx, struct ib_udata *udata) ret = ib_copy_to_udata(udata, &uresp, sizeof(uresp)); if (ret) - goto err_out; + goto err_put_mmap_entries; return 0; -err_out: +err_put_mmap_entries: erdma_uctx_user_mmap_entries_remove(ctx); + +err_free_ext_db: + if (ext_db_en) + free_ext_db_resources(dev, ctx); + +err_out: atomic_dec(&dev->num_ctx); return ret; } @@ -1262,15 +1338,18 @@ void erdma_dealloc_ucontext(struct ib_ucontext *ibctx) struct erdma_ucontext *ctx = to_ectx(ibctx); struct erdma_dev *dev = to_edev(ibctx->device); - spin_lock(&dev->db_bitmap_lock); - if (ctx->sdb_type == ERDMA_SDB_PAGE) - clear_bit(ctx->sdb_idx, dev->sdb_page); - else if (ctx->sdb_type == ERDMA_SDB_ENTRY) - clear_bit(ctx->sdb_idx, dev->sdb_entry); - erdma_uctx_user_mmap_entries_remove(ctx); - spin_unlock(&dev->db_bitmap_lock); + if (ctx->ext_db.enable) { + free_ext_db_resources(dev, ctx); + } else { + spin_lock(&dev->db_bitmap_lock); + if (ctx->sdb_type == ERDMA_SDB_PAGE) + clear_bit(ctx->sdb_idx, dev->sdb_page); + else if (ctx->sdb_type == ERDMA_SDB_ENTRY) + clear_bit(ctx->sdb_idx, dev->sdb_entry); + spin_unlock(&dev->db_bitmap_lock); + } atomic_dec(&dev->num_ctx); } diff --git a/drivers/infiniband/hw/erdma/erdma_verbs.h b/drivers/infiniband/hw/erdma/erdma_verbs.h index 131cf5f40982..252106679d36 100644 --- a/drivers/infiniband/hw/erdma/erdma_verbs.h +++ b/drivers/infiniband/hw/erdma/erdma_verbs.h @@ -31,9 +31,18 @@ struct erdma_user_mmap_entry { u8 mmap_flag; }; +struct erdma_ext_db_info { + bool enable; + u16 sdb_off; + u16 rdb_off; + u16 cdb_off; +}; + struct erdma_ucontext { struct ib_ucontext ibucontext; + struct erdma_ext_db_info ext_db; + u32 sdb_type; u32 sdb_idx; u32 sdb_page_idx; From patchwork Tue Jun 6 05:50:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cheng Xu X-Patchwork-Id: 13268259 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 535DCC77B7A for ; Tue, 6 Jun 2023 05:50:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231670AbjFFFub (ORCPT ); Tue, 6 Jun 2023 01:50:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50586 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234224AbjFFFuU (ORCPT ); Tue, 6 Jun 2023 01:50:20 -0400 Received: from out30-112.freemail.mail.aliyun.com (out30-112.freemail.mail.aliyun.com [115.124.30.112]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A901FE6F for ; Mon, 5 Jun 2023 22:50:14 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R301e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046059;MF=chengyou@linux.alibaba.com;NM=1;PH=DS;RN=4;SR=0;TI=SMTPD_---0VkV1mr6_1686030610; Received: from localhost(mailfrom:chengyou@linux.alibaba.com fp:SMTPD_---0VkV1mr6_1686030610) by smtp.aliyun-inc.com; Tue, 06 Jun 2023 13:50:11 +0800 From: Cheng Xu To: jgg@ziepe.ca, leon@kernel.org Cc: linux-rdma@vger.kernel.org, KaiShen@linux.alibaba.com Subject: [PATCH for-next 3/4] RDMA/erdma: Associate QPs/CQs with doorbells for authorization Date: Tue, 6 Jun 2023 13:50:04 +0800 Message-Id: <20230606055005.80729-4-chengyou@linux.alibaba.com> X-Mailer: git-send-email 2.37.0 In-Reply-To: <20230606055005.80729-1-chengyou@linux.alibaba.com> References: <20230606055005.80729-1-chengyou@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org For the isolation requirement, each QP/CQ can only issue doorbells from the allocated mmio space. Configure the relationship between QPs/CQs and mmio doorbell spaces to hardware in create_qp/create_cq interfaces. Signed-off-by: Cheng Xu --- drivers/infiniband/hw/erdma/erdma_hw.h | 17 ++++++++++++- drivers/infiniband/hw/erdma/erdma_verbs.c | 31 ++++++++++++++++++----- 2 files changed, 41 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/hw/erdma/erdma_hw.h b/drivers/infiniband/hw/erdma/erdma_hw.h index 812fc40de64b..cf7629bfe534 100644 --- a/drivers/infiniband/hw/erdma/erdma_hw.h +++ b/drivers/infiniband/hw/erdma/erdma_hw.h @@ -134,7 +134,7 @@ /* CMDQ related. */ #define ERDMA_CMDQ_MAX_OUTSTANDING 128 -#define ERDMA_CMDQ_SQE_SIZE 64 +#define ERDMA_CMDQ_SQE_SIZE 128 /* cmdq sub module definition. */ enum CMDQ_WQE_SUB_MOD { @@ -242,8 +242,12 @@ struct erdma_cmdq_ext_db_req { /* create_cq cfg1 */ #define ERDMA_CMD_CREATE_CQ_MTT_CNT_MASK GENMASK(31, 16) #define ERDMA_CMD_CREATE_CQ_MTT_TYPE_MASK BIT(15) +#define ERDMA_CMD_CREATE_CQ_MTT_DB_CFG_MASK BIT(11) #define ERDMA_CMD_CREATE_CQ_EQN_MASK GENMASK(9, 0) +/* create_cq cfg2 */ +#define ERDMA_CMD_CREATE_CQ_DB_CFG_MASK GENMASK(15, 0) + struct erdma_cmdq_create_cq_req { u64 hdr; u32 cfg0; @@ -252,6 +256,7 @@ struct erdma_cmdq_create_cq_req { u32 cfg1; u64 cq_db_info_addr; u32 first_page_offset; + u32 cfg2; }; /* regmr/deregmr cfg0 */ @@ -311,6 +316,7 @@ struct erdma_cmdq_modify_qp_req { /* create qp cqn_mtt_cfg */ #define ERDMA_CMD_CREATE_QP_PAGE_SIZE_MASK GENMASK(31, 28) +#define ERDMA_CMD_CREATE_QP_DB_CFG_MASK BIT(25) #define ERDMA_CMD_CREATE_QP_CQN_MASK GENMASK(23, 0) /* create qp mtt_cfg */ @@ -318,6 +324,10 @@ struct erdma_cmdq_modify_qp_req { #define ERDMA_CMD_CREATE_QP_MTT_CNT_MASK GENMASK(11, 1) #define ERDMA_CMD_CREATE_QP_MTT_TYPE_MASK BIT(0) +/* create qp db cfg */ +#define ERDMA_CMD_CREATE_QP_SQDB_CFG_MASK GENMASK(31, 16) +#define ERDMA_CMD_CREATE_QP_RQDB_CFG_MASK GENMASK(15, 0) + #define ERDMA_CMDQ_CREATE_QP_RESP_COOKIE_MASK GENMASK_ULL(31, 0) struct erdma_cmdq_create_qp_req { @@ -332,6 +342,11 @@ struct erdma_cmdq_create_qp_req { u32 rq_mtt_cfg; u64 sq_db_info_dma_addr; u64 rq_db_info_dma_addr; + + u64 sq_mtt_entry[3]; + u64 rq_mtt_entry[3]; + + u32 db_cfg; }; struct erdma_cmdq_destroy_qp_req { diff --git a/drivers/infiniband/hw/erdma/erdma_verbs.c b/drivers/infiniband/hw/erdma/erdma_verbs.c index 376f70219ecd..ffc05ddc98ae 100644 --- a/drivers/infiniband/hw/erdma/erdma_verbs.c +++ b/drivers/infiniband/hw/erdma/erdma_verbs.c @@ -19,10 +19,11 @@ #include "erdma_cm.h" #include "erdma_verbs.h" -static int create_qp_cmd(struct erdma_dev *dev, struct erdma_qp *qp) +static int create_qp_cmd(struct erdma_ucontext *uctx, struct erdma_qp *qp) { - struct erdma_cmdq_create_qp_req req; + struct erdma_dev *dev = to_edev(qp->ibqp.device); struct erdma_pd *pd = to_epd(qp->ibqp.pd); + struct erdma_cmdq_create_qp_req req; struct erdma_uqp *user_qp; u64 resp0, resp1; int err; @@ -93,6 +94,16 @@ static int create_qp_cmd(struct erdma_dev *dev, struct erdma_qp *qp) req.sq_db_info_dma_addr = user_qp->sq_db_info_dma_addr; req.rq_db_info_dma_addr = user_qp->rq_db_info_dma_addr; + + if (uctx->ext_db.enable) { + req.sq_cqn_mtt_cfg |= + FIELD_PREP(ERDMA_CMD_CREATE_QP_DB_CFG_MASK, 1); + req.db_cfg = + FIELD_PREP(ERDMA_CMD_CREATE_QP_SQDB_CFG_MASK, + uctx->ext_db.sdb_off) | + FIELD_PREP(ERDMA_CMD_CREATE_QP_RQDB_CFG_MASK, + uctx->ext_db.rdb_off); + } } err = erdma_post_cmd_wait(&dev->cmdq, &req, sizeof(req), &resp0, @@ -146,11 +157,12 @@ static int regmr_cmd(struct erdma_dev *dev, struct erdma_mr *mr) return erdma_post_cmd_wait(&dev->cmdq, &req, sizeof(req), NULL, NULL); } -static int create_cq_cmd(struct erdma_dev *dev, struct erdma_cq *cq) +static int create_cq_cmd(struct erdma_ucontext *uctx, struct erdma_cq *cq) { + struct erdma_dev *dev = to_edev(cq->ibcq.device); struct erdma_cmdq_create_cq_req req; - u32 page_size; struct erdma_mem *mtt; + u32 page_size; erdma_cmdq_build_reqhdr(&req.hdr, CMDQ_SUBMOD_RDMA, CMDQ_OPCODE_CREATE_CQ); @@ -192,6 +204,13 @@ static int create_cq_cmd(struct erdma_dev *dev, struct erdma_cq *cq) req.first_page_offset = mtt->page_offset; req.cq_db_info_addr = cq->user_cq.db_info_dma_addr; + + if (uctx->ext_db.enable) { + req.cfg1 |= FIELD_PREP( + ERDMA_CMD_CREATE_CQ_MTT_DB_CFG_MASK, 1); + req.cfg2 = FIELD_PREP(ERDMA_CMD_CREATE_CQ_DB_CFG_MASK, + uctx->ext_db.cdb_off); + } } return erdma_post_cmd_wait(&dev->cmdq, &req, sizeof(req), NULL, NULL); @@ -753,7 +772,7 @@ int erdma_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *attrs, qp->attrs.state = ERDMA_QP_STATE_IDLE; INIT_DELAYED_WORK(&qp->reflush_dwork, erdma_flush_worker); - ret = create_qp_cmd(dev, qp); + ret = create_qp_cmd(uctx, qp); if (ret) goto err_out_cmd; @@ -1517,7 +1536,7 @@ int erdma_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, goto err_out_xa; } - ret = create_cq_cmd(dev, cq); + ret = create_cq_cmd(ctx, cq); if (ret) goto err_free_res; From patchwork Tue Jun 6 05:50:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cheng Xu X-Patchwork-Id: 13268261 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1BAFC77B7A for ; Tue, 6 Jun 2023 05:50:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234319AbjFFFug (ORCPT ); Tue, 6 Jun 2023 01:50:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50698 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234091AbjFFFuX (ORCPT ); Tue, 6 Jun 2023 01:50:23 -0400 Received: from out30-101.freemail.mail.aliyun.com (out30-101.freemail.mail.aliyun.com [115.124.30.101]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6FE19E42 for ; Mon, 5 Jun 2023 22:50:15 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R101e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045170;MF=chengyou@linux.alibaba.com;NM=1;PH=DS;RN=4;SR=0;TI=SMTPD_---0VkV2uT4_1686030611; Received: from localhost(mailfrom:chengyou@linux.alibaba.com fp:SMTPD_---0VkV2uT4_1686030611) by smtp.aliyun-inc.com; Tue, 06 Jun 2023 13:50:12 +0800 From: Cheng Xu To: jgg@ziepe.ca, leon@kernel.org Cc: linux-rdma@vger.kernel.org, KaiShen@linux.alibaba.com Subject: [PATCH for-next 4/4] RDMA/erdma: Refactor the original doorbell allocation mechanism Date: Tue, 6 Jun 2023 13:50:05 +0800 Message-Id: <20230606055005.80729-5-chengyou@linux.alibaba.com> X-Mailer: git-send-email 2.37.0 In-Reply-To: <20230606055005.80729-1-chengyou@linux.alibaba.com> References: <20230606055005.80729-1-chengyou@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org The original doorbell allocation mechanism is complex and does not meet the isolation requirement. So we introduce a new doorbell mechanism and the original mechanism (only be used with CAP_SYS_RAWIO if hardware does not support the new mechanism) needs to be kept as simple as possible for compatibility. Signed-off-by: Cheng Xu --- drivers/infiniband/hw/erdma/erdma.h | 14 --- drivers/infiniband/hw/erdma/erdma_hw.h | 13 --- drivers/infiniband/hw/erdma/erdma_main.c | 33 ------ drivers/infiniband/hw/erdma/erdma_verbs.c | 126 +++++----------------- drivers/infiniband/hw/erdma/erdma_verbs.h | 4 - 5 files changed, 27 insertions(+), 163 deletions(-) diff --git a/drivers/infiniband/hw/erdma/erdma.h b/drivers/infiniband/hw/erdma/erdma.h index a361d4bcd714..f190111840e9 100644 --- a/drivers/infiniband/hw/erdma/erdma.h +++ b/drivers/infiniband/hw/erdma/erdma.h @@ -128,13 +128,8 @@ struct erdma_devattr { int numa_node; enum erdma_cc_alg cc; - u32 grp_num; u32 irq_num; - bool disable_dwqe; - u16 dwqe_pages; - u16 dwqe_entries; - u32 max_qp; u32 max_send_wr; u32 max_recv_wr; @@ -215,15 +210,6 @@ struct erdma_dev { u32 next_alloc_qpn; u32 next_alloc_cqn; - spinlock_t db_bitmap_lock; - /* We provide max 64 uContexts that each has one SQ doorbell Page. */ - DECLARE_BITMAP(sdb_page, ERDMA_DWQE_TYPE0_CNT); - /* - * We provide max 496 uContexts that each has one SQ normal Db, - * and one directWQE db. - */ - DECLARE_BITMAP(sdb_entry, ERDMA_DWQE_TYPE1_CNT); - atomic_t num_ctx; struct list_head cep_list; }; diff --git a/drivers/infiniband/hw/erdma/erdma_hw.h b/drivers/infiniband/hw/erdma/erdma_hw.h index cf7629bfe534..a882b57aa118 100644 --- a/drivers/infiniband/hw/erdma/erdma_hw.h +++ b/drivers/infiniband/hw/erdma/erdma_hw.h @@ -82,19 +82,6 @@ #define ERDMA_BAR_CQDB_SPACE_OFFSET \ (ERDMA_BAR_RQDB_SPACE_OFFSET + ERDMA_BAR_RQDB_SPACE_SIZE) -/* Doorbell page resources related. */ -/* - * Max # of parallelly issued directSQE is 3072 per device, - * hardware organizes this into 24 group, per group has 128 credits. - */ -#define ERDMA_DWQE_MAX_GRP_CNT 24 -#define ERDMA_DWQE_NUM_PER_GRP 128 - -#define ERDMA_DWQE_TYPE0_CNT 64 -#define ERDMA_DWQE_TYPE1_CNT 496 -/* type1 DB contains 2 DBs, takes 256Byte. */ -#define ERDMA_DWQE_TYPE1_CNT_PER_PAGE 16 - #define ERDMA_SDB_SHARED_PAGE_INDEX 95 /* Doorbell related. */ diff --git a/drivers/infiniband/hw/erdma/erdma_main.c b/drivers/infiniband/hw/erdma/erdma_main.c index 525edea987b2..0880c79a978c 100644 --- a/drivers/infiniband/hw/erdma/erdma_main.c +++ b/drivers/infiniband/hw/erdma/erdma_main.c @@ -130,33 +130,6 @@ static irqreturn_t erdma_comm_irq_handler(int irq, void *data) return IRQ_HANDLED; } -static void erdma_dwqe_resource_init(struct erdma_dev *dev) -{ - int total_pages, type0, type1; - - dev->attrs.grp_num = erdma_reg_read32(dev, ERDMA_REGS_GRP_NUM_REG); - - if (dev->attrs.grp_num < 4) - dev->attrs.disable_dwqe = true; - else - dev->attrs.disable_dwqe = false; - - /* One page contains 4 goups. */ - total_pages = dev->attrs.grp_num * 4; - - if (dev->attrs.grp_num >= ERDMA_DWQE_MAX_GRP_CNT) { - dev->attrs.grp_num = ERDMA_DWQE_MAX_GRP_CNT; - type0 = ERDMA_DWQE_TYPE0_CNT; - type1 = ERDMA_DWQE_TYPE1_CNT / ERDMA_DWQE_TYPE1_CNT_PER_PAGE; - } else { - type1 = total_pages / 3; - type0 = total_pages - type1 - 1; - } - - dev->attrs.dwqe_pages = type0; - dev->attrs.dwqe_entries = type1 * ERDMA_DWQE_TYPE1_CNT_PER_PAGE; -} - static int erdma_request_vectors(struct erdma_dev *dev) { int expect_irq_num = min(num_possible_cpus() + 1, ERDMA_NUM_MSIX_VEC); @@ -199,8 +172,6 @@ static int erdma_device_init(struct erdma_dev *dev, struct pci_dev *pdev) { int ret; - erdma_dwqe_resource_init(dev); - ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(ERDMA_PCI_WIDTH)); if (ret) @@ -557,10 +528,6 @@ static int erdma_ib_device_add(struct pci_dev *pdev) if (ret) return ret; - spin_lock_init(&dev->db_bitmap_lock); - bitmap_zero(dev->sdb_page, ERDMA_DWQE_TYPE0_CNT); - bitmap_zero(dev->sdb_entry, ERDMA_DWQE_TYPE1_CNT); - atomic_set(&dev->num_ctx, 0); mac = erdma_reg_read32(dev, ERDMA_REGS_NETDEV_MAC_L_REG); diff --git a/drivers/infiniband/hw/erdma/erdma_verbs.c b/drivers/infiniband/hw/erdma/erdma_verbs.c index ffc05ddc98ae..517676fbb8b1 100644 --- a/drivers/infiniband/hw/erdma/erdma_verbs.c +++ b/drivers/infiniband/hw/erdma/erdma_verbs.c @@ -1149,71 +1149,27 @@ void erdma_mmap_free(struct rdma_user_mmap_entry *rdma_entry) kfree(entry); } -#define ERDMA_SDB_PAGE 0 -#define ERDMA_SDB_ENTRY 1 -#define ERDMA_SDB_SHARED 2 - -static void alloc_db_resources(struct erdma_dev *dev, - struct erdma_ucontext *ctx) -{ - u32 bitmap_idx; - struct erdma_devattr *attrs = &dev->attrs; - - if (attrs->disable_dwqe) - goto alloc_normal_db; - - /* Try to alloc independent SDB page. */ - spin_lock(&dev->db_bitmap_lock); - bitmap_idx = find_first_zero_bit(dev->sdb_page, attrs->dwqe_pages); - if (bitmap_idx != attrs->dwqe_pages) { - set_bit(bitmap_idx, dev->sdb_page); - spin_unlock(&dev->db_bitmap_lock); - - ctx->sdb_type = ERDMA_SDB_PAGE; - ctx->sdb_idx = bitmap_idx; - ctx->sdb_page_idx = bitmap_idx; - ctx->sdb = dev->func_bar_addr + ERDMA_BAR_SQDB_SPACE_OFFSET + - (bitmap_idx << PAGE_SHIFT); - ctx->sdb_page_off = 0; - - return; - } - - bitmap_idx = find_first_zero_bit(dev->sdb_entry, attrs->dwqe_entries); - if (bitmap_idx != attrs->dwqe_entries) { - set_bit(bitmap_idx, dev->sdb_entry); - spin_unlock(&dev->db_bitmap_lock); - - ctx->sdb_type = ERDMA_SDB_ENTRY; - ctx->sdb_idx = bitmap_idx; - ctx->sdb_page_idx = attrs->dwqe_pages + - bitmap_idx / ERDMA_DWQE_TYPE1_CNT_PER_PAGE; - ctx->sdb_page_off = bitmap_idx % ERDMA_DWQE_TYPE1_CNT_PER_PAGE; - - ctx->sdb = dev->func_bar_addr + ERDMA_BAR_SQDB_SPACE_OFFSET + - (ctx->sdb_page_idx << PAGE_SHIFT); - - return; - } - - spin_unlock(&dev->db_bitmap_lock); - -alloc_normal_db: - ctx->sdb_type = ERDMA_SDB_SHARED; - ctx->sdb_idx = 0; - ctx->sdb_page_idx = ERDMA_SDB_SHARED_PAGE_INDEX; - ctx->sdb_page_off = 0; - - ctx->sdb = dev->func_bar_addr + (ctx->sdb_page_idx << PAGE_SHIFT); -} - -static int alloc_ext_db_resources(struct erdma_dev *dev, - struct erdma_ucontext *ctx) +static int alloc_db_resources(struct erdma_dev *dev, struct erdma_ucontext *ctx, + bool ext_db_en) { struct erdma_cmdq_ext_db_req req = {}; u64 val0, val1; int ret; + /* + * CAP_SYS_RAWIO is required if hardware does not support extend + * doorbell mechanism. + */ + if (!ext_db_en && !capable(CAP_SYS_RAWIO)) + return -EPERM; + + if (!ext_db_en) { + ctx->sdb = dev->func_bar_addr + ERDMA_BAR_SQDB_SPACE_OFFSET; + ctx->rdb = dev->func_bar_addr + ERDMA_BAR_RQDB_SPACE_OFFSET; + ctx->cdb = dev->func_bar_addr + ERDMA_BAR_CQDB_SPACE_OFFSET; + return 0; + } + erdma_cmdq_build_reqhdr(&req.hdr, CMDQ_SUBMOD_COMMON, CMDQ_OPCODE_ALLOC_DB); @@ -1230,7 +1186,6 @@ static int alloc_ext_db_resources(struct erdma_dev *dev, ctx->ext_db.rdb_off = ERDMA_GET(val0, ALLOC_DB_RESP_RDB); ctx->ext_db.cdb_off = ERDMA_GET(val0, ALLOC_DB_RESP_CDB); - ctx->sdb_type = ERDMA_SDB_PAGE; ctx->sdb = dev->func_bar_addr + (ctx->ext_db.sdb_off << PAGE_SHIFT); ctx->cdb = dev->func_bar_addr + (ctx->ext_db.rdb_off << PAGE_SHIFT); ctx->rdb = dev->func_bar_addr + (ctx->ext_db.cdb_off << PAGE_SHIFT); @@ -1238,12 +1193,14 @@ static int alloc_ext_db_resources(struct erdma_dev *dev, return 0; } -static void free_ext_db_resources(struct erdma_dev *dev, - struct erdma_ucontext *ctx) +static void free_db_resources(struct erdma_dev *dev, struct erdma_ucontext *ctx) { struct erdma_cmdq_ext_db_req req = {}; int ret; + if (!ctx->ext_db.enable) + return; + erdma_cmdq_build_reqhdr(&req.hdr, CMDQ_SUBMOD_COMMON, CMDQ_OPCODE_FREE_DB); @@ -1274,7 +1231,6 @@ int erdma_alloc_ucontext(struct ib_ucontext *ibctx, struct ib_udata *udata) struct erdma_dev *dev = to_edev(ibctx->device); int ret; struct erdma_uresp_alloc_ctx uresp = {}; - bool ext_db_en; if (atomic_inc_return(&dev->num_ctx) > ERDMA_MAX_CONTEXT) { ret = -ENOMEM; @@ -1289,25 +1245,11 @@ int erdma_alloc_ucontext(struct ib_ucontext *ibctx, struct ib_udata *udata) INIT_LIST_HEAD(&ctx->dbrecords_page_list); mutex_init(&ctx->dbrecords_page_mutex); - /* - * CAP_SYS_RAWIO is required if hardware does not support extend - * doorbell mechanism. - */ - ext_db_en = !!(dev->attrs.cap_flags & ERDMA_DEV_CAP_FLAGS_EXTEND_DB); - if (!ext_db_en && !capable(CAP_SYS_RAWIO)) { - ret = -EPERM; + ret = alloc_db_resources(dev, ctx, + !!(dev->attrs.cap_flags & + ERDMA_DEV_CAP_FLAGS_EXTEND_DB)); + if (ret) goto err_out; - } - - if (ext_db_en) { - ret = alloc_ext_db_resources(dev, ctx); - if (ret) - goto err_out; - } else { - alloc_db_resources(dev, ctx); - ctx->rdb = dev->func_bar_addr + ERDMA_BAR_RQDB_SPACE_OFFSET; - ctx->cdb = dev->func_bar_addr + ERDMA_BAR_CQDB_SPACE_OFFSET; - } ctx->sq_db_mmap_entry = erdma_user_mmap_entry_insert( ctx, (void *)ctx->sdb, PAGE_SIZE, ERDMA_MMAP_IO_NC, &uresp.sdb); @@ -1331,8 +1273,6 @@ int erdma_alloc_ucontext(struct ib_ucontext *ibctx, struct ib_udata *udata) } uresp.dev_id = dev->pdev->device; - uresp.sdb_type = ctx->sdb_type; - uresp.sdb_offset = ctx->sdb_page_off; ret = ib_copy_to_udata(udata, &uresp, sizeof(uresp)); if (ret) @@ -1344,8 +1284,7 @@ int erdma_alloc_ucontext(struct ib_ucontext *ibctx, struct ib_udata *udata) erdma_uctx_user_mmap_entries_remove(ctx); err_free_ext_db: - if (ext_db_en) - free_ext_db_resources(dev, ctx); + free_db_resources(dev, ctx); err_out: atomic_dec(&dev->num_ctx); @@ -1354,22 +1293,11 @@ int erdma_alloc_ucontext(struct ib_ucontext *ibctx, struct ib_udata *udata) void erdma_dealloc_ucontext(struct ib_ucontext *ibctx) { - struct erdma_ucontext *ctx = to_ectx(ibctx); struct erdma_dev *dev = to_edev(ibctx->device); + struct erdma_ucontext *ctx = to_ectx(ibctx); erdma_uctx_user_mmap_entries_remove(ctx); - - if (ctx->ext_db.enable) { - free_ext_db_resources(dev, ctx); - } else { - spin_lock(&dev->db_bitmap_lock); - if (ctx->sdb_type == ERDMA_SDB_PAGE) - clear_bit(ctx->sdb_idx, dev->sdb_page); - else if (ctx->sdb_type == ERDMA_SDB_ENTRY) - clear_bit(ctx->sdb_idx, dev->sdb_entry); - spin_unlock(&dev->db_bitmap_lock); - } - + free_db_resources(dev, ctx); atomic_dec(&dev->num_ctx); } diff --git a/drivers/infiniband/hw/erdma/erdma_verbs.h b/drivers/infiniband/hw/erdma/erdma_verbs.h index 252106679d36..429fc3063f98 100644 --- a/drivers/infiniband/hw/erdma/erdma_verbs.h +++ b/drivers/infiniband/hw/erdma/erdma_verbs.h @@ -43,10 +43,6 @@ struct erdma_ucontext { struct erdma_ext_db_info ext_db; - u32 sdb_type; - u32 sdb_idx; - u32 sdb_page_idx; - u32 sdb_page_off; u64 sdb; u64 rdb; u64 cdb;