From patchwork Thu Jul 29 02:19:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenpeng Liang X-Patchwork-Id: 12407477 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68524C19F33 for ; Thu, 29 Jul 2021 02:23:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 532AB6103A for ; Thu, 29 Jul 2021 02:23:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233488AbhG2CXH (ORCPT ); Wed, 28 Jul 2021 22:23:07 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:16016 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233443AbhG2CXG (ORCPT ); Wed, 28 Jul 2021 22:23:06 -0400 Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4GZvP15Vc2zZv7l; Thu, 29 Jul 2021 10:19:33 +0800 (CST) Received: from dggpeml500017.china.huawei.com (7.185.36.243) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:22:59 +0800 Received: from localhost.localdomain (10.67.165.24) by dggpeml500017.china.huawei.com (7.185.36.243) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:22:59 +0800 From: Wenpeng Liang To: , CC: , , , , Xi Wang Subject: [PATCH v4 for-next 01/12] RDMA/hns: Introduce DCA for RC QP Date: Thu, 29 Jul 2021 10:19:12 +0800 Message-ID: <1627525163-1683-2-git-send-email-liangwenpeng@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> References: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpeml500017.china.huawei.com (7.185.36.243) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Xi Wang The hip09 introduces the DCA(Dynamic context attachment) feature which supports many RC QPs to share the WQE buffer in a memory pool, this will reduce the memory consumption when there are too many QPs are inactive. If a QP enables DCA feature, the WQE's buffer will not be allocated when creating. But when the users start to post WRs, the hns driver will allocate a buffer from the memory pool and then fill WQEs which tagged with this QP's number. The hns ROCEE will stop accessing the WQE buffer when the user polled all of the CQEs for a DCA QP, then the driver will recycle this WQE's buffer to the memory pool. This patch adds a group of methods to support the user space register buffers to a memory pool which belongs to the user context. The hns kernel driver will update the pages state in this pool when the user calling the post/poll methods and the user driver can get the QP's WQE buffer address by the key and offset which queried from kernel. Signed-off-by: Xi Wang Signed-off-by: Wenpeng Liang --- drivers/infiniband/hw/hns/Makefile | 2 +- drivers/infiniband/hw/hns/hns_roce_dca.c | 343 ++++++++++++++++++++++++++++ drivers/infiniband/hw/hns/hns_roce_dca.h | 22 ++ drivers/infiniband/hw/hns/hns_roce_device.h | 9 + drivers/infiniband/hw/hns/hns_roce_main.c | 27 ++- include/uapi/rdma/hns-abi.h | 27 +++ 6 files changed, 427 insertions(+), 3 deletions(-) create mode 100644 drivers/infiniband/hw/hns/hns_roce_dca.c create mode 100644 drivers/infiniband/hw/hns/hns_roce_dca.h diff --git a/drivers/infiniband/hw/hns/Makefile b/drivers/infiniband/hw/hns/Makefile index e105945..9962b23 100644 --- a/drivers/infiniband/hw/hns/Makefile +++ b/drivers/infiniband/hw/hns/Makefile @@ -6,7 +6,7 @@ ccflags-y := -I $(srctree)/drivers/net/ethernet/hisilicon/hns3 hns-roce-objs := hns_roce_main.o hns_roce_cmd.o hns_roce_pd.o \ - hns_roce_ah.o hns_roce_hem.o hns_roce_mr.o hns_roce_qp.o \ + hns_roce_ah.o hns_roce_hem.o hns_roce_mr.o hns_roce_qp.o hns_roce_dca.o \ hns_roce_cq.o hns_roce_alloc.o hns_roce_db.o hns_roce_srq.o hns_roce_restrack.o ifdef CONFIG_INFINIBAND_HNS_HIP06 diff --git a/drivers/infiniband/hw/hns/hns_roce_dca.c b/drivers/infiniband/hw/hns/hns_roce_dca.c new file mode 100644 index 0000000..6a3ff12 --- /dev/null +++ b/drivers/infiniband/hw/hns/hns_roce_dca.c @@ -0,0 +1,343 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* + * Copyright (c) 2021 HiSilicon Limited. All rights reserved. + */ + +#include +#include +#include +#include +#include +#include +#include "hns_roce_device.h" +#include "hns_roce_dca.h" + +#define UVERBS_MODULE_NAME hns_ib +#include + +/* DCA memory */ +struct dca_mem { +#define DCA_MEM_FLAGS_ALLOCED BIT(0) +#define DCA_MEM_FLAGS_REGISTERED BIT(1) + u32 flags; + struct list_head list; /* link to mem list in dca context */ + spinlock_t lock; /* protect the @flags and @list */ + int page_count; /* page count in this mem obj */ + u64 key; /* register by caller */ + u32 size; /* bytes in this mem object */ + struct hns_dca_page_state *states; /* record each page's state */ + void *pages; /* memory handle for getting dma address */ +}; + +struct dca_mem_attr { + u64 key; + u64 addr; + u32 size; +}; + +static void *alloc_dca_pages(struct hns_roce_dev *hr_dev, struct dca_mem *mem, + struct dca_mem_attr *attr) +{ + struct ib_device *ibdev = &hr_dev->ib_dev; + struct ib_umem *umem; + + umem = ib_umem_get(ibdev, attr->addr, attr->size, 0); + if (IS_ERR(umem)) { + ibdev_err(ibdev, "failed to get uDCA pages, ret = %ld.\n", + PTR_ERR(umem)); + return NULL; + } + + mem->page_count = ib_umem_num_dma_blocks(umem, HNS_HW_PAGE_SIZE); + + return umem; +} + +static void init_dca_umem_states(struct hns_dca_page_state *states, int count, + struct ib_umem *umem) +{ + dma_addr_t pre_addr, cur_addr; + struct ib_block_iter biter; + int i = 0; + + pre_addr = 0; + rdma_for_each_block(umem->sg_head.sgl, &biter, umem->nmap, + HNS_HW_PAGE_SIZE) { + if (i >= count) + return; + + /* In a continuous address, the first pages's head is 1 */ + cur_addr = rdma_block_iter_dma_address(&biter); + if ((i == 0) || (cur_addr - pre_addr != HNS_HW_PAGE_SIZE)) + states[i].head = 1; + + pre_addr = cur_addr; + i++; + } +} + +static struct hns_dca_page_state *alloc_dca_states(void *pages, int count) +{ + struct hns_dca_page_state *states; + + states = kcalloc(count, sizeof(*states), GFP_KERNEL); + if (!states) + return NULL; + + init_dca_umem_states(states, count, pages); + + return states; +} + +/* user DCA is managed by ucontext */ +#define to_hr_dca_ctx(uctx) (&(uctx)->dca_ctx) + +static void unregister_dca_mem(struct hns_roce_ucontext *uctx, + struct dca_mem *mem) +{ + struct hns_roce_dca_ctx *ctx = to_hr_dca_ctx(uctx); + unsigned long flags; + void *states, *pages; + + spin_lock_irqsave(&ctx->pool_lock, flags); + + spin_lock(&mem->lock); + mem->flags &= ~DCA_MEM_FLAGS_REGISTERED; + mem->page_count = 0; + pages = mem->pages; + mem->pages = NULL; + states = mem->states; + mem->states = NULL; + spin_unlock(&mem->lock); + + ctx->free_mems--; + ctx->free_size -= mem->size; + + ctx->total_size -= mem->size; + spin_unlock_irqrestore(&ctx->pool_lock, flags); + + kfree(states); + ib_umem_release(pages); +} + +static int register_dca_mem(struct hns_roce_dev *hr_dev, + struct hns_roce_ucontext *uctx, + struct dca_mem *mem, struct dca_mem_attr *attr) +{ + struct hns_roce_dca_ctx *ctx = to_hr_dca_ctx(uctx); + void *states, *pages; + unsigned long flags; + + pages = alloc_dca_pages(hr_dev, mem, attr); + if (!pages) + return -ENOMEM; + + states = alloc_dca_states(pages, mem->page_count); + if (!states) { + ib_umem_release(pages); + return -ENOMEM; + } + + spin_lock_irqsave(&ctx->pool_lock, flags); + + spin_lock(&mem->lock); + mem->pages = pages; + mem->states = states; + mem->key = attr->key; + mem->size = attr->size; + mem->flags |= DCA_MEM_FLAGS_REGISTERED; + spin_unlock(&mem->lock); + + ctx->free_mems++; + ctx->free_size += attr->size; + ctx->total_size += attr->size; + spin_unlock_irqrestore(&ctx->pool_lock, flags); + + return 0; +} + +static void init_dca_context(struct hns_roce_dca_ctx *ctx) +{ + INIT_LIST_HEAD(&ctx->pool); + spin_lock_init(&ctx->pool_lock); + ctx->total_size = 0; +} + +static void cleanup_dca_context(struct hns_roce_dev *hr_dev, + struct hns_roce_dca_ctx *ctx) +{ + struct dca_mem *mem, *tmp; + unsigned long flags; + + spin_lock_irqsave(&ctx->pool_lock, flags); + list_for_each_entry_safe(mem, tmp, &ctx->pool, list) { + list_del(&mem->list); + mem->flags = 0; + spin_unlock_irqrestore(&ctx->pool_lock, flags); + + kfree(mem->states); + ib_umem_release(mem->pages); + kfree(mem); + + spin_lock_irqsave(&ctx->pool_lock, flags); + } + ctx->total_size = 0; + spin_unlock_irqrestore(&ctx->pool_lock, flags); +} + +void hns_roce_register_udca(struct hns_roce_dev *hr_dev, + struct hns_roce_ucontext *uctx) +{ + init_dca_context(&uctx->dca_ctx); +} + +void hns_roce_unregister_udca(struct hns_roce_dev *hr_dev, + struct hns_roce_ucontext *uctx) +{ + cleanup_dca_context(hr_dev, &uctx->dca_ctx); +} + +static struct dca_mem *alloc_dca_mem(struct hns_roce_dca_ctx *ctx) +{ + struct dca_mem *mem, *tmp, *found = NULL; + unsigned long flags; + + spin_lock_irqsave(&ctx->pool_lock, flags); + list_for_each_entry_safe(mem, tmp, &ctx->pool, list) { + spin_lock(&mem->lock); + if (!mem->flags) { + found = mem; + mem->flags |= DCA_MEM_FLAGS_ALLOCED; + spin_unlock(&mem->lock); + goto done; + } + spin_unlock(&mem->lock); + } + +done: + spin_unlock_irqrestore(&ctx->pool_lock, flags); + + if (found) + return found; + + mem = kzalloc(sizeof(*mem), GFP_NOWAIT); + if (!mem) + return NULL; + + spin_lock_init(&mem->lock); + INIT_LIST_HEAD(&mem->list); + + mem->flags |= DCA_MEM_FLAGS_ALLOCED; + + spin_lock_irqsave(&ctx->pool_lock, flags); + list_add(&mem->list, &ctx->pool); + spin_unlock_irqrestore(&ctx->pool_lock, flags); + return mem; +} + +static void free_dca_mem(struct dca_mem *mem) +{ + /* We cannot hold the whole pool's lock during the DCA is working + * until cleanup the context in cleanup_dca_context(), so we just + * set the DCA mem state as free when destroying DCA mem object. + */ + spin_lock(&mem->lock); + mem->flags = 0; + spin_unlock(&mem->lock); +} + +static struct hns_roce_ucontext * +uverbs_attr_to_hr_uctx(struct uverbs_attr_bundle *attrs) +{ + return rdma_udata_to_drv_context(&attrs->driver_udata, + struct hns_roce_ucontext, ibucontext); +} + +static int UVERBS_HANDLER(HNS_IB_METHOD_DCA_MEM_REG)( + struct uverbs_attr_bundle *attrs) +{ + struct hns_roce_ucontext *uctx = uverbs_attr_to_hr_uctx(attrs); + struct hns_roce_dev *hr_dev = to_hr_dev(uctx->ibucontext.device); + struct ib_uobject *uobj = + uverbs_attr_get_uobject(attrs, HNS_IB_ATTR_DCA_MEM_REG_HANDLE); + struct dca_mem_attr init_attr = {}; + struct dca_mem *mem; + int ret; + + if (uverbs_copy_from(&init_attr.addr, attrs, + HNS_IB_ATTR_DCA_MEM_REG_ADDR) || + uverbs_copy_from(&init_attr.size, attrs, + HNS_IB_ATTR_DCA_MEM_REG_LEN) || + uverbs_copy_from(&init_attr.key, attrs, + HNS_IB_ATTR_DCA_MEM_REG_KEY)) + return -EFAULT; + + mem = alloc_dca_mem(to_hr_dca_ctx(uctx)); + if (!mem) + return -ENOMEM; + + ret = register_dca_mem(hr_dev, uctx, mem, &init_attr); + if (ret) { + free_dca_mem(mem); + return ret; + } + + uobj->object = mem; + + return 0; +} + +static int dca_cleanup(struct ib_uobject *uobject, enum rdma_remove_reason why, + struct uverbs_attr_bundle *attrs) +{ + struct hns_roce_ucontext *uctx = uverbs_attr_to_hr_uctx(attrs); + struct dca_mem *mem; + + /* One DCA MEM maybe shared by many QPs, so the DCA mem uobject must + * be destroyed before all QP uobjects, and we will destroy the DCA + * uobjects when cleanup DCA context by calling hns_roce_cleanup_dca(). + */ + if (why == RDMA_REMOVE_CLOSE || why == RDMA_REMOVE_DRIVER_REMOVE) + return 0; + + mem = uobject->object; + unregister_dca_mem(uctx, mem); + free_dca_mem(mem); + + return 0; +} + +DECLARE_UVERBS_NAMED_METHOD( + HNS_IB_METHOD_DCA_MEM_REG, + UVERBS_ATTR_IDR(HNS_IB_ATTR_DCA_MEM_REG_HANDLE, HNS_IB_OBJECT_DCA_MEM, + UVERBS_ACCESS_NEW, UA_MANDATORY), + UVERBS_ATTR_PTR_IN(HNS_IB_ATTR_DCA_MEM_REG_LEN, UVERBS_ATTR_TYPE(u32), + UA_MANDATORY), + UVERBS_ATTR_PTR_IN(HNS_IB_ATTR_DCA_MEM_REG_ADDR, UVERBS_ATTR_TYPE(u64), + UA_MANDATORY), + UVERBS_ATTR_PTR_IN(HNS_IB_ATTR_DCA_MEM_REG_KEY, UVERBS_ATTR_TYPE(u64), + UA_MANDATORY)); + +DECLARE_UVERBS_NAMED_METHOD_DESTROY( + HNS_IB_METHOD_DCA_MEM_DEREG, + UVERBS_ATTR_IDR(HNS_IB_ATTR_DCA_MEM_DEREG_HANDLE, HNS_IB_OBJECT_DCA_MEM, + UVERBS_ACCESS_DESTROY, UA_MANDATORY)); + +DECLARE_UVERBS_NAMED_OBJECT(HNS_IB_OBJECT_DCA_MEM, + UVERBS_TYPE_ALLOC_IDR(dca_cleanup), + &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_REG), + &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_DEREG)); + +static bool dca_is_supported(struct ib_device *device) +{ + struct hns_roce_dev *dev = to_hr_dev(device); + + return dev->caps.flags & HNS_ROCE_CAP_FLAG_DCA_MODE; +} + +const struct uapi_definition hns_roce_dca_uapi_defs[] = { + UAPI_DEF_CHAIN_OBJ_TREE_NAMED( + HNS_IB_OBJECT_DCA_MEM, + UAPI_DEF_IS_OBJ_SUPPORTED(dca_is_supported)), + {} +}; diff --git a/drivers/infiniband/hw/hns/hns_roce_dca.h b/drivers/infiniband/hw/hns/hns_roce_dca.h new file mode 100644 index 0000000..aad198c --- /dev/null +++ b/drivers/infiniband/hw/hns/hns_roce_dca.h @@ -0,0 +1,22 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* + * Copyright (c) 2021 HiSilicon Limited. All rights reserved. + */ + +#ifndef __HNS_ROCE_DCA_H +#define __HNS_ROCE_DCA_H + +/* DCA page state (32 bit) */ +struct hns_dca_page_state { + u32 buf_id : 29; /* If zero, means page can be used by any buffer. */ + u32 lock : 1; /* @buf_id locked this page to prepare access. */ + u32 active : 1; /* @buf_id is accessing this page. */ + u32 head : 1; /* This page is the head in a continuous address range. */ +}; + +void hns_roce_register_udca(struct hns_roce_dev *hr_dev, + struct hns_roce_ucontext *uctx); +void hns_roce_unregister_udca(struct hns_roce_dev *hr_dev, + struct hns_roce_ucontext *uctx); + +#endif diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index 991f652..d18bc22 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -225,11 +225,20 @@ struct hns_roce_uar { unsigned long logic_idx; }; +struct hns_roce_dca_ctx { + struct list_head pool; /* all DCA mems link to @pool */ + spinlock_t pool_lock; /* protect @pool */ + unsigned int free_mems; /* free mem num in pool */ + size_t free_size; /* free mem size in pool */ + size_t total_size; /* total size in pool */ +}; + struct hns_roce_ucontext { struct ib_ucontext ibucontext; struct hns_roce_uar uar; struct list_head page_list; struct mutex page_mutex; + struct hns_roce_dca_ctx dca_ctx; }; struct hns_roce_pd { diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c index 078a971..3df95d4 100644 --- a/drivers/infiniband/hw/hns/hns_roce_main.c +++ b/drivers/infiniband/hw/hns/hns_roce_main.c @@ -37,10 +37,12 @@ #include #include #include +#include #include #include "hns_roce_common.h" #include "hns_roce_device.h" #include "hns_roce_hem.h" +#include "hns_roce_dca.h" static int hns_roce_set_mac(struct hns_roce_dev *hr_dev, u32 port, u8 *addr) { @@ -294,16 +296,17 @@ static int hns_roce_modify_device(struct ib_device *ib_dev, int mask, static int hns_roce_alloc_ucontext(struct ib_ucontext *uctx, struct ib_udata *udata) { - int ret; struct hns_roce_ucontext *context = to_hr_ucontext(uctx); - struct hns_roce_ib_alloc_ucontext_resp resp = {}; struct hns_roce_dev *hr_dev = to_hr_dev(uctx->device); + struct hns_roce_ib_alloc_ucontext_resp resp = {}; + int ret; if (!hr_dev->active) return -EAGAIN; resp.qp_tab_size = hr_dev->caps.num_qps; resp.srq_tab_size = hr_dev->caps.num_srqs; + resp.cap_flags = hr_dev->caps.flags; ret = hns_roce_uar_alloc(hr_dev, &context->uar); if (ret) @@ -315,6 +318,9 @@ static int hns_roce_alloc_ucontext(struct ib_ucontext *uctx, mutex_init(&context->page_mutex); } + if (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_DCA_MODE) + hns_roce_register_udca(hr_dev, context); + resp.cqe_size = hr_dev->caps.cqe_sz; ret = ib_copy_to_udata(udata, &resp, @@ -325,6 +331,9 @@ static int hns_roce_alloc_ucontext(struct ib_ucontext *uctx, return 0; error_fail_copy_to_udata: + if (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_DCA_MODE) + hns_roce_unregister_udca(hr_dev, context); + hns_roce_uar_free(hr_dev, &context->uar); error_fail_uar_alloc: @@ -334,8 +343,12 @@ static int hns_roce_alloc_ucontext(struct ib_ucontext *uctx, static void hns_roce_dealloc_ucontext(struct ib_ucontext *ibcontext) { struct hns_roce_ucontext *context = to_hr_ucontext(ibcontext); + struct hns_roce_dev *hr_dev = to_hr_dev(ibcontext->device); hns_roce_uar_free(to_hr_dev(ibcontext->device), &context->uar); + + if (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_DCA_MODE) + hns_roce_unregister_udca(hr_dev, context); } static int hns_roce_mmap(struct ib_ucontext *context, @@ -417,6 +430,12 @@ static void hns_roce_unregister_device(struct hns_roce_dev *hr_dev) ib_unregister_device(&hr_dev->ib_dev); } +extern const struct uapi_definition hns_roce_dca_uapi_defs[]; +static const struct uapi_definition hns_roce_uapi_defs[] = { + UAPI_DEF_CHAIN(hns_roce_dca_uapi_defs), + {} +}; + static const struct ib_device_ops hns_roce_dev_ops = { .owner = THIS_MODULE, .driver_id = RDMA_DRIVER_HNS, @@ -526,6 +545,10 @@ static int hns_roce_register_device(struct hns_roce_dev *hr_dev) ib_set_device_ops(ib_dev, hr_dev->hw->hns_roce_dev_ops); ib_set_device_ops(ib_dev, &hns_roce_dev_ops); + + if (IS_ENABLED(CONFIG_INFINIBAND_USER_ACCESS)) + ib_dev->driver_def = hns_roce_uapi_defs; + for (i = 0; i < hr_dev->caps.num_ports; i++) { if (!hr_dev->iboe.netdevs[i]) continue; diff --git a/include/uapi/rdma/hns-abi.h b/include/uapi/rdma/hns-abi.h index 42b1776..c17ec91 100644 --- a/include/uapi/rdma/hns-abi.h +++ b/include/uapi/rdma/hns-abi.h @@ -83,15 +83,42 @@ struct hns_roce_ib_create_qp_resp { __aligned_u64 cap_flags; }; +enum { + HNS_ROCE_CAP_FLAG_DCA_MODE = 1 << 15, +}; + struct hns_roce_ib_alloc_ucontext_resp { __u32 qp_tab_size; __u32 cqe_size; __u32 srq_tab_size; __u32 reserved; + __aligned_u64 cap_flags; }; struct hns_roce_ib_alloc_pd_resp { __u32 pdn; }; +#define UVERBS_ID_NS_MASK 0xF000 +#define UVERBS_ID_NS_SHIFT 12 + +enum hns_ib_objects { + HNS_IB_OBJECT_DCA_MEM = (1U << UVERBS_ID_NS_SHIFT), +}; + +enum hns_ib_dca_mem_methods { + HNS_IB_METHOD_DCA_MEM_REG = (1U << UVERBS_ID_NS_SHIFT), + HNS_IB_METHOD_DCA_MEM_DEREG, +}; + +enum hns_ib_dca_mem_reg_attrs { + HNS_IB_ATTR_DCA_MEM_REG_HANDLE = (1U << UVERBS_ID_NS_SHIFT), + HNS_IB_ATTR_DCA_MEM_REG_LEN, + HNS_IB_ATTR_DCA_MEM_REG_ADDR, + HNS_IB_ATTR_DCA_MEM_REG_KEY, +}; + +enum hns_ib_dca_mem_dereg_attrs { + HNS_IB_ATTR_DCA_MEM_DEREG_HANDLE = (1U << UVERBS_ID_NS_SHIFT), +}; #endif /* HNS_ABI_USER_H */ From patchwork Thu Jul 29 02:19:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenpeng Liang X-Patchwork-Id: 12407467 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83D6CC43214 for ; Thu, 29 Jul 2021 02:23:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 699926103A for ; Thu, 29 Jul 2021 02:23:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233341AbhG2CXG (ORCPT ); Wed, 28 Jul 2021 22:23:06 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:16017 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233384AbhG2CXG (ORCPT ); Wed, 28 Jul 2021 22:23:06 -0400 Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4GZvP16B3yzZv7p; Thu, 29 Jul 2021 10:19:33 +0800 (CST) Received: from dggpeml500017.china.huawei.com (7.185.36.243) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:22:59 +0800 Received: from localhost.localdomain (10.67.165.24) by dggpeml500017.china.huawei.com (7.185.36.243) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:22:59 +0800 From: Wenpeng Liang To: , CC: , , , , Xi Wang Subject: [PATCH v4 for-next 02/12] RDMA/hns: Add method for shrinking DCA memory pool Date: Thu, 29 Jul 2021 10:19:13 +0800 Message-ID: <1627525163-1683-3-git-send-email-liangwenpeng@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> References: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpeml500017.china.huawei.com (7.185.36.243) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Xi Wang If no QP is using a DCA mem object, the userspace driver can destroy it. So add a new method 'HNS_IB_METHOD_DCA_MEM_SHRINK' to allow the userspace dirver to remove an object from DCA memory pool. If a DCA mem object has been shrunk, the userspace driver can destroy it by 'HNS_IB_METHOD_DCA_MEM_DEREG' method and free the buffer which is allocated in userspace. Signed-off-by: Xi Wang Signed-off-by: Wenpeng Liang --- drivers/infiniband/hw/hns/hns_roce_dca.c | 136 ++++++++++++++++++++++++++++++- drivers/infiniband/hw/hns/hns_roce_dca.h | 7 ++ include/uapi/rdma/hns-abi.h | 9 ++ 3 files changed, 151 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_dca.c b/drivers/infiniband/hw/hns/hns_roce_dca.c index 6a3ff12..92256784a 100644 --- a/drivers/infiniband/hw/hns/hns_roce_dca.c +++ b/drivers/infiniband/hw/hns/hns_roce_dca.c @@ -35,6 +35,10 @@ struct dca_mem_attr { u32 size; }; +#define dca_page_is_free(s) ((s)->buf_id == HNS_DCA_INVALID_BUF_ID) +#define dca_mem_is_available(m) \ + ((m)->flags == (DCA_MEM_FLAGS_ALLOCED | DCA_MEM_FLAGS_REGISTERED)) + static void *alloc_dca_pages(struct hns_roce_dev *hr_dev, struct dca_mem *mem, struct dca_mem_attr *attr) { @@ -89,6 +93,41 @@ static struct hns_dca_page_state *alloc_dca_states(void *pages, int count) return states; } +#define DCA_MEM_STOP_ITERATE -1 +#define DCA_MEM_NEXT_ITERATE -2 +static void travel_dca_pages(struct hns_roce_dca_ctx *ctx, void *param, + int (*cb)(struct dca_mem *, int, void *)) +{ + struct dca_mem *mem, *tmp; + unsigned long flags; + bool avail; + int ret; + int i; + + spin_lock_irqsave(&ctx->pool_lock, flags); + list_for_each_entry_safe(mem, tmp, &ctx->pool, list) { + spin_unlock_irqrestore(&ctx->pool_lock, flags); + + spin_lock(&mem->lock); + avail = dca_mem_is_available(mem); + ret = 0; + for (i = 0; avail && i < mem->page_count; i++) { + ret = cb(mem, i, param); + if (ret == DCA_MEM_STOP_ITERATE || + ret == DCA_MEM_NEXT_ITERATE) + break; + } + spin_unlock(&mem->lock); + spin_lock_irqsave(&ctx->pool_lock, flags); + + if (ret == DCA_MEM_STOP_ITERATE) + goto done; + } + +done: + spin_unlock_irqrestore(&ctx->pool_lock, flags); +} + /* user DCA is managed by ucontext */ #define to_hr_dca_ctx(uctx) (&(uctx)->dca_ctx) @@ -156,6 +195,63 @@ static int register_dca_mem(struct hns_roce_dev *hr_dev, return 0; } +struct dca_mem_shrink_attr { + u64 shrink_key; + u32 shrink_mems; +}; + +static int shrink_dca_page_proc(struct dca_mem *mem, int index, void *param) +{ + struct dca_mem_shrink_attr *attr = param; + struct hns_dca_page_state *state; + int i, free_pages; + + free_pages = 0; + for (i = 0; i < mem->page_count; i++) { + state = &mem->states[i]; + if (dca_page_is_free(state)) + free_pages++; + } + + /* No pages are in use */ + if (free_pages == mem->page_count) { + /* unregister first empty DCA mem */ + if (!attr->shrink_mems) { + mem->flags &= ~DCA_MEM_FLAGS_REGISTERED; + attr->shrink_key = mem->key; + } + + attr->shrink_mems++; + } + + if (attr->shrink_mems > 1) + return DCA_MEM_STOP_ITERATE; + else + return DCA_MEM_NEXT_ITERATE; +} + +static int shrink_dca_mem(struct hns_roce_dev *hr_dev, + struct hns_roce_ucontext *uctx, u64 reserved_size, + struct hns_dca_shrink_resp *resp) +{ + struct hns_roce_dca_ctx *ctx = to_hr_dca_ctx(uctx); + struct dca_mem_shrink_attr attr = {}; + unsigned long flags; + bool need_shink; + + spin_lock_irqsave(&ctx->pool_lock, flags); + need_shink = ctx->free_mems > 0 && ctx->free_size > reserved_size; + spin_unlock_irqrestore(&ctx->pool_lock, flags); + if (!need_shink) + return 0; + + travel_dca_pages(ctx, &attr, shrink_dca_page_proc); + resp->free_mems = attr.shrink_mems; + resp->free_key = attr.shrink_key; + + return 0; +} + static void init_dca_context(struct hns_roce_dca_ctx *ctx) { INIT_LIST_HEAD(&ctx->pool); @@ -323,10 +419,48 @@ DECLARE_UVERBS_NAMED_METHOD_DESTROY( UVERBS_ATTR_IDR(HNS_IB_ATTR_DCA_MEM_DEREG_HANDLE, HNS_IB_OBJECT_DCA_MEM, UVERBS_ACCESS_DESTROY, UA_MANDATORY)); +static int UVERBS_HANDLER(HNS_IB_METHOD_DCA_MEM_SHRINK)( + struct uverbs_attr_bundle *attrs) +{ + struct hns_roce_ucontext *uctx = uverbs_attr_to_hr_uctx(attrs); + struct hns_dca_shrink_resp resp = {}; + u64 reserved_size = 0; + int ret; + + if (uverbs_copy_from(&reserved_size, attrs, + HNS_IB_ATTR_DCA_MEM_SHRINK_RESERVED_SIZE)) + return -EFAULT; + + ret = shrink_dca_mem(to_hr_dev(uctx->ibucontext.device), uctx, + reserved_size, &resp); + if (ret) + return ret; + + if (uverbs_copy_to(attrs, HNS_IB_ATTR_DCA_MEM_SHRINK_OUT_FREE_KEY, + &resp.free_key, sizeof(resp.free_key)) || + uverbs_copy_to(attrs, HNS_IB_ATTR_DCA_MEM_SHRINK_OUT_FREE_MEMS, + &resp.free_mems, sizeof(resp.free_mems))) + return -EFAULT; + + return 0; +} + +DECLARE_UVERBS_NAMED_METHOD( + HNS_IB_METHOD_DCA_MEM_SHRINK, + UVERBS_ATTR_IDR(HNS_IB_ATTR_DCA_MEM_SHRINK_HANDLE, + HNS_IB_OBJECT_DCA_MEM, UVERBS_ACCESS_WRITE, + UA_MANDATORY), + UVERBS_ATTR_PTR_IN(HNS_IB_ATTR_DCA_MEM_SHRINK_RESERVED_SIZE, + UVERBS_ATTR_TYPE(u64), UA_MANDATORY), + UVERBS_ATTR_PTR_OUT(HNS_IB_ATTR_DCA_MEM_SHRINK_OUT_FREE_KEY, + UVERBS_ATTR_TYPE(u64), UA_MANDATORY), + UVERBS_ATTR_PTR_OUT(HNS_IB_ATTR_DCA_MEM_SHRINK_OUT_FREE_MEMS, + UVERBS_ATTR_TYPE(u32), UA_MANDATORY)); DECLARE_UVERBS_NAMED_OBJECT(HNS_IB_OBJECT_DCA_MEM, UVERBS_TYPE_ALLOC_IDR(dca_cleanup), &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_REG), - &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_DEREG)); + &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_DEREG), + &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_SHRINK)); static bool dca_is_supported(struct ib_device *device) { diff --git a/drivers/infiniband/hw/hns/hns_roce_dca.h b/drivers/infiniband/hw/hns/hns_roce_dca.h index aad198c..a82ed5e 100644 --- a/drivers/infiniband/hw/hns/hns_roce_dca.h +++ b/drivers/infiniband/hw/hns/hns_roce_dca.h @@ -14,6 +14,13 @@ struct hns_dca_page_state { u32 head : 1; /* This page is the head in a continuous address range. */ }; +struct hns_dca_shrink_resp { + u64 free_key; /* free buffer's key which registered by the user */ + u32 free_mems; /* free buffer count which no any QP be using */ +}; + +#define HNS_DCA_INVALID_BUF_ID 0UL + void hns_roce_register_udca(struct hns_roce_dev *hr_dev, struct hns_roce_ucontext *uctx); void hns_roce_unregister_udca(struct hns_roce_dev *hr_dev, diff --git a/include/uapi/rdma/hns-abi.h b/include/uapi/rdma/hns-abi.h index c17ec91..bcca5be 100644 --- a/include/uapi/rdma/hns-abi.h +++ b/include/uapi/rdma/hns-abi.h @@ -109,6 +109,7 @@ enum hns_ib_objects { enum hns_ib_dca_mem_methods { HNS_IB_METHOD_DCA_MEM_REG = (1U << UVERBS_ID_NS_SHIFT), HNS_IB_METHOD_DCA_MEM_DEREG, + HNS_IB_METHOD_DCA_MEM_SHRINK, }; enum hns_ib_dca_mem_reg_attrs { @@ -121,4 +122,12 @@ enum hns_ib_dca_mem_reg_attrs { enum hns_ib_dca_mem_dereg_attrs { HNS_IB_ATTR_DCA_MEM_DEREG_HANDLE = (1U << UVERBS_ID_NS_SHIFT), }; + +enum hns_ib_dca_mem_shrink_attrs { + HNS_IB_ATTR_DCA_MEM_SHRINK_HANDLE = (1U << UVERBS_ID_NS_SHIFT), + HNS_IB_ATTR_DCA_MEM_SHRINK_RESERVED_SIZE, + HNS_IB_ATTR_DCA_MEM_SHRINK_OUT_FREE_KEY, + HNS_IB_ATTR_DCA_MEM_SHRINK_OUT_FREE_MEMS, +}; + #endif /* HNS_ABI_USER_H */ From patchwork Thu Jul 29 02:19:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenpeng Liang X-Patchwork-Id: 12407475 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A93F0C19F31 for ; Thu, 29 Jul 2021 02:23:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9508A6103C for ; Thu, 29 Jul 2021 02:23:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233443AbhG2CXH (ORCPT ); Wed, 28 Jul 2021 22:23:07 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:16018 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233444AbhG2CXG (ORCPT ); Wed, 28 Jul 2021 22:23:06 -0400 Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4GZvP16v6QzZv7q; Thu, 29 Jul 2021 10:19:33 +0800 (CST) Received: from dggpeml500017.china.huawei.com (7.185.36.243) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:22:59 +0800 Received: from localhost.localdomain (10.67.165.24) by dggpeml500017.china.huawei.com (7.185.36.243) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:22:59 +0800 From: Wenpeng Liang To: , CC: , , , , Xi Wang Subject: [PATCH v4 for-next 03/12] RDMA/hns: Configure DCA mode for the userspace QP Date: Thu, 29 Jul 2021 10:19:14 +0800 Message-ID: <1627525163-1683-4-git-send-email-liangwenpeng@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> References: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpeml500017.china.huawei.com (7.185.36.243) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Xi Wang If the userspace driver assign a NULL to the field of 'buf_addr' in 'struct hns_roce_ib_create_qp' when creating QP, this means the kernel driver need setup the QP as DCA mode. So add a QP capability bit in response to indicate the userspace driver that the DCA mode has been enabled. Signed-off-by: Xi Wang Signed-off-by: Wenpeng Liang --- drivers/infiniband/hw/hns/hns_roce_dca.c | 15 ++++ drivers/infiniband/hw/hns/hns_roce_dca.h | 6 +- drivers/infiniband/hw/hns/hns_roce_device.h | 5 ++ drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 19 ++++- drivers/infiniband/hw/hns/hns_roce_qp.c | 106 ++++++++++++++++++++++------ include/uapi/rdma/hns-abi.h | 1 + 6 files changed, 127 insertions(+), 25 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_dca.c b/drivers/infiniband/hw/hns/hns_roce_dca.c index 92256784a..741e009 100644 --- a/drivers/infiniband/hw/hns/hns_roce_dca.c +++ b/drivers/infiniband/hw/hns/hns_roce_dca.c @@ -342,6 +342,21 @@ static void free_dca_mem(struct dca_mem *mem) spin_unlock(&mem->lock); } +void hns_roce_enable_dca(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp) +{ + struct hns_roce_dca_cfg *cfg = &hr_qp->dca_cfg; + + cfg->buf_id = HNS_DCA_INVALID_BUF_ID; +} + +void hns_roce_disable_dca(struct hns_roce_dev *hr_dev, + struct hns_roce_qp *hr_qp) +{ + struct hns_roce_dca_cfg *cfg = &hr_qp->dca_cfg; + + cfg->buf_id = HNS_DCA_INVALID_BUF_ID; +} + static struct hns_roce_ucontext * uverbs_attr_to_hr_uctx(struct uverbs_attr_bundle *attrs) { diff --git a/drivers/infiniband/hw/hns/hns_roce_dca.h b/drivers/infiniband/hw/hns/hns_roce_dca.h index a82ed5e..a13a2d6 100644 --- a/drivers/infiniband/hw/hns/hns_roce_dca.h +++ b/drivers/infiniband/hw/hns/hns_roce_dca.h @@ -14,16 +14,20 @@ struct hns_dca_page_state { u32 head : 1; /* This page is the head in a continuous address range. */ }; +#define HNS_DCA_INVALID_BUF_ID 0UL struct hns_dca_shrink_resp { u64 free_key; /* free buffer's key which registered by the user */ u32 free_mems; /* free buffer count which no any QP be using */ }; -#define HNS_DCA_INVALID_BUF_ID 0UL void hns_roce_register_udca(struct hns_roce_dev *hr_dev, struct hns_roce_ucontext *uctx); void hns_roce_unregister_udca(struct hns_roce_dev *hr_dev, struct hns_roce_ucontext *uctx); +void hns_roce_enable_dca(struct hns_roce_dev *hr_dev, + struct hns_roce_qp *hr_qp); +void hns_roce_disable_dca(struct hns_roce_dev *hr_dev, + struct hns_roce_qp *hr_qp); #endif diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index d18bc22..00f80b3 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -331,6 +331,10 @@ struct hns_roce_mtr { struct hns_roce_hem_cfg hem_cfg; /* config for hardware addressing */ }; +struct hns_roce_dca_cfg { + u32 buf_id; +}; + struct hns_roce_mw { struct ib_mw ibmw; u32 pdn; @@ -631,6 +635,7 @@ struct hns_roce_qp { struct hns_roce_wq sq; struct hns_roce_mtr mtr; + struct hns_roce_dca_cfg dca_cfg; u32 buff_size; struct mutex mutex; diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index 594d4ce..ced0c44 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -345,6 +345,12 @@ static int set_rwqe_data_seg(struct ib_qp *ibqp, const struct ib_send_wr *wr, return 0; } +static bool check_dca_attach_enable(struct hns_roce_qp *hr_qp) +{ + return hr_qp->en_flags & HNS_ROCE_QP_CAP_DYNAMIC_CTX_ATTACH; +} + + static int check_send_valid(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp) { @@ -4371,6 +4377,17 @@ static int modify_qp_init_to_rtr(struct ib_qp *ibqp, hr_reg_write(context, QPC_TRRL_BA_H, trrl_ba >> (32 + 16 + 4)); hr_reg_clear(qpc_mask, QPC_TRRL_BA_H); + /* hip09 reused the IRRL_HEAD fileds in hip08 */ + if (hr_dev->pci_dev->revision >= PCI_REVISION_ID_HIP09) { + if (check_dca_attach_enable(hr_qp)) { + hr_reg_enable(context, QPC_DCA_MODE); + hr_reg_clear(qpc_mask, QPC_DCA_MODE); + } + } else { + /* reset IRRL_HEAD */ + hr_reg_clear(qpc_mask, QPC_V2_IRRL_HEAD); + } + context->irrl_ba = cpu_to_le32(irrl_ba >> 6); qpc_mask->irrl_ba = 0; hr_reg_write(context, QPC_IRRL_BA_H, irrl_ba >> (32 + 6)); @@ -4489,8 +4506,6 @@ static int modify_qp_rtr_to_rts(struct ib_qp *ibqp, hr_reg_write(context, QPC_LSN, 0x100); hr_reg_clear(qpc_mask, QPC_LSN); - hr_reg_clear(qpc_mask, QPC_V2_IRRL_HEAD); - return 0; } diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c index b101b7e5..91311a0 100644 --- a/drivers/infiniband/hw/hns/hns_roce_qp.c +++ b/drivers/infiniband/hw/hns/hns_roce_qp.c @@ -39,6 +39,7 @@ #include "hns_roce_common.h" #include "hns_roce_device.h" #include "hns_roce_hem.h" +#include "hns_roce_dca.h" static void flush_work_handle(struct work_struct *work) { @@ -604,8 +605,21 @@ static int set_user_sq_size(struct hns_roce_dev *hr_dev, return 0; } +static bool check_dca_is_enable(struct hns_roce_dev *hr_dev, bool is_user, + unsigned long addr) +{ + if (!(hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_DCA_MODE)) + return false; + + /* If the user QP's buffer addr is 0, the DCA mode should be enabled */ + if (is_user) + return !addr; + + return false; +} + static int set_wqe_buf_attr(struct hns_roce_dev *hr_dev, - struct hns_roce_qp *hr_qp, + struct hns_roce_qp *hr_qp, bool dca_en, struct hns_roce_buf_attr *buf_attr) { int buf_size; @@ -649,9 +663,21 @@ static int set_wqe_buf_attr(struct hns_roce_dev *hr_dev, if (hr_qp->buff_size < 1) return -EINVAL; - buf_attr->page_shift = HNS_HW_PAGE_SHIFT + hr_dev->caps.mtt_buf_pg_sz; buf_attr->region_count = idx; + if (dca_en) { + /* + * When enable DCA, there's no need to alloc buffer now, and + * the page shift should be fixed to 4K. + */ + buf_attr->mtt_only = true; + buf_attr->page_shift = HNS_HW_PAGE_SHIFT; + } else { + buf_attr->mtt_only = false; + buf_attr->page_shift = HNS_HW_PAGE_SHIFT + + hr_dev->caps.mtt_buf_pg_sz; + } + return 0; } @@ -748,12 +774,48 @@ static void free_rq_inline_buf(struct hns_roce_qp *hr_qp) kfree(hr_qp->rq_inl_buf.wqe_list); } -static int alloc_qp_buf(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp, +static int alloc_wqe_buf(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp, + bool dca_en, struct hns_roce_buf_attr *buf_attr, + struct ib_udata *udata, unsigned long addr) +{ + struct ib_device *ibdev = &hr_dev->ib_dev; + int ret; + + if (dca_en) { + /* DCA must be enabled after the buffer attr is configured. */ + hns_roce_enable_dca(hr_dev, hr_qp); + + hr_qp->en_flags |= HNS_ROCE_QP_CAP_DYNAMIC_CTX_ATTACH; + } + + ret = hns_roce_mtr_create(hr_dev, &hr_qp->mtr, buf_attr, + HNS_HW_PAGE_SHIFT + hr_dev->caps.mtt_ba_pg_sz, + udata, addr); + if (ret) { + ibdev_err(ibdev, "failed to create WQE mtr, ret = %d.\n", ret); + if (dca_en) + hns_roce_disable_dca(hr_dev, hr_qp); + } + + return ret; +} + +static void free_wqe_buf(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp, + struct ib_udata *udata) +{ + hns_roce_mtr_destroy(hr_dev, &hr_qp->mtr); + + if (hr_qp->en_flags & HNS_ROCE_QP_CAP_DYNAMIC_CTX_ATTACH) + hns_roce_disable_dca(hr_dev, hr_qp); +} + +static int alloc_qp_wqe(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp, struct ib_qp_init_attr *init_attr, struct ib_udata *udata, unsigned long addr) { struct ib_device *ibdev = &hr_dev->ib_dev; struct hns_roce_buf_attr buf_attr = {}; + bool dca_en; int ret; if (!udata && hr_qp->rq_inl_buf.wqe_cnt) { @@ -768,16 +830,16 @@ static int alloc_qp_buf(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp, hr_qp->rq_inl_buf.wqe_list = NULL; } - ret = set_wqe_buf_attr(hr_dev, hr_qp, &buf_attr); + dca_en = check_dca_is_enable(hr_dev, !!udata, addr); + ret = set_wqe_buf_attr(hr_dev, hr_qp, dca_en, &buf_attr); if (ret) { - ibdev_err(ibdev, "failed to split WQE buf, ret = %d.\n", ret); + ibdev_err(ibdev, "failed to set WQE attr, ret = %d.\n", ret); goto err_inline; } - ret = hns_roce_mtr_create(hr_dev, &hr_qp->mtr, &buf_attr, - PAGE_SHIFT + hr_dev->caps.mtt_ba_pg_sz, - udata, addr); + + ret = alloc_wqe_buf(hr_dev, hr_qp, dca_en, &buf_attr, udata, addr); if (ret) { - ibdev_err(ibdev, "failed to create WQE mtr, ret = %d.\n", ret); + ibdev_err(ibdev, "failed to alloc WQE buf, ret = %d.\n", ret); goto err_inline; } @@ -788,9 +850,10 @@ static int alloc_qp_buf(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp, return ret; } -static void free_qp_buf(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp) +static void free_qp_wqe(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp, + struct ib_udata *udata) { - hns_roce_mtr_destroy(hr_dev, &hr_qp->mtr); + free_wqe_buf(hr_dev, hr_qp, udata); free_rq_inline_buf(hr_qp); } @@ -848,7 +911,6 @@ static int alloc_qp_db(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp, goto err_out; } hr_qp->en_flags |= HNS_ROCE_QP_CAP_SQ_RECORD_DB; - resp->cap_flags |= HNS_ROCE_QP_CAP_SQ_RECORD_DB; } if (user_qp_has_rdb(hr_dev, init_attr, udata, resp)) { @@ -861,7 +923,6 @@ static int alloc_qp_db(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp, goto err_sdb; } hr_qp->en_flags |= HNS_ROCE_QP_CAP_RQ_RECORD_DB; - resp->cap_flags |= HNS_ROCE_QP_CAP_RQ_RECORD_DB; } } else { if (hr_dev->pci_dev->revision >= PCI_REVISION_ID_HIP09) @@ -1040,18 +1101,18 @@ static int hns_roce_create_qp_common(struct hns_roce_dev *hr_dev, } } - ret = alloc_qp_buf(hr_dev, hr_qp, init_attr, udata, ucmd.buf_addr); - if (ret) { - ibdev_err(ibdev, "failed to alloc QP buffer, ret = %d.\n", ret); - goto err_buf; - } - ret = alloc_qpn(hr_dev, hr_qp); if (ret) { ibdev_err(ibdev, "failed to alloc QPN, ret = %d.\n", ret); goto err_qpn; } + ret = alloc_qp_wqe(hr_dev, hr_qp, init_attr, udata, ucmd.buf_addr); + if (ret) { + ibdev_err(ibdev, "failed to alloc QP buffer, ret = %d.\n", ret); + goto err_buf; + } + ret = alloc_qp_db(hr_dev, hr_qp, init_attr, udata, &ucmd, &resp); if (ret) { ibdev_err(ibdev, "failed to alloc QP doorbell, ret = %d.\n", @@ -1073,6 +1134,7 @@ static int hns_roce_create_qp_common(struct hns_roce_dev *hr_dev, } if (udata) { + resp.cap_flags = hr_qp->en_flags; ret = ib_copy_to_udata(udata, &resp, min(udata->outlen, sizeof(resp))); if (ret) { @@ -1101,10 +1163,10 @@ static int hns_roce_create_qp_common(struct hns_roce_dev *hr_dev, err_qpc: free_qp_db(hr_dev, hr_qp, udata); err_db: + free_qp_wqe(hr_dev, hr_qp, udata); +err_buf: free_qpn(hr_dev, hr_qp); err_qpn: - free_qp_buf(hr_dev, hr_qp); -err_buf: free_kernel_wrid(hr_qp); return ret; } @@ -1118,7 +1180,7 @@ void hns_roce_qp_destroy(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp, free_qpc(hr_dev, hr_qp); free_qpn(hr_dev, hr_qp); - free_qp_buf(hr_dev, hr_qp); + free_qp_wqe(hr_dev, hr_qp, udata); free_kernel_wrid(hr_qp); free_qp_db(hr_dev, hr_qp, udata); diff --git a/include/uapi/rdma/hns-abi.h b/include/uapi/rdma/hns-abi.h index bcca5be..4452b17 100644 --- a/include/uapi/rdma/hns-abi.h +++ b/include/uapi/rdma/hns-abi.h @@ -77,6 +77,7 @@ enum hns_roce_qp_cap_flags { HNS_ROCE_QP_CAP_RQ_RECORD_DB = 1 << 0, HNS_ROCE_QP_CAP_SQ_RECORD_DB = 1 << 1, HNS_ROCE_QP_CAP_OWNER_DB = 1 << 2, + HNS_ROCE_QP_CAP_DYNAMIC_CTX_ATTACH = 1 << 4, }; struct hns_roce_ib_create_qp_resp { From patchwork Thu Jul 29 02:19:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenpeng Liang X-Patchwork-Id: 12407469 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08E5EC04FE3 for ; Thu, 29 Jul 2021 02:23:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E610060F01 for ; Thu, 29 Jul 2021 02:23:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233507AbhG2CXH (ORCPT ); Wed, 28 Jul 2021 22:23:07 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:16019 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233488AbhG2CXG (ORCPT ); Wed, 28 Jul 2021 22:23:06 -0400 Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4GZvP20QgfzZv7w; Thu, 29 Jul 2021 10:19:34 +0800 (CST) Received: from dggpeml500017.china.huawei.com (7.185.36.243) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:23:00 +0800 Received: from localhost.localdomain (10.67.165.24) by dggpeml500017.china.huawei.com (7.185.36.243) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:22:59 +0800 From: Wenpeng Liang To: , CC: , , , , Xi Wang Subject: [PATCH v4 for-next 04/12] RDMA/hns: Refactor QP modify flow Date: Thu, 29 Jul 2021 10:19:15 +0800 Message-ID: <1627525163-1683-5-git-send-email-liangwenpeng@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> References: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpeml500017.china.huawei.com (7.185.36.243) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Xi Wang Warp the qp modify checking logic as a funciton to make the code more readable. Signed-off-by: Xi Wang Signed-off-by: Wenpeng Liang --- drivers/infiniband/hw/hns/hns_roce_qp.c | 50 ++++++++++++++++++++++----------- 1 file changed, 34 insertions(+), 16 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c index 91311a0..4b8e850 100644 --- a/drivers/infiniband/hw/hns/hns_roce_qp.c +++ b/drivers/infiniband/hw/hns/hns_roce_qp.c @@ -1343,21 +1343,17 @@ static int hns_roce_check_qp_attr(struct ib_qp *ibqp, struct ib_qp_attr *attr, return 0; } -int hns_roce_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, - int attr_mask, struct ib_udata *udata) +static int check_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, + int attr_mask, enum ib_qp_state cur_state, + enum ib_qp_state new_state) { - struct hns_roce_dev *hr_dev = to_hr_dev(ibqp->device); struct hns_roce_qp *hr_qp = to_hr_qp(ibqp); - enum ib_qp_state cur_state, new_state; - int ret = -EINVAL; - - mutex_lock(&hr_qp->mutex); - - if (attr_mask & IB_QP_CUR_STATE && attr->cur_qp_state != hr_qp->state) - goto out; + int ret; - cur_state = hr_qp->state; - new_state = attr_mask & IB_QP_STATE ? attr->qp_state : cur_state; + if (attr_mask & IB_QP_CUR_STATE && attr->cur_qp_state != hr_qp->state) { + ibdev_err(ibqp->device, "failed to check modify curr state\n"); + return -EINVAL; + } if (ibqp->uobject && (attr_mask & IB_QP_STATE) && new_state == IB_QPS_ERR) { @@ -1367,19 +1363,41 @@ int hns_roce_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, if (hr_qp->en_flags & HNS_ROCE_QP_CAP_RQ_RECORD_DB) hr_qp->rq.head = *(int *)(hr_qp->rdb.virt_addr); } else { - ibdev_warn(&hr_dev->ib_dev, + ibdev_warn(ibqp->device, "flush cqe is not supported in userspace!\n"); - goto out; + return -EINVAL; } } if (!ib_modify_qp_is_ok(cur_state, new_state, ibqp->qp_type, attr_mask)) { - ibdev_err(&hr_dev->ib_dev, "ib_modify_qp_is_ok failed\n"); - goto out; + ibdev_err(ibqp->device, "failed to check modify qp state\n"); + return -EINVAL; } ret = hns_roce_check_qp_attr(ibqp, attr, attr_mask); + if (ret) { + ibdev_err(ibqp->device, "failed to check modify qp attr\n"); + return ret; + } + + return 0; +} + +int hns_roce_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, + int attr_mask, struct ib_udata *udata) +{ + struct hns_roce_dev *hr_dev = to_hr_dev(ibqp->device); + struct hns_roce_qp *hr_qp = to_hr_qp(ibqp); + enum ib_qp_state cur_state, new_state; + int ret; + + mutex_lock(&hr_qp->mutex); + + cur_state = hr_qp->state; + new_state = attr_mask & IB_QP_STATE ? attr->qp_state : cur_state; + + ret = check_modify_qp(ibqp, attr, attr_mask, cur_state, new_state); if (ret) goto out; From patchwork Thu Jul 29 02:19:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenpeng Liang X-Patchwork-Id: 12407487 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A2A2C4338F for ; Thu, 29 Jul 2021 02:23:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 872E06103C for ; Thu, 29 Jul 2021 02:23:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233559AbhG2CXI (ORCPT ); Wed, 28 Jul 2021 22:23:08 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:16023 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233548AbhG2CXI (ORCPT ); Wed, 28 Jul 2021 22:23:08 -0400 Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4GZvP20grWzZv7v; Thu, 29 Jul 2021 10:19:34 +0800 (CST) Received: from dggpeml500017.china.huawei.com (7.185.36.243) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:23:00 +0800 Received: from localhost.localdomain (10.67.165.24) by dggpeml500017.china.huawei.com (7.185.36.243) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:22:59 +0800 From: Wenpeng Liang To: , CC: , , , , Xi Wang Subject: [PATCH v4 for-next 05/12] RDMA/hns: Add method for attaching WQE buffer Date: Thu, 29 Jul 2021 10:19:16 +0800 Message-ID: <1627525163-1683-6-git-send-email-liangwenpeng@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> References: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpeml500017.china.huawei.com (7.185.36.243) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Xi Wang If a uQP works as DCA mode, the userspace driver needs to config the WQE buffer by calling the 'HNS_IB_METHOD_DCA_MEM_ATTACH' method before filling the WQE. This method will allocate a group of pages from DCA memory pool and write the configuration of addressing to QPC. Signed-off-by: Xi Wang Signed-off-by: Wenpeng Liang --- drivers/infiniband/hw/hns/hns_roce_dca.c | 494 +++++++++++++++++++++++++++- drivers/infiniband/hw/hns/hns_roce_dca.h | 25 ++ drivers/infiniband/hw/hns/hns_roce_device.h | 10 + drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 29 +- include/uapi/rdma/hns-abi.h | 11 + 5 files changed, 557 insertions(+), 12 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_dca.c b/drivers/infiniband/hw/hns/hns_roce_dca.c index 741e009..45fe163 100644 --- a/drivers/infiniband/hw/hns/hns_roce_dca.c +++ b/drivers/infiniband/hw/hns/hns_roce_dca.c @@ -35,7 +35,40 @@ struct dca_mem_attr { u32 size; }; +static inline void set_dca_page_to_free(struct hns_dca_page_state *state) +{ + state->buf_id = HNS_DCA_INVALID_BUF_ID; + state->active = 0; + state->lock = 0; +} + +static inline void lock_dca_page_to_attach(struct hns_dca_page_state *state, + u32 buf_id) +{ + state->buf_id = HNS_DCA_ID_MASK & buf_id; + state->active = 0; + state->lock = 1; +} + +static inline void unlock_dca_page_to_active(struct hns_dca_page_state *state, + u32 buf_id) +{ + state->buf_id = HNS_DCA_ID_MASK & buf_id; + state->active = 1; + state->lock = 0; +} + #define dca_page_is_free(s) ((s)->buf_id == HNS_DCA_INVALID_BUF_ID) + +/* only the own bit needs to be matched. */ +#define dca_page_is_attached(s, id) \ + ((HNS_DCA_OWN_MASK & (id)) == (HNS_DCA_OWN_MASK & (s)->buf_id)) + +#define dca_page_is_allocated(s, id) \ + (dca_page_is_attached(s, id) && (s)->lock) + +#define dca_page_is_inactive(s) (!(s)->lock && !(s)->active) + #define dca_mem_is_available(m) \ ((m)->flags == (DCA_MEM_FLAGS_ALLOCED | DCA_MEM_FLAGS_REGISTERED)) @@ -342,11 +375,408 @@ static void free_dca_mem(struct dca_mem *mem) spin_unlock(&mem->lock); } +static inline struct hns_roce_dca_ctx *hr_qp_to_dca_ctx(struct hns_roce_qp *qp) +{ + return to_hr_dca_ctx(to_hr_ucontext(qp->ibqp.pd->uobject->context)); +} + +struct dca_page_clear_attr { + u32 buf_id; + u32 max_pages; + u32 clear_pages; +}; + +static int clear_dca_pages_proc(struct dca_mem *mem, int index, void *param) +{ + struct hns_dca_page_state *state = &mem->states[index]; + struct dca_page_clear_attr *attr = param; + + if (dca_page_is_attached(state, attr->buf_id)) { + set_dca_page_to_free(state); + attr->clear_pages++; + } + + if (attr->clear_pages >= attr->max_pages) + return DCA_MEM_STOP_ITERATE; + else + return 0; +} + +static void clear_dca_pages(struct hns_roce_dca_ctx *ctx, u32 buf_id, u32 count) +{ + struct dca_page_clear_attr attr = {}; + + attr.buf_id = buf_id; + attr.max_pages = count; + travel_dca_pages(ctx, &attr, clear_dca_pages_proc); +} + +struct dca_page_assign_attr { + u32 buf_id; + int unit; + int total; + int max; +}; + +static bool dca_page_is_allocable(struct hns_dca_page_state *state, bool head) +{ + bool is_free = dca_page_is_free(state) || dca_page_is_inactive(state); + + return head ? is_free : is_free && !state->head; +} + +static int assign_dca_pages_proc(struct dca_mem *mem, int index, void *param) +{ + struct dca_page_assign_attr *attr = param; + struct hns_dca_page_state *state; + int checked_pages = 0; + int start_index = 0; + int free_pages = 0; + int i; + + /* Check the continuous pages count is not smaller than unit count */ + for (i = index; free_pages < attr->unit && i < mem->page_count; i++) { + checked_pages++; + state = &mem->states[i]; + if (dca_page_is_allocable(state, free_pages == 0)) { + if (free_pages == 0) + start_index = i; + + free_pages++; + } else { + free_pages = 0; + } + } + + if (free_pages < attr->unit) + return DCA_MEM_NEXT_ITERATE; + + for (i = 0; i < free_pages; i++) { + state = &mem->states[start_index + i]; + lock_dca_page_to_attach(state, attr->buf_id); + attr->total++; + } + + if (attr->total >= attr->max) + return DCA_MEM_STOP_ITERATE; + + return checked_pages; +} + +static u32 assign_dca_pages(struct hns_roce_dca_ctx *ctx, u32 buf_id, u32 count, + u32 unit) +{ + struct dca_page_assign_attr attr = {}; + + attr.buf_id = buf_id; + attr.unit = unit; + attr.max = count; + travel_dca_pages(ctx, &attr, assign_dca_pages_proc); + return attr.total; +} + +struct dca_page_active_attr { + u32 buf_id; + u32 max_pages; + u32 alloc_pages; + u32 dirty_mems; +}; + +static int active_dca_pages_proc(struct dca_mem *mem, int index, void *param) +{ + struct dca_page_active_attr *attr = param; + struct hns_dca_page_state *state; + bool changed = false; + bool stop = false; + int i, free_pages; + + free_pages = 0; + for (i = 0; !stop && i < mem->page_count; i++) { + state = &mem->states[i]; + if (dca_page_is_free(state)) { + free_pages++; + } else if (dca_page_is_allocated(state, attr->buf_id)) { + free_pages++; + /* Change matched pages state */ + unlock_dca_page_to_active(state, attr->buf_id); + changed = true; + attr->alloc_pages++; + if (attr->alloc_pages == attr->max_pages) + stop = true; + } + } + + for (; changed && i < mem->page_count; i++) + if (dca_page_is_free(state)) + free_pages++; + + /* Clean mem changed to dirty */ + if (changed && free_pages == mem->page_count) + attr->dirty_mems++; + + return stop ? DCA_MEM_STOP_ITERATE : DCA_MEM_NEXT_ITERATE; +} + +static u32 active_dca_pages(struct hns_roce_dca_ctx *ctx, u32 buf_id, u32 count) +{ + struct dca_page_active_attr attr = {}; + unsigned long flags; + + attr.buf_id = buf_id; + attr.max_pages = count; + travel_dca_pages(ctx, &attr, active_dca_pages_proc); + + /* Update free size */ + spin_lock_irqsave(&ctx->pool_lock, flags); + ctx->free_mems -= attr.dirty_mems; + ctx->free_size -= attr.alloc_pages << HNS_HW_PAGE_SHIFT; + spin_unlock_irqrestore(&ctx->pool_lock, flags); + + return attr.alloc_pages; +} + +struct dca_get_alloced_pages_attr { + u32 buf_id; + dma_addr_t *pages; + u32 total; + u32 max; +}; + +static int get_alloced_umem_proc(struct dca_mem *mem, int index, void *param) + +{ + struct dca_get_alloced_pages_attr *attr = param; + struct hns_dca_page_state *states = mem->states; + struct ib_umem *umem = mem->pages; + struct ib_block_iter biter; + u32 i = 0; + + rdma_for_each_block(umem->sg_head.sgl, &biter, umem->nmap, + HNS_HW_PAGE_SIZE) { + if (dca_page_is_allocated(&states[i], attr->buf_id)) { + attr->pages[attr->total++] = + rdma_block_iter_dma_address(&biter); + if (attr->total >= attr->max) + return DCA_MEM_STOP_ITERATE; + } + i++; + } + + return DCA_MEM_NEXT_ITERATE; +} + +static int config_dca_qpc(struct hns_roce_dev *hr_dev, + struct hns_roce_qp *hr_qp, dma_addr_t *pages, + int page_count) +{ + struct ib_device *ibdev = &hr_dev->ib_dev; + struct hns_roce_mtr *mtr = &hr_qp->mtr; + int ret; + + ret = hns_roce_mtr_map(hr_dev, mtr, pages, page_count); + if (ret) { + ibdev_err(ibdev, "failed to map DCA pages, ret = %d.\n", ret); + return ret; + } + + if (hr_dev->hw->set_dca_buf) { + ret = hr_dev->hw->set_dca_buf(hr_dev, hr_qp); + if (ret) { + ibdev_err(ibdev, "failed to set DCA to HW, ret = %d.\n", + ret); + return ret; + } + } + + return 0; +} + +static int setup_dca_buf_to_hw(struct hns_roce_dev *hr_dev, + struct hns_roce_qp *hr_qp, + struct hns_roce_dca_ctx *ctx, u32 buf_id, + u32 count) +{ + struct dca_get_alloced_pages_attr attr = {}; + dma_addr_t *pages; + int ret; + + /* alloc a tmp array to store buffer's dma address */ + pages = kvcalloc(count, sizeof(dma_addr_t), GFP_ATOMIC); + if (!pages) + return -ENOMEM; + + attr.buf_id = buf_id; + attr.pages = pages; + attr.max = count; + + if (hr_qp->ibqp.uobject) + travel_dca_pages(ctx, &attr, get_alloced_umem_proc); + + if (attr.total != count) { + ibdev_err(&hr_dev->ib_dev, "failed to get DCA page %u != %u.\n", + attr.total, count); + ret = -ENOMEM; + goto err_get_pages; + } + + ret = config_dca_qpc(hr_dev, hr_qp, pages, count); +err_get_pages: + /* drop tmp array */ + kvfree(pages); + + return ret; +} + +static int sync_dca_buf_offset(struct hns_roce_dev *hr_dev, + struct hns_roce_qp *hr_qp, + struct hns_dca_attach_attr *attr) +{ + struct ib_device *ibdev = &hr_dev->ib_dev; + + if (hr_qp->sq.wqe_cnt > 0) { + if (attr->sq_offset >= hr_qp->sge.offset) { + ibdev_err(ibdev, "failed to check SQ offset = %u\n", + attr->sq_offset); + return -EINVAL; + } + hr_qp->sq.wqe_offset = hr_qp->sq.offset + attr->sq_offset; + } + + if (hr_qp->sge.sge_cnt > 0) { + if (attr->sge_offset >= hr_qp->rq.offset) { + ibdev_err(ibdev, "failed to check exSGE offset = %u\n", + attr->sge_offset); + return -EINVAL; + } + hr_qp->sge.wqe_offset = hr_qp->sge.offset + attr->sge_offset; + } + + if (hr_qp->rq.wqe_cnt > 0) { + if (attr->rq_offset >= hr_qp->buff_size) { + ibdev_err(ibdev, "failed to check RQ offset = %u\n", + attr->rq_offset); + return -EINVAL; + } + hr_qp->rq.wqe_offset = hr_qp->rq.offset + attr->rq_offset; + } + + return 0; +} + +static u32 alloc_buf_from_dca_mem(struct hns_roce_qp *hr_qp, + struct hns_roce_dca_ctx *ctx) +{ + u32 buf_pages, unit_pages, alloc_pages; + u32 buf_id; + + buf_pages = hr_qp->dca_cfg.npages; + /* Gen new buf id */ + buf_id = HNS_DCA_TO_BUF_ID(hr_qp->qpn, hr_qp->dca_cfg.attach_count); + + /* Assign pages from free pages */ + unit_pages = hr_qp->mtr.hem_cfg.is_direct ? buf_pages : 1; + alloc_pages = assign_dca_pages(ctx, buf_id, buf_pages, unit_pages); + if (buf_pages != alloc_pages) { + if (alloc_pages > 0) + clear_dca_pages(ctx, buf_id, alloc_pages); + return HNS_DCA_INVALID_BUF_ID; + } + + return buf_id; +} + +static int active_alloced_buf(struct hns_roce_qp *hr_qp, + struct hns_roce_dca_ctx *ctx, + struct hns_dca_attach_attr *attr, u32 buf_id) +{ + struct hns_roce_dev *hr_dev = to_hr_dev(hr_qp->ibqp.device); + struct ib_device *ibdev = &hr_dev->ib_dev; + u32 active_pages, alloc_pages; + int ret; + + alloc_pages = hr_qp->dca_cfg.npages; + ret = sync_dca_buf_offset(hr_dev, hr_qp, attr); + if (ret) { + ibdev_err(ibdev, "failed to sync DCA offset, ret = %d\n", ret); + goto active_fail; + } + + ret = setup_dca_buf_to_hw(hr_dev, hr_qp, ctx, buf_id, alloc_pages); + if (ret) { + ibdev_err(ibdev, "failed to setup DCA buf, ret = %d.\n", ret); + goto active_fail; + } + + active_pages = active_dca_pages(ctx, buf_id, alloc_pages); + if (active_pages != alloc_pages) { + ibdev_err(ibdev, "failed to active DCA pages, %u != %u.\n", + active_pages, alloc_pages); + ret = -ENOBUFS; + goto active_fail; + } + + return 0; +active_fail: + clear_dca_pages(ctx, buf_id, alloc_pages); + return ret; +} + +static int attach_dca_mem(struct hns_roce_dev *hr_dev, + struct hns_roce_qp *hr_qp, + struct hns_dca_attach_attr *attr, + struct hns_dca_attach_resp *resp) +{ + struct hns_roce_dca_ctx *ctx = hr_qp_to_dca_ctx(hr_qp); + struct hns_roce_dca_cfg *cfg = &hr_qp->dca_cfg; + u32 buf_id; + int ret; + + resp->alloc_flags = 0; + spin_lock(&cfg->lock); + buf_id = cfg->buf_id; + /* Already attached */ + if (buf_id != HNS_DCA_INVALID_BUF_ID) { + resp->alloc_pages = cfg->npages; + spin_unlock(&cfg->lock); + return 0; + } + + /* Start to new attach */ + resp->alloc_pages = 0; + buf_id = alloc_buf_from_dca_mem(hr_qp, ctx); + if (buf_id == HNS_DCA_INVALID_BUF_ID) { + spin_unlock(&cfg->lock); + /* No report fail, need try again after the pool increased */ + return 0; + } + + ret = active_alloced_buf(hr_qp, ctx, attr, buf_id); + if (ret) { + spin_unlock(&cfg->lock); + ibdev_err(&hr_dev->ib_dev, + "failed to active DCA buf for QP-%lu, ret = %d.\n", + hr_qp->qpn, ret); + return ret; + } + + /* Attach ok */ + cfg->buf_id = buf_id; + cfg->attach_count++; + spin_unlock(&cfg->lock); + + resp->alloc_flags |= HNS_IB_ATTACH_FLAGS_NEW_BUFFER; + resp->alloc_pages = cfg->npages; + + return 0; +} + void hns_roce_enable_dca(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp) { struct hns_roce_dca_cfg *cfg = &hr_qp->dca_cfg; + spin_lock_init(&cfg->lock); cfg->buf_id = HNS_DCA_INVALID_BUF_ID; + cfg->npages = hr_qp->buff_size >> HNS_HW_PAGE_SHIFT; } void hns_roce_disable_dca(struct hns_roce_dev *hr_dev, @@ -471,11 +901,73 @@ DECLARE_UVERBS_NAMED_METHOD( UVERBS_ATTR_TYPE(u64), UA_MANDATORY), UVERBS_ATTR_PTR_OUT(HNS_IB_ATTR_DCA_MEM_SHRINK_OUT_FREE_MEMS, UVERBS_ATTR_TYPE(u32), UA_MANDATORY)); + +static inline struct hns_roce_qp * +uverbs_attr_to_hr_qp(struct uverbs_attr_bundle *attrs) +{ + struct ib_uobject *uobj = + uverbs_attr_get_uobject(attrs, 1U << UVERBS_ID_NS_SHIFT); + + if (uobj_get_object_id(uobj) == UVERBS_OBJECT_QP) + return to_hr_qp(uobj->object); + + return NULL; +} + +static int UVERBS_HANDLER(HNS_IB_METHOD_DCA_MEM_ATTACH)( + struct uverbs_attr_bundle *attrs) +{ + struct hns_roce_qp *hr_qp = uverbs_attr_to_hr_qp(attrs); + struct hns_dca_attach_attr attr = {}; + struct hns_dca_attach_resp resp = {}; + int ret; + + if (!hr_qp) + return -EINVAL; + + if (uverbs_copy_from(&attr.sq_offset, attrs, + HNS_IB_ATTR_DCA_MEM_ATTACH_SQ_OFFSET) || + uverbs_copy_from(&attr.sge_offset, attrs, + HNS_IB_ATTR_DCA_MEM_ATTACH_SGE_OFFSET) || + uverbs_copy_from(&attr.rq_offset, attrs, + HNS_IB_ATTR_DCA_MEM_ATTACH_RQ_OFFSET)) + return -EFAULT; + + ret = attach_dca_mem(to_hr_dev(hr_qp->ibqp.device), hr_qp, &attr, + &resp); + if (ret) + return ret; + + if (uverbs_copy_to(attrs, HNS_IB_ATTR_DCA_MEM_ATTACH_OUT_ALLOC_FLAGS, + &resp.alloc_flags, sizeof(resp.alloc_flags)) || + uverbs_copy_to(attrs, HNS_IB_ATTR_DCA_MEM_ATTACH_OUT_ALLOC_PAGES, + &resp.alloc_pages, sizeof(resp.alloc_pages))) + return -EFAULT; + + return 0; +} + +DECLARE_UVERBS_NAMED_METHOD( + HNS_IB_METHOD_DCA_MEM_ATTACH, + UVERBS_ATTR_IDR(HNS_IB_ATTR_DCA_MEM_ATTACH_HANDLE, UVERBS_OBJECT_QP, + UVERBS_ACCESS_WRITE, UA_MANDATORY), + UVERBS_ATTR_PTR_IN(HNS_IB_ATTR_DCA_MEM_ATTACH_SQ_OFFSET, + UVERBS_ATTR_TYPE(u32), UA_MANDATORY), + UVERBS_ATTR_PTR_IN(HNS_IB_ATTR_DCA_MEM_ATTACH_SGE_OFFSET, + UVERBS_ATTR_TYPE(u32), UA_MANDATORY), + UVERBS_ATTR_PTR_IN(HNS_IB_ATTR_DCA_MEM_ATTACH_RQ_OFFSET, + UVERBS_ATTR_TYPE(u32), UA_MANDATORY), + UVERBS_ATTR_PTR_OUT(HNS_IB_ATTR_DCA_MEM_ATTACH_OUT_ALLOC_FLAGS, + UVERBS_ATTR_TYPE(u32), UA_MANDATORY), + UVERBS_ATTR_PTR_OUT(HNS_IB_ATTR_DCA_MEM_ATTACH_OUT_ALLOC_PAGES, + UVERBS_ATTR_TYPE(u32), UA_MANDATORY)); + DECLARE_UVERBS_NAMED_OBJECT(HNS_IB_OBJECT_DCA_MEM, UVERBS_TYPE_ALLOC_IDR(dca_cleanup), &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_REG), &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_DEREG), - &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_SHRINK)); + &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_SHRINK), + &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_ATTACH)); static bool dca_is_supported(struct ib_device *device) { diff --git a/drivers/infiniband/hw/hns/hns_roce_dca.h b/drivers/infiniband/hw/hns/hns_roce_dca.h index a13a2d6..d66cb74 100644 --- a/drivers/infiniband/hw/hns/hns_roce_dca.h +++ b/drivers/infiniband/hw/hns/hns_roce_dca.h @@ -21,6 +21,31 @@ struct hns_dca_shrink_resp { }; +/* + * buffer id(29b) = tag(7b) + owner(22b) + * [28:22] tag : indicate the QP config update times. + * [21: 0] owner: indicate the QP to which the page belongs. + */ +#define HNS_DCA_ID_MASK GENMASK(28, 0) +#define HNS_DCA_TAG_MASK GENMASK(28, 22) +#define HNS_DCA_OWN_MASK GENMASK(21, 0) + +#define HNS_DCA_BUF_ID_TO_TAG(buf_id) (((buf_id) & HNS_DCA_TAG_MASK) >> 22) +#define HNS_DCA_BUF_ID_TO_QPN(buf_id) ((buf_id) & HNS_DCA_OWN_MASK) +#define HNS_DCA_TO_BUF_ID(qpn, tag) (((qpn) & HNS_DCA_OWN_MASK) | \ + (((tag) << 22) & HNS_DCA_TAG_MASK)) + +struct hns_dca_attach_attr { + u32 sq_offset; + u32 sge_offset; + u32 rq_offset; +}; + +struct hns_dca_attach_resp { + u32 alloc_flags; + u32 alloc_pages; +}; + void hns_roce_register_udca(struct hns_roce_dev *hr_dev, struct hns_roce_ucontext *uctx); void hns_roce_unregister_udca(struct hns_roce_dev *hr_dev, diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index 00f80b3..50dc894 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -332,9 +332,13 @@ struct hns_roce_mtr { }; struct hns_roce_dca_cfg { + spinlock_t lock; u32 buf_id; + u16 attach_count; + u32 npages; }; + struct hns_roce_mw { struct ib_mw ibmw; u32 pdn; @@ -375,6 +379,7 @@ struct hns_roce_wq { u32 max_gs; u32 rsv_sge; int offset; + int wqe_offset; int wqe_shift; /* WQE size */ u32 head; u32 tail; @@ -385,6 +390,7 @@ struct hns_roce_sge { unsigned int sge_cnt; /* SGE num */ int offset; int sge_shift; /* SGE size */ + int wqe_offset; }; struct hns_roce_buf_list { @@ -924,6 +930,10 @@ struct hns_roce_hw { int (*clear_hem)(struct hns_roce_dev *hr_dev, struct hns_roce_hem_table *table, int obj, int step_idx); + int (*set_dca_buf)(struct hns_roce_dev *hr_dev, + struct hns_roce_qp *hr_qp); + int (*query_qp)(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr, + int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr); int (*modify_qp)(struct ib_qp *ibqp, const struct ib_qp_attr *attr, int attr_mask, enum ib_qp_state cur_state, enum ib_qp_state new_state); diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index ced0c44..b31b493 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -4195,8 +4195,8 @@ static int config_qp_rq_buf(struct hns_roce_dev *hr_dev, int count; /* Search qp buf's mtts */ - count = hns_roce_mtr_find(hr_dev, &hr_qp->mtr, hr_qp->rq.offset, mtts, - MTT_MIN_COUNT, &wqe_sge_ba); + count = hns_roce_mtr_find(hr_dev, &hr_qp->mtr, hr_qp->rq.wqe_offset, + mtts, ARRAY_SIZE(mtts), &wqe_sge_ba); if (hr_qp->rq.wqe_cnt && count < 1) { ibdev_err(&hr_dev->ib_dev, "failed to find RQ WQE, QPN = 0x%lx.\n", hr_qp->qpn); @@ -4246,12 +4246,15 @@ static int config_qp_rq_buf(struct hns_roce_dev *hr_dev, upper_32_bits(to_hr_hw_page_addr(mtts[0]))); hr_reg_clear(qpc_mask, QPC_RQ_CUR_BLK_ADDR_H); - context->rq_nxt_blk_addr = cpu_to_le32(to_hr_hw_page_addr(mtts[1])); - qpc_mask->rq_nxt_blk_addr = 0; - - hr_reg_write(context, QPC_RQ_NXT_BLK_ADDR_H, - upper_32_bits(to_hr_hw_page_addr(mtts[1]))); - hr_reg_clear(qpc_mask, QPC_RQ_NXT_BLK_ADDR_H); + // The rq next block address is only valid for HIP08 QPC. + if (hr_dev->pci_dev->revision == PCI_REVISION_ID_HIP08) { + context->rq_nxt_blk_addr = + cpu_to_le32(to_hr_hw_page_addr(mtts[1])); + qpc_mask->rq_nxt_blk_addr = 0; + hr_reg_write(context, QPC_RQ_NXT_BLK_ADDR_H, + upper_32_bits(to_hr_hw_page_addr(mtts[1]))); + hr_reg_clear(qpc_mask, QPC_RQ_NXT_BLK_ADDR_H); + } return 0; } @@ -4267,7 +4270,8 @@ static int config_qp_sq_buf(struct hns_roce_dev *hr_dev, int count; /* search qp buf's mtts */ - count = hns_roce_mtr_find(hr_dev, &hr_qp->mtr, 0, &sq_cur_blk, 1, NULL); + count = hns_roce_mtr_find(hr_dev, &hr_qp->mtr, hr_qp->sq.wqe_offset, + &sq_cur_blk, 1, NULL); if (count < 1) { ibdev_err(ibdev, "failed to find QP(0x%lx) SQ buf.\n", hr_qp->qpn); @@ -4275,8 +4279,8 @@ static int config_qp_sq_buf(struct hns_roce_dev *hr_dev, } if (hr_qp->sge.sge_cnt > 0) { count = hns_roce_mtr_find(hr_dev, &hr_qp->mtr, - hr_qp->sge.offset, - &sge_cur_blk, 1, NULL); + hr_qp->sge.wqe_offset, &sge_cur_blk, + 1, NULL); if (count < 1) { ibdev_err(ibdev, "failed to find QP(0x%lx) SGE buf.\n", hr_qp->qpn); @@ -4342,6 +4346,7 @@ static int modify_qp_init_to_rtr(struct ib_qp *ibqp, int mtu; int ret; + hr_qp->rq.wqe_offset = hr_qp->rq.offset; ret = config_qp_rq_buf(hr_dev, hr_qp, context, qpc_mask); if (ret) { ibdev_err(ibdev, "failed to config rq buf, ret = %d.\n", ret); @@ -4476,6 +4481,8 @@ static int modify_qp_rtr_to_rts(struct ib_qp *ibqp, return -EINVAL; } + hr_qp->sq.wqe_offset = hr_qp->sq.offset; + hr_qp->sge.wqe_offset = hr_qp->sge.offset; ret = config_qp_sq_buf(hr_dev, hr_qp, context, qpc_mask); if (ret) { ibdev_err(ibdev, "failed to config sq buf, ret = %d.\n", ret); diff --git a/include/uapi/rdma/hns-abi.h b/include/uapi/rdma/hns-abi.h index 4452b17..18ef96e 100644 --- a/include/uapi/rdma/hns-abi.h +++ b/include/uapi/rdma/hns-abi.h @@ -111,6 +111,7 @@ enum hns_ib_dca_mem_methods { HNS_IB_METHOD_DCA_MEM_REG = (1U << UVERBS_ID_NS_SHIFT), HNS_IB_METHOD_DCA_MEM_DEREG, HNS_IB_METHOD_DCA_MEM_SHRINK, + HNS_IB_METHOD_DCA_MEM_ATTACH, }; enum hns_ib_dca_mem_reg_attrs { @@ -131,4 +132,14 @@ enum hns_ib_dca_mem_shrink_attrs { HNS_IB_ATTR_DCA_MEM_SHRINK_OUT_FREE_MEMS, }; +#define HNS_IB_ATTACH_FLAGS_NEW_BUFFER 1U + +enum hns_ib_dca_mem_attach_attrs { + HNS_IB_ATTR_DCA_MEM_ATTACH_HANDLE = (1U << UVERBS_ID_NS_SHIFT), + HNS_IB_ATTR_DCA_MEM_ATTACH_SQ_OFFSET, + HNS_IB_ATTR_DCA_MEM_ATTACH_SGE_OFFSET, + HNS_IB_ATTR_DCA_MEM_ATTACH_RQ_OFFSET, + HNS_IB_ATTR_DCA_MEM_ATTACH_OUT_ALLOC_FLAGS, + HNS_IB_ATTR_DCA_MEM_ATTACH_OUT_ALLOC_PAGES, +}; #endif /* HNS_ABI_USER_H */ From patchwork Thu Jul 29 02:19:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenpeng Liang X-Patchwork-Id: 12407471 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D214BC43216 for ; Thu, 29 Jul 2021 02:23:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B672E6103A for ; Thu, 29 Jul 2021 02:23:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233516AbhG2CXH (ORCPT ); Wed, 28 Jul 2021 22:23:07 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:16020 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233507AbhG2CXG (ORCPT ); Wed, 28 Jul 2021 22:23:06 -0400 Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4GZvP20yQhzZtcr; Thu, 29 Jul 2021 10:19:34 +0800 (CST) Received: from dggpeml500017.china.huawei.com (7.185.36.243) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:23:00 +0800 Received: from localhost.localdomain (10.67.165.24) by dggpeml500017.china.huawei.com (7.185.36.243) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:23:00 +0800 From: Wenpeng Liang To: , CC: , , , , Xi Wang Subject: [PATCH v4 for-next 06/12] RDMA/hns: Setup the configuration of WQE addressing to QPC Date: Thu, 29 Jul 2021 10:19:17 +0800 Message-ID: <1627525163-1683-7-git-send-email-liangwenpeng@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> References: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpeml500017.china.huawei.com (7.185.36.243) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Xi Wang Add a new command to update the configuration of WQE buffer addressing to QPC in DCA mode. Signed-off-by: Xi Wang Signed-off-by: Wenpeng Liang --- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 82 +++++++++++++++++++++++++++--- drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 1 + 2 files changed, 77 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index b31b493..7e44128 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -2782,6 +2782,17 @@ static void hns_roce_v2_exit(struct hns_roce_dev *hr_dev) free_dip_list(hr_dev); } +static inline void mbox_desc_init(struct hns_roce_post_mbox *mb, u64 in_param, + u64 out_param, u32 in_modifier, + u8 op_modifier, u16 op) +{ + mb->in_param_l = cpu_to_le32(in_param); + mb->in_param_h = cpu_to_le32(in_param >> 32); + mb->out_param_l = cpu_to_le32(out_param); + mb->out_param_h = cpu_to_le32(out_param >> 32); + mb->cmd_tag = cpu_to_le32(in_modifier << 8 | op); +} + static int hns_roce_mbox_post(struct hns_roce_dev *hr_dev, u64 in_param, u64 out_param, u32 in_modifier, u8 op_modifier, u16 op, u16 token, int event) @@ -2790,17 +2801,34 @@ static int hns_roce_mbox_post(struct hns_roce_dev *hr_dev, u64 in_param, struct hns_roce_post_mbox *mb = (struct hns_roce_post_mbox *)desc.data; hns_roce_cmq_setup_basic_desc(&desc, HNS_ROCE_OPC_POST_MB, false); - - mb->in_param_l = cpu_to_le32(in_param); - mb->in_param_h = cpu_to_le32(in_param >> 32); - mb->out_param_l = cpu_to_le32(out_param); - mb->out_param_h = cpu_to_le32(out_param >> 32); - mb->cmd_tag = cpu_to_le32(in_modifier << 8 | op); + mbox_desc_init(mb, in_param, out_param, in_modifier, op_modifier, op); mb->token_event_en = cpu_to_le32(event << 16 | token); return hns_roce_cmq_send(hr_dev, &desc, 1); } +static int hns_roce_mbox_send(struct hns_roce_dev *hr_dev, u64 in_param, + u64 out_param, u32 in_modifier, u8 op_modifier, + u16 op) +{ + struct hns_roce_cmq_desc desc; + struct hns_roce_post_mbox *mb = (struct hns_roce_post_mbox *)desc.data; + + hns_roce_cmq_setup_basic_desc(&desc, HNS_ROCE_OPC_SYNC_MB, false); + + mbox_desc_init(mb, in_param, out_param, in_modifier, op_modifier, op); + + /* The hardware doesn't care about the token fields when working in + * sync mode. + */ + mb->token_event_en = 0; + + /* The cmdq send returns 0 indicates that the hardware has already + * finished the operation defined in this mbox. + */ + return hns_roce_cmq_send(hr_dev, &desc, 1); +} + static int v2_wait_mbox_complete(struct hns_roce_dev *hr_dev, u32 timeout, u8 *complete_status) { @@ -5062,6 +5090,47 @@ static int hns_roce_v2_modify_qp(struct ib_qp *ibqp, return ret; } +static int hns_roce_v2_set_dca_buf(struct hns_roce_dev *hr_dev, + struct hns_roce_qp *hr_qp) +{ + struct ib_device *ibdev = &hr_dev->ib_dev; + struct hns_roce_v2_qp_context *qpc, *msk; + dma_addr_t dma_handle; + int qpc_sz; + int ret; + + qpc_sz = hr_dev->caps.qpc_sz; + WARN_ON(2 * qpc_sz > HNS_ROCE_MAILBOX_SIZE); + qpc = dma_pool_alloc(hr_dev->cmd.pool, GFP_NOWAIT, &dma_handle); + if (!qpc) + return -ENOMEM; + + msk = (struct hns_roce_v2_qp_context *)((void *)qpc + qpc_sz); + memset(msk, 0xff, qpc_sz); + + ret = config_qp_rq_buf(hr_dev, hr_qp, qpc, msk); + if (ret) { + ibdev_err(ibdev, "failed to config rq qpc, ret = %d.\n", ret); + goto done; + } + + ret = config_qp_sq_buf(hr_dev, hr_qp, qpc, msk); + if (ret) { + ibdev_err(ibdev, "failed to config sq qpc, ret = %d.\n", ret); + goto done; + } + + ret = hns_roce_mbox_send(hr_dev, dma_handle, 0, hr_qp->qpn, 0, + HNS_ROCE_CMD_MODIFY_QPC); + if (ret) + ibdev_err(ibdev, "failed to modify DCA buf, ret = %d.\n", ret); + +done: + dma_pool_free(hr_dev->cmd.pool, qpc, dma_handle); + + return ret; +} + static int to_ib_qp_st(enum hns_roce_v2_qp_state state) { static const enum ib_qp_state map[] = { @@ -6239,6 +6308,7 @@ static const struct hns_roce_hw hns_roce_hw_v2 = { .write_cqc = hns_roce_v2_write_cqc, .set_hem = hns_roce_v2_set_hem, .clear_hem = hns_roce_v2_clear_hem, + .set_dca_buf = hns_roce_v2_set_dca_buf, .modify_qp = hns_roce_v2_modify_qp, .qp_flow_control_init = hns_roce_v2_qp_flow_control_init, .init_eq = hns_roce_v2_init_eq_table, diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h index b8a09d4..3f758d6 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h @@ -257,6 +257,7 @@ enum hns_roce_opcode_type { HNS_ROCE_OPC_QUERY_VF_RES = 0x850e, HNS_ROCE_OPC_CFG_GMV_TBL = 0x850f, HNS_ROCE_OPC_CFG_GMV_BT = 0x8510, + HNS_ROCE_OPC_SYNC_MB = 0x8511, HNS_ROCE_OPC_EXT_CFG = 0x8512, HNS_SWITCH_PARAMETER_CFG = 0x1033, }; From patchwork Thu Jul 29 02:19:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenpeng Liang X-Patchwork-Id: 12407473 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04DA8C432BE for ; Thu, 29 Jul 2021 02:23:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DCFE361053 for ; Thu, 29 Jul 2021 02:23:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233256AbhG2CXF (ORCPT ); Wed, 28 Jul 2021 22:23:05 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:7764 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233324AbhG2CXF (ORCPT ); Wed, 28 Jul 2021 22:23:05 -0400 Received: from dggemv703-chm.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4GZvL76G7fzYhmy; Thu, 29 Jul 2021 10:17:03 +0800 (CST) Received: from dggpeml500017.china.huawei.com (7.185.36.243) by dggemv703-chm.china.huawei.com (10.3.19.46) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:23:00 +0800 Received: from localhost.localdomain (10.67.165.24) by dggpeml500017.china.huawei.com (7.185.36.243) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:23:00 +0800 From: Wenpeng Liang To: , CC: , , , , Xi Wang Subject: [PATCH v4 for-next 07/12] RDMA/hns: Add method to detach WQE buffer Date: Thu, 29 Jul 2021 10:19:18 +0800 Message-ID: <1627525163-1683-8-git-send-email-liangwenpeng@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> References: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpeml500017.china.huawei.com (7.185.36.243) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Xi Wang If a uQP works in DCA mode, the userspace driver needs to drop the WQE buffer by calling the 'HNS_IB_METHOD_DCA_MEM_DETACH' method when the QP's CI is equal to PI, that means, the hns ROCEE will not access the WQE's buffer at this time, and the userspace driver can free this WQE's buffer. This method will start an worker queue to recycle the WQE buffer in kernel space, if the WQE buffer is indeed not being accessed by hns ROCEE, the worker will change the pages' state as free in DCA memory pool. Signed-off-by: Xi Wang Signed-off-by: Wenpeng Liang --- drivers/infiniband/hw/hns/hns_roce_dca.c | 245 +++++++++++++++++++++++++++- drivers/infiniband/hw/hns/hns_roce_dca.h | 9 +- drivers/infiniband/hw/hns/hns_roce_device.h | 14 +- drivers/infiniband/hw/hns/hns_roce_hw_v1.c | 12 +- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 43 ++++- drivers/infiniband/hw/hns/hns_roce_qp.c | 6 +- include/uapi/rdma/hns-abi.h | 7 + 7 files changed, 321 insertions(+), 15 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_dca.c b/drivers/infiniband/hw/hns/hns_roce_dca.c index 45fe163..dd1d6c3 100644 --- a/drivers/infiniband/hw/hns/hns_roce_dca.c +++ b/drivers/infiniband/hw/hns/hns_roce_dca.c @@ -15,6 +15,9 @@ #define UVERBS_MODULE_NAME hns_ib #include +/* DCA mem ageing interval time */ +#define DCA_MEM_AGEING_MSES 1000 + /* DCA memory */ struct dca_mem { #define DCA_MEM_FLAGS_ALLOCED BIT(0) @@ -42,6 +45,12 @@ static inline void set_dca_page_to_free(struct hns_dca_page_state *state) state->lock = 0; } +static inline void set_dca_page_to_inactive(struct hns_dca_page_state *state) +{ + state->active = 0; + state->lock = 0; +} + static inline void lock_dca_page_to_attach(struct hns_dca_page_state *state, u32 buf_id) { @@ -285,11 +294,94 @@ static int shrink_dca_mem(struct hns_roce_dev *hr_dev, return 0; } +static void restart_aging_dca_mem(struct hns_roce_dev *hr_dev, + struct hns_roce_dca_ctx *ctx) +{ + spin_lock(&ctx->aging_lock); + ctx->exit_aging = false; + if (!list_empty(&ctx->aging_new_list)) + queue_delayed_work(hr_dev->irq_workq, &ctx->aging_dwork, + msecs_to_jiffies(DCA_MEM_AGEING_MSES)); + + spin_unlock(&ctx->aging_lock); +} + +static void stop_aging_dca_mem(struct hns_roce_dca_ctx *ctx, + struct hns_roce_dca_cfg *cfg, bool stop_worker) +{ + spin_lock(&ctx->aging_lock); + if (stop_worker) { + ctx->exit_aging = true; + cancel_delayed_work(&ctx->aging_dwork); + } + + spin_lock(&cfg->lock); + + if (!list_empty(&cfg->aging_node)) + list_del_init(&cfg->aging_node); + + spin_unlock(&cfg->lock); + spin_unlock(&ctx->aging_lock); +} + +static void free_buf_from_dca_mem(struct hns_roce_dca_ctx *ctx, + struct hns_roce_dca_cfg *cfg); +static void process_aging_dca_mem(struct hns_roce_dev *hr_dev, + struct hns_roce_dca_ctx *ctx) +{ + struct hns_roce_dca_cfg *cfg, *tmp_cfg; + struct hns_roce_qp *hr_qp; + + spin_lock(&ctx->aging_lock); + list_for_each_entry_safe(cfg, tmp_cfg, &ctx->aging_new_list, aging_node) + list_move(&cfg->aging_node, &ctx->aging_proc_list); + + while (!ctx->exit_aging && !list_empty(&ctx->aging_proc_list)) { + cfg = list_first_entry(&ctx->aging_proc_list, + struct hns_roce_dca_cfg, aging_node); + list_del_init_careful(&cfg->aging_node); + hr_qp = container_of(cfg, struct hns_roce_qp, dca_cfg); + spin_unlock(&ctx->aging_lock); + + if (hr_dev->hw->chk_dca_buf_inactive(hr_dev, hr_qp)) + free_buf_from_dca_mem(ctx, cfg); + + spin_lock(&ctx->aging_lock); + + spin_lock(&cfg->lock); + /* If buf not free then add the buf to next aging list */ + if (cfg->buf_id != HNS_DCA_INVALID_BUF_ID) + list_move(&cfg->aging_node, &ctx->aging_new_list); + + spin_unlock(&cfg->lock); + } + spin_unlock(&ctx->aging_lock); +} + +static void udca_mem_aging_work(struct work_struct *work) +{ + struct hns_roce_dca_ctx *ctx = container_of(work, + struct hns_roce_dca_ctx, aging_dwork.work); + struct hns_roce_ucontext *uctx = container_of(ctx, + struct hns_roce_ucontext, dca_ctx); + struct hns_roce_dev *hr_dev = to_hr_dev(uctx->ibucontext.device); + + cancel_delayed_work(&ctx->aging_dwork); + process_aging_dca_mem(hr_dev, ctx); + if (!ctx->exit_aging) + restart_aging_dca_mem(hr_dev, ctx); +} + static void init_dca_context(struct hns_roce_dca_ctx *ctx) { INIT_LIST_HEAD(&ctx->pool); spin_lock_init(&ctx->pool_lock); ctx->total_size = 0; + INIT_LIST_HEAD(&ctx->aging_new_list); + INIT_LIST_HEAD(&ctx->aging_proc_list); + spin_lock_init(&ctx->aging_lock); + ctx->exit_aging = false; + INIT_DELAYED_WORK(&ctx->aging_dwork, udca_mem_aging_work); } static void cleanup_dca_context(struct hns_roce_dev *hr_dev, @@ -298,6 +390,10 @@ static void cleanup_dca_context(struct hns_roce_dev *hr_dev, struct dca_mem *mem, *tmp; unsigned long flags; + spin_lock(&ctx->aging_lock); + cancel_delayed_work_sync(&ctx->aging_dwork); + spin_unlock(&ctx->aging_lock); + spin_lock_irqsave(&ctx->pool_lock, flags); list_for_each_entry_safe(mem, tmp, &ctx->pool, list) { list_del(&mem->list); @@ -731,7 +827,11 @@ static int attach_dca_mem(struct hns_roce_dev *hr_dev, u32 buf_id; int ret; + if (hr_qp->en_flags & HNS_ROCE_QP_CAP_DYNAMIC_CTX_DETACH) + stop_aging_dca_mem(ctx, cfg, false); + resp->alloc_flags = 0; + spin_lock(&cfg->lock); buf_id = cfg->buf_id; /* Already attached */ @@ -770,23 +870,138 @@ static int attach_dca_mem(struct hns_roce_dev *hr_dev, return 0; } +struct dca_page_free_buf_attr { + u32 buf_id; + u32 max_pages; + u32 free_pages; + u32 clean_mems; +}; + +static int free_buffer_pages_proc(struct dca_mem *mem, int index, void *param) +{ + struct dca_page_free_buf_attr *attr = param; + struct hns_dca_page_state *state; + bool changed = false; + bool stop = false; + int i, free_pages; + + free_pages = 0; + for (i = 0; !stop && i < mem->page_count; i++) { + state = &mem->states[i]; + /* Change matched pages state */ + if (dca_page_is_attached(state, attr->buf_id)) { + set_dca_page_to_free(state); + changed = true; + attr->free_pages++; + if (attr->free_pages == attr->max_pages) + stop = true; + } + + if (dca_page_is_free(state)) + free_pages++; + } + + for (; changed && i < mem->page_count; i++) + if (dca_page_is_free(state)) + free_pages++; + + if (changed && free_pages == mem->page_count) + attr->clean_mems++; + + return stop ? DCA_MEM_STOP_ITERATE : DCA_MEM_NEXT_ITERATE; +} + +static void free_buf_from_dca_mem(struct hns_roce_dca_ctx *ctx, + struct hns_roce_dca_cfg *cfg) +{ + struct dca_page_free_buf_attr attr = {}; + unsigned long flags; + u32 buf_id; + + spin_lock(&cfg->lock); + buf_id = cfg->buf_id; + cfg->buf_id = HNS_DCA_INVALID_BUF_ID; + spin_unlock(&cfg->lock); + if (buf_id == HNS_DCA_INVALID_BUF_ID) + return; + + attr.buf_id = buf_id; + attr.max_pages = cfg->npages; + travel_dca_pages(ctx, &attr, free_buffer_pages_proc); + + /* Update free size */ + spin_lock_irqsave(&ctx->pool_lock, flags); + ctx->free_mems += attr.clean_mems; + ctx->free_size += attr.free_pages << HNS_HW_PAGE_SHIFT; + spin_unlock_irqrestore(&ctx->pool_lock, flags); +} + +static void detach_dca_mem(struct hns_roce_dev *hr_dev, + struct hns_roce_qp *hr_qp, + struct hns_dca_detach_attr *attr) +{ + struct hns_roce_dca_ctx *ctx = hr_qp_to_dca_ctx(hr_qp); + struct hns_roce_dca_cfg *cfg = &hr_qp->dca_cfg; + + stop_aging_dca_mem(ctx, cfg, true); + + spin_lock(&ctx->aging_lock); + spin_lock(&cfg->lock); + cfg->sq_idx = attr->sq_idx; + list_add_tail(&cfg->aging_node, &ctx->aging_new_list); + spin_unlock(&cfg->lock); + spin_unlock(&ctx->aging_lock); + + restart_aging_dca_mem(hr_dev, ctx); +} + +static void kick_dca_buf(struct hns_roce_dev *hr_dev, + struct hns_roce_dca_cfg *cfg, + struct hns_roce_dca_ctx *ctx) +{ + stop_aging_dca_mem(ctx, cfg, true); + free_buf_from_dca_mem(ctx, cfg); + restart_aging_dca_mem(hr_dev, ctx); +} + void hns_roce_enable_dca(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp) { struct hns_roce_dca_cfg *cfg = &hr_qp->dca_cfg; spin_lock_init(&cfg->lock); + INIT_LIST_HEAD(&cfg->aging_node); cfg->buf_id = HNS_DCA_INVALID_BUF_ID; cfg->npages = hr_qp->buff_size >> HNS_HW_PAGE_SHIFT; + /* Support dynamic detach when rq is empty */ + if (!hr_qp->rq.wqe_cnt) + hr_qp->en_flags |= HNS_ROCE_QP_CAP_DYNAMIC_CTX_DETACH; } void hns_roce_disable_dca(struct hns_roce_dev *hr_dev, - struct hns_roce_qp *hr_qp) + struct hns_roce_qp *hr_qp, struct ib_udata *udata) { + struct hns_roce_ucontext *uctx = rdma_udata_to_drv_context(udata, + struct hns_roce_ucontext, ibucontext); + struct hns_roce_dca_ctx *ctx = to_hr_dca_ctx(uctx); struct hns_roce_dca_cfg *cfg = &hr_qp->dca_cfg; + kick_dca_buf(hr_dev, cfg, ctx); cfg->buf_id = HNS_DCA_INVALID_BUF_ID; } +void hns_roce_modify_dca(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp, + struct ib_udata *udata) +{ + struct hns_roce_ucontext *uctx = rdma_udata_to_drv_context(udata, + struct hns_roce_ucontext, ibucontext); + struct hns_roce_dca_ctx *ctx = to_hr_dca_ctx(uctx); + struct hns_roce_dca_cfg *cfg = &hr_qp->dca_cfg; + + if (hr_qp->state == IB_QPS_RESET || hr_qp->state == IB_QPS_ERR) + kick_dca_buf(hr_dev, cfg, ctx); + +} + static struct hns_roce_ucontext * uverbs_attr_to_hr_uctx(struct uverbs_attr_bundle *attrs) { @@ -962,12 +1177,38 @@ DECLARE_UVERBS_NAMED_METHOD( UVERBS_ATTR_PTR_OUT(HNS_IB_ATTR_DCA_MEM_ATTACH_OUT_ALLOC_PAGES, UVERBS_ATTR_TYPE(u32), UA_MANDATORY)); +static int UVERBS_HANDLER(HNS_IB_METHOD_DCA_MEM_DETACH)( + struct uverbs_attr_bundle *attrs) +{ + struct hns_roce_qp *hr_qp = uverbs_attr_to_hr_qp(attrs); + struct hns_dca_detach_attr attr = {}; + + if (!hr_qp) + return -EINVAL; + + if (uverbs_copy_from(&attr.sq_idx, attrs, + HNS_IB_ATTR_DCA_MEM_DETACH_SQ_INDEX)) + return -EFAULT; + + detach_dca_mem(to_hr_dev(hr_qp->ibqp.device), hr_qp, &attr); + + return 0; +} + +DECLARE_UVERBS_NAMED_METHOD( + HNS_IB_METHOD_DCA_MEM_DETACH, + UVERBS_ATTR_IDR(HNS_IB_ATTR_DCA_MEM_DETACH_HANDLE, UVERBS_OBJECT_QP, + UVERBS_ACCESS_WRITE, UA_MANDATORY), + UVERBS_ATTR_PTR_IN(HNS_IB_ATTR_DCA_MEM_DETACH_SQ_INDEX, + UVERBS_ATTR_TYPE(u32), UA_MANDATORY)); + DECLARE_UVERBS_NAMED_OBJECT(HNS_IB_OBJECT_DCA_MEM, UVERBS_TYPE_ALLOC_IDR(dca_cleanup), &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_REG), &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_DEREG), &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_SHRINK), - &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_ATTACH)); + &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_ATTACH), + &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_DETACH)); static bool dca_is_supported(struct ib_device *device) { diff --git a/drivers/infiniband/hw/hns/hns_roce_dca.h b/drivers/infiniband/hw/hns/hns_roce_dca.h index d66cb74..4493854 100644 --- a/drivers/infiniband/hw/hns/hns_roce_dca.h +++ b/drivers/infiniband/hw/hns/hns_roce_dca.h @@ -46,6 +46,10 @@ struct hns_dca_attach_resp { u32 alloc_pages; }; +struct hns_dca_detach_attr { + u32 sq_idx; +}; + void hns_roce_register_udca(struct hns_roce_dev *hr_dev, struct hns_roce_ucontext *uctx); void hns_roce_unregister_udca(struct hns_roce_dev *hr_dev, @@ -54,5 +58,8 @@ void hns_roce_unregister_udca(struct hns_roce_dev *hr_dev, void hns_roce_enable_dca(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp); void hns_roce_disable_dca(struct hns_roce_dev *hr_dev, - struct hns_roce_qp *hr_qp); + struct hns_roce_qp *hr_qp, struct ib_udata *udata); +void hns_roce_modify_dca(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp, + struct ib_udata *udata); + #endif diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index 50dc894..fcaa004 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -231,6 +231,12 @@ struct hns_roce_dca_ctx { unsigned int free_mems; /* free mem num in pool */ size_t free_size; /* free mem size in pool */ size_t total_size; /* total size in pool */ + + bool exit_aging; + struct list_head aging_proc_list; + struct list_head aging_new_list; + spinlock_t aging_lock; + struct delayed_work aging_dwork; }; struct hns_roce_ucontext { @@ -336,9 +342,11 @@ struct hns_roce_dca_cfg { u32 buf_id; u16 attach_count; u32 npages; + u32 sq_idx; + bool aging_enable; + struct list_head aging_node; }; - struct hns_roce_mw { struct ib_mw ibmw; u32 pdn; @@ -932,11 +940,13 @@ struct hns_roce_hw { int step_idx); int (*set_dca_buf)(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp); + bool (*chk_dca_buf_inactive)(struct hns_roce_dev *hr_dev, + struct hns_roce_qp *hr_qp); int (*query_qp)(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr, int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr); int (*modify_qp)(struct ib_qp *ibqp, const struct ib_qp_attr *attr, int attr_mask, enum ib_qp_state cur_state, - enum ib_qp_state new_state); + enum ib_qp_state new_state, struct ib_udata *udata); int (*qp_flow_control_init)(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp); int (*dereg_mr)(struct hns_roce_dev *hr_dev, struct hns_roce_mr *mr, diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c index a3305d1..82fcf6b 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c @@ -897,21 +897,21 @@ static int hns_roce_v1_rsv_lp_qp(struct hns_roce_dev *hr_dev) rdma_ah_set_dgid_raw(&attr.ah_attr, dgid.raw); ret = hr_dev->hw->modify_qp(&hr_qp->ibqp, &attr, attr_mask, - IB_QPS_RESET, IB_QPS_INIT); + IB_QPS_RESET, IB_QPS_INIT, NULL); if (ret) { dev_err(dev, "modify qp failed(%d)!\n", ret); goto create_lp_qp_failed; } ret = hr_dev->hw->modify_qp(&hr_qp->ibqp, &attr, IB_QP_DEST_QPN, - IB_QPS_INIT, IB_QPS_RTR); + IB_QPS_INIT, IB_QPS_RTR, NULL); if (ret) { dev_err(dev, "modify qp failed(%d)!\n", ret); goto create_lp_qp_failed; } ret = hr_dev->hw->modify_qp(&hr_qp->ibqp, &attr, attr_mask, - IB_QPS_RTR, IB_QPS_RTS); + IB_QPS_RTR, IB_QPS_RTS, NULL); if (ret) { dev_err(dev, "modify qp failed(%d)!\n", ret); goto create_lp_qp_failed; @@ -3326,7 +3326,8 @@ static int hns_roce_v1_m_qp(struct ib_qp *ibqp, const struct ib_qp_attr *attr, static int hns_roce_v1_modify_qp(struct ib_qp *ibqp, const struct ib_qp_attr *attr, int attr_mask, enum ib_qp_state cur_state, - enum ib_qp_state new_state) + enum ib_qp_state new_state, + struct ib_udata *udata) { if (attr_mask & ~IB_QP_ATTR_STANDARD_BITS) return -EOPNOTSUPP; @@ -3612,7 +3613,8 @@ int hns_roce_v1_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata) struct hns_roce_cq *send_cq, *recv_cq; int ret; - ret = hns_roce_v1_modify_qp(ibqp, NULL, 0, hr_qp->state, IB_QPS_RESET); + ret = hns_roce_v1_modify_qp(ibqp, NULL, 0, hr_qp->state, IB_QPS_RESET, + NULL); if (ret) return ret; diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index 7e44128..4677c48 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -46,6 +46,7 @@ #include "hns_roce_device.h" #include "hns_roce_cmd.h" #include "hns_roce_hem.h" +#include "hns_roce_dca.h" #include "hns_roce_hw_v2.h" enum { @@ -5026,7 +5027,8 @@ static void v2_set_flushed_fields(struct ib_qp *ibqp, static int hns_roce_v2_modify_qp(struct ib_qp *ibqp, const struct ib_qp_attr *attr, int attr_mask, enum ib_qp_state cur_state, - enum ib_qp_state new_state) + enum ib_qp_state new_state, + struct ib_udata *udata) { struct hns_roce_dev *hr_dev = to_hr_dev(ibqp->device); struct hns_roce_qp *hr_qp = to_hr_qp(ibqp); @@ -5086,6 +5088,9 @@ static int hns_roce_v2_modify_qp(struct ib_qp *ibqp, if (new_state == IB_QPS_RESET && !ibqp->uobject) clear_qp(hr_qp); + if (check_dca_attach_enable(hr_qp)) + hns_roce_modify_dca(hr_dev, hr_qp, udata); + out: return ret; } @@ -5272,6 +5277,39 @@ static int hns_roce_v2_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr, return ret; } +static bool hns_roce_v2_chk_dca_buf_inactive(struct hns_roce_dev *hr_dev, + struct hns_roce_qp *hr_qp) +{ + struct hns_roce_dca_cfg *cfg = &hr_qp->dca_cfg; + struct hns_roce_v2_qp_context context = {}; + struct ib_device *ibdev = &hr_dev->ib_dev; + u32 tmp, sq_idx; + int state; + int ret; + + ret = hns_roce_v2_query_qpc(hr_dev, hr_qp, &context); + if (ret) { + ibdev_err(ibdev, "failed to query DCA QPC, ret = %d.\n", ret); + return false; + } + + state = hr_reg_read(&context, QPC_QP_ST); + if (state == HNS_ROCE_QP_ST_ERR || state == HNS_ROCE_QP_ST_RST) + return true; + + if (hr_qp->sq.wqe_cnt > 0) { + tmp = (u32)hr_reg_read(&context, QPC_RETRY_MSG_MSN); + sq_idx = tmp & (hr_qp->sq.wqe_cnt - 1); + /* If SQ-PI equals to retry_msg_msn in QPC, the QP is + * inactive. + */ + if (sq_idx != cfg->sq_idx) + return false; + } + + return true; +} + static inline int modify_qp_is_ok(struct hns_roce_qp *hr_qp) { return ((hr_qp->ibqp.qp_type == IB_QPT_RC || @@ -5293,7 +5331,7 @@ static int hns_roce_v2_destroy_qp_common(struct hns_roce_dev *hr_dev, if (modify_qp_is_ok(hr_qp)) { /* Modify qp to reset before destroying qp */ ret = hns_roce_v2_modify_qp(&hr_qp->ibqp, NULL, 0, - hr_qp->state, IB_QPS_RESET); + hr_qp->state, IB_QPS_RESET, udata); if (ret) ibdev_err(ibdev, "failed to modify QP to RST, ret = %d.\n", @@ -6309,6 +6347,7 @@ static const struct hns_roce_hw hns_roce_hw_v2 = { .set_hem = hns_roce_v2_set_hem, .clear_hem = hns_roce_v2_clear_hem, .set_dca_buf = hns_roce_v2_set_dca_buf, + .chk_dca_buf_inactive = hns_roce_v2_chk_dca_buf_inactive, .modify_qp = hns_roce_v2_modify_qp, .qp_flow_control_init = hns_roce_v2_qp_flow_control_init, .init_eq = hns_roce_v2_init_eq_table, diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c index 4b8e850..0918c97 100644 --- a/drivers/infiniband/hw/hns/hns_roce_qp.c +++ b/drivers/infiniband/hw/hns/hns_roce_qp.c @@ -794,7 +794,7 @@ static int alloc_wqe_buf(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp, if (ret) { ibdev_err(ibdev, "failed to create WQE mtr, ret = %d.\n", ret); if (dca_en) - hns_roce_disable_dca(hr_dev, hr_qp); + hns_roce_disable_dca(hr_dev, hr_qp, udata); } return ret; @@ -806,7 +806,7 @@ static void free_wqe_buf(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp, hns_roce_mtr_destroy(hr_dev, &hr_qp->mtr); if (hr_qp->en_flags & HNS_ROCE_QP_CAP_DYNAMIC_CTX_ATTACH) - hns_roce_disable_dca(hr_dev, hr_qp); + hns_roce_disable_dca(hr_dev, hr_qp, udata); } static int alloc_qp_wqe(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp, @@ -1414,7 +1414,7 @@ int hns_roce_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, } ret = hr_dev->hw->modify_qp(ibqp, attr, attr_mask, cur_state, - new_state); + new_state, udata); out: mutex_unlock(&hr_qp->mutex); diff --git a/include/uapi/rdma/hns-abi.h b/include/uapi/rdma/hns-abi.h index 18ef96e..97ab795 100644 --- a/include/uapi/rdma/hns-abi.h +++ b/include/uapi/rdma/hns-abi.h @@ -78,6 +78,7 @@ enum hns_roce_qp_cap_flags { HNS_ROCE_QP_CAP_SQ_RECORD_DB = 1 << 1, HNS_ROCE_QP_CAP_OWNER_DB = 1 << 2, HNS_ROCE_QP_CAP_DYNAMIC_CTX_ATTACH = 1 << 4, + HNS_ROCE_QP_CAP_DYNAMIC_CTX_DETACH = 1 << 6, }; struct hns_roce_ib_create_qp_resp { @@ -112,6 +113,7 @@ enum hns_ib_dca_mem_methods { HNS_IB_METHOD_DCA_MEM_DEREG, HNS_IB_METHOD_DCA_MEM_SHRINK, HNS_IB_METHOD_DCA_MEM_ATTACH, + HNS_IB_METHOD_DCA_MEM_DETACH, }; enum hns_ib_dca_mem_reg_attrs { @@ -142,4 +144,9 @@ enum hns_ib_dca_mem_attach_attrs { HNS_IB_ATTR_DCA_MEM_ATTACH_OUT_ALLOC_FLAGS, HNS_IB_ATTR_DCA_MEM_ATTACH_OUT_ALLOC_PAGES, }; + +enum hns_ib_dca_mem_detach_attrs { + HNS_IB_ATTR_DCA_MEM_DETACH_HANDLE = (1U << UVERBS_ID_NS_SHIFT), + HNS_IB_ATTR_DCA_MEM_DETACH_SQ_INDEX, +}; #endif /* HNS_ABI_USER_H */ From patchwork Thu Jul 29 02:19:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenpeng Liang X-Patchwork-Id: 12407463 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78A62C4338F for ; Thu, 29 Jul 2021 02:23:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 590B661052 for ; Thu, 29 Jul 2021 02:23:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233500AbhG2CXF (ORCPT ); Wed, 28 Jul 2021 22:23:05 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:7763 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233256AbhG2CXE (ORCPT ); Wed, 28 Jul 2021 22:23:04 -0400 Received: from dggemv703-chm.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4GZvL76XnqzYhn3; Thu, 29 Jul 2021 10:17:03 +0800 (CST) Received: from dggpeml500017.china.huawei.com (7.185.36.243) by dggemv703-chm.china.huawei.com (10.3.19.46) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:23:00 +0800 Received: from localhost.localdomain (10.67.165.24) by dggpeml500017.china.huawei.com (7.185.36.243) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:23:00 +0800 From: Wenpeng Liang To: , CC: , , , , Xi Wang Subject: [PATCH v4 for-next 08/12] RDMA/hns: Add method to query WQE buffer's address Date: Thu, 29 Jul 2021 10:19:19 +0800 Message-ID: <1627525163-1683-9-git-send-email-liangwenpeng@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> References: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpeml500017.china.huawei.com (7.185.36.243) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Xi Wang If a uQP works in DCA mode, the userspace driver need to get the buffer's address in DCA memory pool by calling the 'HNS_IB_METHOD_DCA_MEM_QUERY' method after the QP was attached by calling the 'HNS_IB_METHOD_DCA_MEM_ATTACH' method. This method will return the DCA mem object's key and the offset to let the userspace driver get the WQE's virtual address in DCA memory pool. Signed-off-by: Xi Wang Signed-off-by: Wenpeng Liang --- drivers/infiniband/hw/hns/hns_roce_dca.c | 110 ++++++++++++++++++++++++++++++- drivers/infiniband/hw/hns/hns_roce_dca.h | 6 ++ include/uapi/rdma/hns-abi.h | 10 +++ 3 files changed, 124 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_dca.c b/drivers/infiniband/hw/hns/hns_roce_dca.c index dd1d6c3..7d59744 100644 --- a/drivers/infiniband/hw/hns/hns_roce_dca.c +++ b/drivers/infiniband/hw/hns/hns_roce_dca.c @@ -74,7 +74,11 @@ static inline void unlock_dca_page_to_active(struct hns_dca_page_state *state, ((HNS_DCA_OWN_MASK & (id)) == (HNS_DCA_OWN_MASK & (s)->buf_id)) #define dca_page_is_allocated(s, id) \ - (dca_page_is_attached(s, id) && (s)->lock) + (dca_page_is_attached(s, id) && (s)->lock) + +/* all buf id bits must be matched */ +#define dca_page_is_active(s, id) ((HNS_DCA_ID_MASK & (id)) == \ + (s)->buf_id && !(s)->lock && (s)->active) #define dca_page_is_inactive(s) (!(s)->lock && !(s)->active) @@ -870,6 +874,64 @@ static int attach_dca_mem(struct hns_roce_dev *hr_dev, return 0; } +struct dca_page_query_active_attr { + u32 buf_id; + u32 curr_index; + u32 start_index; + u32 page_index; + u32 page_count; + u64 mem_key; +}; + +static int query_dca_active_pages_proc(struct dca_mem *mem, int index, + void *param) +{ + struct hns_dca_page_state *state = &mem->states[index]; + struct dca_page_query_active_attr *attr = param; + + if (!dca_page_is_active(state, attr->buf_id)) + return 0; + + if (attr->curr_index < attr->start_index) { + attr->curr_index++; + return 0; + } else if (attr->curr_index > attr->start_index) { + return DCA_MEM_STOP_ITERATE; + } + + /* Search first page in DCA mem */ + attr->page_index = index; + attr->mem_key = mem->key; + /* Search active pages in continuous addresses */ + while (index < mem->page_count) { + state = &mem->states[index]; + if (!dca_page_is_active(state, attr->buf_id)) + break; + + index++; + attr->page_count++; + } + + return DCA_MEM_STOP_ITERATE; +} + +static int query_dca_mem(struct hns_roce_qp *hr_qp, u32 page_index, + struct hns_dca_query_resp *resp) +{ + struct hns_roce_dca_ctx *ctx = hr_qp_to_dca_ctx(hr_qp); + struct dca_page_query_active_attr attr = {}; + + attr.buf_id = hr_qp->dca_cfg.buf_id; + attr.start_index = page_index; + travel_dca_pages(ctx, &attr, query_dca_active_pages_proc); + + resp->mem_key = attr.mem_key; + resp->mem_ofs = attr.page_index << HNS_HW_PAGE_SHIFT; + resp->page_count = attr.page_count; + + return attr.page_count ? 0 : -ENOMEM; +} + struct dca_page_free_buf_attr { u32 buf_id; u32 max_pages; @@ -1202,13 +1264,57 @@ DECLARE_UVERBS_NAMED_METHOD( UVERBS_ATTR_PTR_IN(HNS_IB_ATTR_DCA_MEM_DETACH_SQ_INDEX, UVERBS_ATTR_TYPE(u32), UA_MANDATORY)); +static int UVERBS_HANDLER(HNS_IB_METHOD_DCA_MEM_QUERY)( + struct uverbs_attr_bundle *attrs) +{ + struct hns_roce_qp *hr_qp = uverbs_attr_to_hr_qp(attrs); + struct hns_dca_query_resp resp = {}; + u32 page_idx; + int ret; + + if (!hr_qp) + return -EINVAL; + + if (uverbs_copy_from(&page_idx, attrs, + HNS_IB_ATTR_DCA_MEM_QUERY_PAGE_INDEX)) + return -EFAULT; + + ret = query_dca_mem(hr_qp, page_idx, &resp); + if (ret) + return ret; + + if (uverbs_copy_to(attrs, HNS_IB_ATTR_DCA_MEM_QUERY_OUT_KEY, + &resp.mem_key, sizeof(resp.mem_key)) || + uverbs_copy_to(attrs, HNS_IB_ATTR_DCA_MEM_QUERY_OUT_OFFSET, + &resp.mem_ofs, sizeof(resp.mem_ofs)) || + uverbs_copy_to(attrs, HNS_IB_ATTR_DCA_MEM_QUERY_OUT_PAGE_COUNT, + &resp.page_count, sizeof(resp.page_count))) + return -EFAULT; + + return 0; +} + +DECLARE_UVERBS_NAMED_METHOD( + HNS_IB_METHOD_DCA_MEM_QUERY, + UVERBS_ATTR_IDR(HNS_IB_ATTR_DCA_MEM_QUERY_HANDLE, UVERBS_OBJECT_QP, + UVERBS_ACCESS_READ, UA_MANDATORY), + UVERBS_ATTR_PTR_IN(HNS_IB_ATTR_DCA_MEM_QUERY_PAGE_INDEX, + UVERBS_ATTR_TYPE(u32), UA_MANDATORY), + UVERBS_ATTR_PTR_OUT(HNS_IB_ATTR_DCA_MEM_QUERY_OUT_KEY, + UVERBS_ATTR_TYPE(u64), UA_MANDATORY), + UVERBS_ATTR_PTR_OUT(HNS_IB_ATTR_DCA_MEM_QUERY_OUT_OFFSET, + UVERBS_ATTR_TYPE(u32), UA_MANDATORY), + UVERBS_ATTR_PTR_OUT(HNS_IB_ATTR_DCA_MEM_QUERY_OUT_PAGE_COUNT, + UVERBS_ATTR_TYPE(u32), UA_MANDATORY)); + DECLARE_UVERBS_NAMED_OBJECT(HNS_IB_OBJECT_DCA_MEM, UVERBS_TYPE_ALLOC_IDR(dca_cleanup), &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_REG), &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_DEREG), &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_SHRINK), &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_ATTACH), - &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_DETACH)); + &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_DETACH), + &UVERBS_METHOD(HNS_IB_METHOD_DCA_MEM_QUERY)); static bool dca_is_supported(struct ib_device *device) { diff --git a/drivers/infiniband/hw/hns/hns_roce_dca.h b/drivers/infiniband/hw/hns/hns_roce_dca.h index 4493854..cb7bd6a 100644 --- a/drivers/infiniband/hw/hns/hns_roce_dca.h +++ b/drivers/infiniband/hw/hns/hns_roce_dca.h @@ -50,6 +50,12 @@ struct hns_dca_detach_attr { u32 sq_idx; }; +struct hns_dca_query_resp { + u64 mem_key; + u32 mem_ofs; + u32 page_count; +}; + void hns_roce_register_udca(struct hns_roce_dev *hr_dev, struct hns_roce_ucontext *uctx); void hns_roce_unregister_udca(struct hns_roce_dev *hr_dev, diff --git a/include/uapi/rdma/hns-abi.h b/include/uapi/rdma/hns-abi.h index 97ab795..7f5d2d5 100644 --- a/include/uapi/rdma/hns-abi.h +++ b/include/uapi/rdma/hns-abi.h @@ -114,6 +114,7 @@ enum hns_ib_dca_mem_methods { HNS_IB_METHOD_DCA_MEM_SHRINK, HNS_IB_METHOD_DCA_MEM_ATTACH, HNS_IB_METHOD_DCA_MEM_DETACH, + HNS_IB_METHOD_DCA_MEM_QUERY, }; enum hns_ib_dca_mem_reg_attrs { @@ -149,4 +150,13 @@ enum hns_ib_dca_mem_detach_attrs { HNS_IB_ATTR_DCA_MEM_DETACH_HANDLE = (1U << UVERBS_ID_NS_SHIFT), HNS_IB_ATTR_DCA_MEM_DETACH_SQ_INDEX, }; + +enum hns_ib_dca_mem_query_attrs { + HNS_IB_ATTR_DCA_MEM_QUERY_HANDLE = (1U << UVERBS_ID_NS_SHIFT), + HNS_IB_ATTR_DCA_MEM_QUERY_PAGE_INDEX, + HNS_IB_ATTR_DCA_MEM_QUERY_OUT_KEY, + HNS_IB_ATTR_DCA_MEM_QUERY_OUT_OFFSET, + HNS_IB_ATTR_DCA_MEM_QUERY_OUT_PAGE_COUNT, +}; + #endif /* HNS_ABI_USER_H */ From patchwork Thu Jul 29 02:19:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenpeng Liang X-Patchwork-Id: 12407485 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49415C43214 for ; Thu, 29 Jul 2021 02:23:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 31C1B6103C for ; Thu, 29 Jul 2021 02:23:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233694AbhG2CXM (ORCPT ); Wed, 28 Jul 2021 22:23:12 -0400 Received: from szxga02-in.huawei.com ([45.249.212.188]:7889 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233548AbhG2CXL (ORCPT ); Wed, 28 Jul 2021 22:23:11 -0400 Received: from dggemv711-chm.china.huawei.com (unknown [172.30.72.55]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4GZvNn3l4sz81ZP; Thu, 29 Jul 2021 10:19:21 +0800 (CST) Received: from dggpeml500017.china.huawei.com (7.185.36.243) by dggemv711-chm.china.huawei.com (10.1.198.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:23:00 +0800 Received: from localhost.localdomain (10.67.165.24) by dggpeml500017.china.huawei.com (7.185.36.243) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:23:00 +0800 From: Wenpeng Liang To: , CC: , , , , Xi Wang Subject: [PATCH v4 for-next 09/12] RDMA/hns: Add a shared memory to sync DCA status Date: Thu, 29 Jul 2021 10:19:20 +0800 Message-ID: <1627525163-1683-10-git-send-email-liangwenpeng@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> References: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpeml500017.china.huawei.com (7.185.36.243) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Xi Wang The user DCA needs to check the QP attaching state before filling wqe buffer by the resp from uverbs 'HNS_IB_METHOD_DCA_MEM_ATTACH', but this will result in too much time being wasted on system calls, so add a shared table between user driver and kernel driver to sync DCA status. Signed-off-by: Xi Wang Signed-off-by: Wenpeng Liang --- drivers/infiniband/hw/hns/hns_roce_dca.c | 40 ++++++++++- drivers/infiniband/hw/hns/hns_roce_dca.h | 2 +- drivers/infiniband/hw/hns/hns_roce_device.h | 7 ++ drivers/infiniband/hw/hns/hns_roce_main.c | 104 +++++++++++++++++++++++++--- include/uapi/rdma/hns-abi.h | 16 +++++ 5 files changed, 157 insertions(+), 12 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_dca.c b/drivers/infiniband/hw/hns/hns_roce_dca.c index 7d59744..ffa6137 100644 --- a/drivers/infiniband/hw/hns/hns_roce_dca.c +++ b/drivers/infiniband/hw/hns/hns_roce_dca.c @@ -414,16 +414,50 @@ static void cleanup_dca_context(struct hns_roce_dev *hr_dev, spin_unlock_irqrestore(&ctx->pool_lock, flags); } -void hns_roce_register_udca(struct hns_roce_dev *hr_dev, +static void init_udca_status(struct hns_roce_dca_ctx *ctx, int udca_max_qps, + unsigned int dev_max_qps) +{ + const unsigned int bits_per_qp = 2 * HNS_DCA_BITS_PER_STATUS; + void *kaddr; + size_t size; + + size = BITS_TO_BYTES(udca_max_qps * bits_per_qp); + ctx->status_npage = DIV_ROUND_UP(size, PAGE_SIZE); + + size = ctx->status_npage * PAGE_SIZE; + ctx->max_qps = min_t(unsigned int, dev_max_qps, + size * BITS_PER_BYTE / bits_per_qp); + + kaddr = alloc_pages_exact(size, GFP_KERNEL | __GFP_ZERO); + if (!kaddr) + return; + + ctx->buf_status = (unsigned long *)kaddr; + ctx->sync_status = (unsigned long *)(kaddr + size / 2); +} + +void hns_roce_register_udca(struct hns_roce_dev *hr_dev, int max_qps, struct hns_roce_ucontext *uctx) { - init_dca_context(&uctx->dca_ctx); + struct hns_roce_dca_ctx *ctx = to_hr_dca_ctx(uctx); + + init_dca_context(ctx); + if (max_qps > 0) + init_udca_status(ctx, max_qps, hr_dev->caps.num_qps); } void hns_roce_unregister_udca(struct hns_roce_dev *hr_dev, struct hns_roce_ucontext *uctx) { - cleanup_dca_context(hr_dev, &uctx->dca_ctx); + struct hns_roce_dca_ctx *ctx = to_hr_dca_ctx(uctx); + + cleanup_dca_context(hr_dev, ctx); + + if (ctx->buf_status) { + free_pages_exact(ctx->buf_status, + ctx->status_npage * PAGE_SIZE); + ctx->buf_status = NULL; + } } static struct dca_mem *alloc_dca_mem(struct hns_roce_dca_ctx *ctx) diff --git a/drivers/infiniband/hw/hns/hns_roce_dca.h b/drivers/infiniband/hw/hns/hns_roce_dca.h index cb7bd6a..1f59a62 100644 --- a/drivers/infiniband/hw/hns/hns_roce_dca.h +++ b/drivers/infiniband/hw/hns/hns_roce_dca.h @@ -56,7 +56,7 @@ struct hns_dca_query_resp { u32 page_count; }; -void hns_roce_register_udca(struct hns_roce_dev *hr_dev, +void hns_roce_register_udca(struct hns_roce_dev *hr_dev, int max_qps, struct hns_roce_ucontext *uctx); void hns_roce_unregister_udca(struct hns_roce_dev *hr_dev, struct hns_roce_ucontext *uctx); diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index fcaa004..ac53a44 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -232,6 +232,13 @@ struct hns_roce_dca_ctx { size_t free_size; /* free mem size in pool */ size_t total_size; /* total size in pool */ + unsigned int max_qps; + unsigned int status_npage; + +#define HNS_DCA_BITS_PER_STATUS 1 + unsigned long *buf_status; + unsigned long *sync_status; + bool exit_aging; struct list_head aging_proc_list; struct list_head aging_new_list; diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c index 3df95d4..e37ece8 100644 --- a/drivers/infiniband/hw/hns/hns_roce_main.c +++ b/drivers/infiniband/hw/hns/hns_roce_main.c @@ -293,20 +293,54 @@ static int hns_roce_modify_device(struct ib_device *ib_dev, int mask, return 0; } +static void ucontext_set_resp(struct ib_ucontext *uctx, + struct hns_roce_ib_alloc_ucontext_resp *resp) +{ + struct hns_roce_ucontext *context = to_hr_ucontext(uctx); + struct hns_roce_dev *hr_dev = to_hr_dev(uctx->device); + + resp->qp_tab_size = hr_dev->caps.num_qps; + resp->cap_flags = hr_dev->caps.flags; + resp->cqe_size = hr_dev->caps.cqe_sz; + resp->srq_tab_size = hr_dev->caps.num_srqs; + resp->dca_qps = context->dca_ctx.max_qps; + resp->dca_mmap_size = PAGE_SIZE * context->dca_ctx.status_npage; +} + +static u32 get_udca_max_qps(struct hns_roce_dev *hr_dev, + struct hns_roce_ib_alloc_ucontext *ucmd) +{ + u32 qp_num; + + if (ucmd->comp & HNS_ROCE_ALLOC_UCTX_COMP_DCA_MAX_QPS) { + qp_num = ucmd->dca_max_qps; + if (!qp_num) + qp_num = hr_dev->caps.num_qps; + } else { + qp_num = 0; + } + + return qp_num; +} + static int hns_roce_alloc_ucontext(struct ib_ucontext *uctx, struct ib_udata *udata) { struct hns_roce_ucontext *context = to_hr_ucontext(uctx); struct hns_roce_dev *hr_dev = to_hr_dev(uctx->device); struct hns_roce_ib_alloc_ucontext_resp resp = {}; + struct hns_roce_ib_alloc_ucontext ucmd = {}; int ret; if (!hr_dev->active) return -EAGAIN; - resp.qp_tab_size = hr_dev->caps.num_qps; - resp.srq_tab_size = hr_dev->caps.num_srqs; - resp.cap_flags = hr_dev->caps.flags; + if (udata->inlen == sizeof(struct hns_roce_ib_alloc_ucontext)) { + ret = ib_copy_from_udata(&ucmd, udata, + min(udata->inlen, sizeof(ucmd))); + if (ret) + return ret; + } ret = hns_roce_uar_alloc(hr_dev, &context->uar); if (ret) @@ -319,9 +353,10 @@ static int hns_roce_alloc_ucontext(struct ib_ucontext *uctx, } if (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_DCA_MODE) - hns_roce_register_udca(hr_dev, context); + hns_roce_register_udca(hr_dev, get_udca_max_qps(hr_dev, &ucmd), + context); - resp.cqe_size = hr_dev->caps.cqe_sz; + ucontext_set_resp(uctx, &resp); ret = ib_copy_to_udata(udata, &resp, min(udata->outlen, sizeof(resp))); @@ -340,6 +375,17 @@ static int hns_roce_alloc_ucontext(struct ib_ucontext *uctx, return ret; } +/* command value is offset[15:8] */ +static int hns_roce_mmap_get_command(unsigned long offset) +{ + return (offset >> 8) & 0xff; +} + +/* index value is offset[63:16] | offset[7:0] */ +static unsigned long hns_roce_mmap_get_index(unsigned long offset) +{ + return ((offset >> 16) << 8) | (offset & 0xff); +} static void hns_roce_dealloc_ucontext(struct ib_ucontext *ibcontext) { struct hns_roce_ucontext *context = to_hr_ucontext(ibcontext); @@ -351,12 +397,11 @@ static void hns_roce_dealloc_ucontext(struct ib_ucontext *ibcontext) hns_roce_unregister_udca(hr_dev, context); } -static int hns_roce_mmap(struct ib_ucontext *context, - struct vm_area_struct *vma) +static int mmap_uar(struct ib_ucontext *context, struct vm_area_struct *vma) { struct hns_roce_dev *hr_dev = to_hr_dev(context->device); - switch (vma->vm_pgoff) { + switch (hns_roce_mmap_get_index(vma->vm_pgoff)) { case 0: return rdma_user_mmap_io(context, vma, to_hr_ucontext(context)->uar.pfn, @@ -383,6 +428,49 @@ static int hns_roce_mmap(struct ib_ucontext *context, } } +static int mmap_dca(struct ib_ucontext *context, struct vm_area_struct *vma) +{ + struct hns_roce_ucontext *uctx = to_hr_ucontext(context); + struct hns_roce_dca_ctx *ctx = &uctx->dca_ctx; + struct page **pages; + unsigned long num; + int ret; + + if ((vma->vm_end - vma->vm_start != (ctx->status_npage * PAGE_SIZE) || + !(vma->vm_flags & VM_SHARED))) + return -EINVAL; + + if (!(vma->vm_flags & VM_WRITE) || (vma->vm_flags & VM_EXEC)) + return -EPERM; + + if (!ctx->buf_status) + return -EOPNOTSUPP; + + pages = kcalloc(ctx->status_npage, sizeof(struct page *), GFP_KERNEL); + if (!pages) + return -ENOMEM; + + for (num = 0; num < ctx->status_npage; num++) + pages[num] = virt_to_page(ctx->buf_status + num * PAGE_SIZE); + + ret = vm_insert_pages(vma, vma->vm_start, pages, &num); + kfree(pages); + + return ret; +} + +static int hns_roce_mmap(struct ib_ucontext *uctx, struct vm_area_struct *vma) +{ + switch (hns_roce_mmap_get_command(vma->vm_pgoff)) { + case HNS_ROCE_MMAP_REGULAR_PAGE: + return mmap_uar(uctx, vma); + case HNS_ROCE_MMAP_DCA_PAGE: + return mmap_dca(uctx, vma); + default: + return -EINVAL; + } +} + static int hns_roce_port_immutable(struct ib_device *ib_dev, u32 port_num, struct ib_port_immutable *immutable) { diff --git a/include/uapi/rdma/hns-abi.h b/include/uapi/rdma/hns-abi.h index 7f5d2d5..476ea81 100644 --- a/include/uapi/rdma/hns-abi.h +++ b/include/uapi/rdma/hns-abi.h @@ -86,6 +86,15 @@ struct hns_roce_ib_create_qp_resp { }; enum { + HNS_ROCE_ALLOC_UCTX_COMP_DCA_MAX_QPS = 1 << 0, +}; + +struct hns_roce_ib_alloc_ucontext { + __u32 comp; + __u32 dca_max_qps; +}; + +enum { HNS_ROCE_CAP_FLAG_DCA_MODE = 1 << 15, }; @@ -95,12 +104,19 @@ struct hns_roce_ib_alloc_ucontext_resp { __u32 srq_tab_size; __u32 reserved; __aligned_u64 cap_flags; + __u32 dca_qps; + __u32 dca_mmap_size; }; struct hns_roce_ib_alloc_pd_resp { __u32 pdn; }; +enum { + HNS_ROCE_MMAP_REGULAR_PAGE, + HNS_ROCE_MMAP_DCA_PAGE, +}; + #define UVERBS_ID_NS_MASK 0xF000 #define UVERBS_ID_NS_SHIFT 12 From patchwork Thu Jul 29 02:19:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenpeng Liang X-Patchwork-Id: 12407481 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01BC8C19F35 for ; Thu, 29 Jul 2021 02:23:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DF51E6103C for ; Thu, 29 Jul 2021 02:23:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233444AbhG2CXI (ORCPT ); Wed, 28 Jul 2021 22:23:08 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:16022 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233513AbhG2CXH (ORCPT ); Wed, 28 Jul 2021 22:23:07 -0400 Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4GZvP21BsZzZv7k; Thu, 29 Jul 2021 10:19:34 +0800 (CST) Received: from dggpeml500017.china.huawei.com (7.185.36.243) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:23:00 +0800 Received: from localhost.localdomain (10.67.165.24) by dggpeml500017.china.huawei.com (7.185.36.243) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:23:00 +0800 From: Wenpeng Liang To: , CC: , , , , Xi Wang Subject: [PATCH v4 for-next 10/12] RDMA/hns: Sync DCA status by the shared memory Date: Thu, 29 Jul 2021 10:19:21 +0800 Message-ID: <1627525163-1683-11-git-send-email-liangwenpeng@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> References: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpeml500017.china.huawei.com (7.185.36.243) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Xi Wang Alloc a DCA num to indicate the DCA status position in the shared memory, if the num is valid, the user DCA can get the DCA status by testing the bit in the shared memory for each QP, otherwise the user DCA needs to invoke the verbs 'HNS_IB_METHOD_DCA_MEM_ATTACH' to check the DCA status. Signed-off-by: Xi Wang Signed-off-by: Wenpeng Liang --- drivers/infiniband/hw/hns/hns_roce_dca.c | 81 +++++++++++++++++++++++++++-- drivers/infiniband/hw/hns/hns_roce_dca.h | 2 + drivers/infiniband/hw/hns/hns_roce_device.h | 6 ++- drivers/infiniband/hw/hns/hns_roce_qp.c | 11 ++++ include/uapi/rdma/hns-abi.h | 5 ++ 5 files changed, 99 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/hw/hns/hns_roce_dca.c b/drivers/infiniband/hw/hns/hns_roce_dca.c index ffa6137..67b29e4 100644 --- a/drivers/infiniband/hw/hns/hns_roce_dca.c +++ b/drivers/infiniband/hw/hns/hns_roce_dca.c @@ -298,6 +298,39 @@ static int shrink_dca_mem(struct hns_roce_dev *hr_dev, return 0; } +#define DCAN_TO_SYNC_BIT(n) ((n) * HNS_DCA_BITS_PER_STATUS) +#define DCAN_TO_STAT_BIT(n) DCAN_TO_SYNC_BIT(n) +static bool start_free_dca_buf(struct hns_roce_dca_ctx *ctx, u32 dcan) +{ + unsigned long *st = ctx->sync_status; + + if (st && dcan < ctx->max_qps) + return !test_and_set_bit_lock(DCAN_TO_SYNC_BIT(dcan), st); + + return true; +} + +static void stop_free_dca_buf(struct hns_roce_dca_ctx *ctx, u32 dcan) +{ + unsigned long *st = ctx->sync_status; + + if (st && dcan < ctx->max_qps) + clear_bit_unlock(DCAN_TO_SYNC_BIT(dcan), st); +} + +static void update_dca_buf_status(struct hns_roce_dca_ctx *ctx, u32 dcan, + bool en) +{ + unsigned long *st = ctx->buf_status; + + if (st && dcan < ctx->max_qps) { + if (en) + set_bit(DCAN_TO_STAT_BIT(dcan), st); + else + clear_bit(DCAN_TO_STAT_BIT(dcan), st); + } +} + static void restart_aging_dca_mem(struct hns_roce_dev *hr_dev, struct hns_roce_dca_ctx *ctx) { @@ -347,8 +380,12 @@ static void process_aging_dca_mem(struct hns_roce_dev *hr_dev, hr_qp = container_of(cfg, struct hns_roce_qp, dca_cfg); spin_unlock(&ctx->aging_lock); - if (hr_dev->hw->chk_dca_buf_inactive(hr_dev, hr_qp)) - free_buf_from_dca_mem(ctx, cfg); + if (start_free_dca_buf(ctx, cfg->dcan)) { + if (hr_dev->hw->chk_dca_buf_inactive(hr_dev, hr_qp)) + free_buf_from_dca_mem(ctx, cfg); + + stop_free_dca_buf(ctx, cfg->dcan); + } spin_lock(&ctx->aging_lock); @@ -381,6 +418,8 @@ static void init_dca_context(struct hns_roce_dca_ctx *ctx) INIT_LIST_HEAD(&ctx->pool); spin_lock_init(&ctx->pool_lock); ctx->total_size = 0; + + ida_init(&ctx->ida); INIT_LIST_HEAD(&ctx->aging_new_list); INIT_LIST_HEAD(&ctx->aging_proc_list); spin_lock_init(&ctx->aging_lock); @@ -458,6 +497,8 @@ void hns_roce_unregister_udca(struct hns_roce_dev *hr_dev, ctx->status_npage * PAGE_SIZE); ctx->buf_status = NULL; } + + ida_destroy(&ctx->ida); } static struct dca_mem *alloc_dca_mem(struct hns_roce_dca_ctx *ctx) @@ -904,6 +945,7 @@ static int attach_dca_mem(struct hns_roce_dev *hr_dev, resp->alloc_flags |= HNS_IB_ATTACH_FLAGS_NEW_BUFFER; resp->alloc_pages = cfg->npages; + update_dca_buf_status(ctx, cfg->dcan, true); return 0; } @@ -1014,6 +1056,7 @@ static void free_buf_from_dca_mem(struct hns_roce_dca_ctx *ctx, unsigned long flags; u32 buf_id; + update_dca_buf_status(ctx, cfg->dcan, false); spin_lock(&cfg->lock); buf_id = cfg->buf_id; cfg->buf_id = HNS_DCA_INVALID_BUF_ID; @@ -1060,6 +1103,27 @@ static void kick_dca_buf(struct hns_roce_dev *hr_dev, restart_aging_dca_mem(hr_dev, ctx); } +static u32 alloc_dca_num(struct hns_roce_dca_ctx *ctx) +{ + int ret; + + ret = ida_alloc_max(&ctx->ida, ctx->max_qps - 1, GFP_KERNEL); + if (ret < 0) + return HNS_DCA_INVALID_DCA_NUM; + + stop_free_dca_buf(ctx, ret); + update_dca_buf_status(ctx, ret, false); + return ret; +} + +static void free_dca_num(u32 dcan, struct hns_roce_dca_ctx *ctx) +{ + if (dcan == HNS_DCA_INVALID_DCA_NUM) + return; + + ida_free(&ctx->ida, dcan); +} + void hns_roce_enable_dca(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp) { struct hns_roce_dca_cfg *cfg = &hr_qp->dca_cfg; @@ -1068,6 +1132,7 @@ void hns_roce_enable_dca(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp) INIT_LIST_HEAD(&cfg->aging_node); cfg->buf_id = HNS_DCA_INVALID_BUF_ID; cfg->npages = hr_qp->buff_size >> HNS_HW_PAGE_SHIFT; + cfg->dcan = HNS_DCA_INVALID_DCA_NUM; /* Support dynamic detach when rq is empty */ if (!hr_qp->rq.wqe_cnt) hr_qp->en_flags |= HNS_ROCE_QP_CAP_DYNAMIC_CTX_DETACH; @@ -1083,6 +1148,9 @@ void hns_roce_disable_dca(struct hns_roce_dev *hr_dev, kick_dca_buf(hr_dev, cfg, ctx); cfg->buf_id = HNS_DCA_INVALID_BUF_ID; + + free_dca_num(cfg->dcan, ctx); + cfg->dcan = HNS_DCA_INVALID_DCA_NUM; } void hns_roce_modify_dca(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp, @@ -1093,9 +1161,14 @@ void hns_roce_modify_dca(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp, struct hns_roce_dca_ctx *ctx = to_hr_dca_ctx(uctx); struct hns_roce_dca_cfg *cfg = &hr_qp->dca_cfg; - if (hr_qp->state == IB_QPS_RESET || hr_qp->state == IB_QPS_ERR) + if (hr_qp->state == IB_QPS_RESET || hr_qp->state == IB_QPS_ERR) { kick_dca_buf(hr_dev, cfg, ctx); - + free_dca_num(cfg->dcan, ctx); + cfg->dcan = HNS_DCA_INVALID_DCA_NUM; + } else if (hr_qp->state == IB_QPS_RTR) { + free_dca_num(cfg->dcan, ctx); + cfg->dcan = alloc_dca_num(ctx); + } } static struct hns_roce_ucontext * diff --git a/drivers/infiniband/hw/hns/hns_roce_dca.h b/drivers/infiniband/hw/hns/hns_roce_dca.h index 1f59a62..0004ee4 100644 --- a/drivers/infiniband/hw/hns/hns_roce_dca.h +++ b/drivers/infiniband/hw/hns/hns_roce_dca.h @@ -15,6 +15,8 @@ struct hns_dca_page_state { }; #define HNS_DCA_INVALID_BUF_ID 0UL +#define HNS_DCA_INVALID_DCA_NUM ~0U + struct hns_dca_shrink_resp { u64 free_key; /* free buffer's key which registered by the user */ u32 free_mems; /* free buffer count which no any QP be using */ diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index ac53a44..bef418d 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -234,6 +234,7 @@ struct hns_roce_dca_ctx { unsigned int max_qps; unsigned int status_npage; + struct ida ida; #define HNS_DCA_BITS_PER_STATUS 1 unsigned long *buf_status; @@ -348,10 +349,11 @@ struct hns_roce_dca_cfg { spinlock_t lock; u32 buf_id; u16 attach_count; + u32 dcan; u32 npages; u32 sq_idx; - bool aging_enable; - struct list_head aging_node; + bool aging_enable; + struct list_head aging_node; }; struct hns_roce_mw { diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c index 0918c97..8476000 100644 --- a/drivers/infiniband/hw/hns/hns_roce_qp.c +++ b/drivers/infiniband/hw/hns/hns_roce_qp.c @@ -1389,6 +1389,7 @@ int hns_roce_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, { struct hns_roce_dev *hr_dev = to_hr_dev(ibqp->device); struct hns_roce_qp *hr_qp = to_hr_qp(ibqp); + struct hns_roce_ib_modify_qp_resp resp = {}; enum ib_qp_state cur_state, new_state; int ret; @@ -1415,7 +1416,17 @@ int hns_roce_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, ret = hr_dev->hw->modify_qp(ibqp, attr, attr_mask, cur_state, new_state, udata); + if (ret) + goto out; + if (udata && udata->outlen) { + resp.dcan = hr_qp->dca_cfg.dcan; + ret = ib_copy_to_udata(udata, &resp, + min(udata->outlen, sizeof(resp))); + if (ret) + ibdev_err(&hr_dev->ib_dev, + "failed to copy modify qp resp.\n"); + } out: mutex_unlock(&hr_qp->mutex); diff --git a/include/uapi/rdma/hns-abi.h b/include/uapi/rdma/hns-abi.h index 476ea81..40ac2c3 100644 --- a/include/uapi/rdma/hns-abi.h +++ b/include/uapi/rdma/hns-abi.h @@ -117,6 +117,11 @@ enum { HNS_ROCE_MMAP_DCA_PAGE, }; +struct hns_roce_ib_modify_qp_resp { + __u32 dcan; + __u32 reserved; +}; + #define UVERBS_ID_NS_MASK 0xF000 #define UVERBS_ID_NS_SHIFT 12 From patchwork Thu Jul 29 02:19:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenpeng Liang X-Patchwork-Id: 12407479 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51CEDC19F38 for ; Thu, 29 Jul 2021 02:23:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 39EE76103C for ; Thu, 29 Jul 2021 02:23:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233513AbhG2CXI (ORCPT ); Wed, 28 Jul 2021 22:23:08 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:16021 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233384AbhG2CXH (ORCPT ); Wed, 28 Jul 2021 22:23:07 -0400 Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4GZvP21SJHzZv7x; Thu, 29 Jul 2021 10:19:34 +0800 (CST) Received: from dggpeml500017.china.huawei.com (7.185.36.243) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:23:01 +0800 Received: from localhost.localdomain (10.67.165.24) by dggpeml500017.china.huawei.com (7.185.36.243) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:23:00 +0800 From: Wenpeng Liang To: , CC: , , , , Xi Wang Subject: [PATCH v4 for-next 11/12] RDMA/nldev: Add detailed CTX information support Date: Thu, 29 Jul 2021 10:19:22 +0800 Message-ID: <1627525163-1683-12-git-send-email-liangwenpeng@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> References: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpeml500017.china.huawei.com (7.185.36.243) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Xi Wang Implement the RDMA nldev netlink interface for dumping detailed CTX information. Signed-off-by: Xi Wang Signed-off-by: Wenpeng Liang --- drivers/infiniband/core/device.c | 1 + drivers/infiniband/core/nldev.c | 8 +++++++- include/rdma/ib_verbs.h | 1 + 3 files changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c index 9056f48..195986f 100644 --- a/drivers/infiniband/core/device.c +++ b/drivers/infiniband/core/device.c @@ -2641,6 +2641,7 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops) SET_DEVICE_OP(dev_ops, drain_rq); SET_DEVICE_OP(dev_ops, drain_sq); SET_DEVICE_OP(dev_ops, enable_driver); + SET_DEVICE_OP(dev_ops, fill_res_ctx_entry); SET_DEVICE_OP(dev_ops, fill_res_cm_id_entry); SET_DEVICE_OP(dev_ops, fill_res_cq_entry); SET_DEVICE_OP(dev_ops, fill_res_cq_entry_raw); diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c index e9b4b2c..e8c99b7 100644 --- a/drivers/infiniband/core/nldev.c +++ b/drivers/infiniband/core/nldev.c @@ -716,6 +716,7 @@ static int fill_res_ctx_entry(struct sk_buff *msg, bool has_cap_net_admin, struct rdma_restrack_entry *res, uint32_t port) { struct ib_ucontext *ctx = container_of(res, struct ib_ucontext, res); + struct ib_device *dev = ctx->device; if (rdma_is_kernel_res(res)) return 0; @@ -723,7 +724,12 @@ static int fill_res_ctx_entry(struct sk_buff *msg, bool has_cap_net_admin, if (nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_CTXN, ctx->res.id)) return -EMSGSIZE; - return fill_res_name_pid(msg, res); + if (fill_res_name_pid(msg, res)) + return -EMSGSIZE; + + return (dev->ops.fill_res_ctx_entry) ? + dev->ops.fill_res_ctx_entry(msg, ctx) : + 0; } static int fill_res_range_qp_entry(struct sk_buff *msg, uint32_t min_range, diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 371df1c..b0277c5 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -2568,6 +2568,7 @@ struct ib_device_ops { /** * Allows rdma drivers to add their own restrack attributes. */ + int (*fill_res_ctx_entry)(struct sk_buff *msg, struct ib_ucontext *ctx); int (*fill_res_mr_entry)(struct sk_buff *msg, struct ib_mr *ibmr); int (*fill_res_mr_entry_raw)(struct sk_buff *msg, struct ib_mr *ibmr); int (*fill_res_cq_entry)(struct sk_buff *msg, struct ib_cq *ibcq); From patchwork Thu Jul 29 02:19:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenpeng Liang X-Patchwork-Id: 12407465 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF50CC4320A for ; Thu, 29 Jul 2021 02:23:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CD12A61050 for ; Thu, 29 Jul 2021 02:23:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233324AbhG2CXG (ORCPT ); Wed, 28 Jul 2021 22:23:06 -0400 Received: from szxga08-in.huawei.com ([45.249.212.255]:12277 "EHLO szxga08-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233341AbhG2CXF (ORCPT ); Wed, 28 Jul 2021 22:23:05 -0400 Received: from dggemv703-chm.china.huawei.com (unknown [172.30.72.57]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4GZvL84bCYz1CNPX; Thu, 29 Jul 2021 10:17:04 +0800 (CST) Received: from dggpeml500017.china.huawei.com (7.185.36.243) by dggemv703-chm.china.huawei.com (10.3.19.46) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:23:01 +0800 Received: from localhost.localdomain (10.67.165.24) by dggpeml500017.china.huawei.com (7.185.36.243) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Thu, 29 Jul 2021 10:23:01 +0800 From: Wenpeng Liang To: , CC: , , , , Xi Wang Subject: [PATCH v4 for-next 12/12] RDMA/hns: Dump detailed driver-specific UCTX Date: Thu, 29 Jul 2021 10:19:23 +0800 Message-ID: <1627525163-1683-13-git-send-email-liangwenpeng@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> References: <1627525163-1683-1-git-send-email-liangwenpeng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpeml500017.china.huawei.com (7.185.36.243) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Xi Wang Dump DCA mem pool status in UCTX restrack. Sample output: $ rdma res show ctx dev hns_0 -dd dev hns_0 ctxn 7 pid 1410 comm python3 drv_dca-total 65536 drv_dca-free 40960 dev hns_0 ctxn 8 pid 1410 comm python3 drv_dca-total 0 drv_dca-free 0 Signed-off-by: Xi Wang Signed-off-by: Wenpeng Liang --- drivers/infiniband/hw/hns/hns_roce_device.h | 2 ++ drivers/infiniband/hw/hns/hns_roce_main.c | 1 + drivers/infiniband/hw/hns/hns_roce_restrack.c | 50 +++++++++++++++++++++++++++ 3 files changed, 53 insertions(+) diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h index bef418d..0dfaca4 100644 --- a/drivers/infiniband/hw/hns/hns_roce_device.h +++ b/drivers/infiniband/hw/hns/hns_roce_device.h @@ -1304,4 +1304,6 @@ int hns_roce_init(struct hns_roce_dev *hr_dev); void hns_roce_exit(struct hns_roce_dev *hr_dev); int hns_roce_fill_res_cq_entry(struct sk_buff *msg, struct ib_cq *ib_cq); +int hns_roce_fill_res_ctx_entry(struct sk_buff *msg, struct ib_ucontext *ctx); + #endif /* _HNS_ROCE_DEVICE_H */ diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c index e37ece8..4f30c29 100644 --- a/drivers/infiniband/hw/hns/hns_roce_main.c +++ b/drivers/infiniband/hw/hns/hns_roce_main.c @@ -546,6 +546,7 @@ static const struct ib_device_ops hns_roce_dev_ops = { .destroy_cq = hns_roce_destroy_cq, .disassociate_ucontext = hns_roce_disassociate_ucontext, .fill_res_cq_entry = hns_roce_fill_res_cq_entry, + .fill_res_ctx_entry = hns_roce_fill_res_ctx_entry, .get_dma_mr = hns_roce_get_dma_mr, .get_link_layer = hns_roce_get_link_layer, .get_port_immutable = hns_roce_port_immutable, diff --git a/drivers/infiniband/hw/hns/hns_roce_restrack.c b/drivers/infiniband/hw/hns/hns_roce_restrack.c index 259444c..18521a4 100644 --- a/drivers/infiniband/hw/hns/hns_roce_restrack.c +++ b/drivers/infiniband/hw/hns/hns_roce_restrack.c @@ -118,3 +118,53 @@ int hns_roce_fill_res_cq_entry(struct sk_buff *msg, kfree(context); return ret; } + +static int hns_roce_fill_dca_uctx(struct hns_roce_dca_ctx *ctx, + struct sk_buff *msg) +{ + unsigned long flags; + u64 total, free; + + spin_lock_irqsave(&ctx->pool_lock, flags); + total = ctx->total_size; + free = ctx->free_size; + spin_unlock_irqrestore(&ctx->pool_lock, flags); + + if (rdma_nl_put_driver_u64(msg, "dca-total", total)) + goto err; + + if (rdma_nl_put_driver_u64(msg, "dca-free", free)) + goto err; + + return 0; + +err: + return -EMSGSIZE; +} + +int hns_roce_fill_res_ctx_entry(struct sk_buff *msg, struct ib_ucontext *ctx) +{ + struct hns_roce_dev *hr_dev = to_hr_dev(ctx->device); + struct hns_roce_ucontext *uctx = to_hr_ucontext(ctx); + struct nlattr *table_attr; + + table_attr = nla_nest_start(msg, RDMA_NLDEV_ATTR_DRIVER); + if (!table_attr) + goto err; + + if (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_DCA_MODE) { + if (hns_roce_fill_dca_uctx(&uctx->dca_ctx, msg)) + goto err_cancel_table; + } + + nla_nest_end(msg, table_attr); + + return 0; + +err_cancel_table: + nla_nest_cancel(msg, table_attr); +err: + return -EMSGSIZE; + + return 0; +}