From patchwork Sun May 5 14:07:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 10930189 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 537E792A for ; Sun, 5 May 2019 14:07:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3FB50285CC for ; Sun, 5 May 2019 14:07:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 33E26285E5; Sun, 5 May 2019 14:07:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 27C1D285CC for ; Sun, 5 May 2019 14:07:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727034AbfEEOHZ (ORCPT ); Sun, 5 May 2019 10:07:25 -0400 Received: from mail.kernel.org ([198.145.29.99]:42564 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726524AbfEEOHZ (ORCPT ); Sun, 5 May 2019 10:07:25 -0400 Received: from localhost (unknown [193.47.165.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id DDBAE204FD; Sun, 5 May 2019 14:07:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1557065243; bh=S5tG9irxYyXP2f8+SbgzXsZulQnxSPQ4sWdGxSFUGC4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=goaNNotW6I/7kXyRw8oXvL8m/eidNZ8+pUmlIuDmI464ddpvfU48iERitdtf2nRXP Xf7PsPwnSG+UYj2WS6a2ZBQi0IImpbI6j4RVJwC2U5em6be8tb7WMyio81jZY1VXKN FyH3g0t/kfxBBQGYnpJj3oNo0wGMTAU8dAyi5N9Q= From: Leon Romanovsky To: Doug Ledford , Jason Gunthorpe Cc: Leon Romanovsky , RDMA mailing list , Ariel Levkovich , Eli Cohen , Mark Bloch Subject: [PATCH rdma-next 1/4] IB/mlx5: Support device memory type attribute Date: Sun, 5 May 2019 17:07:11 +0300 Message-Id: <20190505140714.8741-2-leon@kernel.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190505140714.8741-1-leon@kernel.org> References: <20190505140714.8741-1-leon@kernel.org> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Ariel Levkovich This patch intoruduces a new mlx5_ib driver attribute to the DM allocation method - the DM type. In order to allow addition of new types in downstream patches this patch also refactors the allocation, deallocation and registration handlers to consider the requested type and perform the necessary actions according to it. Since not all future device memory types will be such that are mapped to user memory, the mandatory page index output attribute is modified to be optional. Signed-off-by: Ariel Levkovich Reviewed-by: Eli Cohen Reviewed-by: Mark Bloch Signed-off-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/cmd.c | 30 ++--- drivers/infiniband/hw/mlx5/cmd.h | 4 +- drivers/infiniband/hw/mlx5/main.c | 135 ++++++++++++++-------- drivers/infiniband/hw/mlx5/mlx5_ib.h | 23 ++-- drivers/infiniband/hw/mlx5/mr.c | 32 +++-- include/uapi/rdma/mlx5_user_ioctl_cmds.h | 1 + include/uapi/rdma/mlx5_user_ioctl_verbs.h | 4 + 7 files changed, 145 insertions(+), 84 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/cmd.c b/drivers/infiniband/hw/mlx5/cmd.c index be95ac5aeb30..f0e9c7609083 100644 --- a/drivers/infiniband/hw/mlx5/cmd.c +++ b/drivers/infiniband/hw/mlx5/cmd.c @@ -82,10 +82,10 @@ int mlx5_cmd_modify_cong_params(struct mlx5_core_dev *dev, return mlx5_cmd_exec(dev, in, in_size, out, sizeof(out)); } -int mlx5_cmd_alloc_memic(struct mlx5_memic *memic, phys_addr_t *addr, - u64 length, u32 alignment) +int mlx5_cmd_alloc_memic(struct mlx5_dm *dm, phys_addr_t *addr, + u64 length, u32 alignment) { - struct mlx5_core_dev *dev = memic->dev; + struct mlx5_core_dev *dev = dm->dev; u64 num_memic_hw_pages = MLX5_CAP_DEV_MEM(dev, memic_bar_size) >> PAGE_SHIFT; u64 hw_start_addr = MLX5_CAP64_DEV_MEM(dev, memic_bar_start_addr); @@ -115,17 +115,17 @@ int mlx5_cmd_alloc_memic(struct mlx5_memic *memic, phys_addr_t *addr, mlx5_alignment); while (page_idx < num_memic_hw_pages) { - spin_lock(&memic->memic_lock); - page_idx = bitmap_find_next_zero_area(memic->memic_alloc_pages, + spin_lock(&dm->lock); + page_idx = bitmap_find_next_zero_area(dm->memic_alloc_pages, num_memic_hw_pages, page_idx, num_pages, 0); if (page_idx < num_memic_hw_pages) - bitmap_set(memic->memic_alloc_pages, + bitmap_set(dm->memic_alloc_pages, page_idx, num_pages); - spin_unlock(&memic->memic_lock); + spin_unlock(&dm->lock); if (page_idx >= num_memic_hw_pages) break; @@ -135,10 +135,10 @@ int mlx5_cmd_alloc_memic(struct mlx5_memic *memic, phys_addr_t *addr, ret = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); if (ret) { - spin_lock(&memic->memic_lock); - bitmap_clear(memic->memic_alloc_pages, + spin_lock(&dm->lock); + bitmap_clear(dm->memic_alloc_pages, page_idx, num_pages); - spin_unlock(&memic->memic_lock); + spin_unlock(&dm->lock); if (ret == -EAGAIN) { page_idx++; @@ -157,9 +157,9 @@ int mlx5_cmd_alloc_memic(struct mlx5_memic *memic, phys_addr_t *addr, return -ENOMEM; } -int mlx5_cmd_dealloc_memic(struct mlx5_memic *memic, u64 addr, u64 length) +int mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, u64 addr, u64 length) { - struct mlx5_core_dev *dev = memic->dev; + struct mlx5_core_dev *dev = dm->dev; u64 hw_start_addr = MLX5_CAP64_DEV_MEM(dev, memic_bar_start_addr); u32 num_pages = DIV_ROUND_UP(length, PAGE_SIZE); u32 out[MLX5_ST_SZ_DW(dealloc_memic_out)] = {0}; @@ -177,10 +177,10 @@ int mlx5_cmd_dealloc_memic(struct mlx5_memic *memic, u64 addr, u64 length) err = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); if (!err) { - spin_lock(&memic->memic_lock); - bitmap_clear(memic->memic_alloc_pages, + spin_lock(&dm->lock); + bitmap_clear(dm->memic_alloc_pages, start_page_idx, num_pages); - spin_unlock(&memic->memic_lock); + spin_unlock(&dm->lock); } return err; diff --git a/drivers/infiniband/hw/mlx5/cmd.h b/drivers/infiniband/hw/mlx5/cmd.h index 923a7b93f507..80a644bea6c7 100644 --- a/drivers/infiniband/hw/mlx5/cmd.h +++ b/drivers/infiniband/hw/mlx5/cmd.h @@ -44,9 +44,9 @@ int mlx5_cmd_query_cong_params(struct mlx5_core_dev *dev, int cong_point, int mlx5_cmd_query_ext_ppcnt_counters(struct mlx5_core_dev *dev, void *out); int mlx5_cmd_modify_cong_params(struct mlx5_core_dev *mdev, void *in, int in_size); -int mlx5_cmd_alloc_memic(struct mlx5_memic *memic, phys_addr_t *addr, +int mlx5_cmd_alloc_memic(struct mlx5_dm *dm, phys_addr_t *addr, u64 length, u32 alignment); -int mlx5_cmd_dealloc_memic(struct mlx5_memic *memic, u64 addr, u64 length); +int mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, u64 addr, u64 length); void mlx5_cmd_dealloc_pd(struct mlx5_core_dev *dev, u32 pdn, u16 uid); void mlx5_cmd_destroy_tir(struct mlx5_core_dev *dev, u32 tirn, u16 uid); void mlx5_cmd_destroy_tis(struct mlx5_core_dev *dev, u32 tisn, u16 uid); diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index d69cfe511c37..06c4bb9e13db 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -2294,58 +2294,90 @@ static int mlx5_ib_mmap(struct ib_ucontext *ibcontext, struct vm_area_struct *vm return 0; } -struct ib_dm *mlx5_ib_alloc_dm(struct ib_device *ibdev, - struct ib_ucontext *context, - struct ib_dm_alloc_attr *attr, - struct uverbs_attr_bundle *attrs) +static int handle_alloc_dm_memic(struct ib_ucontext *ctx, + struct mlx5_ib_dm *dm, + struct ib_dm_alloc_attr *attr, + struct uverbs_attr_bundle *attrs) { - u64 act_size = roundup(attr->length, MLX5_MEMIC_BASE_SIZE); - struct mlx5_memic *memic = &to_mdev(ibdev)->memic; - phys_addr_t memic_addr; - struct mlx5_ib_dm *dm; + struct mlx5_dm *dm_db = &to_mdev(ctx->device)->dm; u64 start_offset; u32 page_idx; int err; - dm = kzalloc(sizeof(*dm), GFP_KERNEL); - if (!dm) - return ERR_PTR(-ENOMEM); - - mlx5_ib_dbg(to_mdev(ibdev), "alloc_memic req: user_length=0x%llx act_length=0x%llx log_alignment=%d\n", - attr->length, act_size, attr->alignment); + dm->size = roundup(attr->length, MLX5_MEMIC_BASE_SIZE); - err = mlx5_cmd_alloc_memic(memic, &memic_addr, - act_size, attr->alignment); + err = mlx5_cmd_alloc_memic(dm_db, &dm->dev_addr, + dm->size, attr->alignment); if (err) - goto err_free; + return err; - start_offset = memic_addr & ~PAGE_MASK; - page_idx = (memic_addr - memic->dev->bar_addr - - MLX5_CAP64_DEV_MEM(memic->dev, memic_bar_start_addr)) >> + page_idx = (dm->dev_addr - pci_resource_start(dm_db->dev->pdev, 0) - + MLX5_CAP64_DEV_MEM(dm_db->dev, memic_bar_start_addr)) >> PAGE_SHIFT; err = uverbs_copy_to(attrs, - MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET, - &start_offset, sizeof(start_offset)); + MLX5_IB_ATTR_ALLOC_DM_RESP_PAGE_INDEX, + &page_idx, sizeof(page_idx)); if (err) goto err_dealloc; + start_offset = dm->dev_addr & ~PAGE_MASK; err = uverbs_copy_to(attrs, - MLX5_IB_ATTR_ALLOC_DM_RESP_PAGE_INDEX, - &page_idx, sizeof(page_idx)); + MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET, + &start_offset, sizeof(start_offset)); if (err) goto err_dealloc; - bitmap_set(to_mucontext(context)->dm_pages, page_idx, - DIV_ROUND_UP(act_size, PAGE_SIZE)); + bitmap_set(to_mucontext(ctx)->dm_pages, page_idx, + DIV_ROUND_UP(dm->size, PAGE_SIZE)); + + return 0; + +err_dealloc: + mlx5_cmd_dealloc_memic(dm_db, dm->dev_addr, dm->size); + + return err; +} + +struct ib_dm *mlx5_ib_alloc_dm(struct ib_device *ibdev, + struct ib_ucontext *context, + struct ib_dm_alloc_attr *attr, + struct uverbs_attr_bundle *attrs) +{ + struct mlx5_ib_dm *dm; + enum mlx5_ib_uapi_dm_type type; + int err; + + err = uverbs_get_const_default(&type, attrs, + MLX5_IB_ATTR_ALLOC_DM_REQ_TYPE, + MLX5_IB_UAPI_DM_TYPE_MEMIC); + if (err) + return ERR_PTR(err); + + mlx5_ib_dbg(to_mdev(ibdev), "alloc_dm req: dm_type=%d user_length=0x%llx log_alignment=%d\n", + type, attr->length, attr->alignment); + + dm = kzalloc(sizeof(*dm), GFP_KERNEL); + if (!dm) + return ERR_PTR(-ENOMEM); + + dm->type = type; + + switch (type) { + case MLX5_IB_UAPI_DM_TYPE_MEMIC: + err = handle_alloc_dm_memic(context, dm, + attr, + attrs); + break; + default: + err = -EOPNOTSUPP; + } - dm->dev_addr = memic_addr; + if (err) + goto err_free; return &dm->ibdm; -err_dealloc: - mlx5_cmd_dealloc_memic(memic, memic_addr, - act_size); err_free: kfree(dm); return ERR_PTR(err); @@ -2353,25 +2385,31 @@ struct ib_dm *mlx5_ib_alloc_dm(struct ib_device *ibdev, int mlx5_ib_dealloc_dm(struct ib_dm *ibdm, struct uverbs_attr_bundle *attrs) { - struct mlx5_memic *memic = &to_mdev(ibdm->device)->memic; + struct mlx5_dm *dm_db = &to_mdev(ibdm->device)->dm; struct mlx5_ib_dm *dm = to_mdm(ibdm); - u64 act_size = roundup(dm->ibdm.length, MLX5_MEMIC_BASE_SIZE); u32 page_idx; int ret; - ret = mlx5_cmd_dealloc_memic(memic, dm->dev_addr, act_size); - if (ret) - return ret; + switch (dm->type) { + case MLX5_IB_UAPI_DM_TYPE_MEMIC: + ret = mlx5_cmd_dealloc_memic(dm_db, dm->dev_addr, dm->size); + if (ret) + return ret; - page_idx = (dm->dev_addr - memic->dev->bar_addr - - MLX5_CAP64_DEV_MEM(memic->dev, memic_bar_start_addr)) >> - PAGE_SHIFT; - bitmap_clear(rdma_udata_to_drv_context( - &attrs->driver_udata, - struct mlx5_ib_ucontext, - ibucontext)->dm_pages, - page_idx, - DIV_ROUND_UP(act_size, PAGE_SIZE)); + page_idx = (dm->dev_addr - + pci_resource_start(dm_db->dev->pdev, 0) - + MLX5_CAP64_DEV_MEM(dm_db->dev, + memic_bar_start_addr)) >> + PAGE_SHIFT; + bitmap_clear(rdma_udata_to_drv_context(&attrs->driver_udata, + struct mlx5_ib_ucontext, + ibucontext) + ->dm_pages, + page_idx, DIV_ROUND_UP(dm->size, PAGE_SIZE)); + break; + default: + return -EOPNOTSUPP; + } kfree(dm); @@ -5872,7 +5910,10 @@ ADD_UVERBS_ATTRIBUTES_SIMPLE( UA_MANDATORY), UVERBS_ATTR_PTR_OUT(MLX5_IB_ATTR_ALLOC_DM_RESP_PAGE_INDEX, UVERBS_ATTR_TYPE(u16), - UA_MANDATORY)); + UA_OPTIONAL), + UVERBS_ATTR_CONST_IN(MLX5_IB_ATTR_ALLOC_DM_REQ_TYPE, + enum mlx5_ib_uapi_dm_type, + UA_OPTIONAL)); ADD_UVERBS_ATTRIBUTES_SIMPLE( mlx5_ib_flow_action, @@ -6020,8 +6061,8 @@ static int mlx5_ib_stage_init_init(struct mlx5_ib_dev *dev) INIT_LIST_HEAD(&dev->qp_list); spin_lock_init(&dev->reset_flow_resource_lock); - spin_lock_init(&dev->memic.memic_lock); - dev->memic.dev = mdev; + spin_lock_init(&dev->dm.lock); + dev->dm.dev = mdev; if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) { err = init_srcu_struct(&dev->mr_srcu); diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 447f8ad5abbd..86f8110bf3ab 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -48,6 +48,7 @@ #include #include #include +#include #include "srq.h" @@ -558,15 +559,17 @@ enum mlx5_ib_mtt_access_flags { struct mlx5_ib_dm { struct ib_dm ibdm; phys_addr_t dev_addr; + u32 type; + size_t size; }; #define MLX5_IB_MTT_PRESENT (MLX5_IB_MTT_READ | MLX5_IB_MTT_WRITE) -#define MLX5_IB_DM_ALLOWED_ACCESS (IB_ACCESS_LOCAL_WRITE |\ - IB_ACCESS_REMOTE_WRITE |\ - IB_ACCESS_REMOTE_READ |\ - IB_ACCESS_REMOTE_ATOMIC |\ - IB_ZERO_BASED) +#define MLX5_IB_DM_MEMIC_ALLOWED_ACCESS (IB_ACCESS_LOCAL_WRITE |\ + IB_ACCESS_REMOTE_WRITE |\ + IB_ACCESS_REMOTE_READ |\ + IB_ACCESS_REMOTE_ATOMIC |\ + IB_ZERO_BASED) struct mlx5_ib_mr { struct ib_mr ibmr; @@ -847,9 +850,13 @@ struct mlx5_ib_flow_action { }; }; -struct mlx5_memic { +struct mlx5_dm { struct mlx5_core_dev *dev; - spinlock_t memic_lock; + /* This lock is used to protect the access to the shared + * allocation map when concurrent requests by different + * processes are handled. + */ + spinlock_t lock; DECLARE_BITMAP(memic_alloc_pages, MLX5_MAX_MEMIC_PAGES); }; @@ -953,7 +960,7 @@ struct mlx5_ib_dev { u8 umr_fence; struct list_head ib_dev_list; u64 sys_image_guid; - struct mlx5_memic memic; + struct mlx5_dm dm; u16 devx_whitelist_uid; struct mlx5_srq_table srq_table; struct mlx5_async_ctx async_ctx; diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index 4381cddab97b..ba35d68e7499 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1159,8 +1159,8 @@ static void set_mr_fields(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr, mr->access_flags = access_flags; } -static struct ib_mr *mlx5_ib_get_memic_mr(struct ib_pd *pd, u64 memic_addr, - u64 length, int acc) +static struct ib_mr *mlx5_ib_get_dm_mr(struct ib_pd *pd, u64 start_addr, + u64 length, int acc, int mode) { struct mlx5_ib_dev *dev = to_mdev(pd->device); int inlen = MLX5_ST_SZ_BYTES(create_mkey_in); @@ -1182,9 +1182,8 @@ static struct ib_mr *mlx5_ib_get_memic_mr(struct ib_pd *pd, u64 memic_addr, mkc = MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry); - MLX5_SET(mkc, mkc, access_mode_1_0, MLX5_MKC_ACCESS_MODE_MEMIC & 0x3); - MLX5_SET(mkc, mkc, access_mode_4_2, - (MLX5_MKC_ACCESS_MODE_MEMIC >> 2) & 0x7); + MLX5_SET(mkc, mkc, access_mode_1_0, mode & 0x3); + MLX5_SET(mkc, mkc, access_mode_4_2, (mode >> 2) & 0x7); MLX5_SET(mkc, mkc, a, !!(acc & IB_ACCESS_REMOTE_ATOMIC)); MLX5_SET(mkc, mkc, rw, !!(acc & IB_ACCESS_REMOTE_WRITE)); MLX5_SET(mkc, mkc, rr, !!(acc & IB_ACCESS_REMOTE_READ)); @@ -1194,7 +1193,7 @@ static struct ib_mr *mlx5_ib_get_memic_mr(struct ib_pd *pd, u64 memic_addr, MLX5_SET64(mkc, mkc, len, length); MLX5_SET(mkc, mkc, pd, to_mpd(pd)->pdn); MLX5_SET(mkc, mkc, qpn, 0xffffff); - MLX5_SET64(mkc, mkc, start_addr, memic_addr - dev->mdev->bar_addr); + MLX5_SET64(mkc, mkc, start_addr, start_addr); err = mlx5_core_create_mkey(mdev, &mr->mmkey, in, inlen); if (err) @@ -1236,15 +1235,24 @@ struct ib_mr *mlx5_ib_reg_dm_mr(struct ib_pd *pd, struct ib_dm *dm, struct uverbs_attr_bundle *attrs) { struct mlx5_ib_dm *mdm = to_mdm(dm); - u64 memic_addr; + struct mlx5_core_dev *dev = to_mdev(dm->device)->mdev; + u64 start_addr = mdm->dev_addr + attr->offset; + int mode; - if (attr->access_flags & ~MLX5_IB_DM_ALLOWED_ACCESS) - return ERR_PTR(-EINVAL); + switch (mdm->type) { + case MLX5_IB_UAPI_DM_TYPE_MEMIC: + if (attr->access_flags & ~MLX5_IB_DM_MEMIC_ALLOWED_ACCESS) + return ERR_PTR(-EINVAL); - memic_addr = mdm->dev_addr + attr->offset; + mode = MLX5_MKC_ACCESS_MODE_MEMIC; + start_addr -= pci_resource_start(dev->pdev, 0); + break; + default: + return ERR_PTR(-EINVAL); + } - return mlx5_ib_get_memic_mr(pd, memic_addr, attr->length, - attr->access_flags); + return mlx5_ib_get_dm_mr(pd, start_addr, attr->length, + attr->access_flags, mode); } struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, diff --git a/include/uapi/rdma/mlx5_user_ioctl_cmds.h b/include/uapi/rdma/mlx5_user_ioctl_cmds.h index 0d8f564ce60b..d404c951954c 100644 --- a/include/uapi/rdma/mlx5_user_ioctl_cmds.h +++ b/include/uapi/rdma/mlx5_user_ioctl_cmds.h @@ -44,6 +44,7 @@ enum mlx5_ib_create_flow_action_attrs { enum mlx5_ib_alloc_dm_attrs { MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_ALLOC_DM_RESP_PAGE_INDEX, + MLX5_IB_ATTR_ALLOC_DM_REQ_TYPE, }; enum mlx5_ib_devx_methods { diff --git a/include/uapi/rdma/mlx5_user_ioctl_verbs.h b/include/uapi/rdma/mlx5_user_ioctl_verbs.h index 0a126a6b9337..c291fb2f8446 100644 --- a/include/uapi/rdma/mlx5_user_ioctl_verbs.h +++ b/include/uapi/rdma/mlx5_user_ioctl_verbs.h @@ -57,5 +57,9 @@ struct mlx5_ib_uapi_devx_async_cmd_hdr { __u8 out_data[]; }; +enum mlx5_ib_uapi_dm_type { + MLX5_IB_UAPI_DM_TYPE_MEMIC, +}; + #endif From patchwork Sun May 5 14:07:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 10930191 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7F38B92A for ; Sun, 5 May 2019 14:07:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6E195285CC for ; Sun, 5 May 2019 14:07:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6275A285E5; Sun, 5 May 2019 14:07:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 12962285CC for ; Sun, 5 May 2019 14:07:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727232AbfEEOH1 (ORCPT ); Sun, 5 May 2019 10:07:27 -0400 Received: from mail.kernel.org ([198.145.29.99]:42608 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726524AbfEEOH1 (ORCPT ); Sun, 5 May 2019 10:07:27 -0400 Received: from localhost (unknown [193.47.165.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 2AB64204FD; Sun, 5 May 2019 14:07:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1557065246; bh=4LRa9zBkDkvPZce6iG5806680XX2e3mnOEbRqqX0bao=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Pc/sWKmUAnQYDqOZLtqXaFV0o/KTxkSEmNUWHd+/DsHA5LA85aIO/g118qjirXeiX PjS0XzeuyxcC5Bt3FYItuAbi8aOUEg2wC9OQ1G/jPrjuGmEVBQBGKkUQvDN7Me9uBZ CoOUjgFiAKJ/J/SsoqfvfoBew/0n28/zuh6JygEw= From: Leon Romanovsky To: Doug Ledford , Jason Gunthorpe Cc: Leon Romanovsky , RDMA mailing list , Ariel Levkovich , Eli Cohen , Mark Bloch Subject: [PATCH rdma-next 2/4] IB/mlx5: Warn on allocated MEMIC buffers during cleanup Date: Sun, 5 May 2019 17:07:12 +0300 Message-Id: <20190505140714.8741-3-leon@kernel.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190505140714.8741-1-leon@kernel.org> References: <20190505140714.8741-1-leon@kernel.org> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Ariel Levkovich Adding a warning on allocated MEMIC buffers that weren't freed prior to driver tear down. Signed-off-by: Ariel Levkovich Reviewed-by: Eli Cohen Reviewed-by: Mark Bloch Signed-off-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/main.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 06c4bb9e13db..2c03ea1a9dc3 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -6011,6 +6011,8 @@ static void mlx5_ib_stage_init_cleanup(struct mlx5_ib_dev *dev) srcu_barrier(&dev->mr_srcu); cleanup_srcu_struct(&dev->mr_srcu); } + + WARN_ON(!bitmap_empty(dev->dm.memic_alloc_pages, MLX5_MAX_MEMIC_PAGES)); } static int mlx5_ib_stage_init_init(struct mlx5_ib_dev *dev) From patchwork Sun May 5 14:07:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 10930193 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5537D1398 for ; Sun, 5 May 2019 14:07:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 41EC5285CC for ; Sun, 5 May 2019 14:07:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 364EC285E5; Sun, 5 May 2019 14:07:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 29C4F285CC for ; Sun, 5 May 2019 14:07:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727259AbfEEOHb (ORCPT ); Sun, 5 May 2019 10:07:31 -0400 Received: from mail.kernel.org ([198.145.29.99]:42630 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726524AbfEEOHb (ORCPT ); Sun, 5 May 2019 10:07:31 -0400 Received: from localhost (unknown [193.47.165.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 73DF0204FD; Sun, 5 May 2019 14:07:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1557065250; bh=BARksFkLpirZ6EhDEnjdrJxyuISDoWKw3yUwWjgrIak=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VOHJ+zla5CsUYYQQF9+1OXn0fnJ/HFMHGUSYH7z9fi982R2m/G1RVgqPcJYao3T38 jp+sXzDn2rqZigMLQ3iWsBqV7KEKEMITUR9ErJG0ucCByqNnuPg+dAFZgJM1XdjzPh 5aHJGRpg0tKnTZ4UwktF1y7/9cH9Yu1UzB1G1ZFU= From: Leon Romanovsky To: Doug Ledford , Jason Gunthorpe Cc: Leon Romanovsky , RDMA mailing list , Ariel Levkovich , Eli Cohen , Mark Bloch Subject: [PATCH rdma-next 3/4] IB/mlx5: Add steering SW ICM device memory type Date: Sun, 5 May 2019 17:07:13 +0300 Message-Id: <20190505140714.8741-4-leon@kernel.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190505140714.8741-1-leon@kernel.org> References: <20190505140714.8741-1-leon@kernel.org> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Ariel Levkovich This patch adds support for allocating, deallocating and registering a new device memory type, STEERING_SW_ICM. This memory can be allocated and used by a privileged user for direct rule insertion and management of the device's steering tables. The type is provided by the user via the dedicated attribute in the alloc_dm ioctl command. Signed-off-by: Ariel Levkovich Reviewed-by: Eli Cohen Reviewed-by: Mark Bloch Signed-off-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/cmd.c | 127 ++++++++++++++++++- drivers/infiniband/hw/mlx5/cmd.h | 6 +- drivers/infiniband/hw/mlx5/main.c | 142 ++++++++++++++++++++-- drivers/infiniband/hw/mlx5/mlx5_ib.h | 17 +++ drivers/infiniband/hw/mlx5/mr.c | 7 ++ include/uapi/rdma/mlx5_user_ioctl_verbs.h | 2 + 6 files changed, 292 insertions(+), 9 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/cmd.c b/drivers/infiniband/hw/mlx5/cmd.c index f0e9c7609083..e3ec79b8f7f5 100644 --- a/drivers/infiniband/hw/mlx5/cmd.c +++ b/drivers/infiniband/hw/mlx5/cmd.c @@ -157,7 +157,7 @@ int mlx5_cmd_alloc_memic(struct mlx5_dm *dm, phys_addr_t *addr, return -ENOMEM; } -int mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, u64 addr, u64 length) +int mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, phys_addr_t addr, u64 length) { struct mlx5_core_dev *dev = dm->dev; u64 hw_start_addr = MLX5_CAP64_DEV_MEM(dev, memic_bar_start_addr); @@ -186,6 +186,131 @@ int mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, u64 addr, u64 length) return err; } +int mlx5_cmd_alloc_sw_icm(struct mlx5_dm *dm, int type, u64 length, + u16 uid, phys_addr_t *addr, u32 *obj_id) +{ + struct mlx5_core_dev *dev = dm->dev; + u32 num_blocks = DIV_ROUND_UP(length, MLX5_SW_ICM_BLOCK_SIZE(dev)); + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; + u32 in[MLX5_ST_SZ_DW(create_sw_icm_in)] = {}; + unsigned long *block_map; + u64 icm_start_addr; + u32 log_icm_size; + u32 max_blocks; + u64 block_idx; + void *sw_icm; + int ret; + + MLX5_SET(general_obj_in_cmd_hdr, in, opcode, + MLX5_CMD_OP_CREATE_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, in, obj_type, MLX5_OBJ_TYPE_SW_ICM); + MLX5_SET(general_obj_in_cmd_hdr, in, uid, uid); + + switch (type) { + case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM: + icm_start_addr = MLX5_CAP64_DEV_MEM(dev, + steering_sw_icm_start_address); + log_icm_size = MLX5_CAP_DEV_MEM(dev, log_steering_sw_icm_size); + block_map = dm->steering_sw_icm_alloc_blocks; + break; + case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM: + icm_start_addr = MLX5_CAP64_DEV_MEM(dev, + header_modify_sw_icm_start_address); + log_icm_size = MLX5_CAP_DEV_MEM(dev, + log_header_modify_sw_icm_size); + block_map = dm->header_modify_sw_icm_alloc_blocks; + break; + default: + return -EINVAL; + } + + max_blocks = BIT(log_icm_size - MLX5_LOG_SW_ICM_BLOCK_SIZE(dev)); + spin_lock(&dm->lock); + block_idx = bitmap_find_next_zero_area(block_map, + max_blocks, + 0, + num_blocks, 0); + + if (block_idx < max_blocks) + bitmap_set(block_map, + block_idx, num_blocks); + + spin_unlock(&dm->lock); + + if (block_idx >= max_blocks) + return -ENOMEM; + + sw_icm = MLX5_ADDR_OF(create_sw_icm_in, in, sw_icm); + icm_start_addr += block_idx << MLX5_LOG_SW_ICM_BLOCK_SIZE(dev); + MLX5_SET64(sw_icm, sw_icm, sw_icm_start_addr, + icm_start_addr); + MLX5_SET(sw_icm, sw_icm, log_sw_icm_size, ilog2(length)); + + ret = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); + if (ret) { + spin_lock(&dm->lock); + bitmap_clear(block_map, + block_idx, num_blocks); + spin_unlock(&dm->lock); + + return ret; + } + + *addr = icm_start_addr; + *obj_id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id); + + return 0; +} + +int mlx5_cmd_dealloc_sw_icm(struct mlx5_dm *dm, int type, u64 length, + u16 uid, phys_addr_t addr, u32 obj_id) +{ + struct mlx5_core_dev *dev = dm->dev; + u32 num_blocks = DIV_ROUND_UP(length, MLX5_SW_ICM_BLOCK_SIZE(dev)); + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; + u32 in[MLX5_ST_SZ_DW(general_obj_in_cmd_hdr)] = {}; + unsigned long *block_map; + u64 start_idx; + int err; + + switch (type) { + case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM: + start_idx = + (addr - MLX5_CAP64_DEV_MEM( + dev, steering_sw_icm_start_address)) >> + MLX5_LOG_SW_ICM_BLOCK_SIZE(dev); + block_map = dm->steering_sw_icm_alloc_blocks; + break; + case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM: + start_idx = + (addr - + MLX5_CAP64_DEV_MEM( + dev, header_modify_sw_icm_start_address)) >> + MLX5_LOG_SW_ICM_BLOCK_SIZE(dev); + block_map = dm->header_modify_sw_icm_alloc_blocks; + break; + default: + return -EINVAL; + } + + MLX5_SET(general_obj_in_cmd_hdr, in, opcode, + MLX5_CMD_OP_DESTROY_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, in, obj_type, MLX5_OBJ_TYPE_SW_ICM); + MLX5_SET(general_obj_in_cmd_hdr, in, obj_id, obj_id); + MLX5_SET(general_obj_in_cmd_hdr, in, uid, uid); + + err = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); + if (err) + return err; + + spin_lock(&dm->lock); + bitmap_clear(block_map, + start_idx, num_blocks); + spin_unlock(&dm->lock); + + return 0; +} + int mlx5_cmd_query_ext_ppcnt_counters(struct mlx5_core_dev *dev, void *out) { u32 in[MLX5_ST_SZ_DW(ppcnt_reg)] = {}; diff --git a/drivers/infiniband/hw/mlx5/cmd.h b/drivers/infiniband/hw/mlx5/cmd.h index 80a644bea6c7..0572dcba6eae 100644 --- a/drivers/infiniband/hw/mlx5/cmd.h +++ b/drivers/infiniband/hw/mlx5/cmd.h @@ -46,7 +46,7 @@ int mlx5_cmd_modify_cong_params(struct mlx5_core_dev *mdev, void *in, int in_size); int mlx5_cmd_alloc_memic(struct mlx5_dm *dm, phys_addr_t *addr, u64 length, u32 alignment); -int mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, u64 addr, u64 length); +int mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, phys_addr_t addr, u64 length); void mlx5_cmd_dealloc_pd(struct mlx5_core_dev *dev, u32 pdn, u16 uid); void mlx5_cmd_destroy_tir(struct mlx5_core_dev *dev, u32 tirn, u16 uid); void mlx5_cmd_destroy_tis(struct mlx5_core_dev *dev, u32 tisn, u16 uid); @@ -65,4 +65,8 @@ int mlx5_cmd_alloc_q_counter(struct mlx5_core_dev *dev, u16 *counter_id, u16 uid); int mlx5_cmd_mad_ifc(struct mlx5_core_dev *dev, const void *inb, void *outb, u16 opmod, u8 port); +int mlx5_cmd_alloc_sw_icm(struct mlx5_dm *dm, int type, u64 length, + u16 uid, phys_addr_t *addr, u32 *obj_id); +int mlx5_cmd_dealloc_sw_icm(struct mlx5_dm *dm, int type, u64 length, + u16 uid, phys_addr_t addr, u32 obj_id); #endif /* MLX5_IB_CMD_H */ diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 2c03ea1a9dc3..b49929682dc0 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -2294,6 +2294,28 @@ static int mlx5_ib_mmap(struct ib_ucontext *ibcontext, struct vm_area_struct *vm return 0; } +static inline int check_dm_type_support(struct mlx5_ib_dev *dev, + u32 type) +{ + switch (type) { + case MLX5_IB_UAPI_DM_TYPE_MEMIC: + if (!MLX5_CAP_DEV_MEM(dev->mdev, memic)) + return -EOPNOTSUPP; + break; + case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM: + if (!capable(CAP_SYS_RAWIO) || + !capable(CAP_NET_RAW)) + return -EPERM; + + if (!(MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, sw_owner) || + MLX5_CAP_FLOWTABLE_NIC_TX(dev->mdev, sw_owner))) + return -EOPNOTSUPP; + break; + } + + return 0; +} + static int handle_alloc_dm_memic(struct ib_ucontext *ctx, struct mlx5_ib_dm *dm, struct ib_dm_alloc_attr *attr, @@ -2339,6 +2361,40 @@ static int handle_alloc_dm_memic(struct ib_ucontext *ctx, return err; } +static int handle_alloc_dm_sw_icm(struct ib_ucontext *ctx, + struct mlx5_ib_dm *dm, + struct ib_dm_alloc_attr *attr, + struct uverbs_attr_bundle *attrs, + int type) +{ + struct mlx5_dm *dm_db = &to_mdev(ctx->device)->dm; + u64 act_size; + int err; + + /* Allocation size must a multiple of the basic block size + * and a power of 2. + */ + act_size = roundup(attr->length, MLX5_SW_ICM_BLOCK_SIZE(dm_db->dev)); + act_size = roundup_pow_of_two(act_size); + + dm->size = act_size; + err = mlx5_cmd_alloc_sw_icm(dm_db, type, act_size, + to_mucontext(ctx)->devx_uid, &dm->dev_addr, + &dm->icm_dm.obj_id); + if (err) + return err; + + err = uverbs_copy_to(attrs, + MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET, + &dm->dev_addr, sizeof(dm->dev_addr)); + if (err) + mlx5_cmd_dealloc_sw_icm(dm_db, type, dm->size, + to_mucontext(ctx)->devx_uid, + dm->dev_addr, dm->icm_dm.obj_id); + + return err; +} + struct ib_dm *mlx5_ib_alloc_dm(struct ib_device *ibdev, struct ib_ucontext *context, struct ib_dm_alloc_attr *attr, @@ -2357,6 +2413,10 @@ struct ib_dm *mlx5_ib_alloc_dm(struct ib_device *ibdev, mlx5_ib_dbg(to_mdev(ibdev), "alloc_dm req: dm_type=%d user_length=0x%llx log_alignment=%d\n", type, attr->length, attr->alignment); + err = check_dm_type_support(to_mdev(ibdev), type); + if (err) + return ERR_PTR(err); + dm = kzalloc(sizeof(*dm), GFP_KERNEL); if (!dm) return ERR_PTR(-ENOMEM); @@ -2369,6 +2429,10 @@ struct ib_dm *mlx5_ib_alloc_dm(struct ib_device *ibdev, attr, attrs); break; + case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM: + case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM: + err = handle_alloc_dm_sw_icm(context, dm, attr, attrs, type); + break; default: err = -EOPNOTSUPP; } @@ -2385,6 +2449,8 @@ struct ib_dm *mlx5_ib_alloc_dm(struct ib_device *ibdev, int mlx5_ib_dealloc_dm(struct ib_dm *ibdm, struct uverbs_attr_bundle *attrs) { + struct mlx5_ib_ucontext *ctx = rdma_udata_to_drv_context( + &attrs->driver_udata, struct mlx5_ib_ucontext, ibucontext); struct mlx5_dm *dm_db = &to_mdev(ibdm->device)->dm; struct mlx5_ib_dm *dm = to_mdm(ibdm); u32 page_idx; @@ -2401,11 +2467,16 @@ int mlx5_ib_dealloc_dm(struct ib_dm *ibdm, struct uverbs_attr_bundle *attrs) MLX5_CAP64_DEV_MEM(dm_db->dev, memic_bar_start_addr)) >> PAGE_SHIFT; - bitmap_clear(rdma_udata_to_drv_context(&attrs->driver_udata, - struct mlx5_ib_ucontext, - ibucontext) - ->dm_pages, - page_idx, DIV_ROUND_UP(dm->size, PAGE_SIZE)); + bitmap_clear(ctx->dm_pages, page_idx, + DIV_ROUND_UP(dm->size, PAGE_SIZE)); + break; + case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM: + case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM: + ret = mlx5_cmd_dealloc_sw_icm(dm_db, dm->type, dm->size, + ctx->devx_uid, dm->dev_addr, + dm->icm_dm.obj_id); + if (ret) + return ret; break; default: return -EOPNOTSUPP; @@ -6006,6 +6077,8 @@ static struct ib_counters *mlx5_ib_create_counters(struct ib_device *device, static void mlx5_ib_stage_init_cleanup(struct mlx5_ib_dev *dev) { + struct mlx5_core_dev *mdev = dev->mdev; + mlx5_ib_cleanup_multiport_master(dev); if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) { srcu_barrier(&dev->mr_srcu); @@ -6013,11 +6086,29 @@ static void mlx5_ib_stage_init_cleanup(struct mlx5_ib_dev *dev) } WARN_ON(!bitmap_empty(dev->dm.memic_alloc_pages, MLX5_MAX_MEMIC_PAGES)); + + WARN_ON(dev->dm.steering_sw_icm_alloc_blocks && + !bitmap_empty( + dev->dm.steering_sw_icm_alloc_blocks, + BIT(MLX5_CAP_DEV_MEM(mdev, log_steering_sw_icm_size) - + MLX5_LOG_SW_ICM_BLOCK_SIZE(mdev)))); + + kfree(dev->dm.steering_sw_icm_alloc_blocks); + + WARN_ON(dev->dm.header_modify_sw_icm_alloc_blocks && + !bitmap_empty(dev->dm.header_modify_sw_icm_alloc_blocks, + BIT(MLX5_CAP_DEV_MEM( + mdev, log_header_modify_sw_icm_size) - + MLX5_LOG_SW_ICM_BLOCK_SIZE(mdev)))); + + kfree(dev->dm.header_modify_sw_icm_alloc_blocks); } static int mlx5_ib_stage_init_init(struct mlx5_ib_dev *dev) { struct mlx5_core_dev *mdev = dev->mdev; + u64 header_modify_icm_blocks = 0; + u64 steering_icm_blocks = 0; int err; int i; @@ -6063,16 +6154,51 @@ static int mlx5_ib_stage_init_init(struct mlx5_ib_dev *dev) INIT_LIST_HEAD(&dev->qp_list); spin_lock_init(&dev->reset_flow_resource_lock); + if (MLX5_CAP_GEN_64(mdev, general_obj_types) & + MLX5_GENERAL_OBJ_TYPES_CAP_SW_ICM) { + if (MLX5_CAP64_DEV_MEM(mdev, steering_sw_icm_start_address)) { + steering_icm_blocks = + BIT(MLX5_CAP_DEV_MEM(mdev, + log_steering_sw_icm_size) - + MLX5_LOG_SW_ICM_BLOCK_SIZE(mdev)); + + dev->dm.steering_sw_icm_alloc_blocks = + kcalloc(BITS_TO_LONGS(steering_icm_blocks), + sizeof(unsigned long), GFP_KERNEL); + if (!dev->dm.steering_sw_icm_alloc_blocks) + goto err_mp; + } + + if (MLX5_CAP64_DEV_MEM(mdev, + header_modify_sw_icm_start_address)) { + header_modify_icm_blocks = BIT( + MLX5_CAP_DEV_MEM( + mdev, log_header_modify_sw_icm_size) - + MLX5_LOG_SW_ICM_BLOCK_SIZE(mdev)); + + dev->dm.header_modify_sw_icm_alloc_blocks = + kcalloc(BITS_TO_LONGS(header_modify_icm_blocks), + sizeof(unsigned long), GFP_KERNEL); + if (!dev->dm.header_modify_sw_icm_alloc_blocks) + goto err_dm; + } + } + spin_lock_init(&dev->dm.lock); dev->dm.dev = mdev; if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) { err = init_srcu_struct(&dev->mr_srcu); if (err) - goto err_mp; + goto err_dm; } return 0; + +err_dm: + kfree(dev->dm.steering_sw_icm_alloc_blocks); + kfree(dev->dm.header_modify_sw_icm_alloc_blocks); + err_mp: mlx5_ib_cleanup_multiport_master(dev); @@ -6255,7 +6381,9 @@ static int mlx5_ib_stage_caps_init(struct mlx5_ib_dev *dev) ib_set_device_ops(&dev->ib_dev, &mlx5_ib_dev_xrc_ops); } - if (MLX5_CAP_DEV_MEM(mdev, memic)) + if (MLX5_CAP_DEV_MEM(mdev, memic) || + MLX5_CAP_GEN_64(dev->mdev, general_obj_types) & + MLX5_GENERAL_OBJ_TYPES_CAP_SW_ICM) ib_set_device_ops(&dev->ib_dev, &mlx5_ib_dev_dm_ops); if (mlx5_accel_ipsec_device_caps(dev->mdev) & diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 86f8110bf3ab..228ceac13d4d 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -118,6 +118,10 @@ enum { MLX5_MEMIC_BASE_SIZE = 1 << MLX5_MEMIC_BASE_ALIGN, }; +#define MLX5_LOG_SW_ICM_BLOCK_SIZE(dev) \ + (MLX5_CAP_DEV_MEM(dev, log_sw_icm_alloc_granularity)) +#define MLX5_SW_ICM_BLOCK_SIZE(dev) (1 << MLX5_LOG_SW_ICM_BLOCK_SIZE(dev)) + struct mlx5_ib_ucontext { struct ib_ucontext ibucontext; struct list_head db_page_list; @@ -561,6 +565,12 @@ struct mlx5_ib_dm { phys_addr_t dev_addr; u32 type; size_t size; + union { + struct { + u32 obj_id; + } icm_dm; + /* other dm types specific params should be added here */ + }; }; #define MLX5_IB_MTT_PRESENT (MLX5_IB_MTT_READ | MLX5_IB_MTT_WRITE) @@ -571,6 +581,11 @@ struct mlx5_ib_dm { IB_ACCESS_REMOTE_ATOMIC |\ IB_ZERO_BASED) +#define MLX5_IB_DM_SW_ICM_ALLOWED_ACCESS (IB_ACCESS_LOCAL_WRITE |\ + IB_ACCESS_REMOTE_WRITE |\ + IB_ACCESS_REMOTE_READ |\ + IB_ZERO_BASED) + struct mlx5_ib_mr { struct ib_mr ibmr; void *descs; @@ -858,6 +873,8 @@ struct mlx5_dm { */ spinlock_t lock; DECLARE_BITMAP(memic_alloc_pages, MLX5_MAX_MEMIC_PAGES); + unsigned long *steering_sw_icm_alloc_blocks; + unsigned long *header_modify_sw_icm_alloc_blocks; }; struct mlx5_read_counters_attr { diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index ba35d68e7499..5f09699fab98 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1247,6 +1247,13 @@ struct ib_mr *mlx5_ib_reg_dm_mr(struct ib_pd *pd, struct ib_dm *dm, mode = MLX5_MKC_ACCESS_MODE_MEMIC; start_addr -= pci_resource_start(dev->pdev, 0); break; + case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM: + case MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM: + if (attr->access_flags & ~MLX5_IB_DM_SW_ICM_ALLOWED_ACCESS) + return ERR_PTR(-EINVAL); + + mode = MLX5_MKC_ACCESS_MODE_SW_ICM; + break; default: return ERR_PTR(-EINVAL); } diff --git a/include/uapi/rdma/mlx5_user_ioctl_verbs.h b/include/uapi/rdma/mlx5_user_ioctl_verbs.h index c291fb2f8446..a8f34c237458 100644 --- a/include/uapi/rdma/mlx5_user_ioctl_verbs.h +++ b/include/uapi/rdma/mlx5_user_ioctl_verbs.h @@ -59,6 +59,8 @@ struct mlx5_ib_uapi_devx_async_cmd_hdr { enum mlx5_ib_uapi_dm_type { MLX5_IB_UAPI_DM_TYPE_MEMIC, + MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM, + MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM, }; #endif From patchwork Sun May 5 14:07:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 10930195 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 163B01398 for ; Sun, 5 May 2019 14:07:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 04DE4285CC for ; Sun, 5 May 2019 14:07:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id ED9B2285E5; Sun, 5 May 2019 14:07:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9EDA4285CC for ; Sun, 5 May 2019 14:07:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727295AbfEEOHe (ORCPT ); Sun, 5 May 2019 10:07:34 -0400 Received: from mail.kernel.org ([198.145.29.99]:42702 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726524AbfEEOHd (ORCPT ); Sun, 5 May 2019 10:07:33 -0400 Received: from localhost (unknown [193.47.165.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id D17AE204FD; Sun, 5 May 2019 14:07:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1557065253; bh=bOYd9856chqHXKjUAaqoAzaLyWngOUBL8k+EVXsoUJY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=vuGxWokidfrOwEZnQ9UzEu4XogXJwbvMdwd2FPVVUQgziq3vFd6N4a/B16VP2YMH8 kKkMJcor6HYzzzqYO6c3xTaaBS2tsTKkBx2ERtxY5DBqeOxtpmbcqhrWanHZv9k51S FVvWPH/yn9HrLuJ/r3bBelIIwnzsPPgrZYYSDBfI= From: Leon Romanovsky To: Doug Ledford , Jason Gunthorpe Cc: Leon Romanovsky , RDMA mailing list , Ariel Levkovich , Eli Cohen , Mark Bloch Subject: [PATCH rdma-next 4/4] IB/mlx5: Device resource control for privileged DEVX user Date: Sun, 5 May 2019 17:07:14 +0300 Message-Id: <20190505140714.8741-5-leon@kernel.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190505140714.8741-1-leon@kernel.org> References: <20190505140714.8741-1-leon@kernel.org> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Ariel Levkovich For DEVX users who have SYS_RAWIO capability, we set the internal device resources capability when creating the UCTX. This will allow the device to restrict the allocation of internal device resources such as SW ICM memory to privileged DEVX users only. Signed-off-by: Ariel Levkovich Reviewed-by: Eli Cohen Reviewed-by: Mark Bloch Signed-off-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/devx.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/infiniband/hw/mlx5/devx.c b/drivers/infiniband/hw/mlx5/devx.c index d627f44bc84d..169ffffcf5ed 100644 --- a/drivers/infiniband/hw/mlx5/devx.c +++ b/drivers/infiniband/hw/mlx5/devx.c @@ -85,6 +85,10 @@ int mlx5_ib_devx_create(struct mlx5_ib_dev *dev, bool is_user) if (is_user && capable(CAP_NET_RAW) && (MLX5_CAP_GEN(dev->mdev, uctx_cap) & MLX5_UCTX_CAP_RAW_TX)) cap |= MLX5_UCTX_CAP_RAW_TX; + if (is_user && capable(CAP_SYS_RAWIO) && + (MLX5_CAP_GEN(dev->mdev, uctx_cap) & + MLX5_UCTX_CAP_INTERNAL_DEV_RES)) + cap |= MLX5_UCTX_CAP_INTERNAL_DEV_RES; MLX5_SET(create_uctx_in, in, opcode, MLX5_CMD_OP_CREATE_UCTX); MLX5_SET(uctx, uctx, cap, cap);