From patchwork Wed Apr 27 13:49:50 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matan Barak X-Patchwork-Id: 8957641 Return-Path: X-Original-To: patchwork-linux-rdma@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id DC633BF29F for ; Wed, 27 Apr 2016 13:50:33 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id BED2020253 for ; Wed, 27 Apr 2016 13:50:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A2116201EF for ; Wed, 27 Apr 2016 13:50:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752492AbcD0Nu2 (ORCPT ); Wed, 27 Apr 2016 09:50:28 -0400 Received: from [193.47.165.129] ([193.47.165.129]:51670 "EHLO mellanox.co.il" rhost-flags-FAIL-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1752205AbcD0Nu0 (ORCPT ); Wed, 27 Apr 2016 09:50:26 -0400 Received: from Internal Mail-Server by MTLPINE1 (envelope-from matanb@mellanox.com) with ESMTPS (AES256-SHA encrypted); 27 Apr 2016 16:49:58 +0300 Received: from rsws33.mtr.labs.mlnx (dev-r-vrt-064.mtr.labs.mlnx [10.212.64.1]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id u3RDnwXA025720; Wed, 27 Apr 2016 16:49:58 +0300 From: Matan Barak To: Doug Ledford Cc: linux-rdma@vger.kernel.org, Majd Dibbiny , Matan Barak , Guy Levi , Moshe Lazer Subject: [PATCH V1 for-next] IB/mlx5: Add UARs write-combining and non-cached mapping Date: Wed, 27 Apr 2016 16:49:50 +0300 Message-Id: <1461764990-549-1-git-send-email-matanb@mellanox.com> X-Mailer: git-send-email 2.5.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Guy Levi By this patch, the user space library will be able to improve performance using appropriate ringing DoorBell method according to the memory type it asked for. Currently only one mapping command is allowed for UARs: MLX5_IB_MMAP_REGULAR_PAGE. Using this mapping, the kernel maps the UARs to write-combining (WC) if the system supports it. If the system is not supporting WC the UARs are mapped to non-cached(NC). In this case the user space library can't tell which mapping is applied. This patch adds 2 new mapping commands: MLX5_IB_MMAP_WC_PAGE and MLX5_IB_MMAP_NC_PAGE. For these commands the kernel maps exactly as requested and fails if it can't. Since there is no generic way to check if the requested memory region can be mapped as WC, driver enables conclusive WC mapping only for x86, PowerPC and ARM which support WC for the device's memory region. Signed-off-by: Guy Levy Signed-off-by: Moshe Lazer Signed-off-by: Matan Barak --- Hi Doug, Currently the kernel UAR mapping interface provides a best effort mechanism in which the caller simply invokes one mapping command which silently maps to one of the memory types (write-combining or non-cached). However, the user space library will be able to improve the performance using the appropriate ringing DoorBell method according to the memory type. This patch adds two new mapping commands which enable the library to get an explicit mapping type. Regards, Guy and Matan Changes from V0: * Fixed typos. drivers/infiniband/hw/mlx5/main.c | 96 ++++++++++++++++++++++++++++-------- drivers/infiniband/hw/mlx5/mlx5_ib.h | 2 + 2 files changed, 77 insertions(+), 21 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 5acf346..95bfdc1 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -38,6 +38,9 @@ #include #include #include +#if defined(CONFIG_X86) +#include +#endif #include #include #include @@ -1068,38 +1071,89 @@ static int get_index(unsigned long offset) return get_arg(offset); } +static inline char *mmap_cmd2str(enum mlx5_ib_mmap_cmd cmd) +{ + switch (cmd) { + case MLX5_IB_MMAP_WC_PAGE: + return "WC"; + case MLX5_IB_MMAP_REGULAR_PAGE: + return "best effort WC"; + case MLX5_IB_MMAP_NC_PAGE: + return "NC"; + default: + return NULL; + } +} + +static int uar_mmap(struct mlx5_ib_dev *dev, enum mlx5_ib_mmap_cmd cmd, + struct vm_area_struct *vma, struct mlx5_uuar_info *uuari) +{ + int err; + unsigned long idx; + phys_addr_t pfn, pa; + pgprot_t prot; + + switch (cmd) { + case MLX5_IB_MMAP_WC_PAGE: +/* Some architectures don't support WC memory */ +#if defined(CONFIG_X86) + if (!pat_enabled()) + return -EPERM; +#elif !(defined(CONFIG_PPC) || (defined(CONFIG_ARM) && defined(CONFIG_MMU))) + return -EPERM; +#endif + /* fall through */ + case MLX5_IB_MMAP_REGULAR_PAGE: + /* For MLX5_IB_MMAP_REGULAR_PAGE do the best effort to get WC */ + prot = pgprot_writecombine(vma->vm_page_prot); + break; + case MLX5_IB_MMAP_NC_PAGE: + prot = pgprot_noncached(vma->vm_page_prot); + break; + default: + return -EINVAL; + } + + if (vma->vm_end - vma->vm_start != PAGE_SIZE) + return -EINVAL; + + idx = get_index(vma->vm_pgoff); + if (idx >= uuari->num_uars) + return -EINVAL; + + pfn = uar_index2pfn(dev, uuari->uars[idx].index); + mlx5_ib_dbg(dev, "uar idx 0x%lx, pfn %pa\n", idx, &pfn); + + vma->vm_page_prot = prot; + err = io_remap_pfn_range(vma, vma->vm_start, pfn, + PAGE_SIZE, vma->vm_page_prot); + if (err) { + mlx5_ib_err(dev, "io_remap_pfn_range failed with error=%d, vm_start=0x%lx, pfn=%pa, mmap_cmd=%s\n", + err, vma->vm_start, &pfn, mmap_cmd2str(cmd)); + return -EAGAIN; + } + + pa = pfn << PAGE_SHIFT; + mlx5_ib_dbg(dev, "mapped %s at 0x%lx, PA %pa\n", mmap_cmd2str(cmd), + vma->vm_start, &pa); + + return 0; +} + static int mlx5_ib_mmap(struct ib_ucontext *ibcontext, struct vm_area_struct *vma) { struct mlx5_ib_ucontext *context = to_mucontext(ibcontext); struct mlx5_ib_dev *dev = to_mdev(ibcontext->device); struct mlx5_uuar_info *uuari = &context->uuari; unsigned long command; - unsigned long idx; phys_addr_t pfn; command = get_command(vma->vm_pgoff); switch (command) { + case MLX5_IB_MMAP_WC_PAGE: + case MLX5_IB_MMAP_NC_PAGE: case MLX5_IB_MMAP_REGULAR_PAGE: - if (vma->vm_end - vma->vm_start != PAGE_SIZE) - return -EINVAL; - - idx = get_index(vma->vm_pgoff); - if (idx >= uuari->num_uars) - return -EINVAL; - - pfn = uar_index2pfn(dev, uuari->uars[idx].index); - mlx5_ib_dbg(dev, "uar idx 0x%lx, pfn 0x%llx\n", idx, - (unsigned long long)pfn); - - vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); - if (io_remap_pfn_range(vma, vma->vm_start, pfn, - PAGE_SIZE, vma->vm_page_prot)) - return -EAGAIN; - - mlx5_ib_dbg(dev, "mapped WC at 0x%lx, PA 0x%llx\n", - vma->vm_start, - (unsigned long long)pfn << PAGE_SHIFT); - break; + return uar_mmap(dev, command, vma, uuari); case MLX5_IB_MMAP_GET_CONTIGUOUS_PAGES: return -ENOSYS; diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index f16c818..7520661 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -70,6 +70,8 @@ enum { enum mlx5_ib_mmap_cmd { MLX5_IB_MMAP_REGULAR_PAGE = 0, MLX5_IB_MMAP_GET_CONTIGUOUS_PAGES = 1, + MLX5_IB_MMAP_WC_PAGE = 2, + MLX5_IB_MMAP_NC_PAGE = 3, /* 5 is chosen in order to be compatible with old versions of libmlx5 */ MLX5_IB_MMAP_CORE_CLOCK = 5, };