From patchwork Mon Aug 19 11:16:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 11100661 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 87EEB1398 for ; Mon, 19 Aug 2019 11:17:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 76B571FEBA for ; Mon, 19 Aug 2019 11:17:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6A5672237D; Mon, 19 Aug 2019 11:17:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C99E51FEBA for ; Mon, 19 Aug 2019 11:17:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727068AbfHSLRV (ORCPT ); Mon, 19 Aug 2019 07:17:21 -0400 Received: from mail.kernel.org ([198.145.29.99]:32804 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726987AbfHSLRV (ORCPT ); Mon, 19 Aug 2019 07:17:21 -0400 Received: from localhost (unknown [77.137.115.125]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 377A82085A; Mon, 19 Aug 2019 11:17:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566213439; bh=8Jm5F5K3QaGtNexv21p6U5lSw/QYsTdUnZZBoGLQ0Zs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=a/F6cwuf1UGtCpG/x1Bsb7n/pyeYa3P90W8sjlJ4MCGdcjXfPkR/jgdgE4tmToK+F jubSo7dQpcg5DzcRWN/F8F/aSS2IWEYpsJmo4WFxf60wgsTW1aU2AXtXug33S5+ue6 mU5ruvW/2rswg7k7QdHem9NjlLKTEaaNP1/9HEFY= From: Leon Romanovsky To: Doug Ledford , Jason Gunthorpe Cc: Leon Romanovsky , RDMA mailing list , Guy Levi , Moni Shoua Subject: [PATCH rdma-next 01/12] RDMA/odp: Use the common interval tree library instead of generic Date: Mon, 19 Aug 2019 14:16:59 +0300 Message-Id: <20190819111710.18440-2-leon@kernel.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190819111710.18440-1-leon@kernel.org> References: <20190819111710.18440-1-leon@kernel.org> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Jason Gunthorpe ODP is working with userspace VA's in the interval tree which always fit into an unsigned long, so we can use the common code. This comes at a cost of a 16 byte increase in ib_umem_odp struct size due to storing the interval tree start/last in addition to the umem addr/length. However these values were computed and are performance critical for the interval lookup, so this seems like a worthwhile trade off. Removes 2k of .text from the kernel. Signed-off-by: Jason Gunthorpe Signed-off-by: Leon Romanovsky --- drivers/infiniband/Kconfig | 1 + drivers/infiniband/core/umem_odp.c | 72 ++++++++---------------------- include/rdma/ib_umem_odp.h | 20 +++++---- 3 files changed, 31 insertions(+), 62 deletions(-) diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig index 85e103b147cc..b44b1c322ec8 100644 --- a/drivers/infiniband/Kconfig +++ b/drivers/infiniband/Kconfig @@ -55,6 +55,7 @@ config INFINIBAND_ON_DEMAND_PAGING bool "InfiniBand on-demand paging support" depends on INFINIBAND_USER_MEM select MMU_NOTIFIER + select INTERVAL_TREE default y ---help--- On demand paging support for the InfiniBand subsystem. diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index 2a75c6f8d827..8358eb8e3a26 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -39,45 +39,13 @@ #include #include #include -#include +#include #include #include #include #include -/* - * The ib_umem list keeps track of memory regions for which the HW - * device request to receive notification when the related memory - * mapping is changed. - * - * ib_umem_lock protects the list. - */ - -static u64 node_start(struct umem_odp_node *n) -{ - struct ib_umem_odp *umem_odp = - container_of(n, struct ib_umem_odp, interval_tree); - - return ib_umem_start(umem_odp); -} - -/* Note that the representation of the intervals in the interval tree - * considers the ending point as contained in the interval, while the - * function ib_umem_end returns the first address which is not contained - * in the umem. - */ -static u64 node_last(struct umem_odp_node *n) -{ - struct ib_umem_odp *umem_odp = - container_of(n, struct ib_umem_odp, interval_tree); - - return ib_umem_end(umem_odp) - 1; -} - -INTERVAL_TREE_DEFINE(struct umem_odp_node, rb, u64, __subtree_last, - node_start, node_last, static, rbt_ib_umem) - static void ib_umem_notifier_start_account(struct ib_umem_odp *umem_odp) { mutex_lock(&umem_odp->umem_mutex); @@ -209,9 +177,18 @@ static void add_umem_to_per_mm(struct ib_umem_odp *umem_odp) struct ib_ucontext_per_mm *per_mm = umem_odp->per_mm; down_write(&per_mm->umem_rwsem); - if (likely(ib_umem_start(umem_odp) != ib_umem_end(umem_odp))) - rbt_ib_umem_insert(&umem_odp->interval_tree, - &per_mm->umem_tree); + if (likely(ib_umem_start(umem_odp) != ib_umem_end(umem_odp))) { + /* + * Note that the representation of the intervals in the + * interval tree considers the ending point as contained in + * the interval, while the function ib_umem_end returns the + * first address which is not contained in the umem. + */ + umem_odp->interval_tree.start = ib_umem_start(umem_odp); + umem_odp->interval_tree.last = ib_umem_end(umem_odp) - 1; + interval_tree_insert(&umem_odp->interval_tree, + &per_mm->umem_tree); + } up_write(&per_mm->umem_rwsem); } @@ -221,8 +198,8 @@ static void remove_umem_from_per_mm(struct ib_umem_odp *umem_odp) down_write(&per_mm->umem_rwsem); if (likely(ib_umem_start(umem_odp) != ib_umem_end(umem_odp))) - rbt_ib_umem_remove(&umem_odp->interval_tree, - &per_mm->umem_tree); + interval_tree_remove(&umem_odp->interval_tree, + &per_mm->umem_tree); complete_all(&umem_odp->notifier_completion); up_write(&per_mm->umem_rwsem); @@ -765,18 +742,18 @@ int rbt_ib_umem_for_each_in_range(struct rb_root_cached *root, void *cookie) { int ret_val = 0; - struct umem_odp_node *node, *next; + struct interval_tree_node *node, *next; struct ib_umem_odp *umem; if (unlikely(start == last)) return ret_val; - for (node = rbt_ib_umem_iter_first(root, start, last - 1); + for (node = interval_tree_iter_first(root, start, last - 1); node; node = next) { /* TODO move the blockable decision up to the callback */ if (!blockable) return -EAGAIN; - next = rbt_ib_umem_iter_next(node, start, last - 1); + next = interval_tree_iter_next(node, start, last - 1); umem = container_of(node, struct ib_umem_odp, interval_tree); ret_val = cb(umem, start, last, cookie) || ret_val; } @@ -784,16 +761,3 @@ int rbt_ib_umem_for_each_in_range(struct rb_root_cached *root, return ret_val; } EXPORT_SYMBOL(rbt_ib_umem_for_each_in_range); - -struct ib_umem_odp *rbt_ib_umem_lookup(struct rb_root_cached *root, - u64 addr, u64 length) -{ - struct umem_odp_node *node; - - node = rbt_ib_umem_iter_first(root, addr, addr + length - 1); - if (node) - return container_of(node, struct ib_umem_odp, interval_tree); - return NULL; - -} -EXPORT_SYMBOL(rbt_ib_umem_lookup); diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h index 479db5c98ff6..030d5cbad02c 100644 --- a/include/rdma/ib_umem_odp.h +++ b/include/rdma/ib_umem_odp.h @@ -37,11 +37,6 @@ #include #include -struct umem_odp_node { - u64 __subtree_last; - struct rb_node rb; -}; - struct ib_umem_odp { struct ib_umem umem; struct ib_ucontext_per_mm *per_mm; @@ -72,7 +67,7 @@ struct ib_umem_odp { int npages; /* Tree tracking */ - struct umem_odp_node interval_tree; + struct interval_tree_node interval_tree; struct completion notifier_completion; int dying; @@ -163,8 +158,17 @@ int rbt_ib_umem_for_each_in_range(struct rb_root_cached *root, * Find first region intersecting with address range. * Return NULL if not found */ -struct ib_umem_odp *rbt_ib_umem_lookup(struct rb_root_cached *root, - u64 addr, u64 length); +static inline struct ib_umem_odp * +rbt_ib_umem_lookup(struct rb_root_cached *root, u64 addr, u64 length) +{ + struct interval_tree_node *node; + + node = interval_tree_iter_first(root, addr, addr + length - 1); + if (!node) + return NULL; + return container_of(node, struct ib_umem_odp, interval_tree); + +} static inline int ib_umem_mmu_notifier_retry(struct ib_umem_odp *umem_odp, unsigned long mmu_seq) From patchwork Mon Aug 19 11:17:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 11100663 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D9CC617E2 for ; Mon, 19 Aug 2019 11:17:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C8B9F1FEBA for ; Mon, 19 Aug 2019 11:17:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BD0B62237D; Mon, 19 Aug 2019 11:17:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4C1401FEBA for ; Mon, 19 Aug 2019 11:17:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726987AbfHSLRZ (ORCPT ); Mon, 19 Aug 2019 07:17:25 -0400 Received: from mail.kernel.org ([198.145.29.99]:32862 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727424AbfHSLRY (ORCPT ); Mon, 19 Aug 2019 07:17:24 -0400 Received: from localhost (unknown [77.137.115.125]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id C09F02085A; Mon, 19 Aug 2019 11:17:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566213443; bh=b8c1xVTNYiIFx4FiZOHIufuYm1euD+irHiEe8jx6U7w=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Goo0/l6FA2FFOaWfJPR+PfgL3lAZpW93d6o+vv3+fVdCSprRb6MsA0gviFKTHSDOV rz5a1rqTcXxnzJcRGyVLgijlbgagoNpA+FC8yU7tH7yGaB8dq2ymjVrN0GizDy6zEB OGDFZHXyIGBt7TTnNOyMD5OCwkLmiFwY1wecZXXc= From: Leon Romanovsky To: Doug Ledford , Jason Gunthorpe Cc: Leon Romanovsky , RDMA mailing list , Guy Levi , Moni Shoua Subject: [PATCH rdma-next 02/12] RDMA/odp: Iterate over the whole rbtree directly Date: Mon, 19 Aug 2019 14:17:00 +0300 Message-Id: <20190819111710.18440-3-leon@kernel.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190819111710.18440-1-leon@kernel.org> References: <20190819111710.18440-1-leon@kernel.org> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Jason Gunthorpe Instead of intersecting a full interval, just iterate over every element directly. This is faster and clearer. Signed-off-by: Jason Gunthorpe Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/umem_odp.c | 51 ++++++++++++++++-------------- drivers/infiniband/hw/mlx5/odp.c | 41 +++++++++++------------- 2 files changed, 47 insertions(+), 45 deletions(-) diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index 8358eb8e3a26..b9bebef00a33 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -72,35 +72,41 @@ static void ib_umem_notifier_end_account(struct ib_umem_odp *umem_odp) mutex_unlock(&umem_odp->umem_mutex); } -static int ib_umem_notifier_release_trampoline(struct ib_umem_odp *umem_odp, - u64 start, u64 end, void *cookie) -{ - /* - * Increase the number of notifiers running, to - * prevent any further fault handling on this MR. - */ - ib_umem_notifier_start_account(umem_odp); - umem_odp->dying = 1; - /* Make sure that the fact the umem is dying is out before we release - * all pending page faults. */ - smp_wmb(); - complete_all(&umem_odp->notifier_completion); - umem_odp->umem.context->invalidate_range( - umem_odp, ib_umem_start(umem_odp), ib_umem_end(umem_odp)); - return 0; -} - static void ib_umem_notifier_release(struct mmu_notifier *mn, struct mm_struct *mm) { struct ib_ucontext_per_mm *per_mm = container_of(mn, struct ib_ucontext_per_mm, mn); + struct rb_node *node; down_read(&per_mm->umem_rwsem); - if (per_mm->active) - rbt_ib_umem_for_each_in_range( - &per_mm->umem_tree, 0, ULLONG_MAX, - ib_umem_notifier_release_trampoline, true, NULL); + if (!per_mm->active) + goto out; + + for (node = rb_first_cached(&per_mm->umem_tree); node; + node = rb_next(node)) { + struct ib_umem_odp *umem_odp = + rb_entry(node, struct ib_umem_odp, interval_tree.rb); + + /* + * Increase the number of notifiers running, to prevent any + * further fault handling on this MR. + */ + ib_umem_notifier_start_account(umem_odp); + + umem_odp->dying = 1; + /* + * Make sure that the fact the umem is dying is out before we + * release all pending page faults. + */ + smp_wmb(); + complete_all(&umem_odp->notifier_completion); + umem_odp->umem.context->invalidate_range( + umem_odp, ib_umem_start(umem_odp), + ib_umem_end(umem_odp)); + } + +out: up_read(&per_mm->umem_rwsem); } @@ -760,4 +766,3 @@ int rbt_ib_umem_for_each_in_range(struct rb_root_cached *root, return ret_val; } -EXPORT_SYMBOL(rbt_ib_umem_for_each_in_range); diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index b0c5de39d186..3922fced41ec 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -539,34 +539,31 @@ struct mlx5_ib_mr *mlx5_ib_alloc_implicit_mr(struct mlx5_ib_pd *pd, return imr; } -static int mr_leaf_free(struct ib_umem_odp *umem_odp, u64 start, u64 end, - void *cookie) +void mlx5_ib_free_implicit_mr(struct mlx5_ib_mr *imr) { - struct mlx5_ib_mr *mr = umem_odp->private, *imr = cookie; - - if (mr->parent != imr) - return 0; - - ib_umem_odp_unmap_dma_pages(umem_odp, ib_umem_start(umem_odp), - ib_umem_end(umem_odp)); + struct ib_ucontext_per_mm *per_mm = mr_to_per_mm(imr); + struct rb_node *node; - if (umem_odp->dying) - return 0; + down_read(&per_mm->umem_rwsem); + for (node = rb_first_cached(&per_mm->umem_tree); node; + node = rb_next(node)) { + struct ib_umem_odp *umem_odp = + rb_entry(node, struct ib_umem_odp, interval_tree.rb); + struct mlx5_ib_mr *mr = umem_odp->private; - WRITE_ONCE(umem_odp->dying, 1); - atomic_inc(&imr->num_leaf_free); - schedule_work(&umem_odp->work); + if (mr->parent != imr) + continue; - return 0; -} + ib_umem_odp_unmap_dma_pages(umem_odp, ib_umem_start(umem_odp), + ib_umem_end(umem_odp)); -void mlx5_ib_free_implicit_mr(struct mlx5_ib_mr *imr) -{ - struct ib_ucontext_per_mm *per_mm = mr_to_per_mm(imr); + if (umem_odp->dying) + continue; - down_read(&per_mm->umem_rwsem); - rbt_ib_umem_for_each_in_range(&per_mm->umem_tree, 0, ULLONG_MAX, - mr_leaf_free, true, imr); + WRITE_ONCE(umem_odp->dying, 1); + atomic_inc(&imr->num_leaf_free); + schedule_work(&umem_odp->work); + } up_read(&per_mm->umem_rwsem); wait_event(imr->q_leaf_free, !atomic_read(&imr->num_leaf_free)); From patchwork Mon Aug 19 11:17:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 11100665 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8294F1398 for ; Mon, 19 Aug 2019 11:17:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 71C971FEBA for ; Mon, 19 Aug 2019 11:17:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 65E992237D; Mon, 19 Aug 2019 11:17:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C39C81FEBA for ; Mon, 19 Aug 2019 11:17:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727440AbfHSLR2 (ORCPT ); Mon, 19 Aug 2019 07:17:28 -0400 Received: from mail.kernel.org ([198.145.29.99]:32950 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727424AbfHSLR2 (ORCPT ); Mon, 19 Aug 2019 07:17:28 -0400 Received: from localhost (unknown [77.137.115.125]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 462F820989; Mon, 19 Aug 2019 11:17:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566213446; bh=s7XCQF49m1spU9F8AOsruKO635Q3VuajrwBpdfpRTnw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=igf9ynxsT4eFfQ0dlHe8lg82B1l0xcmEEKCVhatEftB0SNb3DHb4SazjefGvBmCCM uCU9dSXtLXHUHqFupr3MnDTFU+ybBxf39TuNqGhEtT4tMYm+jJISrVK8nqLeyvm4W+ PW5KDTJYLH1GbJ6DoN2AisroURSPryHzdO8eC/3o= From: Leon Romanovsky To: Doug Ledford , Jason Gunthorpe Cc: Leon Romanovsky , RDMA mailing list , Guy Levi , Moni Shoua Subject: [PATCH rdma-next 03/12] RDMA/odp: Make it clearer when a umem is an implicit ODP umem Date: Mon, 19 Aug 2019 14:17:01 +0300 Message-Id: <20190819111710.18440-4-leon@kernel.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190819111710.18440-1-leon@kernel.org> References: <20190819111710.18440-1-leon@kernel.org> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Jason Gunthorpe Implicit ODP umems are special, they don't have any page lists, they don't exist in the interval tree and they are never DMA mapped. Instead of trying to guess this based on a zero length use an explicit flag. Further, do not allow non-implicit umems to be 0 size. Signed-off-by: Jason Gunthorpe Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/umem_odp.c | 54 +++++++++++++++++------------- drivers/infiniband/hw/mlx5/mr.c | 2 +- drivers/infiniband/hw/mlx5/odp.c | 2 +- include/rdma/ib_umem_odp.h | 8 +++++ 4 files changed, 40 insertions(+), 26 deletions(-) diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index b9bebef00a33..2eb184a5374a 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -183,18 +183,15 @@ static void add_umem_to_per_mm(struct ib_umem_odp *umem_odp) struct ib_ucontext_per_mm *per_mm = umem_odp->per_mm; down_write(&per_mm->umem_rwsem); - if (likely(ib_umem_start(umem_odp) != ib_umem_end(umem_odp))) { - /* - * Note that the representation of the intervals in the - * interval tree considers the ending point as contained in - * the interval, while the function ib_umem_end returns the - * first address which is not contained in the umem. - */ - umem_odp->interval_tree.start = ib_umem_start(umem_odp); - umem_odp->interval_tree.last = ib_umem_end(umem_odp) - 1; - interval_tree_insert(&umem_odp->interval_tree, - &per_mm->umem_tree); - } + /* + * Note that the representation of the intervals in the interval tree + * considers the ending point as contained in the interval, while the + * function ib_umem_end returns the first address which is not + * contained in the umem. + */ + umem_odp->interval_tree.start = ib_umem_start(umem_odp); + umem_odp->interval_tree.last = ib_umem_end(umem_odp) - 1; + interval_tree_insert(&umem_odp->interval_tree, &per_mm->umem_tree); up_write(&per_mm->umem_rwsem); } @@ -203,11 +200,8 @@ static void remove_umem_from_per_mm(struct ib_umem_odp *umem_odp) struct ib_ucontext_per_mm *per_mm = umem_odp->per_mm; down_write(&per_mm->umem_rwsem); - if (likely(ib_umem_start(umem_odp) != ib_umem_end(umem_odp))) - interval_tree_remove(&umem_odp->interval_tree, - &per_mm->umem_tree); + interval_tree_remove(&umem_odp->interval_tree, &per_mm->umem_tree); complete_all(&umem_odp->notifier_completion); - up_write(&per_mm->umem_rwsem); } @@ -327,6 +321,9 @@ struct ib_umem_odp *ib_alloc_odp_umem(struct ib_umem_odp *root, int pages = size >> PAGE_SHIFT; int ret; + if (!size) + return ERR_PTR(-EINVAL); + odp_data = kzalloc(sizeof(*odp_data), GFP_KERNEL); if (!odp_data) return ERR_PTR(-ENOMEM); @@ -388,6 +385,9 @@ int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access) struct mm_struct *mm = umem->owning_mm; int ret_val; + if (umem_odp->umem.address == 0 && umem_odp->umem.length == 0) + umem_odp->is_implicit_odp = 1; + umem_odp->page_shift = PAGE_SHIFT; if (access & IB_ACCESS_HUGETLB) { struct vm_area_struct *vma; @@ -408,7 +408,10 @@ int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access) init_completion(&umem_odp->notifier_completion); - if (ib_umem_odp_num_pages(umem_odp)) { + if (!umem_odp->is_implicit_odp) { + if (!ib_umem_odp_num_pages(umem_odp)) + return -EINVAL; + umem_odp->page_list = vzalloc(array_size(sizeof(*umem_odp->page_list), ib_umem_odp_num_pages(umem_odp))); @@ -427,7 +430,9 @@ int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access) ret_val = get_per_mm(umem_odp); if (ret_val) goto out_dma_list; - add_umem_to_per_mm(umem_odp); + + if (!umem_odp->is_implicit_odp) + add_umem_to_per_mm(umem_odp); return 0; @@ -446,13 +451,14 @@ void ib_umem_odp_release(struct ib_umem_odp *umem_odp) * It is the driver's responsibility to ensure, before calling us, * that the hardware will not attempt to access the MR any more. */ - ib_umem_odp_unmap_dma_pages(umem_odp, ib_umem_start(umem_odp), - ib_umem_end(umem_odp)); - - remove_umem_from_per_mm(umem_odp); + if (!umem_odp->is_implicit_odp) { + ib_umem_odp_unmap_dma_pages(umem_odp, ib_umem_start(umem_odp), + ib_umem_end(umem_odp)); + remove_umem_from_per_mm(umem_odp); + vfree(umem_odp->dma_list); + vfree(umem_odp->page_list); + } put_per_mm(umem_odp); - vfree(umem_odp->dma_list); - vfree(umem_odp->page_list); } /* diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index 2c77456f359f..e0015b612ffd 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1609,7 +1609,7 @@ static void dereg_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr) /* Wait for all running page-fault handlers to finish. */ synchronize_srcu(&dev->mr_srcu); /* Destroy all page mappings */ - if (umem_odp->page_list) + if (!umem_odp->is_implicit_odp) mlx5_ib_invalidate_range(umem_odp, ib_umem_start(umem_odp), ib_umem_end(umem_odp)); diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 3922fced41ec..5b6b2afa26a6 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -585,7 +585,7 @@ static int pagefault_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr, struct ib_umem_odp *odp; size_t size; - if (!odp_mr->page_list) { + if (odp_mr->is_implicit_odp) { odp = implicit_mr_get_data(mr, io_virt, bcnt); if (IS_ERR(odp)) diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h index 030d5cbad02c..14b38b4459c5 100644 --- a/include/rdma/ib_umem_odp.h +++ b/include/rdma/ib_umem_odp.h @@ -69,6 +69,14 @@ struct ib_umem_odp { /* Tree tracking */ struct interval_tree_node interval_tree; + /* + * An implicit odp umem cannot be DMA mapped, has 0 length, and serves + * only as an anchor for the driver to hold onto the per_mm. FIXME: + * This should be removed and drivers should work with the per_mm + * directly. + */ + bool is_implicit_odp; + struct completion notifier_completion; int dying; unsigned int page_shift; From patchwork Mon Aug 19 11:17:02 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 11100667 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1A2CB1398 for ; Mon, 19 Aug 2019 11:17:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0866B1FEBA for ; Mon, 19 Aug 2019 11:17:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F06752623C; Mon, 19 Aug 2019 11:17:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4DF861FEBA for ; Mon, 19 Aug 2019 11:17:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727442AbfHSLRb (ORCPT ); Mon, 19 Aug 2019 07:17:31 -0400 Received: from mail.kernel.org ([198.145.29.99]:32986 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727424AbfHSLRb (ORCPT ); Mon, 19 Aug 2019 07:17:31 -0400 Received: from localhost (unknown [77.137.115.125]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id CA97A20989; Mon, 19 Aug 2019 11:17:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566213450; bh=sviunBYRIfhAPFQ0P/Ts+x66xXFrc4MCM6jQ8xFFFVA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=sf/eTkEwimEC5n9NxBr75GM3aG/meV7OslbWXkSSWsd6gI+CiVAt8ZhLESCRu2VEu ICE7mU2L0PF7yVY3+B6HoqwVSsCNFn0x2jn5lJhqRCSiBhMm5f62P/iPZ+sXrOhzi6 W1rAgPJRbc4V8StraSwwu29POQwcMLx309J3T904= From: Leon Romanovsky To: Doug Ledford , Jason Gunthorpe Cc: Leon Romanovsky , RDMA mailing list , Guy Levi , Moni Shoua Subject: [PATCH rdma-next 04/12] RMDA/odp: Consolidate umem_odp initialization Date: Mon, 19 Aug 2019 14:17:02 +0300 Message-Id: <20190819111710.18440-5-leon@kernel.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190819111710.18440-1-leon@kernel.org> References: <20190819111710.18440-1-leon@kernel.org> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Jason Gunthorpe This is done in two different places, consolidate all the post-allocation initialization into a single function. Signed-off-by: Jason Gunthorpe Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/umem_odp.c | 200 +++++++++++++---------------- 1 file changed, 86 insertions(+), 114 deletions(-) diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index 2eb184a5374a..487a6371a053 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -178,23 +178,6 @@ static const struct mmu_notifier_ops ib_umem_notifiers = { .invalidate_range_end = ib_umem_notifier_invalidate_range_end, }; -static void add_umem_to_per_mm(struct ib_umem_odp *umem_odp) -{ - struct ib_ucontext_per_mm *per_mm = umem_odp->per_mm; - - down_write(&per_mm->umem_rwsem); - /* - * Note that the representation of the intervals in the interval tree - * considers the ending point as contained in the interval, while the - * function ib_umem_end returns the first address which is not - * contained in the umem. - */ - umem_odp->interval_tree.start = ib_umem_start(umem_odp); - umem_odp->interval_tree.last = ib_umem_end(umem_odp) - 1; - interval_tree_insert(&umem_odp->interval_tree, &per_mm->umem_tree); - up_write(&per_mm->umem_rwsem); -} - static void remove_umem_from_per_mm(struct ib_umem_odp *umem_odp) { struct ib_ucontext_per_mm *per_mm = umem_odp->per_mm; @@ -244,33 +227,23 @@ static struct ib_ucontext_per_mm *alloc_per_mm(struct ib_ucontext *ctx, return ERR_PTR(ret); } -static int get_per_mm(struct ib_umem_odp *umem_odp) +static struct ib_ucontext_per_mm *get_per_mm(struct ib_umem_odp *umem_odp) { struct ib_ucontext *ctx = umem_odp->umem.context; struct ib_ucontext_per_mm *per_mm; + lockdep_assert_held(&ctx->per_mm_list_lock); + /* * Generally speaking we expect only one or two per_mm in this list, * so no reason to optimize this search today. */ - mutex_lock(&ctx->per_mm_list_lock); list_for_each_entry(per_mm, &ctx->per_mm_list, ucontext_list) { if (per_mm->mm == umem_odp->umem.owning_mm) - goto found; - } - - per_mm = alloc_per_mm(ctx, umem_odp->umem.owning_mm); - if (IS_ERR(per_mm)) { - mutex_unlock(&ctx->per_mm_list_lock); - return PTR_ERR(per_mm); + return per_mm; } -found: - umem_odp->per_mm = per_mm; - per_mm->odp_mrs_count++; - mutex_unlock(&ctx->per_mm_list_lock); - - return 0; + return alloc_per_mm(ctx, umem_odp->umem.owning_mm); } static void free_per_mm(struct rcu_head *rcu) @@ -311,79 +284,114 @@ static void put_per_mm(struct ib_umem_odp *umem_odp) mmu_notifier_call_srcu(&per_mm->rcu, free_per_mm); } +static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp, + struct ib_ucontext_per_mm *per_mm) +{ + struct ib_ucontext *ctx = umem_odp->umem.context; + int ret; + + umem_odp->umem.is_odp = 1; + if (!umem_odp->is_implicit_odp) { + size_t pages = ib_umem_odp_num_pages(umem_odp); + + if (!pages) + return -EINVAL; + + /* + * Note that the representation of the intervals in the + * interval tree considers the ending point as contained in + * the interval, while the function ib_umem_end returns the + * first address which is not contained in the umem. + */ + umem_odp->interval_tree.start = ib_umem_start(umem_odp); + umem_odp->interval_tree.last = ib_umem_end(umem_odp) - 1; + + umem_odp->page_list = vzalloc( + array_size(sizeof(*umem_odp->page_list), pages)); + if (!umem_odp->page_list) + return -ENOMEM; + + umem_odp->dma_list = + vzalloc(array_size(sizeof(*umem_odp->dma_list), pages)); + if (!umem_odp->dma_list) { + ret = -ENOMEM; + goto out_page_list; + } + } + + mutex_lock(&ctx->per_mm_list_lock); + if (!per_mm) { + per_mm = get_per_mm(umem_odp); + if (IS_ERR(per_mm)) { + ret = PTR_ERR(per_mm); + goto out_unlock; + } + } + umem_odp->per_mm = per_mm; + per_mm->odp_mrs_count++; + mutex_unlock(&ctx->per_mm_list_lock); + + mutex_init(&umem_odp->umem_mutex); + init_completion(&umem_odp->notifier_completion); + + if (!umem_odp->is_implicit_odp) { + down_write(&per_mm->umem_rwsem); + interval_tree_insert(&umem_odp->interval_tree, + &per_mm->umem_tree); + up_write(&per_mm->umem_rwsem); + } + + return 0; + +out_unlock: + mutex_unlock(&ctx->per_mm_list_lock); + vfree(umem_odp->dma_list); +out_page_list: + vfree(umem_odp->page_list); + return ret; +} + struct ib_umem_odp *ib_alloc_odp_umem(struct ib_umem_odp *root, unsigned long addr, size_t size) { - struct ib_ucontext_per_mm *per_mm = root->per_mm; - struct ib_ucontext *ctx = per_mm->context; + /* + * Caller must ensure that root cannot be freed during the call to + * ib_alloc_odp_umem. + */ struct ib_umem_odp *odp_data; struct ib_umem *umem; - int pages = size >> PAGE_SHIFT; int ret; - if (!size) - return ERR_PTR(-EINVAL); - odp_data = kzalloc(sizeof(*odp_data), GFP_KERNEL); if (!odp_data) return ERR_PTR(-ENOMEM); umem = &odp_data->umem; - umem->context = ctx; + umem->context = root->umem.context; umem->length = size; umem->address = addr; - odp_data->page_shift = PAGE_SHIFT; umem->writable = root->umem.writable; - umem->is_odp = 1; - odp_data->per_mm = per_mm; - umem->owning_mm = per_mm->mm; - mmgrab(umem->owning_mm); - - mutex_init(&odp_data->umem_mutex); - init_completion(&odp_data->notifier_completion); - - odp_data->page_list = - vzalloc(array_size(pages, sizeof(*odp_data->page_list))); - if (!odp_data->page_list) { - ret = -ENOMEM; - goto out_odp_data; - } + umem->owning_mm = root->umem.owning_mm; + odp_data->page_shift = PAGE_SHIFT; - odp_data->dma_list = - vzalloc(array_size(pages, sizeof(*odp_data->dma_list))); - if (!odp_data->dma_list) { - ret = -ENOMEM; - goto out_page_list; + ret = ib_init_umem_odp(odp_data, root->per_mm); + if (ret) { + kfree(odp_data); + return ERR_PTR(ret); } - /* - * Caller must ensure that the umem_odp that the per_mm came from - * cannot be freed during the call to ib_alloc_odp_umem. - */ - mutex_lock(&ctx->per_mm_list_lock); - per_mm->odp_mrs_count++; - mutex_unlock(&ctx->per_mm_list_lock); - add_umem_to_per_mm(odp_data); + mmgrab(umem->owning_mm); return odp_data; - -out_page_list: - vfree(odp_data->page_list); -out_odp_data: - mmdrop(umem->owning_mm); - kfree(odp_data); - return ERR_PTR(ret); } EXPORT_SYMBOL(ib_alloc_odp_umem); int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access) { - struct ib_umem *umem = &umem_odp->umem; /* * NOTE: This must called in a process context where umem->owning_mm * == current->mm */ - struct mm_struct *mm = umem->owning_mm; - int ret_val; + struct mm_struct *mm = umem_odp->umem.owning_mm; if (umem_odp->umem.address == 0 && umem_odp->umem.length == 0) umem_odp->is_implicit_odp = 1; @@ -404,43 +412,7 @@ int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access) up_read(&mm->mmap_sem); } - mutex_init(&umem_odp->umem_mutex); - - init_completion(&umem_odp->notifier_completion); - - if (!umem_odp->is_implicit_odp) { - if (!ib_umem_odp_num_pages(umem_odp)) - return -EINVAL; - - umem_odp->page_list = - vzalloc(array_size(sizeof(*umem_odp->page_list), - ib_umem_odp_num_pages(umem_odp))); - if (!umem_odp->page_list) - return -ENOMEM; - - umem_odp->dma_list = - vzalloc(array_size(sizeof(*umem_odp->dma_list), - ib_umem_odp_num_pages(umem_odp))); - if (!umem_odp->dma_list) { - ret_val = -ENOMEM; - goto out_page_list; - } - } - - ret_val = get_per_mm(umem_odp); - if (ret_val) - goto out_dma_list; - - if (!umem_odp->is_implicit_odp) - add_umem_to_per_mm(umem_odp); - - return 0; - -out_dma_list: - vfree(umem_odp->dma_list); -out_page_list: - vfree(umem_odp->page_list); - return ret_val; + return ib_init_umem_odp(umem_odp, NULL); } void ib_umem_odp_release(struct ib_umem_odp *umem_odp) From patchwork Mon Aug 19 11:17:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 11100677 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 19A9F14DE for ; Mon, 19 Aug 2019 11:17:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 081901FEBA for ; Mon, 19 Aug 2019 11:17:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F09C52623C; Mon, 19 Aug 2019 11:17:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 55A7B1FEBA for ; Mon, 19 Aug 2019 11:17:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727319AbfHSLRt (ORCPT ); Mon, 19 Aug 2019 07:17:49 -0400 Received: from mail.kernel.org ([198.145.29.99]:33158 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726776AbfHSLRt (ORCPT ); Mon, 19 Aug 2019 07:17:49 -0400 Received: from localhost (unknown [77.137.115.125]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id A76E62085A; Mon, 19 Aug 2019 11:17:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566213468; bh=LyqzCzb4bb4PLX6qvZQCcU2RcM3A/qYLlX48oe0SEHE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=DaeJt/E9ud73piwlBqSOihCqS2JTQle4J0sQqIJGImuGDEbeOSgO4FtbCN9FtUEHD FijNB2hbAGYOPffFXUAL57jT7SwMHgx26hzlH/D2y9KywVt10tG2WDTdW+/O6xVTUI buDJeY9s6mekezaBj8J7Qi8OR9QT8eZXA3tndo7o= From: Leon Romanovsky To: Doug Ledford , Jason Gunthorpe Cc: Leon Romanovsky , RDMA mailing list , Guy Levi , Moni Shoua Subject: [PATCH rdma-next 05/12] RDMA/odp: Make the three ways to create a umem_odp clear Date: Mon, 19 Aug 2019 14:17:03 +0300 Message-Id: <20190819111710.18440-6-leon@kernel.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190819111710.18440-1-leon@kernel.org> References: <20190819111710.18440-1-leon@kernel.org> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Jason Gunthorpe The three paths to build the umem_odps are kind of muddled, they are: - As a normal ib_mr umem - As a child in an implicit ODP umem tree - As the root of an implicit ODP umem tree Only the first two are actually umem's, the last is an abuse. The implicit case can only be triggered by explicit driver request, it should never be co-mingled with the normal case. While we are here, make sensible function names and add some comments to make this clearer. Signed-off-by: Jason Gunthorpe Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/umem_odp.c | 80 +++++++++++++++++++++++++++--- drivers/infiniband/hw/mlx5/odp.c | 23 ++++----- include/rdma/ib_umem_odp.h | 6 ++- 3 files changed, 89 insertions(+), 20 deletions(-) diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index 487a6371a053..9b1f779493e9 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -46,6 +46,8 @@ #include #include +#include "uverbs.h" + static void ib_umem_notifier_start_account(struct ib_umem_odp *umem_odp) { mutex_lock(&umem_odp->umem_mutex); @@ -351,8 +353,67 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp, return ret; } -struct ib_umem_odp *ib_alloc_odp_umem(struct ib_umem_odp *root, - unsigned long addr, size_t size) +/** + * ib_umem_odp_alloc_implicit - Allocate a parent implicit ODP umem + * + * Implicit ODP umems do not have a VA range and do not have any page lists. + * They exist only to hold the per_mm reference to help the driver create + * children umems. + * + * @udata: udata from the syscall being used to create the umem + * @access: ib_reg_mr access flags + */ +struct ib_umem_odp *ib_umem_odp_alloc_implicit(struct ib_udata *udata, + int access) +{ + struct ib_ucontext *context = + container_of(udata, struct uverbs_attr_bundle, driver_udata) + ->context; + struct ib_umem *umem; + struct ib_umem_odp *umem_odp; + int ret; + + if (access & IB_ACCESS_HUGETLB) + return ERR_PTR(-EINVAL); + + if (!context) + return ERR_PTR(-EIO); + if (WARN_ON_ONCE(!context->invalidate_range)) + return ERR_PTR(-EINVAL); + + umem_odp = kzalloc(sizeof(*umem_odp), GFP_KERNEL); + if (!umem_odp) + return ERR_PTR(-ENOMEM); + umem = &umem_odp->umem; + umem->context = context; + umem->writable = ib_access_writable(access); + umem->owning_mm = current->mm; + umem_odp->is_implicit_odp = 1; + umem_odp->page_shift = PAGE_SHIFT; + + ret = ib_init_umem_odp(umem_odp, NULL); + if (ret) { + kfree(umem_odp); + return ERR_PTR(ret); + } + + mmgrab(umem->owning_mm); + + return umem_odp; +} +EXPORT_SYMBOL(ib_umem_odp_alloc_implicit); + +/** + * ib_umem_odp_alloc_child - Allocate a child ODP umem under an implicit + * parent ODP umem + * + * @root: The parent umem enclosing the child. This must be allocated using + * ib_alloc_implicit_odp_umem() + * @addr: The starting userspace VA + * @size: The length of the userspace VA + */ +struct ib_umem_odp *ib_umem_odp_alloc_child(struct ib_umem_odp *root, + unsigned long addr, size_t size) { /* * Caller must ensure that root cannot be freed during the call to @@ -362,6 +423,9 @@ struct ib_umem_odp *ib_alloc_odp_umem(struct ib_umem_odp *root, struct ib_umem *umem; int ret; + if (WARN_ON(!root->is_implicit_odp)) + return ERR_PTR(-EINVAL); + odp_data = kzalloc(sizeof(*odp_data), GFP_KERNEL); if (!odp_data) return ERR_PTR(-ENOMEM); @@ -383,8 +447,15 @@ struct ib_umem_odp *ib_alloc_odp_umem(struct ib_umem_odp *root, return odp_data; } -EXPORT_SYMBOL(ib_alloc_odp_umem); +EXPORT_SYMBOL(ib_umem_odp_alloc_child); +/** + * ib_umem_odp_get - Complete ib_umem_get() + * + * @umem_odp: The partially configured umem from ib_umem_get() + * @addr: The starting userspace VA + * @access: ib_reg_mr access flags + */ int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access) { /* @@ -393,9 +464,6 @@ int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access) */ struct mm_struct *mm = umem_odp->umem.owning_mm; - if (umem_odp->umem.address == 0 && umem_odp->umem.length == 0) - umem_odp->is_implicit_odp = 1; - umem_odp->page_shift = PAGE_SHIFT; if (access & IB_ACCESS_HUGETLB) { struct vm_area_struct *vma; diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 5b6b2afa26a6..4371fc759c23 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -384,7 +384,7 @@ static void mlx5_ib_page_fault_resume(struct mlx5_ib_dev *dev, } static struct mlx5_ib_mr *implicit_mr_alloc(struct ib_pd *pd, - struct ib_umem *umem, + struct ib_umem_odp *umem_odp, bool ksm, int access_flags) { struct mlx5_ib_dev *dev = to_mdev(pd->device); @@ -402,7 +402,7 @@ static struct mlx5_ib_mr *implicit_mr_alloc(struct ib_pd *pd, mr->dev = dev; mr->access_flags = access_flags; mr->mmkey.iova = 0; - mr->umem = umem; + mr->umem = &umem_odp->umem; if (ksm) { err = mlx5_ib_update_xlt(mr, 0, @@ -462,14 +462,13 @@ static struct ib_umem_odp *implicit_mr_get_data(struct mlx5_ib_mr *mr, if (nentries) nentries++; } else { - odp = ib_alloc_odp_umem(odp_mr, addr, - MLX5_IMR_MTT_SIZE); + odp = ib_umem_odp_alloc_child(odp_mr, addr, MLX5_IMR_MTT_SIZE); if (IS_ERR(odp)) { mutex_unlock(&odp_mr->umem_mutex); return ERR_CAST(odp); } - mtt = implicit_mr_alloc(mr->ibmr.pd, &odp->umem, 0, + mtt = implicit_mr_alloc(mr->ibmr.pd, odp, 0, mr->access_flags); if (IS_ERR(mtt)) { mutex_unlock(&odp_mr->umem_mutex); @@ -519,19 +518,19 @@ struct mlx5_ib_mr *mlx5_ib_alloc_implicit_mr(struct mlx5_ib_pd *pd, int access_flags) { struct mlx5_ib_mr *imr; - struct ib_umem *umem; + struct ib_umem_odp *umem_odp; - umem = ib_umem_get(udata, 0, 0, access_flags, 0); - if (IS_ERR(umem)) - return ERR_CAST(umem); + umem_odp = ib_umem_odp_alloc_implicit(udata, access_flags); + if (IS_ERR(umem_odp)) + return ERR_CAST(umem_odp); - imr = implicit_mr_alloc(&pd->ibpd, umem, 1, access_flags); + imr = implicit_mr_alloc(&pd->ibpd, umem_odp, 1, access_flags); if (IS_ERR(imr)) { - ib_umem_release(umem); + ib_umem_release(&umem_odp->umem); return ERR_CAST(imr); } - imr->umem = umem; + imr->umem = &umem_odp->umem; init_waitqueue_head(&imr->q_leaf_free); atomic_set(&imr->num_leaf_free, 0); atomic_set(&imr->num_pending_prefetch, 0); diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h index 14b38b4459c5..219fe7015e7d 100644 --- a/include/rdma/ib_umem_odp.h +++ b/include/rdma/ib_umem_odp.h @@ -140,8 +140,10 @@ struct ib_ucontext_per_mm { }; int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access); -struct ib_umem_odp *ib_alloc_odp_umem(struct ib_umem_odp *root_umem, - unsigned long addr, size_t size); +struct ib_umem_odp *ib_umem_odp_alloc_implicit(struct ib_udata *udata, + int access); +struct ib_umem_odp *ib_umem_odp_alloc_child(struct ib_umem_odp *root_umem, + unsigned long addr, size_t size); void ib_umem_odp_release(struct ib_umem_odp *umem_odp); int ib_umem_odp_map_dma_pages(struct ib_umem_odp *umem_odp, u64 start_offset, From patchwork Mon Aug 19 11:17:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 11100669 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3D4DA17E2 for ; Mon, 19 Aug 2019 11:17:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2A0631FEBA for ; Mon, 19 Aug 2019 11:17:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1DE082623C; Mon, 19 Aug 2019 11:17:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 60FB51FEBA for ; Mon, 19 Aug 2019 11:17:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727443AbfHSLRf (ORCPT ); Mon, 19 Aug 2019 07:17:35 -0400 Received: from mail.kernel.org ([198.145.29.99]:33052 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727424AbfHSLRf (ORCPT ); Mon, 19 Aug 2019 07:17:35 -0400 Received: from localhost (unknown [77.137.115.125]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 69EC82085A; Mon, 19 Aug 2019 11:17:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566213454; bh=iCQI4x4JIdcl6S5yE/FVzBg1bu/HtlL84De0JbvijlY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=i4O5AWZBFFVyCopYqqAcpjcVJoNs7Es6kTLyucoNFotAp9Q1VRpA4+MaqLO3RSdnL NdaXMXs2x9mFlNuHqZAP7LT2b4KWIHi+bTios/kKreW/Y99Z9Chimp7tSe8D3NhSEO jNXmCaxR+QYv1kiOxJ9eQDiUXx86s6EX/vjhfz9o= From: Leon Romanovsky To: Doug Ledford , Jason Gunthorpe Cc: Leon Romanovsky , RDMA mailing list , Guy Levi , Moni Shoua Subject: [PATCH rdma-next 06/12] RDMA/odp: Split creating a umem_odp from ib_umem_get Date: Mon, 19 Aug 2019 14:17:04 +0300 Message-Id: <20190819111710.18440-7-leon@kernel.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190819111710.18440-1-leon@kernel.org> References: <20190819111710.18440-1-leon@kernel.org> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Jason Gunthorpe This is the last creation API that is overloaded for both, there is very little code sharing and a driver has to be specifically ready for a umem_odp to be created to use the odp version. Signed-off-by: Jason Gunthorpe Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/umem.c | 30 +++---------- drivers/infiniband/core/umem_odp.c | 67 ++++++++++++++++++++++-------- drivers/infiniband/hw/mlx5/mem.c | 13 ------ drivers/infiniband/hw/mlx5/mr.c | 34 +++++++++++---- include/rdma/ib_umem_odp.h | 9 ++-- 5 files changed, 86 insertions(+), 67 deletions(-) diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index f3bedbb7c4ab..ac7376401965 100644 --- a/drivers/infiniband/core/umem.c +++ b/drivers/infiniband/core/umem.c @@ -184,9 +184,6 @@ EXPORT_SYMBOL(ib_umem_find_best_pgsz); /** * ib_umem_get - Pin and DMA map userspace memory. * - * If access flags indicate ODP memory, avoid pinning. Instead, stores - * the mm for future page fault handling in conjunction with MMU notifiers. - * * @udata: userspace context to pin memory for * @addr: userspace virtual address to start at * @size: length of region to pin @@ -231,17 +228,12 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr, if (!can_do_mlock()) return ERR_PTR(-EPERM); - if (access & IB_ACCESS_ON_DEMAND) { - umem = kzalloc(sizeof(struct ib_umem_odp), GFP_KERNEL); - if (!umem) - return ERR_PTR(-ENOMEM); - umem->is_odp = 1; - } else { - umem = kzalloc(sizeof(*umem), GFP_KERNEL); - if (!umem) - return ERR_PTR(-ENOMEM); - } + if (access & IB_ACCESS_ON_DEMAND) + return ERR_PTR(-EOPNOTSUPP); + umem = kzalloc(sizeof(*umem), GFP_KERNEL); + if (!umem) + return ERR_PTR(-ENOMEM); umem->context = context; umem->length = size; umem->address = addr; @@ -249,18 +241,6 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr, umem->owning_mm = mm = current->mm; mmgrab(mm); - if (access & IB_ACCESS_ON_DEMAND) { - if (WARN_ON_ONCE(!context->invalidate_range)) { - ret = -EINVAL; - goto umem_kfree; - } - - ret = ib_umem_odp_get(to_ib_umem_odp(umem), access); - if (ret) - goto umem_kfree; - return umem; - } - page_list = (struct page **) __get_free_page(GFP_KERNEL); if (!page_list) { ret = -ENOMEM; diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index 9b1f779493e9..79995766316a 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -342,6 +342,7 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp, &per_mm->umem_tree); up_write(&per_mm->umem_rwsem); } + mmgrab(umem_odp->umem.owning_mm); return 0; @@ -396,9 +397,6 @@ struct ib_umem_odp *ib_umem_odp_alloc_implicit(struct ib_udata *udata, kfree(umem_odp); return ERR_PTR(ret); } - - mmgrab(umem->owning_mm); - return umem_odp; } EXPORT_SYMBOL(ib_umem_odp_alloc_implicit); @@ -442,27 +440,51 @@ struct ib_umem_odp *ib_umem_odp_alloc_child(struct ib_umem_odp *root, kfree(odp_data); return ERR_PTR(ret); } - - mmgrab(umem->owning_mm); - return odp_data; } EXPORT_SYMBOL(ib_umem_odp_alloc_child); /** - * ib_umem_odp_get - Complete ib_umem_get() + * ib_umem_odp_get - Create a umem_odp for a userspace va * - * @umem_odp: The partially configured umem from ib_umem_get() - * @addr: The starting userspace VA - * @access: ib_reg_mr access flags + * @udata: userspace context to pin memory for + * @addr: userspace virtual address to start at + * @size: length of region to pin + * @access: IB_ACCESS_xxx flags for memory being pinned + * + * The driver should use when the access flags indicate ODP memory. It avoids + * pinning, instead, stores the mm for future page fault handling in + * conjunction with MMU notifiers. */ -int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access) +struct ib_umem_odp *ib_umem_odp_get(struct ib_udata *udata, unsigned long addr, + size_t size, int access) { - /* - * NOTE: This must called in a process context where umem->owning_mm - * == current->mm - */ - struct mm_struct *mm = umem_odp->umem.owning_mm; + struct ib_umem_odp *umem_odp; + struct ib_ucontext *context; + struct mm_struct *mm; + int ret; + + if (!udata) + return ERR_PTR(-EIO); + + context = container_of(udata, struct uverbs_attr_bundle, driver_udata) + ->context; + if (!context) + return ERR_PTR(-EIO); + + if (WARN_ON_ONCE(!(access & IB_ACCESS_ON_DEMAND)) || + WARN_ON_ONCE(!context->invalidate_range)) + return ERR_PTR(-EINVAL); + + umem_odp = kzalloc(sizeof(struct ib_umem_odp), GFP_KERNEL); + if (!umem_odp) + return ERR_PTR(-ENOMEM); + + umem_odp->umem.context = context; + umem_odp->umem.length = size; + umem_odp->umem.address = addr; + umem_odp->umem.writable = ib_access_writable(access); + umem_odp->umem.owning_mm = mm = current->mm; umem_odp->page_shift = PAGE_SHIFT; if (access & IB_ACCESS_HUGETLB) { @@ -473,15 +495,24 @@ int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access) vma = find_vma(mm, ib_umem_start(umem_odp)); if (!vma || !is_vm_hugetlb_page(vma)) { up_read(&mm->mmap_sem); - return -EINVAL; + ret = -EINVAL; + goto err_free; } h = hstate_vma(vma); umem_odp->page_shift = huge_page_shift(h); up_read(&mm->mmap_sem); } - return ib_init_umem_odp(umem_odp, NULL); + ret = ib_init_umem_odp(umem_odp, NULL); + if (ret) + goto err_free; + return umem_odp; + +err_free: + kfree(umem_odp); + return ERR_PTR(ret); } +EXPORT_SYMBOL(ib_umem_odp_get); void ib_umem_odp_release(struct ib_umem_odp *umem_odp) { diff --git a/drivers/infiniband/hw/mlx5/mem.c b/drivers/infiniband/hw/mlx5/mem.c index a40e0abf2338..b5aece786b36 100644 --- a/drivers/infiniband/hw/mlx5/mem.c +++ b/drivers/infiniband/hw/mlx5/mem.c @@ -56,19 +56,6 @@ void mlx5_ib_cont_pages(struct ib_umem *umem, u64 addr, struct scatterlist *sg; int entry; - if (umem->is_odp) { - struct ib_umem_odp *odp = to_ib_umem_odp(umem); - unsigned int page_shift = odp->page_shift; - - *ncont = ib_umem_odp_num_pages(odp); - *count = *ncont << (page_shift - PAGE_SHIFT); - *shift = page_shift; - if (order) - *order = ilog2(roundup_pow_of_two(*ncont)); - - return; - } - addr = addr >> PAGE_SHIFT; tmp = (unsigned long)addr; m = find_first_bit(&tmp, BITS_PER_LONG); diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index e0015b612ffd..c9690d3cfb5c 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -794,19 +794,37 @@ static int mr_umem_get(struct mlx5_ib_dev *dev, struct ib_udata *udata, int *ncont, int *order) { struct ib_umem *u; - int err; *umem = NULL; - u = ib_umem_get(udata, start, length, access_flags, 0); - err = PTR_ERR_OR_ZERO(u); - if (err) { - mlx5_ib_dbg(dev, "umem get failed (%d)\n", err); - return err; + if (access_flags & IB_ACCESS_ON_DEMAND) { + struct ib_umem_odp *odp; + + odp = ib_umem_odp_get(udata, start, length, access_flags); + if (IS_ERR(odp)) { + mlx5_ib_dbg(dev, "umem get failed (%ld)\n", + PTR_ERR(odp)); + return PTR_ERR(odp); + } + + u = &odp->umem; + + *page_shift = odp->page_shift; + *ncont = ib_umem_odp_num_pages(odp); + *npages = *ncont << (*page_shift - PAGE_SHIFT); + if (order) + *order = ilog2(roundup_pow_of_two(*ncont)); + } else { + u = ib_umem_get(udata, start, length, access_flags, 0); + if (IS_ERR(u)) { + mlx5_ib_dbg(dev, "umem get failed (%ld)\n", PTR_ERR(u)); + return PTR_ERR(u); + } + + mlx5_ib_cont_pages(u, start, MLX5_MKEY_PAGE_SHIFT_MASK, npages, + page_shift, ncont, order); } - mlx5_ib_cont_pages(u, start, MLX5_MKEY_PAGE_SHIFT_MASK, npages, - page_shift, ncont, order); if (!*npages) { mlx5_ib_warn(dev, "avoid zero region\n"); ib_umem_release(u); diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h index 219fe7015e7d..5efb67f97b0a 100644 --- a/include/rdma/ib_umem_odp.h +++ b/include/rdma/ib_umem_odp.h @@ -139,7 +139,8 @@ struct ib_ucontext_per_mm { struct rcu_head rcu; }; -int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access); +struct ib_umem_odp *ib_umem_odp_get(struct ib_udata *udata, unsigned long addr, + size_t size, int access); struct ib_umem_odp *ib_umem_odp_alloc_implicit(struct ib_udata *udata, int access); struct ib_umem_odp *ib_umem_odp_alloc_child(struct ib_umem_odp *root_umem, @@ -199,9 +200,11 @@ static inline int ib_umem_mmu_notifier_retry(struct ib_umem_odp *umem_odp, #else /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */ -static inline int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access) +static inline struct ib_umem_odp *ib_umem_odp_get(struct ib_udata *udata, + unsigned long addr, + size_t size, int access) { - return -EINVAL; + return ERR_PTR(-EINVAL); } static inline void ib_umem_odp_release(struct ib_umem_odp *umem_odp) {} From patchwork Mon Aug 19 11:17:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 11100671 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A91BB1398 for ; Mon, 19 Aug 2019 11:17:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 98DA41FEBA for ; Mon, 19 Aug 2019 11:17:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8CC462623C; Mon, 19 Aug 2019 11:17:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 207651FEBA for ; Mon, 19 Aug 2019 11:17:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727111AbfHSLRi (ORCPT ); Mon, 19 Aug 2019 07:17:38 -0400 Received: from mail.kernel.org ([198.145.29.99]:33078 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727424AbfHSLRi (ORCPT ); Mon, 19 Aug 2019 07:17:38 -0400 Received: from localhost (unknown [77.137.115.125]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 075D92086C; Mon, 19 Aug 2019 11:17:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566213457; bh=lUsjta7Efh5fZbdhlHOVWXZIQxyhiQZTzrFYfpVf+B4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=DWqUheM7pTOEs3qIpGD4WCIViMipe6B8lBfRZoxj73oEaR+Eii68uqMznNCDumiTY Cpv8V2svhph721csyYxcIR0S0vPt/BP9vgOZSG8uUbpIN8JuVTF5Jd5gopLhR99JDd XUK87zCn7dKtTpZ0zSt27Isoo/wzE20TjTuYne0Y= From: Leon Romanovsky To: Doug Ledford , Jason Gunthorpe Cc: Leon Romanovsky , RDMA mailing list , Guy Levi , Moni Shoua Subject: [PATCH rdma-next 07/12] RDMA/odp: Provide ib_umem_odp_release() to undo the allocs Date: Mon, 19 Aug 2019 14:17:05 +0300 Message-Id: <20190819111710.18440-8-leon@kernel.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190819111710.18440-1-leon@kernel.org> References: <20190819111710.18440-1-leon@kernel.org> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Jason Gunthorpe Now that there are allocator APIs that return the ib_umem_odp directly it should be freed through a umem_odp free'er as well. Signed-off-by: Jason Gunthorpe Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/umem.c | 20 ++++---------------- drivers/infiniband/core/umem_odp.c | 3 +++ drivers/infiniband/hw/mlx5/mr.c | 2 +- drivers/infiniband/hw/mlx5/odp.c | 6 +++--- 4 files changed, 11 insertions(+), 20 deletions(-) diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index ac7376401965..312289f84987 100644 --- a/drivers/infiniband/core/umem.c +++ b/drivers/infiniband/core/umem.c @@ -326,15 +326,6 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr, } EXPORT_SYMBOL(ib_umem_get); -static void __ib_umem_release_tail(struct ib_umem *umem) -{ - mmdrop(umem->owning_mm); - if (umem->is_odp) - kfree(to_ib_umem_odp(umem)); - else - kfree(umem); -} - /** * ib_umem_release - release memory pinned with ib_umem_get * @umem: umem struct to release @@ -343,17 +334,14 @@ void ib_umem_release(struct ib_umem *umem) { if (!umem) return; - - if (umem->is_odp) { - ib_umem_odp_release(to_ib_umem_odp(umem)); - __ib_umem_release_tail(umem); - return; - } + if (umem->is_odp) + return ib_umem_odp_release(to_ib_umem_odp(umem)); __ib_umem_release(umem->context->device, umem, 1); atomic64_sub(ib_umem_num_pages(umem), &umem->owning_mm->pinned_vm); - __ib_umem_release_tail(umem); + mmdrop(umem->owning_mm); + kfree(umem); } EXPORT_SYMBOL(ib_umem_release); diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index 79995766316a..2575dd783196 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -530,7 +530,10 @@ void ib_umem_odp_release(struct ib_umem_odp *umem_odp) vfree(umem_odp->page_list); } put_per_mm(umem_odp); + mmdrop(umem_odp->umem.owning_mm); + kfree(umem_odp); } +EXPORT_SYMBOL(ib_umem_odp_release); /* * Map for DMA and insert a single page into the on-demand paging page tables. diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index c9690d3cfb5c..aa0299662c05 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1638,7 +1638,7 @@ static void dereg_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr) * so that there will not be any invalidations in * flight, looking at the *mr struct. */ - ib_umem_release(umem); + ib_umem_odp_release(umem_odp); atomic_sub(npages, &dev->mdev->priv.reg_pages); /* Avoid double-freeing the umem. */ diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 4371fc759c23..ad5d5f2c8509 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -206,7 +206,7 @@ static void mr_leaf_free_action(struct work_struct *work) mr->parent = NULL; synchronize_srcu(&mr->dev->mr_srcu); - ib_umem_release(&odp->umem); + ib_umem_odp_release(odp); if (imr->live) mlx5_ib_update_xlt(imr, idx, 1, 0, MLX5_IB_UPD_XLT_INDIRECT | @@ -472,7 +472,7 @@ static struct ib_umem_odp *implicit_mr_get_data(struct mlx5_ib_mr *mr, mr->access_flags); if (IS_ERR(mtt)) { mutex_unlock(&odp_mr->umem_mutex); - ib_umem_release(&odp->umem); + ib_umem_odp_release(odp); return ERR_CAST(mtt); } @@ -526,7 +526,7 @@ struct mlx5_ib_mr *mlx5_ib_alloc_implicit_mr(struct mlx5_ib_pd *pd, imr = implicit_mr_alloc(&pd->ibpd, umem_odp, 1, access_flags); if (IS_ERR(imr)) { - ib_umem_release(&umem_odp->umem); + ib_umem_odp_release(umem_odp); return ERR_CAST(imr); } From patchwork Mon Aug 19 11:17:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 11100673 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3B5B317E2 for ; Mon, 19 Aug 2019 11:17:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2A40A1FEBA for ; Mon, 19 Aug 2019 11:17:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1E36D2623C; Mon, 19 Aug 2019 11:17:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AF0C31FEBA for ; Mon, 19 Aug 2019 11:17:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727364AbfHSLRm (ORCPT ); Mon, 19 Aug 2019 07:17:42 -0400 Received: from mail.kernel.org ([198.145.29.99]:33098 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727354AbfHSLRm (ORCPT ); Mon, 19 Aug 2019 07:17:42 -0400 Received: from localhost (unknown [77.137.115.125]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 930C92085A; Mon, 19 Aug 2019 11:17:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566213461; bh=vUSq7At/buDxBEhtzuI23PB4aCcxeJoTvKnZcb92QZs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PZi5j25CKr/i2QuRTpv+pHkW3c7f1CTpzb9uHa1QDWgZ82QWWNCRbQDv9F7ZVvG1Y rRDT37v5Ad/bgjMFlpA0uIbQQMKb4lHBOL2KYpqW8Jil5ULWY5wFkUJSqenTf8u+GP Lj9h3UjUodiuB0Jz388/oNzScanSivF4nvbzu1wY= From: Leon Romanovsky To: Doug Ledford , Jason Gunthorpe Cc: Leon Romanovsky , RDMA mailing list , Guy Levi , Moni Shoua Subject: [PATCH rdma-next 08/12] RDMA/odp: Check for overflow when computing the umem_odp end Date: Mon, 19 Aug 2019 14:17:06 +0300 Message-Id: <20190819111710.18440-9-leon@kernel.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190819111710.18440-1-leon@kernel.org> References: <20190819111710.18440-1-leon@kernel.org> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Jason Gunthorpe Since the page size can be extended in the ODP case by IB_ACCESS_HUGETLB the existing overflow checks done by ib_umem_get() are not sufficient. Check for overflow again. Further, remove the unchecked math from the inlines and just use the precomputed value stored in the interval_tree_node. Signed-off-by: Jason Gunthorpe Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/umem_odp.c | 25 +++++++++++++++++++------ include/rdma/ib_umem_odp.h | 5 ++--- 2 files changed, 21 insertions(+), 9 deletions(-) diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index 2575dd783196..46ae9962fae3 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -294,19 +294,32 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp, umem_odp->umem.is_odp = 1; if (!umem_odp->is_implicit_odp) { - size_t pages = ib_umem_odp_num_pages(umem_odp); - + size_t page_size = 1UL << umem_odp->page_shift; + size_t pages; + + umem_odp->interval_tree.start = + ALIGN_DOWN(umem_odp->umem.address, page_size); + if (check_add_overflow(umem_odp->umem.address, + umem_odp->umem.length, + &umem_odp->interval_tree.last)) + return -EOVERFLOW; + umem_odp->interval_tree.last = + ALIGN(umem_odp->interval_tree.last, page_size); + if (unlikely(umem_odp->interval_tree.last < page_size)) + return -EOVERFLOW; + + pages = (umem_odp->interval_tree.last - + umem_odp->interval_tree.start) >> + umem_odp->page_shift; if (!pages) return -EINVAL; /* * Note that the representation of the intervals in the * interval tree considers the ending point as contained in - * the interval, while the function ib_umem_end returns the - * first address which is not contained in the umem. + * the interval. */ - umem_odp->interval_tree.start = ib_umem_start(umem_odp); - umem_odp->interval_tree.last = ib_umem_end(umem_odp) - 1; + umem_odp->interval_tree.last--; umem_odp->page_list = vzalloc( array_size(sizeof(*umem_odp->page_list), pages)); diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h index 5efb67f97b0a..b37c674b7fe6 100644 --- a/include/rdma/ib_umem_odp.h +++ b/include/rdma/ib_umem_odp.h @@ -91,14 +91,13 @@ static inline struct ib_umem_odp *to_ib_umem_odp(struct ib_umem *umem) /* Returns the first page of an ODP umem. */ static inline unsigned long ib_umem_start(struct ib_umem_odp *umem_odp) { - return ALIGN_DOWN(umem_odp->umem.address, 1UL << umem_odp->page_shift); + return umem_odp->interval_tree.start; } /* Returns the address of the page after the last one of an ODP umem. */ static inline unsigned long ib_umem_end(struct ib_umem_odp *umem_odp) { - return ALIGN(umem_odp->umem.address + umem_odp->umem.length, - 1UL << umem_odp->page_shift); + return umem_odp->interval_tree.last + 1; } static inline size_t ib_umem_odp_num_pages(struct ib_umem_odp *umem_odp) From patchwork Mon Aug 19 11:17:07 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 11100675 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E32811864 for ; Mon, 19 Aug 2019 11:17:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D16631FEBA for ; Mon, 19 Aug 2019 11:17:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C60562623C; Mon, 19 Aug 2019 11:17:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 755B91FEBA for ; Mon, 19 Aug 2019 11:17:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727283AbfHSLRp (ORCPT ); Mon, 19 Aug 2019 07:17:45 -0400 Received: from mail.kernel.org ([198.145.29.99]:33130 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726776AbfHSLRp (ORCPT ); Mon, 19 Aug 2019 07:17:45 -0400 Received: from localhost (unknown [77.137.115.125]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 2389E20989; Mon, 19 Aug 2019 11:17:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566213464; bh=jnhFEIv0WTsidHF5RwVsbhsevggyj9R2LaHndN85PkQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=K7Pk1XCbR2g7prH/a/YutATkTQZEkMIFCygzcBSZv1K2gbFNka0LwIjKP2Zj8m+r5 QuSnp3aRdR3GZ6AD72DctExQsgnZHvpE8ayE05zlEQYJfg4FzX/39BI5EX8A0hrZgY PE/W4LyrbvT8Ju3aXfkssJQrwOTqF8pAHfhWdjeA= From: Leon Romanovsky To: Doug Ledford , Jason Gunthorpe Cc: Leon Romanovsky , RDMA mailing list , Guy Levi , Moni Shoua Subject: [PATCH rdma-next 09/12] RDMA/odp: Use kvcalloc for the dma_list and page_list Date: Mon, 19 Aug 2019 14:17:07 +0300 Message-Id: <20190819111710.18440-10-leon@kernel.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190819111710.18440-1-leon@kernel.org> References: <20190819111710.18440-1-leon@kernel.org> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Jason Gunthorpe There is no specific need for these to be in the valloc space, let the system decide automatically how to do the allocation. Signed-off-by: Jason Gunthorpe Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/umem_odp.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index 46ae9962fae3..f1b298575b4c 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -321,13 +321,13 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp, */ umem_odp->interval_tree.last--; - umem_odp->page_list = vzalloc( - array_size(sizeof(*umem_odp->page_list), pages)); + umem_odp->page_list = kvcalloc( + pages, sizeof(*umem_odp->page_list), GFP_KERNEL); if (!umem_odp->page_list) return -ENOMEM; - umem_odp->dma_list = - vzalloc(array_size(sizeof(*umem_odp->dma_list), pages)); + umem_odp->dma_list = kvcalloc( + pages, sizeof(*umem_odp->dma_list), GFP_KERNEL); if (!umem_odp->dma_list) { ret = -ENOMEM; goto out_page_list; @@ -361,9 +361,9 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp, out_unlock: mutex_unlock(&ctx->per_mm_list_lock); - vfree(umem_odp->dma_list); + kvfree(umem_odp->dma_list); out_page_list: - vfree(umem_odp->page_list); + kvfree(umem_odp->page_list); return ret; } @@ -539,8 +539,8 @@ void ib_umem_odp_release(struct ib_umem_odp *umem_odp) ib_umem_odp_unmap_dma_pages(umem_odp, ib_umem_start(umem_odp), ib_umem_end(umem_odp)); remove_umem_from_per_mm(umem_odp); - vfree(umem_odp->dma_list); - vfree(umem_odp->page_list); + kvfree(umem_odp->dma_list); + kvfree(umem_odp->page_list); } put_per_mm(umem_odp); mmdrop(umem_odp->umem.owning_mm); From patchwork Mon Aug 19 11:17:08 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 11100683 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 122A914DE for ; Mon, 19 Aug 2019 11:18:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 004F91FEBA for ; Mon, 19 Aug 2019 11:18:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E7DAC26242; Mon, 19 Aug 2019 11:18:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 54EE91FEBA for ; Mon, 19 Aug 2019 11:18:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727215AbfHSLR7 (ORCPT ); Mon, 19 Aug 2019 07:17:59 -0400 Received: from mail.kernel.org ([198.145.29.99]:33238 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726776AbfHSLR7 (ORCPT ); Mon, 19 Aug 2019 07:17:59 -0400 Received: from localhost (unknown [77.137.115.125]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 568F22087B; Mon, 19 Aug 2019 11:17:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566213479; bh=kfH6SGTaSxNfI2d4IeDbLimhCVud6dopTWhxTGTvFkQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LENo9HGoTDVSpo+NgMc7AKUO/fe+5MgOu+lC/nfcLwQgiJ3WMjmz0Q1R0G/T3VHEV p9ZGvmBRVGmA+rQZyX6DViJqBhdEjqTS/p33JSKoh0OLgD6FD3PY5cHTfDWmUe2oiz x/XpAPRo0vqrSbaDuZXl+GoqeMS/7t6oto47tvnY= From: Leon Romanovsky To: Doug Ledford , Jason Gunthorpe Cc: Leon Romanovsky , RDMA mailing list , Guy Levi , Moni Shoua Subject: [PATCH rdma-next 10/12] RDMA/core: Make invalidate_range a device operation Date: Mon, 19 Aug 2019 14:17:08 +0300 Message-Id: <20190819111710.18440-11-leon@kernel.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190819111710.18440-1-leon@kernel.org> References: <20190819111710.18440-1-leon@kernel.org> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Moni Shoua The callback function 'invalidate_range' is implemented in a driver so the place for it is in the ib_device_ops structure and not in ib_ucontext. Signed-off-by: Moni Shoua Reviewed-by: Guy Levi Reviewed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/device.c | 1 + drivers/infiniband/core/umem_odp.c | 10 +++++----- drivers/infiniband/core/uverbs_cmd.c | 2 -- drivers/infiniband/hw/mlx5/main.c | 4 ---- drivers/infiniband/hw/mlx5/odp.c | 1 + include/rdma/ib_verbs.h | 4 ++-- 6 files changed, 9 insertions(+), 13 deletions(-) diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c index 8892862fb759..6e284963741e 100644 --- a/drivers/infiniband/core/device.c +++ b/drivers/infiniband/core/device.c @@ -2582,6 +2582,7 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops) SET_DEVICE_OP(dev_ops, get_vf_config); SET_DEVICE_OP(dev_ops, get_vf_stats); SET_DEVICE_OP(dev_ops, init_port); + SET_DEVICE_OP(dev_ops, invalidate_range); SET_DEVICE_OP(dev_ops, iw_accept); SET_DEVICE_OP(dev_ops, iw_add_ref); SET_DEVICE_OP(dev_ops, iw_connect); diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index f1b298575b4c..09c0c585b2e7 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -103,7 +103,7 @@ static void ib_umem_notifier_release(struct mmu_notifier *mn, */ smp_wmb(); complete_all(&umem_odp->notifier_completion); - umem_odp->umem.context->invalidate_range( + umem_odp->umem.context->device->ops.invalidate_range( umem_odp, ib_umem_start(umem_odp), ib_umem_end(umem_odp)); } @@ -116,7 +116,7 @@ static int invalidate_range_start_trampoline(struct ib_umem_odp *item, u64 start, u64 end, void *cookie) { ib_umem_notifier_start_account(item); - item->umem.context->invalidate_range(item, start, end); + item->umem.context->device->ops.invalidate_range(item, start, end); return 0; } @@ -392,7 +392,7 @@ struct ib_umem_odp *ib_umem_odp_alloc_implicit(struct ib_udata *udata, if (!context) return ERR_PTR(-EIO); - if (WARN_ON_ONCE(!context->invalidate_range)) + if (WARN_ON_ONCE(!context->device->ops.invalidate_range)) return ERR_PTR(-EINVAL); umem_odp = kzalloc(sizeof(*umem_odp), GFP_KERNEL); @@ -486,7 +486,7 @@ struct ib_umem_odp *ib_umem_odp_get(struct ib_udata *udata, unsigned long addr, return ERR_PTR(-EIO); if (WARN_ON_ONCE(!(access & IB_ACCESS_ON_DEMAND)) || - WARN_ON_ONCE(!context->invalidate_range)) + WARN_ON_ONCE(!context->device->ops.invalidate_range)) return ERR_PTR(-EINVAL); umem_odp = kzalloc(sizeof(struct ib_umem_odp), GFP_KERNEL); @@ -614,7 +614,7 @@ static int ib_umem_odp_map_dma_single_page( if (remove_existing_mapping) { ib_umem_notifier_start_account(umem_odp); - context->invalidate_range( + dev->ops.invalidate_range( umem_odp, ib_umem_start(umem_odp) + (page_index << umem_odp->page_shift), diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c index 7ddd0e5bc6b3..8f4fd4fac159 100644 --- a/drivers/infiniband/core/uverbs_cmd.c +++ b/drivers/infiniband/core/uverbs_cmd.c @@ -275,8 +275,6 @@ static int ib_uverbs_get_context(struct uverbs_attr_bundle *attrs) ret = ib_dev->ops.alloc_ucontext(ucontext, &attrs->driver_udata); if (ret) goto err_file; - if (!(ib_dev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING)) - ucontext->invalidate_range = NULL; rdma_restrack_uadd(&ucontext->res); diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 98e566acb746..08020affdc17 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -1867,10 +1867,6 @@ static int mlx5_ib_alloc_ucontext(struct ib_ucontext *uctx, if (err) goto out_sys_pages; - if (ibdev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING) - context->ibucontext.invalidate_range = - &mlx5_ib_invalidate_range; - if (req.flags & MLX5_IB_ALLOC_UCTX_DEVX) { err = mlx5_ib_devx_create(dev, true); if (err < 0) diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index ad5d5f2c8509..c755c76729bc 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -1594,6 +1594,7 @@ void mlx5_odp_init_mr_cache_entry(struct mlx5_cache_ent *ent) static const struct ib_device_ops mlx5_ib_dev_odp_ops = { .advise_mr = mlx5_ib_advise_mr, + .invalidate_range = mlx5_ib_invalidate_range, }; int mlx5_ib_odp_init_one(struct mlx5_ib_dev *dev) diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 391499008a22..18a34888bbca 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1469,8 +1469,6 @@ struct ib_ucontext { bool cleanup_retryable; - void (*invalidate_range)(struct ib_umem_odp *umem_odp, - unsigned long start, unsigned long end); struct mutex per_mm_list_lock; struct list_head per_mm_list; @@ -2430,6 +2428,8 @@ struct ib_device_ops { u64 iova); int (*unmap_fmr)(struct list_head *fmr_list); int (*dealloc_fmr)(struct ib_fmr *fmr); + void (*invalidate_range)(struct ib_umem_odp *umem_odp, + unsigned long start, unsigned long end); int (*attach_mcast)(struct ib_qp *qp, union ib_gid *gid, u16 lid); int (*detach_mcast)(struct ib_qp *qp, union ib_gid *gid, u16 lid); struct ib_xrcd *(*alloc_xrcd)(struct ib_device *device, From patchwork Mon Aug 19 11:17:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 11100679 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E7FAB112C for ; Mon, 19 Aug 2019 11:17:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D71CC1FEBA for ; Mon, 19 Aug 2019 11:17:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CBB832623C; Mon, 19 Aug 2019 11:17:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 78E561FEBA for ; Mon, 19 Aug 2019 11:17:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726852AbfHSLRx (ORCPT ); Mon, 19 Aug 2019 07:17:53 -0400 Received: from mail.kernel.org ([198.145.29.99]:33182 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727339AbfHSLRw (ORCPT ); Mon, 19 Aug 2019 07:17:52 -0400 Received: from localhost (unknown [77.137.115.125]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 412732085A; Mon, 19 Aug 2019 11:17:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566213471; bh=RqqwixJFdLXzbyX/SBdbIcbTYnHS72+OZtDom+QJt4U=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MzjohZYlR06ewAUQXiTTt/uL0uZR0rSSP5EG+BbNeQjvINBBUfwPd6aK2YNsbEcL1 t3phfFlP/7iTnRhMgivfDv+axI0bhoIys+Va1CL0DOISu/9GkVeL9o7t2emaRxcp6a GqUa0M2QK/BUBK9PXZZWJoksN1DqM3Eb86NU89Ew= From: Leon Romanovsky To: Doug Ledford , Jason Gunthorpe Cc: Leon Romanovsky , RDMA mailing list , Guy Levi , Moni Shoua Subject: [PATCH rdma-next 11/12] RDMA/mlx5: Use ib_umem_start instead of umem.address Date: Mon, 19 Aug 2019 14:17:09 +0300 Message-Id: <20190819111710.18440-12-leon@kernel.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190819111710.18440-1-leon@kernel.org> References: <20190819111710.18440-1-leon@kernel.org> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Jason Gunthorpe These are subtly different, the address is the original VA requested during umem_get, while ib_umem_start() is the version that is rounded to the proper page size, ie is the true start of the umem's dma map. Signed-off-by: Jason Gunthorpe Signed-off-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/odp.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index c755c76729bc..70e0a3555f11 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -184,7 +184,7 @@ void mlx5_odp_populate_klm(struct mlx5_klm *pklm, size_t offset, for (i = 0; i < nentries; i++, pklm++) { pklm->bcount = cpu_to_be32(MLX5_IMR_MTT_SIZE); va = (offset + i) * MLX5_IMR_MTT_SIZE; - if (odp && odp->umem.address == va) { + if (odp && ib_umem_start(odp) == va) { struct mlx5_ib_mr *mtt = odp->private; pklm->key = cpu_to_be32(mtt->ibmr.lkey); @@ -494,7 +494,7 @@ static struct ib_umem_odp *implicit_mr_get_data(struct mlx5_ib_mr *mr, addr += MLX5_IMR_MTT_SIZE; if (unlikely(addr < io_virt + bcnt)) { odp = odp_next(odp); - if (odp && odp->umem.address != addr) + if (odp && ib_umem_start(odp) != addr) odp = NULL; goto next_mr; } @@ -664,7 +664,7 @@ static int pagefault_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr, io_virt += size; next = odp_next(odp); - if (unlikely(!next || next->umem.address != io_virt)) { + if (unlikely(!next || ib_umem_start(next) != io_virt)) { mlx5_ib_dbg(dev, "next implicit leaf removed at 0x%llx. got %p\n", io_virt, next); return -EAGAIN; From patchwork Mon Aug 19 11:17:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 11100681 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8C5AC14DE for ; Mon, 19 Aug 2019 11:17:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7AE121FEBA for ; Mon, 19 Aug 2019 11:17:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6D3152623C; Mon, 19 Aug 2019 11:17:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 107191FEBA for ; Mon, 19 Aug 2019 11:17:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726550AbfHSLR4 (ORCPT ); Mon, 19 Aug 2019 07:17:56 -0400 Received: from mail.kernel.org ([198.145.29.99]:33212 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726776AbfHSLR4 (ORCPT ); Mon, 19 Aug 2019 07:17:56 -0400 Received: from localhost (unknown [77.137.115.125]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id CD5292085A; Mon, 19 Aug 2019 11:17:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566213475; bh=5p/sOk0LjOpZBCV1zE/qeXyKsy+9zRkTskYbo786Njg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oBM1z88n0ulqbO1jqKa45Tf4M27eoLZnBVdWYI2/vUKwiwI9PPL/PBnZe7uY/A0tP zDD8TFZ6FNFCCkyOzVVWfDk84KFLiPko+5R5rUZcG2i/X1OHc4q55mSz6Nfcb6JIk/ HItcaJuEsGhWiGOpcoch/lgjNxwA0D5+YVdBYu9A= From: Leon Romanovsky To: Doug Ledford , Jason Gunthorpe Cc: Leon Romanovsky , RDMA mailing list , Guy Levi , Moni Shoua Subject: [PATCH rdma-next 12/12] RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr Date: Mon, 19 Aug 2019 14:17:10 +0300 Message-Id: <20190819111710.18440-13-leon@kernel.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190819111710.18440-1-leon@kernel.org> References: <20190819111710.18440-1-leon@kernel.org> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Jason Gunthorpe These are the same thing since mr always comes from odp->private. It is confusing to reference the same memory via two names. Signed-off-by: Jason Gunthorpe Signed-off-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/odp.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 70e0a3555f11..8b155a1f0b38 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -603,7 +603,7 @@ static int pagefault_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr, start_idx = (io_virt - (mr->mmkey.iova & page_mask)) >> page_shift; access_mask = ODP_READ_ALLOWED_BIT; - if (prefetch && !downgrade && !mr->umem->writable) { + if (prefetch && !downgrade && !odp->umem.writable) { /* prefetch with write-access must * be supported by the MR */ @@ -611,7 +611,7 @@ static int pagefault_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr, goto out; } - if (mr->umem->writable && !downgrade) + if (odp->umem.writable && !downgrade) access_mask |= ODP_WRITE_ALLOWED_BIT; current_seq = READ_ONCE(odp->notifiers_seq); @@ -621,8 +621,8 @@ static int pagefault_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr, */ smp_rmb(); - ret = ib_umem_odp_map_dma_pages(to_ib_umem_odp(mr->umem), io_virt, size, - access_mask, current_seq); + ret = ib_umem_odp_map_dma_pages(odp, io_virt, size, access_mask, + current_seq); if (ret < 0) goto out; @@ -630,8 +630,7 @@ static int pagefault_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr, np = ret; mutex_lock(&odp->umem_mutex); - if (!ib_umem_mmu_notifier_retry(to_ib_umem_odp(mr->umem), - current_seq)) { + if (!ib_umem_mmu_notifier_retry(odp, current_seq)) { /* * No need to check whether the MTTs really belong to * this MR, since ib_umem_odp_map_dma_pages already