From patchwork Wed Feb 6 17:59:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Davidlohr Bueso X-Patchwork-Id: 10799803 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C22B91390 for ; Wed, 6 Feb 2019 18:00:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B6C8F2BDDC for ; Wed, 6 Feb 2019 18:00:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AAF982CC32; Wed, 6 Feb 2019 18:00:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C1C872BFD6 for ; Wed, 6 Feb 2019 18:00:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729526AbfBFSAE (ORCPT ); Wed, 6 Feb 2019 13:00:04 -0500 Received: from smtp.nue.novell.com ([195.135.221.5]:55363 "EHLO smtp.nue.novell.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727212AbfBFSAC (ORCPT ); Wed, 6 Feb 2019 13:00:02 -0500 Received: from emea4-mta.ukb.novell.com ([10.120.13.87]) by smtp.nue.novell.com with ESMTP (TLS encrypted); Wed, 06 Feb 2019 19:00:00 +0100 Received: from linux-r8p5.suse.de (nwb-a10-snat.microfocus.com [10.120.13.202]) by emea4-mta.ukb.novell.com with ESMTP (TLS encrypted); Wed, 06 Feb 2019 17:59:38 +0000 From: Davidlohr Bueso To: jgg@ziepe.ca, akpm@linux-foundation.org Cc: dledford@redhat.com, jgg@mellanox.com, jack@suse.cz, willy@infradead.org, ira.weiny@intel.com, linux-rdma@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, dave@stgolabs.net, Davidlohr Bueso Subject: [PATCH 1/6] mm: make mm->pinned_vm an atomic64 counter Date: Wed, 6 Feb 2019 09:59:15 -0800 Message-Id: <20190206175920.31082-2-dave@stgolabs.net> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190206175920.31082-1-dave@stgolabs.net> References: <20190206175920.31082-1-dave@stgolabs.net> Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Taking a sleeping lock to _only_ increment a variable is quite the overkill, and pretty much all users do this. Furthermore, some drivers (ie: infiniband and scif) that need pinned semantics can go to quite some trouble to actually delay via workqueue (un)accounting for pinned pages when not possible to acquire it. By making the counter atomic we no longer need to hold the mmap_sem and can simply some code around it for pinned_vm users. The counter is 64-bit such that we need not worry about overflows such as rdma user input controlled from userspace. Reviewed-by: Ira Weiny Reviewed-by: Christoph Lameter Reviewed-by: Daniel Jordan Reviewed-by: Jan Kara Signed-off-by: Davidlohr Bueso --- drivers/infiniband/core/umem.c | 12 ++++++------ drivers/infiniband/hw/hfi1/user_pages.c | 6 +++--- drivers/infiniband/hw/qib/qib_user_pages.c | 4 ++-- drivers/infiniband/hw/usnic/usnic_uiom.c | 8 ++++---- drivers/misc/mic/scif/scif_rma.c | 6 +++--- fs/proc/task_mmu.c | 2 +- include/linux/mm_types.h | 2 +- kernel/events/core.c | 8 ++++---- kernel/fork.c | 2 +- mm/debug.c | 5 +++-- 10 files changed, 28 insertions(+), 27 deletions(-) diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index 1efe0a74e06b..678abe1afcba 100644 --- a/drivers/infiniband/core/umem.c +++ b/drivers/infiniband/core/umem.c @@ -166,13 +166,13 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr, lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; down_write(&mm->mmap_sem); - if (check_add_overflow(mm->pinned_vm, npages, &new_pinned) || - (new_pinned > lock_limit && !capable(CAP_IPC_LOCK))) { + new_pinned = atomic64_read(&mm->pinned_vm) + npages; + if (new_pinned > lock_limit && !capable(CAP_IPC_LOCK)) { up_write(&mm->mmap_sem); ret = -ENOMEM; goto out; } - mm->pinned_vm = new_pinned; + atomic64_set(&mm->pinned_vm, new_pinned); up_write(&mm->mmap_sem); cur_base = addr & PAGE_MASK; @@ -234,7 +234,7 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr, __ib_umem_release(context->device, umem, 0); vma: down_write(&mm->mmap_sem); - mm->pinned_vm -= ib_umem_num_pages(umem); + atomic64_sub(ib_umem_num_pages(umem), &mm->pinned_vm); up_write(&mm->mmap_sem); out: if (vma_list) @@ -263,7 +263,7 @@ static void ib_umem_release_defer(struct work_struct *work) struct ib_umem *umem = container_of(work, struct ib_umem, work); down_write(&umem->owning_mm->mmap_sem); - umem->owning_mm->pinned_vm -= ib_umem_num_pages(umem); + atomic64_sub(ib_umem_num_pages(umem), &umem->owning_mm->pinned_vm); up_write(&umem->owning_mm->mmap_sem); __ib_umem_release_tail(umem); @@ -302,7 +302,7 @@ void ib_umem_release(struct ib_umem *umem) } else { down_write(&umem->owning_mm->mmap_sem); } - umem->owning_mm->pinned_vm -= ib_umem_num_pages(umem); + atomic64_sub(ib_umem_num_pages(umem), &umem->owning_mm->pinned_vm); up_write(&umem->owning_mm->mmap_sem); __ib_umem_release_tail(umem); diff --git a/drivers/infiniband/hw/hfi1/user_pages.c b/drivers/infiniband/hw/hfi1/user_pages.c index e341e6dcc388..40a6e434190f 100644 --- a/drivers/infiniband/hw/hfi1/user_pages.c +++ b/drivers/infiniband/hw/hfi1/user_pages.c @@ -92,7 +92,7 @@ bool hfi1_can_pin_pages(struct hfi1_devdata *dd, struct mm_struct *mm, size = DIV_ROUND_UP(size, PAGE_SIZE); down_read(&mm->mmap_sem); - pinned = mm->pinned_vm; + pinned = atomic64_read(&mm->pinned_vm); up_read(&mm->mmap_sem); /* First, check the absolute limit against all pinned pages. */ @@ -112,7 +112,7 @@ int hfi1_acquire_user_pages(struct mm_struct *mm, unsigned long vaddr, size_t np return ret; down_write(&mm->mmap_sem); - mm->pinned_vm += ret; + atomic64_add(ret, &mm->pinned_vm); up_write(&mm->mmap_sem); return ret; @@ -131,7 +131,7 @@ void hfi1_release_user_pages(struct mm_struct *mm, struct page **p, if (mm) { /* during close after signal, mm can be NULL */ down_write(&mm->mmap_sem); - mm->pinned_vm -= npages; + atomic64_sub(npages, &mm->pinned_vm); up_write(&mm->mmap_sem); } } diff --git a/drivers/infiniband/hw/qib/qib_user_pages.c b/drivers/infiniband/hw/qib/qib_user_pages.c index 075f09fb7ce3..c6c81022d313 100644 --- a/drivers/infiniband/hw/qib/qib_user_pages.c +++ b/drivers/infiniband/hw/qib/qib_user_pages.c @@ -75,7 +75,7 @@ static int __qib_get_user_pages(unsigned long start_page, size_t num_pages, goto bail_release; } - current->mm->pinned_vm += num_pages; + atomic64_add(num_pages, ¤t->mm->pinned_vm); ret = 0; goto bail; @@ -156,7 +156,7 @@ void qib_release_user_pages(struct page **p, size_t num_pages) __qib_release_user_pages(p, num_pages, 1); if (current->mm) { - current->mm->pinned_vm -= num_pages; + atomic64_sub(num_pages, ¤t->mm->pinned_vm); up_write(¤t->mm->mmap_sem); } } diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c index ce01a59fccc4..854436a2b437 100644 --- a/drivers/infiniband/hw/usnic/usnic_uiom.c +++ b/drivers/infiniband/hw/usnic/usnic_uiom.c @@ -129,7 +129,7 @@ static int usnic_uiom_get_pages(unsigned long addr, size_t size, int writable, uiomr->owning_mm = mm = current->mm; down_write(&mm->mmap_sem); - locked = npages + current->mm->pinned_vm; + locked = npages + atomic64_read(¤t->mm->pinned_vm); lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; if ((locked > lock_limit) && !capable(CAP_IPC_LOCK)) { @@ -187,7 +187,7 @@ static int usnic_uiom_get_pages(unsigned long addr, size_t size, int writable, if (ret < 0) usnic_uiom_put_pages(chunk_list, 0); else { - mm->pinned_vm = locked; + atomic64_set(&mm->pinned_vm, locked); mmgrab(uiomr->owning_mm); } @@ -441,7 +441,7 @@ static void usnic_uiom_release_defer(struct work_struct *work) container_of(work, struct usnic_uiom_reg, work); down_write(&uiomr->owning_mm->mmap_sem); - uiomr->owning_mm->pinned_vm -= usnic_uiom_num_pages(uiomr); + atomic64_sub(usnic_uiom_num_pages(uiomr), &uiomr->owning_mm->pinned_vm); up_write(&uiomr->owning_mm->mmap_sem); __usnic_uiom_release_tail(uiomr); @@ -469,7 +469,7 @@ void usnic_uiom_reg_release(struct usnic_uiom_reg *uiomr, } else { down_write(&uiomr->owning_mm->mmap_sem); } - uiomr->owning_mm->pinned_vm -= usnic_uiom_num_pages(uiomr); + atomic64_sub(usnic_uiom_num_pages(uiomr), &uiomr->owning_mm->pinned_vm); up_write(&uiomr->owning_mm->mmap_sem); __usnic_uiom_release_tail(uiomr); diff --git a/drivers/misc/mic/scif/scif_rma.c b/drivers/misc/mic/scif/scif_rma.c index 749321eb91ae..2448368f181e 100644 --- a/drivers/misc/mic/scif/scif_rma.c +++ b/drivers/misc/mic/scif/scif_rma.c @@ -285,7 +285,7 @@ __scif_dec_pinned_vm_lock(struct mm_struct *mm, } else { down_write(&mm->mmap_sem); } - mm->pinned_vm -= nr_pages; + atomic64_sub(nr_pages, &mm->pinned_vm); up_write(&mm->mmap_sem); return 0; } @@ -299,7 +299,7 @@ static inline int __scif_check_inc_pinned_vm(struct mm_struct *mm, return 0; locked = nr_pages; - locked += mm->pinned_vm; + locked += atomic64_read(&mm->pinned_vm); lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; if ((locked > lock_limit) && !capable(CAP_IPC_LOCK)) { dev_err(scif_info.mdev.this_device, @@ -307,7 +307,7 @@ static inline int __scif_check_inc_pinned_vm(struct mm_struct *mm, locked, lock_limit); return -ENOMEM; } - mm->pinned_vm = locked; + atomic64_set(&mm->pinned_vm, locked); return 0; } diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index f0ec9edab2f3..d2902962244d 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -59,7 +59,7 @@ void task_mem(struct seq_file *m, struct mm_struct *mm) SEQ_PUT_DEC("VmPeak:\t", hiwater_vm); SEQ_PUT_DEC(" kB\nVmSize:\t", total_vm); SEQ_PUT_DEC(" kB\nVmLck:\t", mm->locked_vm); - SEQ_PUT_DEC(" kB\nVmPin:\t", mm->pinned_vm); + SEQ_PUT_DEC(" kB\nVmPin:\t", atomic64_read(&mm->pinned_vm)); SEQ_PUT_DEC(" kB\nVmHWM:\t", hiwater_rss); SEQ_PUT_DEC(" kB\nVmRSS:\t", total_rss); SEQ_PUT_DEC(" kB\nRssAnon:\t", anon); diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 2c471a2c43fa..acea2ea2d6c4 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -405,7 +405,7 @@ struct mm_struct { unsigned long total_vm; /* Total pages mapped */ unsigned long locked_vm; /* Pages that have PG_mlocked set */ - unsigned long pinned_vm; /* Refcount permanently increased */ + atomic64_t pinned_vm; /* Refcount permanently increased */ unsigned long data_vm; /* VM_WRITE & ~VM_SHARED & ~VM_STACK */ unsigned long exec_vm; /* VM_EXEC & ~VM_WRITE & ~VM_STACK */ unsigned long stack_vm; /* VM_STACK */ diff --git a/kernel/events/core.c b/kernel/events/core.c index 3cd13a30f732..8df0b77a4687 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -5459,7 +5459,7 @@ static void perf_mmap_close(struct vm_area_struct *vma) /* now it's safe to free the pages */ atomic_long_sub(rb->aux_nr_pages, &mmap_user->locked_vm); - vma->vm_mm->pinned_vm -= rb->aux_mmap_locked; + atomic64_sub(rb->aux_mmap_locked, &vma->vm_mm->pinned_vm); /* this has to be the last one */ rb_free_aux(rb); @@ -5532,7 +5532,7 @@ static void perf_mmap_close(struct vm_area_struct *vma) */ atomic_long_sub((size >> PAGE_SHIFT) + 1, &mmap_user->locked_vm); - vma->vm_mm->pinned_vm -= mmap_locked; + atomic64_sub(mmap_locked, &vma->vm_mm->pinned_vm); free_uid(mmap_user); out_put: @@ -5680,7 +5680,7 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma) lock_limit = rlimit(RLIMIT_MEMLOCK); lock_limit >>= PAGE_SHIFT; - locked = vma->vm_mm->pinned_vm + extra; + locked = atomic64_read(&vma->vm_mm->pinned_vm) + extra; if ((locked > lock_limit) && perf_paranoid_tracepoint_raw() && !capable(CAP_IPC_LOCK)) { @@ -5721,7 +5721,7 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma) unlock: if (!ret) { atomic_long_add(user_extra, &user->locked_vm); - vma->vm_mm->pinned_vm += extra; + atomic64_add(extra, &vma->vm_mm->pinned_vm); atomic_inc(&event->mmap_count); } else if (rb) { diff --git a/kernel/fork.c b/kernel/fork.c index a60459947f18..fe4a051a8d15 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -980,7 +980,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, mm_pgtables_bytes_init(mm); mm->map_count = 0; mm->locked_vm = 0; - mm->pinned_vm = 0; + atomic64_set(&mm->pinned_vm, 0); memset(&mm->rss_stat, 0, sizeof(mm->rss_stat)); spin_lock_init(&mm->page_table_lock); spin_lock_init(&mm->arg_lock); diff --git a/mm/debug.c b/mm/debug.c index 0abb987dad9b..7d13941a72f9 100644 --- a/mm/debug.c +++ b/mm/debug.c @@ -135,7 +135,7 @@ void dump_mm(const struct mm_struct *mm) "mmap_base %lu mmap_legacy_base %lu highest_vm_end %lu\n" "pgd %px mm_users %d mm_count %d pgtables_bytes %lu map_count %d\n" "hiwater_rss %lx hiwater_vm %lx total_vm %lx locked_vm %lx\n" - "pinned_vm %lx data_vm %lx exec_vm %lx stack_vm %lx\n" + "pinned_vm %llx data_vm %lx exec_vm %lx stack_vm %lx\n" "start_code %lx end_code %lx start_data %lx end_data %lx\n" "start_brk %lx brk %lx start_stack %lx\n" "arg_start %lx arg_end %lx env_start %lx env_end %lx\n" @@ -166,7 +166,8 @@ void dump_mm(const struct mm_struct *mm) mm_pgtables_bytes(mm), mm->map_count, mm->hiwater_rss, mm->hiwater_vm, mm->total_vm, mm->locked_vm, - mm->pinned_vm, mm->data_vm, mm->exec_vm, mm->stack_vm, + atomic64_read(&mm->pinned_vm), + mm->data_vm, mm->exec_vm, mm->stack_vm, mm->start_code, mm->end_code, mm->start_data, mm->end_data, mm->start_brk, mm->brk, mm->start_stack, mm->arg_start, mm->arg_end, mm->env_start, mm->env_end, From patchwork Wed Feb 6 17:59:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Davidlohr Bueso X-Patchwork-Id: 10799805 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 304101390 for ; Wed, 6 Feb 2019 18:00:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 25EB82CC41 for ; Wed, 6 Feb 2019 18:00:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2449A2CDB5; Wed, 6 Feb 2019 18:00:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B82182CD72 for ; Wed, 6 Feb 2019 18:00:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728861AbfBFSAU (ORCPT ); Wed, 6 Feb 2019 13:00:20 -0500 Received: from smtp.nue.novell.com ([195.135.221.5]:39422 "EHLO smtp.nue.novell.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729173AbfBFSAE (ORCPT ); Wed, 6 Feb 2019 13:00:04 -0500 Received: from emea4-mta.ukb.novell.com ([10.120.13.87]) by smtp.nue.novell.com with ESMTP (TLS encrypted); Wed, 06 Feb 2019 19:00:02 +0100 Received: from linux-r8p5.suse.de (nwb-a10-snat.microfocus.com [10.120.13.202]) by emea4-mta.ukb.novell.com with ESMTP (TLS encrypted); Wed, 06 Feb 2019 17:59:41 +0000 From: Davidlohr Bueso To: jgg@ziepe.ca, akpm@linux-foundation.org Cc: dledford@redhat.com, jgg@mellanox.com, jack@suse.cz, willy@infradead.org, ira.weiny@intel.com, linux-rdma@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, dave@stgolabs.net, sudeep.dutt@intel.com, ashutosh.dixit@intel.com, Davidlohr Bueso Subject: [PATCH 2/6] drivers/mic/scif: do not use mmap_sem Date: Wed, 6 Feb 2019 09:59:16 -0800 Message-Id: <20190206175920.31082-3-dave@stgolabs.net> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190206175920.31082-1-dave@stgolabs.net> References: <20190206175920.31082-1-dave@stgolabs.net> Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The driver uses mmap_sem for both pinned_vm accounting and get_user_pages(). By using gup_fast() and letting the mm handle the lock if needed, we can no longer rely on the semaphore and simplify the whole thing. Cc: sudeep.dutt@intel.com Cc: ashutosh.dixit@intel.com Reviewed-by: Ira Weiny Signed-off-by: Davidlohr Bueso --- drivers/misc/mic/scif/scif_rma.c | 36 +++++++++++------------------------- 1 file changed, 11 insertions(+), 25 deletions(-) diff --git a/drivers/misc/mic/scif/scif_rma.c b/drivers/misc/mic/scif/scif_rma.c index 2448368f181e..263b8ad507ea 100644 --- a/drivers/misc/mic/scif/scif_rma.c +++ b/drivers/misc/mic/scif/scif_rma.c @@ -272,21 +272,12 @@ static inline void __scif_release_mm(struct mm_struct *mm) static inline int __scif_dec_pinned_vm_lock(struct mm_struct *mm, - int nr_pages, bool try_lock) + int nr_pages) { if (!mm || !nr_pages || !scif_ulimit_check) return 0; - if (try_lock) { - if (!down_write_trylock(&mm->mmap_sem)) { - dev_err(scif_info.mdev.this_device, - "%s %d err\n", __func__, __LINE__); - return -1; - } - } else { - down_write(&mm->mmap_sem); - } + atomic64_sub(nr_pages, &mm->pinned_vm); - up_write(&mm->mmap_sem); return 0; } @@ -298,16 +289,16 @@ static inline int __scif_check_inc_pinned_vm(struct mm_struct *mm, if (!mm || !nr_pages || !scif_ulimit_check) return 0; - locked = nr_pages; - locked += atomic64_read(&mm->pinned_vm); lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; + locked = atomic64_add_return(nr_pages, &mm->pinned_vm); + if ((locked > lock_limit) && !capable(CAP_IPC_LOCK)) { + atomic64_sub(nr_pages, &mm->pinned_vm); dev_err(scif_info.mdev.this_device, "locked(%lu) > lock_limit(%lu)\n", locked, lock_limit); return -ENOMEM; } - atomic64_set(&mm->pinned_vm, locked); return 0; } @@ -326,7 +317,7 @@ int scif_destroy_window(struct scif_endpt *ep, struct scif_window *window) might_sleep(); if (!window->temp && window->mm) { - __scif_dec_pinned_vm_lock(window->mm, window->nr_pages, 0); + __scif_dec_pinned_vm_lock(window->mm, window->nr_pages); __scif_release_mm(window->mm); window->mm = NULL; } @@ -737,7 +728,7 @@ int scif_unregister_window(struct scif_window *window) ep->rma_info.dma_chan); } else { if (!__scif_dec_pinned_vm_lock(window->mm, - window->nr_pages, 1)) { + window->nr_pages)) { __scif_release_mm(window->mm); window->mm = NULL; } @@ -1385,28 +1376,23 @@ int __scif_pin_pages(void *addr, size_t len, int *out_prot, prot |= SCIF_PROT_WRITE; retry: mm = current->mm; - down_write(&mm->mmap_sem); if (ulimit) { err = __scif_check_inc_pinned_vm(mm, nr_pages); if (err) { - up_write(&mm->mmap_sem); pinned_pages->nr_pages = 0; goto error_unmap; } } - pinned_pages->nr_pages = get_user_pages( + pinned_pages->nr_pages = get_user_pages_fast( (u64)addr, nr_pages, (prot & SCIF_PROT_WRITE) ? FOLL_WRITE : 0, - pinned_pages->pages, - NULL); - up_write(&mm->mmap_sem); + pinned_pages->pages); if (nr_pages != pinned_pages->nr_pages) { if (try_upgrade) { if (ulimit) - __scif_dec_pinned_vm_lock(mm, - nr_pages, 0); + __scif_dec_pinned_vm_lock(mm, nr_pages); /* Roll back any pinned pages */ for (i = 0; i < pinned_pages->nr_pages; i++) { if (pinned_pages->pages[i]) @@ -1433,7 +1419,7 @@ int __scif_pin_pages(void *addr, size_t len, int *out_prot, return err; dec_pinned: if (ulimit) - __scif_dec_pinned_vm_lock(mm, nr_pages, 0); + __scif_dec_pinned_vm_lock(mm, nr_pages); /* Something went wrong! Rollback */ error_unmap: pinned_pages->nr_pages = nr_pages; From patchwork Wed Feb 6 17:59:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Davidlohr Bueso X-Patchwork-Id: 10799801 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9552A746 for ; Wed, 6 Feb 2019 18:00:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8B0312BDAE for ; Wed, 6 Feb 2019 18:00:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 892162CD72; Wed, 6 Feb 2019 18:00:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 159052C0A1 for ; Wed, 6 Feb 2019 18:00:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727212AbfBFSAF (ORCPT ); Wed, 6 Feb 2019 13:00:05 -0500 Received: from smtp.nue.novell.com ([195.135.221.5]:54017 "EHLO smtp.nue.novell.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730952AbfBFSAF (ORCPT ); Wed, 6 Feb 2019 13:00:05 -0500 Received: from emea4-mta.ukb.novell.com ([10.120.13.87]) by smtp.nue.novell.com with ESMTP (TLS encrypted); Wed, 06 Feb 2019 19:00:03 +0100 Received: from linux-r8p5.suse.de (nwb-a10-snat.microfocus.com [10.120.13.202]) by emea4-mta.ukb.novell.com with ESMTP (TLS encrypted); Wed, 06 Feb 2019 17:59:46 +0000 From: Davidlohr Bueso To: jgg@ziepe.ca, akpm@linux-foundation.org Cc: dledford@redhat.com, jgg@mellanox.com, jack@suse.cz, willy@infradead.org, ira.weiny@intel.com, linux-rdma@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, dave@stgolabs.net, dennis.dalessandro@intel.com, mike.marciniszyn@intel.com, Davidlohr Bueso Subject: [PATCH 3/6] drivers/IB,qib: optimize mmap_sem usage Date: Wed, 6 Feb 2019 09:59:17 -0800 Message-Id: <20190206175920.31082-4-dave@stgolabs.net> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190206175920.31082-1-dave@stgolabs.net> References: <20190206175920.31082-1-dave@stgolabs.net> Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The driver uses mmap_sem for both pinned_vm accounting and get_user_pages(). Because rdma drivers might want to use gup_longterm() in the future we still need some sort of mmap_sem serialization (as opposed to removing it entirely by using gup_fast()). Now that pinned_vm is atomic the writer lock can therefore be converted to reader. This also fixes a bug that __qib_get_user_pages was not taking into account the current value of pinned_vm. Cc: dennis.dalessandro@intel.com Cc: mike.marciniszyn@intel.com Reviewed-by: Ira Weiny Signed-off-by: Davidlohr Bueso --- drivers/infiniband/hw/qib/qib_user_pages.c | 73 +++++++++++------------------- 1 file changed, 27 insertions(+), 46 deletions(-) diff --git a/drivers/infiniband/hw/qib/qib_user_pages.c b/drivers/infiniband/hw/qib/qib_user_pages.c index c6c81022d313..ef8bcf366ddc 100644 --- a/drivers/infiniband/hw/qib/qib_user_pages.c +++ b/drivers/infiniband/hw/qib/qib_user_pages.c @@ -49,43 +49,6 @@ static void __qib_release_user_pages(struct page **p, size_t num_pages, } } -/* - * Call with current->mm->mmap_sem held. - */ -static int __qib_get_user_pages(unsigned long start_page, size_t num_pages, - struct page **p) -{ - unsigned long lock_limit; - size_t got; - int ret; - - lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; - - if (num_pages > lock_limit && !capable(CAP_IPC_LOCK)) { - ret = -ENOMEM; - goto bail; - } - - for (got = 0; got < num_pages; got += ret) { - ret = get_user_pages_longterm(start_page + got * PAGE_SIZE, - num_pages - got, - FOLL_WRITE | FOLL_FORCE, - p + got, NULL); - if (ret < 0) - goto bail_release; - } - - atomic64_add(num_pages, ¤t->mm->pinned_vm); - - ret = 0; - goto bail; - -bail_release: - __qib_release_user_pages(p, got, 0); -bail: - return ret; -} - /** * qib_map_page - a safety wrapper around pci_map_page() * @@ -137,26 +100,44 @@ int qib_map_page(struct pci_dev *hwdev, struct page *page, dma_addr_t *daddr) int qib_get_user_pages(unsigned long start_page, size_t num_pages, struct page **p) { + unsigned long locked, lock_limit; + size_t got; int ret; - down_write(¤t->mm->mmap_sem); + lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; + locked = atomic64_add_return(num_pages, ¤t->mm->pinned_vm); - ret = __qib_get_user_pages(start_page, num_pages, p); + if (num_pages > lock_limit && !capable(CAP_IPC_LOCK)) { + ret = -ENOMEM; + goto bail; + } - up_write(¤t->mm->mmap_sem); + down_read(¤t->mm->mmap_sem); + for (got = 0; got < num_pages; got += ret) { + ret = get_user_pages_longterm(start_page + got * PAGE_SIZE, + num_pages - got, + FOLL_WRITE | FOLL_FORCE, + p + got, NULL); + if (ret < 0) { + up_read(¤t->mm->mmap_sem); + goto bail_release; + } + } + up_read(¤t->mm->mmap_sem); + return 0; +bail_release: + __qib_release_user_pages(p, got, 0); +bail: + atomic64_sub(num_pages, ¤t->mm->pinned_vm); return ret; } void qib_release_user_pages(struct page **p, size_t num_pages) { - if (current->mm) /* during close after signal, mm can be NULL */ - down_write(¤t->mm->mmap_sem); - __qib_release_user_pages(p, num_pages, 1); - if (current->mm) { + /* during close after signal, mm can be NULL */ + if (current->mm) atomic64_sub(num_pages, ¤t->mm->pinned_vm); - up_write(¤t->mm->mmap_sem); - } } From patchwork Wed Feb 6 17:59:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Davidlohr Bueso X-Patchwork-Id: 10799799 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BEF5A1669 for ; Wed, 6 Feb 2019 18:00:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B48A82CC05 for ; Wed, 6 Feb 2019 18:00:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B27042CE73; Wed, 6 Feb 2019 18:00:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 698442CD72 for ; Wed, 6 Feb 2019 18:00:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731019AbfBFSAH (ORCPT ); Wed, 6 Feb 2019 13:00:07 -0500 Received: from smtp.nue.novell.com ([195.135.221.5]:58924 "EHLO smtp.nue.novell.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730127AbfBFSAG (ORCPT ); Wed, 6 Feb 2019 13:00:06 -0500 Received: from emea4-mta.ukb.novell.com ([10.120.13.87]) by smtp.nue.novell.com with ESMTP (TLS encrypted); Wed, 06 Feb 2019 19:00:05 +0100 Received: from linux-r8p5.suse.de (nwb-a10-snat.microfocus.com [10.120.13.202]) by emea4-mta.ukb.novell.com with ESMTP (TLS encrypted); Wed, 06 Feb 2019 17:59:50 +0000 From: Davidlohr Bueso To: jgg@ziepe.ca, akpm@linux-foundation.org Cc: dledford@redhat.com, jgg@mellanox.com, jack@suse.cz, willy@infradead.org, ira.weiny@intel.com, linux-rdma@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, dave@stgolabs.net, Davidlohr Bueso Subject: [PATCH 4/6] drivers/IB,hfi1: do not se mmap_sem Date: Wed, 6 Feb 2019 09:59:18 -0800 Message-Id: <20190206175920.31082-5-dave@stgolabs.net> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190206175920.31082-1-dave@stgolabs.net> References: <20190206175920.31082-1-dave@stgolabs.net> Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This driver already uses gup_fast() and thus we can just drop the mmap_sem protection around the pinned_vm counter. Note that the window between when hfi1_can_pin_pages() is called and the actual counter is incremented remains the same as mmap_sem was _only_ used for when ->pinned_vm was touched. Reviewed-by: Ira Weiny Signed-off-by: Davidlohr Bueso --- drivers/infiniband/hw/hfi1/user_pages.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/drivers/infiniband/hw/hfi1/user_pages.c b/drivers/infiniband/hw/hfi1/user_pages.c index 40a6e434190f..24b592c6522e 100644 --- a/drivers/infiniband/hw/hfi1/user_pages.c +++ b/drivers/infiniband/hw/hfi1/user_pages.c @@ -91,9 +91,7 @@ bool hfi1_can_pin_pages(struct hfi1_devdata *dd, struct mm_struct *mm, /* Convert to number of pages */ size = DIV_ROUND_UP(size, PAGE_SIZE); - down_read(&mm->mmap_sem); pinned = atomic64_read(&mm->pinned_vm); - up_read(&mm->mmap_sem); /* First, check the absolute limit against all pinned pages. */ if (pinned + npages >= ulimit && !can_lock) @@ -111,9 +109,7 @@ int hfi1_acquire_user_pages(struct mm_struct *mm, unsigned long vaddr, size_t np if (ret < 0) return ret; - down_write(&mm->mmap_sem); atomic64_add(ret, &mm->pinned_vm); - up_write(&mm->mmap_sem); return ret; } @@ -130,8 +126,6 @@ void hfi1_release_user_pages(struct mm_struct *mm, struct page **p, } if (mm) { /* during close after signal, mm can be NULL */ - down_write(&mm->mmap_sem); atomic64_sub(npages, &mm->pinned_vm); - up_write(&mm->mmap_sem); } } From patchwork Wed Feb 6 17:59:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Davidlohr Bueso X-Patchwork-Id: 10799813 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EEECB746 for ; Wed, 6 Feb 2019 18:00:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E30792BDAE for ; Wed, 6 Feb 2019 18:00:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E0FBB2CF4C; Wed, 6 Feb 2019 18:00:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6BC212BDAE for ; Wed, 6 Feb 2019 18:00:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731053AbfBFSAa (ORCPT ); Wed, 6 Feb 2019 13:00:30 -0500 Received: from smtp.nue.novell.com ([195.135.221.5]:35597 "EHLO smtp.nue.novell.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729173AbfBFSAa (ORCPT ); Wed, 6 Feb 2019 13:00:30 -0500 Received: from emea4-mta.ukb.novell.com ([10.120.13.87]) by smtp.nue.novell.com with ESMTP (TLS encrypted); Wed, 06 Feb 2019 19:00:28 +0100 Received: from linux-r8p5.suse.de (nwb-a10-snat.microfocus.com [10.120.13.202]) by emea4-mta.ukb.novell.com with ESMTP (TLS encrypted); Wed, 06 Feb 2019 17:59:54 +0000 From: Davidlohr Bueso To: jgg@ziepe.ca, akpm@linux-foundation.org Cc: dledford@redhat.com, jgg@mellanox.com, jack@suse.cz, willy@infradead.org, ira.weiny@intel.com, linux-rdma@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, dave@stgolabs.net, benve@cisco.com, neescoba@cisco.com, Davidlohr Bueso Subject: [PATCH 5/6] drivers/IB,usnic: reduce scope of mmap_sem Date: Wed, 6 Feb 2019 09:59:19 -0800 Message-Id: <20190206175920.31082-6-dave@stgolabs.net> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190206175920.31082-1-dave@stgolabs.net> References: <20190206175920.31082-1-dave@stgolabs.net> Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP usnic_uiom_get_pages() uses gup_longterm() so we cannot really get rid of mmap_sem altogether in the driver, but we can get rid of some complexity that mmap_sem brings with only pinned_vm. We can get rid of the wq altogether as we no longer need to defer work to unpin pages as the counter is now atomic. We also share the lock. Cc: benve@cisco.com Cc: neescoba@cisco.com Acked-by: Parvi Kaustubhi Reviewed-by: Ira Weiny Signed-off-by: Davidlohr Bueso --- drivers/infiniband/hw/usnic/usnic_ib_main.c | 2 - drivers/infiniband/hw/usnic/usnic_uiom.c | 58 +++-------------------------- drivers/infiniband/hw/usnic/usnic_uiom.h | 1 - 3 files changed, 6 insertions(+), 55 deletions(-) diff --git a/drivers/infiniband/hw/usnic/usnic_ib_main.c b/drivers/infiniband/hw/usnic/usnic_ib_main.c index 3201dd1899c7..1d363b706314 100644 --- a/drivers/infiniband/hw/usnic/usnic_ib_main.c +++ b/drivers/infiniband/hw/usnic/usnic_ib_main.c @@ -691,7 +691,6 @@ static int __init usnic_ib_init(void) out_pci_unreg: pci_unregister_driver(&usnic_ib_pci_driver); out_umem_fini: - usnic_uiom_fini(); return err; } @@ -704,7 +703,6 @@ static void __exit usnic_ib_destroy(void) unregister_inetaddr_notifier(&usnic_ib_inetaddr_notifier); unregister_netdevice_notifier(&usnic_ib_netdevice_notifier); pci_unregister_driver(&usnic_ib_pci_driver); - usnic_uiom_fini(); } MODULE_DESCRIPTION("Cisco VIC (usNIC) Verbs Driver"); diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c index 854436a2b437..06862a6af185 100644 --- a/drivers/infiniband/hw/usnic/usnic_uiom.c +++ b/drivers/infiniband/hw/usnic/usnic_uiom.c @@ -47,8 +47,6 @@ #include "usnic_uiom.h" #include "usnic_uiom_interval_tree.h" -static struct workqueue_struct *usnic_uiom_wq; - #define USNIC_UIOM_PAGE_CHUNK \ ((PAGE_SIZE - offsetof(struct usnic_uiom_chunk, page_list)) /\ ((void *) &((struct usnic_uiom_chunk *) 0)->page_list[1] - \ @@ -127,9 +125,9 @@ static int usnic_uiom_get_pages(unsigned long addr, size_t size, int writable, npages = PAGE_ALIGN(size + (addr & ~PAGE_MASK)) >> PAGE_SHIFT; uiomr->owning_mm = mm = current->mm; - down_write(&mm->mmap_sem); + down_read(&mm->mmap_sem); - locked = npages + atomic64_read(¤t->mm->pinned_vm); + locked = atomic64_add_return(npages, ¤t->mm->pinned_vm); lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; if ((locked > lock_limit) && !capable(CAP_IPC_LOCK)) { @@ -184,14 +182,13 @@ static int usnic_uiom_get_pages(unsigned long addr, size_t size, int writable, } out: - if (ret < 0) + if (ret < 0) { usnic_uiom_put_pages(chunk_list, 0); - else { - atomic64_set(&mm->pinned_vm, locked); + atomic64_sub(npages, ¤t->mm->pinned_vm); + } else mmgrab(uiomr->owning_mm); - } - up_write(&mm->mmap_sem); + up_read(&mm->mmap_sem); free_page((unsigned long) page_list); return ret; } @@ -435,43 +432,12 @@ static inline size_t usnic_uiom_num_pages(struct usnic_uiom_reg *uiomr) return PAGE_ALIGN(uiomr->length + uiomr->offset) >> PAGE_SHIFT; } -static void usnic_uiom_release_defer(struct work_struct *work) -{ - struct usnic_uiom_reg *uiomr = - container_of(work, struct usnic_uiom_reg, work); - - down_write(&uiomr->owning_mm->mmap_sem); - atomic64_sub(usnic_uiom_num_pages(uiomr), &uiomr->owning_mm->pinned_vm); - up_write(&uiomr->owning_mm->mmap_sem); - - __usnic_uiom_release_tail(uiomr); -} - void usnic_uiom_reg_release(struct usnic_uiom_reg *uiomr, struct ib_ucontext *context) { __usnic_uiom_reg_release(uiomr->pd, uiomr, 1); - /* - * We may be called with the mm's mmap_sem already held. This - * can happen when a userspace munmap() is the call that drops - * the last reference to our file and calls our release - * method. If there are memory regions to destroy, we'll end - * up here and not be able to take the mmap_sem. In that case - * we defer the vm_locked accounting to a workqueue. - */ - if (context->closing) { - if (!down_write_trylock(&uiomr->owning_mm->mmap_sem)) { - INIT_WORK(&uiomr->work, usnic_uiom_release_defer); - queue_work(usnic_uiom_wq, &uiomr->work); - return; - } - } else { - down_write(&uiomr->owning_mm->mmap_sem); - } atomic64_sub(usnic_uiom_num_pages(uiomr), &uiomr->owning_mm->pinned_vm); - up_write(&uiomr->owning_mm->mmap_sem); - __usnic_uiom_release_tail(uiomr); } @@ -600,17 +566,5 @@ int usnic_uiom_init(char *drv_name) return -EPERM; } - usnic_uiom_wq = create_workqueue(drv_name); - if (!usnic_uiom_wq) { - usnic_err("Unable to alloc wq for drv %s\n", drv_name); - return -ENOMEM; - } - return 0; } - -void usnic_uiom_fini(void) -{ - flush_workqueue(usnic_uiom_wq); - destroy_workqueue(usnic_uiom_wq); -} diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.h b/drivers/infiniband/hw/usnic/usnic_uiom.h index b86a9731071b..c88cfa087e3a 100644 --- a/drivers/infiniband/hw/usnic/usnic_uiom.h +++ b/drivers/infiniband/hw/usnic/usnic_uiom.h @@ -93,5 +93,4 @@ struct usnic_uiom_reg *usnic_uiom_reg_get(struct usnic_uiom_pd *pd, void usnic_uiom_reg_release(struct usnic_uiom_reg *uiomr, struct ib_ucontext *ucontext); int usnic_uiom_init(char *drv_name); -void usnic_uiom_fini(void); #endif /* USNIC_UIOM_H_ */ From patchwork Wed Feb 6 17:59:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Davidlohr Bueso X-Patchwork-Id: 10799811 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0B9DB746 for ; Wed, 6 Feb 2019 18:00:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 01F322BD96 for ; Wed, 6 Feb 2019 18:00:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 003DE2BFC1; Wed, 6 Feb 2019 18:00:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 976102BD96 for ; Wed, 6 Feb 2019 18:00:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731074AbfBFSAc (ORCPT ); Wed, 6 Feb 2019 13:00:32 -0500 Received: from smtp.nue.novell.com ([195.135.221.5]:36715 "EHLO smtp.nue.novell.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729173AbfBFSAc (ORCPT ); Wed, 6 Feb 2019 13:00:32 -0500 Received: from emea4-mta.ukb.novell.com ([10.120.13.87]) by smtp.nue.novell.com with ESMTP (TLS encrypted); Wed, 06 Feb 2019 19:00:30 +0100 Received: from linux-r8p5.suse.de (nwb-a10-snat.microfocus.com [10.120.13.202]) by emea4-mta.ukb.novell.com with ESMTP (TLS encrypted); Wed, 06 Feb 2019 17:59:58 +0000 From: Davidlohr Bueso To: jgg@ziepe.ca, akpm@linux-foundation.org Cc: dledford@redhat.com, jgg@mellanox.com, jack@suse.cz, willy@infradead.org, ira.weiny@intel.com, linux-rdma@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, dave@stgolabs.net, Davidlohr Bueso Subject: [PATCH 6/6] drivers/IB,core: reduce scope of mmap_sem Date: Wed, 6 Feb 2019 09:59:20 -0800 Message-Id: <20190206175920.31082-7-dave@stgolabs.net> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190206175920.31082-1-dave@stgolabs.net> References: <20190206175920.31082-1-dave@stgolabs.net> Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP ib_umem_get() uses gup_longterm() and relies on the lock to stabilze the vma_list, so we cannot really get rid of mmap_sem altogether, but now that the counter is atomic, we can get of some complexity that mmap_sem brings with only pinned_vm. Reviewed-by: Ira Weiny Signed-off-by: Davidlohr Bueso --- drivers/infiniband/core/umem.c | 41 ++--------------------------------------- 1 file changed, 2 insertions(+), 39 deletions(-) diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index 678abe1afcba..b69d3efa8712 100644 --- a/drivers/infiniband/core/umem.c +++ b/drivers/infiniband/core/umem.c @@ -165,15 +165,12 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr, lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; - down_write(&mm->mmap_sem); - new_pinned = atomic64_read(&mm->pinned_vm) + npages; + new_pinned = atomic64_add_return(npages, &mm->pinned_vm); if (new_pinned > lock_limit && !capable(CAP_IPC_LOCK)) { - up_write(&mm->mmap_sem); + atomic64_sub(npages, &mm->pinned_vm); ret = -ENOMEM; goto out; } - atomic64_set(&mm->pinned_vm, new_pinned); - up_write(&mm->mmap_sem); cur_base = addr & PAGE_MASK; @@ -233,9 +230,7 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr, umem_release: __ib_umem_release(context->device, umem, 0); vma: - down_write(&mm->mmap_sem); atomic64_sub(ib_umem_num_pages(umem), &mm->pinned_vm); - up_write(&mm->mmap_sem); out: if (vma_list) free_page((unsigned long) vma_list); @@ -258,25 +253,12 @@ static void __ib_umem_release_tail(struct ib_umem *umem) kfree(umem); } -static void ib_umem_release_defer(struct work_struct *work) -{ - struct ib_umem *umem = container_of(work, struct ib_umem, work); - - down_write(&umem->owning_mm->mmap_sem); - atomic64_sub(ib_umem_num_pages(umem), &umem->owning_mm->pinned_vm); - up_write(&umem->owning_mm->mmap_sem); - - __ib_umem_release_tail(umem); -} - /** * ib_umem_release - release memory pinned with ib_umem_get * @umem: umem struct to release */ void ib_umem_release(struct ib_umem *umem) { - struct ib_ucontext *context = umem->context; - if (umem->is_odp) { ib_umem_odp_release(to_ib_umem_odp(umem)); __ib_umem_release_tail(umem); @@ -285,26 +267,7 @@ void ib_umem_release(struct ib_umem *umem) __ib_umem_release(umem->context->device, umem, 1); - /* - * We may be called with the mm's mmap_sem already held. This - * can happen when a userspace munmap() is the call that drops - * the last reference to our file and calls our release - * method. If there are memory regions to destroy, we'll end - * up here and not be able to take the mmap_sem. In that case - * we defer the vm_locked accounting a workqueue. - */ - if (context->closing) { - if (!down_write_trylock(&umem->owning_mm->mmap_sem)) { - INIT_WORK(&umem->work, ib_umem_release_defer); - queue_work(ib_wq, &umem->work); - return; - } - } else { - down_write(&umem->owning_mm->mmap_sem); - } atomic64_sub(ib_umem_num_pages(umem), &umem->owning_mm->pinned_vm); - up_write(&umem->owning_mm->mmap_sem); - __ib_umem_release_tail(umem); } EXPORT_SYMBOL(ib_umem_release); From patchwork Thu Feb 7 01:31:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Davidlohr Bueso X-Patchwork-Id: 10800339 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3966A13B5 for ; Thu, 7 Feb 2019 01:32:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 217352D4F8 for ; Thu, 7 Feb 2019 01:32:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1424E2D54D; Thu, 7 Feb 2019 01:32:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 59D0B2D4F8 for ; Thu, 7 Feb 2019 01:32:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726245AbfBGBcK (ORCPT ); Wed, 6 Feb 2019 20:32:10 -0500 Received: from mx2.suse.de ([195.135.220.15]:39684 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726161AbfBGBcJ (ORCPT ); Wed, 6 Feb 2019 20:32:09 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 2EEAAAFA8; Thu, 7 Feb 2019 01:32:06 +0000 (UTC) Date: Wed, 6 Feb 2019 17:31:55 -0800 From: Davidlohr Bueso To: jgg@ziepe.ca, akpm@linux-foundation.org Cc: dledford@redhat.com, jgg@mellanox.com, jack@suse.cz, willy@infradead.org, ira.weiny@intel.com, linux-rdma@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH 7/6] Documentation/infiniband: update from locked to pinned_vm Message-ID: <20190207013155.lq5diwqc2svyt3t3@linux-r8p5> Mail-Followup-To: jgg@ziepe.ca, akpm@linux-foundation.org, dledford@redhat.com, jgg@mellanox.com, jack@suse.cz, willy@infradead.org, ira.weiny@intel.com, linux-rdma@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20190206175920.31082-1-dave@stgolabs.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20190206175920.31082-1-dave@stgolabs.net> User-Agent: NeoMutt/20180323 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We are really talking about pinned_vm here. Signed-off-by: Davidlohr Bueso Reviewed-by: Ira Weiny --- Documentation/infiniband/user_verbs.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/infiniband/user_verbs.txt b/Documentation/infiniband/user_verbs.txt index df049b9f5b6e..47ebf2f80b2b 100644 --- a/Documentation/infiniband/user_verbs.txt +++ b/Documentation/infiniband/user_verbs.txt @@ -46,11 +46,11 @@ Memory pinning I/O targets be kept resident at the same physical address. The ib_uverbs module manages pinning and unpinning memory regions via get_user_pages() and put_page() calls. It also accounts for the - amount of memory pinned in the process's locked_vm, and checks that + amount of memory pinned in the process's pinned_vm, and checks that unprivileged processes do not exceed their RLIMIT_MEMLOCK limit. Pages that are pinned multiple times are counted each time they are - pinned, so the value of locked_vm may be an overestimate of the + pinned, so the value of pinned_vm may be an overestimate of the number of pages pinned by a process. /dev files