From patchwork Mon Aug 12 01:50:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: john.hubbard@gmail.com X-Patchwork-Id: 11089225 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E868113AC for ; Mon, 12 Aug 2019 01:51:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DA0F027F93 for ; Mon, 12 Aug 2019 01:51:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CD52620243; Mon, 12 Aug 2019 01:51:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 731F527F93 for ; Mon, 12 Aug 2019 01:51:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726839AbfHLBu7 (ORCPT ); Sun, 11 Aug 2019 21:50:59 -0400 Received: from mail-pl1-f195.google.com ([209.85.214.195]:36191 "EHLO mail-pl1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726200AbfHLBu5 (ORCPT ); Sun, 11 Aug 2019 21:50:57 -0400 Received: by mail-pl1-f195.google.com with SMTP id g4so785412plo.3; Sun, 11 Aug 2019 18:50:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=//FfBrFJF4GgHRR+yYGohW47iWmsEIpsgR4o2CfLEaI=; b=n9WdsBYCNshNyrjNwoA988ZTa5cUmZ3ftfo2srk4qVaeFk/3QoLLi5AAHL4AElSmiI XuhzXcej3mP/qZgB3XHsGbTfB/yNKw6cCUUIAPQ+5Nz8QE6P37CaiJAF1juYEdwzcQ7x KCBBLQdzMFyQAH/huuHEICl517WeOf9evlQYRM6ra2qmGObt90NYEQrc2g3Hrh2Z96nu KDGoR5r30ngur4PCtV8FhCZSZwFtTuDq2bAvEGOaEF6xs7+I2IMBDgXvMsOlYXQ02UbB 3XrvtA2d2QJxWudzhOEkOmMrqZS8v6TJDdTbi1eDdq4ek8o80TJA5b2A6NTjpOCVITJa 5I0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=//FfBrFJF4GgHRR+yYGohW47iWmsEIpsgR4o2CfLEaI=; b=s1Tqi5Ho9l68qUnJg2DpDM3hCfFgcjt+i4BWmpcWAcbwnXmk0SrjKXI2sYWQLPbR3k Z9Sdw14Zknfcw8zi/tExms3m4rxzuqw7CKOePEzaFhP9Yc5zf3Q6AlL6mWh1YiciSGHe 0hDpP8McVlJuP0+cFMnEMLtY640SHJJn0iIba0IFsVvjaGuYQik887A3IbK8yiFn3rj7 0xADGp+v0kVBoarO6LxZLpkxcPrb3/FgHz8eCfPBIM9f+XJYRI53hoJhoI+BDMP/g8K2 WMEdqZC+s6hpnQ5+S8ZtofKJ0j6wm0CLMXdWQUjo2scsE5EaNCNtcFcgSTXGXrnatl2k R2MA== X-Gm-Message-State: APjAAAXC2ep1QKmxmbbK/yDsvSMgd9r8a3w4P+b+VYQxMLxn//f6OUiH Dd3HkhNwS9+vWI61CQwAMmqKpDGv X-Google-Smtp-Source: APXvYqzT/0PLSkZoDJi/ZIzWxvm8QO/lLOlBRLMJ5/ogboYNq/cgR98nTHd4GJwrVGgRFe2BKV9BHg== X-Received: by 2002:a17:902:b604:: with SMTP id b4mr15581107pls.94.1565574657032; Sun, 11 Aug 2019 18:50:57 -0700 (PDT) Received: from blueforge.nvidia.com (searspoint.nvidia.com. [216.228.112.21]) by smtp.gmail.com with ESMTPSA id j20sm100062363pfr.113.2019.08.11.18.50.56 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Sun, 11 Aug 2019 18:50:56 -0700 (PDT) From: john.hubbard@gmail.com X-Google-Original-From: jhubbard@nvidia.com To: Andrew Morton Cc: Christoph Hellwig , Dan Williams , Dave Chinner , Ira Weiny , Jan Kara , Jason Gunthorpe , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , LKML , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-rdma@vger.kernel.org, John Hubbard , Michal Hocko Subject: [RFC PATCH 1/2] mm/gup: introduce FOLL_PIN flag for get_user_pages() Date: Sun, 11 Aug 2019 18:50:43 -0700 Message-Id: <20190812015044.26176-2-jhubbard@nvidia.com> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190812015044.26176-1-jhubbard@nvidia.com> References: <20190812015044.26176-1-jhubbard@nvidia.com> MIME-Version: 1.0 X-NVConfidentiality: public Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: John Hubbard FOLL_PIN is set by vaddr_pin_pages(). This is different than FOLL_LONGTERM, because even short term page pins need a new kind of tracking, if those pinned pages' data is going to potentially be modified. This situation is described in more detail in commit fc1d8e7cca2d ("mm: introduce put_user_page*(), placeholder versions"). FOLL_PIN is added now, rather than waiting until there is code that takes action based on FOLL_PIN. That's because having FOLL_PIN in the code helps to highlight the differences between: a) get_user_pages(): soon to be deprecated. Used to pin pages, but without awareness of file systems that might use those pages, b) The original vaddr_pin_pages(): intended only for FOLL_LONGTERM and DAX use cases. This assumes direct IO and therefore is not applicable the most of the other callers of get_user_pages(), and c) The new vaddr_pin_pages(), which provides the correct get_user_pages() flags for all cases, by setting FOLL_PIN. Cc: Ira Weiny Cc: Jan Kara Cc: Michal Hocko Signed-off-by: John Hubbard --- include/linux/mm.h | 1 + mm/gup.c | 5 ++++- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 90c5802866df..61b616cd9243 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2663,6 +2663,7 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address, #define FOLL_ANON 0x8000 /* don't do file mappings */ #define FOLL_LONGTERM 0x10000 /* mapping lifetime is indefinite: see below */ #define FOLL_SPLIT_PMD 0x20000 /* split huge pmd before returning */ +#define FOLL_PIN 0x40000 /* pages must be released via put_user_page() */ /* * NOTE on FOLL_LONGTERM: diff --git a/mm/gup.c b/mm/gup.c index 58f008a3c153..85f09958fbdc 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2494,6 +2494,9 @@ EXPORT_SYMBOL_GPL(get_user_pages_fast); * being made against. Usually "current->mm". * * Expects mmap_sem to be read locked. + * + * Implementation note: this sets FOLL_PIN, which means that the pages must + * ultimately be released by put_user_page(). */ long vaddr_pin_pages(unsigned long addr, unsigned long nr_pages, unsigned int gup_flags, struct page **pages, @@ -2501,7 +2504,7 @@ long vaddr_pin_pages(unsigned long addr, unsigned long nr_pages, { long ret; - gup_flags |= FOLL_LONGTERM; + gup_flags |= FOLL_LONGTERM | FOLL_PIN; if (!vaddr_pin || (!vaddr_pin->mm && !vaddr_pin->f_owner)) return -EINVAL; From patchwork Mon Aug 12 01:50:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: john.hubbard@gmail.com X-Patchwork-Id: 11089227 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 765E31398 for ; Mon, 12 Aug 2019 01:51:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 683A426419 for ; Mon, 12 Aug 2019 01:51:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5C88A20243; Mon, 12 Aug 2019 01:51:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C26E327FE4 for ; Mon, 12 Aug 2019 01:51:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726826AbfHLBu7 (ORCPT ); Sun, 11 Aug 2019 21:50:59 -0400 Received: from mail-pl1-f194.google.com ([209.85.214.194]:41607 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726820AbfHLBu6 (ORCPT ); Sun, 11 Aug 2019 21:50:58 -0400 Received: by mail-pl1-f194.google.com with SMTP id m9so47106182pls.8; Sun, 11 Aug 2019 18:50:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5JtiWY0w31kuQOtn3l4X+8CaJzkxiVZy5vK9oZwhAyM=; b=tN18OW3egtBoEIE9QN2L83YpkIuEmn8E/2vatlX2Rt8/d5Cz4UgSVajiq6e/mKLtaS Y249S8PkRKY1XNNQCc1ShRk2rwVcEYBgq5dc/KhKvRDosjYbqzTKzsXamCtfiXYDc9Xp e09h33F7vAARQiFw/FG8VCiqdIkvyyhBlY/mp3fdDfxf6u2KTJYoHRQkFO45IwfAa5bo gJG8IrKfVZKwLlhGy/t3ERaR2nPJMfzZ5Lyz1p0oex5y7bLt3zvVM2vRFJ5rXSFt80t5 3snZBb+hSDGsUlRyup/QfXn5z03seNoHqcPJopyWhki+dM/QX6bD5iOeLfhwkSMlhTih Gj3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5JtiWY0w31kuQOtn3l4X+8CaJzkxiVZy5vK9oZwhAyM=; b=Wbw8fm51zM6+Z/vl5lKHDjQN3Y59AOd4sn6pZtN0WOw8imPTPeiG7cTMMo4KHaAus0 dVhn7Iqg6x04BP1+95q+p/1JvNJJ/WlwMbEBNMv1RDzRyUP+0Z/kND/ZM6HUzLVIPGtI NHwYDUTJZjZKlwLHxxYOh3HebepGXJDwdR1SmEEoWOBG4nK5QXLBB/+vcdcw5UG2vwIT dorfDQ7aEPMU38t0F29rsS6R8En6vbSmy/NljkfUoaBlLjCshnfBtsnPEhgs1OLwLJUq GgoSa0AOaeut7w1PevjvvTFOTGUVzpRLVC1STjheZS7vc2YlPmABLRWrBNBqxUjFeuvX qEGQ== X-Gm-Message-State: APjAAAVRkk8ukadmsDDEKmKl6OPedDj4d3rIjEH6QbD34MFA68Ir8tVa XJNUutLZbhRWGfzdrRbPPEY= X-Google-Smtp-Source: APXvYqzBXwirWl6l+fEmHVCahvbTNzZTrBy2+Jlu3deIQj1LEMtfht+Rr+eRVfWYnMxSvK5SOCC2EQ== X-Received: by 2002:a17:902:e311:: with SMTP id cg17mr3605017plb.183.1565574657941; Sun, 11 Aug 2019 18:50:57 -0700 (PDT) Received: from blueforge.nvidia.com (searspoint.nvidia.com. [216.228.112.21]) by smtp.gmail.com with ESMTPSA id j20sm100062363pfr.113.2019.08.11.18.50.57 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Sun, 11 Aug 2019 18:50:57 -0700 (PDT) From: john.hubbard@gmail.com X-Google-Original-From: jhubbard@nvidia.com To: Andrew Morton Cc: Christoph Hellwig , Dan Williams , Dave Chinner , Ira Weiny , Jan Kara , Jason Gunthorpe , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , LKML , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-rdma@vger.kernel.org, John Hubbard Subject: [RFC PATCH 2/2] mm/gup: introduce vaddr_pin_pages_remote() Date: Sun, 11 Aug 2019 18:50:44 -0700 Message-Id: <20190812015044.26176-3-jhubbard@nvidia.com> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190812015044.26176-1-jhubbard@nvidia.com> References: <20190812015044.26176-1-jhubbard@nvidia.com> MIME-Version: 1.0 X-NVConfidentiality: public Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: John Hubbard This is the "vaddr_pin_pages" corresponding variant to get_user_pages_remote(), but with FOLL_PIN semantics: the implementation sets FOLL_PIN. That, in turn, means that the pages must ultimately be released by put_user_page*()--typically, via vaddr_unpin_pages*(). Note that the put_user_page*() requirement won't be truly required until all of the call sites have been converted, and the tracking of pages is actually activated. Also introduce vaddr_unpin_pages(), in order to have a simpler call for the error handling cases. Use both of these new calls in the Infiniband drive, replacing get_user_pages_remote() and put_user_pages(). Signed-off-by: John Hubbard --- drivers/infiniband/core/umem_odp.c | 15 +++++---- include/linux/mm.h | 7 +++++ mm/gup.c | 50 ++++++++++++++++++++++++++++++ 3 files changed, 66 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index 53085896d718..fdff034a8a30 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -534,7 +534,7 @@ static int ib_umem_odp_map_dma_single_page( } out: - put_user_page(page); + vaddr_unpin_pages(&page, 1, &umem_odp->umem.vaddr_pin); if (remove_existing_mapping) { ib_umem_notifier_start_account(umem_odp); @@ -635,9 +635,10 @@ int ib_umem_odp_map_dma_pages(struct ib_umem_odp *umem_odp, u64 user_virt, * complex (and doesn't gain us much performance in most use * cases). */ - npages = get_user_pages_remote(owning_process, owning_mm, + npages = vaddr_pin_pages_remote(owning_process, owning_mm, user_virt, gup_num_pages, - flags, local_page_list, NULL, NULL); + flags, local_page_list, NULL, NULL, + &umem_odp->umem.vaddr_pin); up_read(&owning_mm->mmap_sem); if (npages < 0) { @@ -657,7 +658,8 @@ int ib_umem_odp_map_dma_pages(struct ib_umem_odp *umem_odp, u64 user_virt, ret = -EFAULT; break; } - put_user_page(local_page_list[j]); + vaddr_unpin_pages(&local_page_list[j], 1, + &umem_odp->umem.vaddr_pin); continue; } @@ -684,8 +686,9 @@ int ib_umem_odp_map_dma_pages(struct ib_umem_odp *umem_odp, u64 user_virt, * ib_umem_odp_map_dma_single_page(). */ if (npages - (j + 1) > 0) - put_user_pages(&local_page_list[j+1], - npages - (j + 1)); + vaddr_unpin_pages(&local_page_list[j+1], + npages - (j + 1), + &umem_odp->umem.vaddr_pin); break; } } diff --git a/include/linux/mm.h b/include/linux/mm.h index 61b616cd9243..2bd76ad8787e 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1606,6 +1606,13 @@ int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc, long vaddr_pin_pages(unsigned long addr, unsigned long nr_pages, unsigned int gup_flags, struct page **pages, struct vaddr_pin *vaddr_pin); +long vaddr_pin_pages_remote(struct task_struct *tsk, struct mm_struct *mm, + unsigned long start, unsigned long nr_pages, + unsigned int gup_flags, struct page **pages, + struct vm_area_struct **vmas, int *locked, + struct vaddr_pin *vaddr_pin); +void vaddr_unpin_pages(struct page **pages, unsigned long nr_pages, + struct vaddr_pin *vaddr_pin); void vaddr_unpin_pages_dirty_lock(struct page **pages, unsigned long nr_pages, struct vaddr_pin *vaddr_pin, bool make_dirty); bool mapping_inode_has_layout(struct vaddr_pin *vaddr_pin, struct page *page); diff --git a/mm/gup.c b/mm/gup.c index 85f09958fbdc..bb95adfaf9b6 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2518,6 +2518,38 @@ long vaddr_pin_pages(unsigned long addr, unsigned long nr_pages, } EXPORT_SYMBOL(vaddr_pin_pages); +/** + * vaddr_pin_pages pin pages by virtual address and return the pages to the + * user. + * + * @tsk: the task_struct to use for page fault accounting, or + * NULL if faults are not to be recorded. + * @mm: mm_struct of target mm + * @addr: start address + * @nr_pages: number of pages to pin + * @gup_flags: flags to use for the pin + * @pages: array of pages returned + * @vaddr_pin: initialized meta information this pin is to be associated + * with. + * + * This is the "vaddr_pin_pages" corresponding variant to + * get_user_pages_remote(), but with FOLL_PIN semantics: the implementation sets + * FOLL_PIN. That, in turn, means that the pages must ultimately be released + * by put_user_page(). + */ +long vaddr_pin_pages_remote(struct task_struct *tsk, struct mm_struct *mm, + unsigned long start, unsigned long nr_pages, + unsigned int gup_flags, struct page **pages, + struct vm_area_struct **vmas, int *locked, + struct vaddr_pin *vaddr_pin) +{ + gup_flags |= FOLL_TOUCH | FOLL_REMOTE | FOLL_PIN; + + return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas, + locked, gup_flags, vaddr_pin); +} +EXPORT_SYMBOL(vaddr_pin_pages_remote); + /** * vaddr_unpin_pages_dirty_lock - counterpart to vaddr_pin_pages * @@ -2536,3 +2568,21 @@ void vaddr_unpin_pages_dirty_lock(struct page **pages, unsigned long nr_pages, __put_user_pages_dirty_lock(vaddr_pin, pages, nr_pages, make_dirty); } EXPORT_SYMBOL(vaddr_unpin_pages_dirty_lock); + +/** + * vaddr_unpin_pages - simple, non-dirtying counterpart to vaddr_pin_pages + * + * @pages: array of pages returned + * @nr_pages: number of pages in pages + * @vaddr_pin: same information passed to vaddr_pin_pages + * + * Like vaddr_unpin_pages_dirty_lock, but for non-dirty pages. Useful in putting + * back pages in an error case: they were never made dirty. + */ +void vaddr_unpin_pages(struct page **pages, unsigned long nr_pages, + struct vaddr_pin *vaddr_pin) +{ + __put_user_pages_dirty_lock(vaddr_pin, pages, nr_pages, false); +} +EXPORT_SYMBOL(vaddr_unpin_pages); +