From patchwork Fri Dec 4 19:39:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jason Gunthorpe X-Patchwork-Id: 11952211 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,MSGID_FROM_MTA_HEADER, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2EF4C4361A for ; Fri, 4 Dec 2020 19:40:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DAE2D22C9C for ; Fri, 4 Dec 2020 19:40:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DAE2D22C9C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2624C6B0036; Fri, 4 Dec 2020 14:40:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 214806B005C; Fri, 4 Dec 2020 14:40:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 103286B005D; Fri, 4 Dec 2020 14:40:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0072.hostedemail.com [216.40.44.72]) by kanga.kvack.org (Postfix) with ESMTP id EDFC66B0036 for ; Fri, 4 Dec 2020 14:40:04 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id B017B824999B for ; Fri, 4 Dec 2020 19:40:04 +0000 (UTC) X-FDA: 77556615528.26.fish82_43027f2273c7 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin26.hostedemail.com (Postfix) with ESMTP id 8EBA41804B661 for ; Fri, 4 Dec 2020 19:40:04 +0000 (UTC) X-HE-Tag: fish82_43027f2273c7 X-Filterd-Recvd-Size: 10104 Received: from nat-hk.nvidia.com (nat-hk.nvidia.com [203.18.50.4]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Fri, 4 Dec 2020 19:40:02 +0000 (UTC) Received: from HKMAIL103.nvidia.com (Not Verified[10.18.92.77]) by nat-hk.nvidia.com (using TLS: TLSv1.2, AES256-SHA) id ; Sat, 05 Dec 2020 03:39:59 +0800 Received: from HKMAIL102.nvidia.com (10.18.16.11) by HKMAIL103.nvidia.com (10.18.16.12) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Fri, 4 Dec 2020 19:39:56 +0000 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (104.47.59.177) by HKMAIL102.nvidia.com (10.18.16.11) with Microsoft SMTP Server (TLS) id 15.0.1473.3 via Frontend Transport; Fri, 4 Dec 2020 19:39:56 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=i7d24pYyKbtnjjrTfgnj3YHyywzlMQYKAfD5KWvz1s/xkXwsK83tw2X7OQcqTfS4X4GmcBGP7x7+megV0EDItr5/KNRe4FJL4bATx8cOgCWo4ghWw0f7MYWnbuhGJJXSX1qnKVyr7CNXBd43tHLP/HjJ9TI+jagdg+Xyx/L+I+bBYSnKib+2FndJH1f9MTl86BKOgXfN5DTrdmTNNh2PKZ8iS+z+322OKmcpaVQkQ1+LIMA8XvJ8I9mervZrAn1lA+CoqNu1LXCpfPxd0KimdkycTijiFNKjSzS13dAfGL5z7DqD3gvffSL92qMJwPRWJ9utn/nxFzWJmlY2mMYwqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+ArRTfFWWQLMJrEIqsJ0SkBD6cxwVgVU/JY6eT92xTE=; b=KtBYWxkjgZXipQ6K81mZbYytqQnTqP2lwMOpd8IB4nFPrDKckVdm1qTOxwUUiaiLZWf/oHAqLKUkeaW4y7FVsSfUHVl41ZTioqL+UVZ1o+AkRmHsdgC3pZw0eP4Utztvpmxf3I27VPo/DEC2Pso6kbkQoYEngv2d0kZZ0qIsRpPkH7y5U8kbhc2gZogY1PYpidI1xlMDTa/DnuIxtZ5J1wjyWS3JbxVZ66QPaz6IeU6X34h6Z0rnYKHIMJhzKYOTH7zLqUw1IsYWtk8MYQ9fcPHjdzYEZJqB2R+8gc5qDasIQ/LVtlnGWnTDAO9k4tyRovEbyTWdh42H3s0NCUaWiw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none Received: from DM6PR12MB3834.namprd12.prod.outlook.com (2603:10b6:5:14a::12) by DM6PR12MB4796.namprd12.prod.outlook.com (2603:10b6:5:16a::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3632.21; Fri, 4 Dec 2020 19:39:54 +0000 Received: from DM6PR12MB3834.namprd12.prod.outlook.com ([fe80::1ce9:3434:90fe:3433]) by DM6PR12MB3834.namprd12.prod.outlook.com ([fe80::1ce9:3434:90fe:3433%3]) with mapi id 15.20.3632.021; Fri, 4 Dec 2020 19:39:54 +0000 From: Jason Gunthorpe To: Andrew Morton , linux-mm CC: Dan Williams , Ira Weiny , John Hubbard , Pavel Tatashin Subject: [PATCH] mm/gup: remove the vma allocation from gup_longterm_locked() Date: Fri, 4 Dec 2020 15:39:52 -0400 Message-ID: <0-v1-5551df3ed12e+b8-gup_dax_speedup_jgg@nvidia.com> X-ClientProxiedBy: MN2PR08CA0010.namprd08.prod.outlook.com (2603:10b6:208:239::15) To DM6PR12MB3834.namprd12.prod.outlook.com (2603:10b6:5:14a::12) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from mlx.ziepe.ca (156.34.48.30) by MN2PR08CA0010.namprd08.prod.outlook.com (2603:10b6:208:239::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3632.17 via Frontend Transport; Fri, 4 Dec 2020 19:39:53 +0000 Received: from jgg by mlx with local (Exim 4.94) (envelope-from ) id 1klGvo-006D5u-7y; Fri, 04 Dec 2020 15:39:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1607110799; bh=rV61zv+4bLdSQh81A7ztzTB/gBuv5rqx0X52VqvtUhs=; h=ARC-Seal:ARC-Message-Signature:ARC-Authentication-Results:From:To: CC:Subject:Date:Message-ID:Content-Transfer-Encoding:Content-Type: X-ClientProxiedBy:MIME-Version: X-MS-Exchange-MessageSentRepresentingType; b=aLQGKvufgIIGoA+gsslxmMQ8JaQmOebtvg7CDJ3ftMs0mUbX3rnTOK17AYmRB3Res cvLzinSn8FLkHDBJov3R6RgOcTmxbR4FSiPOSEyH6DSI9NVia9FPG9Sj1eEaM4UTVi IqLF30DBJ3CSPbtzLOHWiLlurXWqzlAdZ4mgfoMbaGQF/fc0YJ/DAiLeYnVozXL7CR RtQJmnn1rwkqW8jzs16psjTwZPeNZ2MUXOfl3QlPyNp/Cxhic1kpY2DdfbjoLF5wFJ oUxTJ+8gnvfBX+vvyU4xDvKhy7/mB3Q5Px6bcyNeYjlDa4f9OonQ1aMrlgK0SyRuHZ aPUuxprcARZFw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Long ago there wasn't a FOLL_LONGTERM flag so this DAX check was done by post-processing the VMA list. These days it is trivial to just check each VMA to see if it is DAX before processing it inside __get_user_pages() and return failure if a DAX VMA is encountered with FOLL_LONGTERM. Removing the allocation of the VMA list is a significant speed up for many call sites. Add an IS_ENABLED to vma_is_fsdax so that code generation is unchanged when DAX is compiled out. Remove the dummy version of __gup_longterm_locked() as !CONFIG_CMA already makes memalloc_nocma_save(), check_and_migrate_cma_pages(), and memalloc_nocma_restore() into a NOP. Cc: Dan Williams Cc: Ira Weiny Cc: John Hubbard Cc: Pavel Tatashin Signed-off-by: Jason Gunthorpe Reviewed-by: Ira Weiny --- include/linux/fs.h | 2 +- mm/gup.c | 83 +++++++++------------------------------------- 2 files changed, 16 insertions(+), 69 deletions(-) This was tested using the fake nvdimm stuff and RDMA's FOLL_LONGTERM pin continues to correctly reject DAX vmas and returns EOPNOTSUPP Pavel, this accomplishes the same #ifdef clean up as your patch series for CMA by just deleting all the code that justified the ifdefs. FWIW, this is probably going to be the start of a longer trickle of patches to make pin_user_pages()/unpin_user_pages() faster. This flow is offensively slow right now. Ira, I investigated streamlining the callers from here, and you are right. The distinction that FOLL_LONGTERM means locked == NULL is no longer required now that the vma list isn't used, and with some adjusting of the CMA path we can purge out a lot of other complexity too. I have some drafts, but I want to tackle this separately. diff --git a/include/linux/fs.h b/include/linux/fs.h index 8667d0cdc71e76..1fcc2b00582b22 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3230,7 +3230,7 @@ static inline bool vma_is_fsdax(struct vm_area_struct *vma) { struct inode *inode; - if (!vma->vm_file) + if (!IS_ENABLED(CONFIG_FS_DAX) || !vma->vm_file) return false; if (!vma_is_dax(vma)) return false; diff --git a/mm/gup.c b/mm/gup.c index 9c6a2f5001c5c2..311a44ff41ff42 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -923,6 +923,9 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) if (gup_flags & FOLL_ANON && !vma_is_anonymous(vma)) return -EFAULT; + if ((gup_flags & FOLL_LONGTERM) && vma_is_fsdax(vma)) + return -EOPNOTSUPP; + if (write) { if (!(vm_flags & VM_WRITE)) { if (!(gup_flags & FOLL_FORCE)) @@ -1060,10 +1063,14 @@ static long __get_user_pages(struct mm_struct *mm, goto next_page; } - if (!vma || check_vma_flags(vma, gup_flags)) { + if (!vma) { ret = -EFAULT; goto out; } + ret = check_vma_flags(vma, gup_flags); + if (ret) + goto out; + if (is_vm_hugetlb_page(vma)) { i = follow_hugetlb_page(mm, vma, pages, vmas, &start, &nr_pages, i, @@ -1567,26 +1574,6 @@ struct page *get_dump_page(unsigned long addr) } #endif /* CONFIG_ELF_CORE */ -#if defined(CONFIG_FS_DAX) || defined (CONFIG_CMA) -static bool check_dax_vmas(struct vm_area_struct **vmas, long nr_pages) -{ - long i; - struct vm_area_struct *vma_prev = NULL; - - for (i = 0; i < nr_pages; i++) { - struct vm_area_struct *vma = vmas[i]; - - if (vma == vma_prev) - continue; - - vma_prev = vma; - - if (vma_is_fsdax(vma)) - return true; - } - return false; -} - #ifdef CONFIG_CMA static long check_and_migrate_cma_pages(struct mm_struct *mm, unsigned long start, @@ -1705,63 +1692,23 @@ static long __gup_longterm_locked(struct mm_struct *mm, struct vm_area_struct **vmas, unsigned int gup_flags) { - struct vm_area_struct **vmas_tmp = vmas; unsigned long flags = 0; - long rc, i; + long rc; - if (gup_flags & FOLL_LONGTERM) { - if (!pages) - return -EINVAL; - - if (!vmas_tmp) { - vmas_tmp = kcalloc(nr_pages, - sizeof(struct vm_area_struct *), - GFP_KERNEL); - if (!vmas_tmp) - return -ENOMEM; - } + if (gup_flags & FOLL_LONGTERM) flags = memalloc_nocma_save(); - } - rc = __get_user_pages_locked(mm, start, nr_pages, pages, - vmas_tmp, NULL, gup_flags); + rc = __get_user_pages_locked(mm, start, nr_pages, pages, vmas, NULL, + gup_flags); if (gup_flags & FOLL_LONGTERM) { - if (rc < 0) - goto out; - - if (check_dax_vmas(vmas_tmp, rc)) { - if (gup_flags & FOLL_PIN) - unpin_user_pages(pages, rc); - else - for (i = 0; i < rc; i++) - put_page(pages[i]); - rc = -EOPNOTSUPP; - goto out; - } - - rc = check_and_migrate_cma_pages(mm, start, rc, pages, - vmas_tmp, gup_flags); -out: + if (rc > 0) + rc = check_and_migrate_cma_pages(mm, start, rc, pages, + vmas, gup_flags); memalloc_nocma_restore(flags); } - - if (vmas_tmp != vmas) - kfree(vmas_tmp); return rc; } -#else /* !CONFIG_FS_DAX && !CONFIG_CMA */ -static __always_inline long __gup_longterm_locked(struct mm_struct *mm, - unsigned long start, - unsigned long nr_pages, - struct page **pages, - struct vm_area_struct **vmas, - unsigned int flags) -{ - return __get_user_pages_locked(mm, start, nr_pages, pages, vmas, - NULL, flags); -} -#endif /* CONFIG_FS_DAX || CONFIG_CMA */ static bool is_valid_gup_flags(unsigned int gup_flags) {