From patchwork Thu Mar 21 22:08:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13599415 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C399BC6FD1F for ; Thu, 21 Mar 2024 22:08:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 50E9C6B00A7; Thu, 21 Mar 2024 18:08:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 497B76B00A8; Thu, 21 Mar 2024 18:08:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 311656B00A9; Thu, 21 Mar 2024 18:08:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 19C856B00A7 for ; Thu, 21 Mar 2024 18:08:37 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id E9DBEA08B8 for ; Thu, 21 Mar 2024 22:08:36 +0000 (UTC) X-FDA: 81922436232.28.DAA57D8 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf15.hostedemail.com (Postfix) with ESMTP id DC04BA000A for ; Thu, 21 Mar 2024 22:08:34 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=dpm03JTq; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf15.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711058914; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kKAkS039JEm9eSlNjnxGInSB8mOTV3IHPzDUFjGHTVk=; b=liXYTUzk8K0zbFF+RvEsoTOzaBBjTOA7Y1uLHeU+uQqSOxBIgvVNOyiuMEKiK7/cUB0oZL MctyzJH0zn+ON3LY4SlUDVToNXlfYk8gLdFv0GF3vEVlzmP7a2XBeBVjsHO4gQryjGAyOi SDjjkWjl2ExNcr/KZPkxR/jq3PUw5d0= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=dpm03JTq; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf15.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711058914; a=rsa-sha256; cv=none; b=5BQ6kYjYRtT7H+XDCB3W4/EokovZljsp+ZMhia5T/uMvs11YYpsUMpJ++Q9Ip8lqQyUj7U U78z8oi4JLraVFka8AfoUDcNXx+ue7FN+rrbJu9XN5VRs4HfBtItuSIa6xSNI1wSwGYQe9 VI8FtQ93OnOLgey5dwucKCXd7uK9leE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1711058914; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kKAkS039JEm9eSlNjnxGInSB8mOTV3IHPzDUFjGHTVk=; b=dpm03JTqZbWMlokdHc6F6ZZHRyRyMyRMdGBBsBggvlTMwKFiPMnRCnpdib4lIjhCuHizK9 TlKjkFCd4Ov11JaK4FkNDVpOx2AHKeJYmHgpTeHyv0EkGltQ1YibXD9MID0Unlv/+znN++ g1KairYkgwowQ9oTGzh7eG0+cbmgvN4= Received: from mail-oa1-f72.google.com (mail-oa1-f72.google.com [209.85.160.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-131-6QKRsGpGP_qWtzPt9ffSdg-1; Thu, 21 Mar 2024 18:08:32 -0400 X-MC-Unique: 6QKRsGpGP_qWtzPt9ffSdg-1 Received: by mail-oa1-f72.google.com with SMTP id 586e51a60fabf-222b55202f4so525986fac.1 for ; Thu, 21 Mar 2024 15:08:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711058910; x=1711663710; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kKAkS039JEm9eSlNjnxGInSB8mOTV3IHPzDUFjGHTVk=; b=qe4uLPZh6ZctT7TvWyH3Tu+0377jeM6v5uo0KK3FgesCi8u3/m5SRf0XHVxLEMLx3j LccZu9Dln2XlAZFa1nsgtIC597jHwl/rg+jlFKvJV56I21hrbE4P8vP+/KwG2SBWoen1 38SnP+aBkwr/e0WZrB4wSWtX9gemyeEgKgwtSSzpFBetV5moNGtLMshEarHcNmK7z8f7 6ckPnmLAFHxsl15+Gu03WwkCOqYv1kj6b8iLbKTOvNa+Ht4tedhjv8w53i3Yj4g3+nQA HPI/YtNCGvyq0UnfkCMZyrQAj2GVXGua6IKChBgJHtKT54eu4jpHYwecskIYv1P8+hQ8 BAAg== X-Gm-Message-State: AOJu0YyKmYR/rNto2il4Apqt6ubAYQJIk8qEkJP/dS9nV9A7ze5MTLuW lr0HWuIUOLskUJgNDLONK1U/OqZOrhEx2PMhqO1gG4CXZcV8JDPyIF9s3r77C7QxmuRbsu3bvOm Hif8eF/P6AmyJHcYurNwtVdi8//9+9ixfcjbEs6MpF3Nu2j9Yg6UF8cmOcdS++o8JhHjhXxWYIO i5QWnMK7dyDaAY7qFgyXX1rk+QYdQp9A== X-Received: by 2002:a4a:d5ce:0:b0:5a2:26c7:397b with SMTP id a14-20020a4ad5ce000000b005a226c7397bmr912040oot.0.1711058909673; Thu, 21 Mar 2024 15:08:29 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH7bJa0sTEC4z3o8Ei9oO8UNdi6+GHr5ekBXamqtFS26QUeqg4nthASLLtCNNWPAVCCH5AJXA== X-Received: by 2002:a4a:d5ce:0:b0:5a2:26c7:397b with SMTP id a14-20020a4ad5ce000000b005a226c7397bmr911987oot.0.1711058909121; Thu, 21 Mar 2024 15:08:29 -0700 (PDT) Received: from x1n.redhat.com ([99.254.121.117]) by smtp.gmail.com with ESMTPSA id o6-20020a0562140e4600b00690baf5cde9sm351663qvc.118.2024.03.21.15.08.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Mar 2024 15:08:28 -0700 (PDT) From: peterx@redhat.com To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org, Michael Ellerman , Christophe Leroy , Matthew Wilcox , Rik van Riel , Lorenzo Stoakes , Axel Rasmussen , peterx@redhat.com, Yang Shi , John Hubbard , linux-arm-kernel@lists.infradead.org, "Kirill A . Shutemov" , Andrew Jones , Vlastimil Babka , Mike Rapoport , Andrew Morton , Muchun Song , Christoph Hellwig , linux-riscv@lists.infradead.org, James Houghton , David Hildenbrand , Jason Gunthorpe , Andrea Arcangeli , "Aneesh Kumar K . V" , Mike Kravetz Subject: [PATCH v3 12/12] mm/gup: Handle hugetlb in the generic follow_page_mask code Date: Thu, 21 Mar 2024 18:08:02 -0400 Message-ID: <20240321220802.679544-13-peterx@redhat.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240321220802.679544-1-peterx@redhat.com> References: <20240321220802.679544-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Queue-Id: DC04BA000A X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: pojzzgeuei3m7crd8a3pd6qp1b9c186m X-HE-Tag: 1711058914-328012 X-HE-Meta: U2FsdGVkX19MRoa7amfWNuXIIMrnaj0shhGJT2gUzdw9Xx+9L7MhGFvoYpdwTtIhHpH7nbQz9FhaGaRVtIxaiP7zn6y1hr8jmQoW0nozKXu+3eYIJW8YugXM6VH12dlyqmqeo9tyd6n+ESCDE4csfRBgJ3UkKWT8+KkaRSyEw818TjgwRqHIAokf0vPSl5qXZJPe0XFzOMlr1efq76AGWOPxxvy1Kplc4hc59k8QYVba/MPx2MionSveHKjsWngjZhdQDLL9xFUmnK0XnbsSzPNWLh852Keoh4WoT4Oq5Ng9TQueC9vRA3nUDDsyxSpwTt5wH3DrN6FCvocCYgsxJsnP76juyGZVBdyXTj93KjFZAra+8mADoZX6l5/hFi+otYpoUkPvgp69ETO+3scTcWA3EshMMAEJYJFj6vW1Y5U8jGZmQWbfVSMpQ1qh5cjQje2hyc72BE/HyXUyHjA0pzjjFb0f5FgAztETAgXSg7ENbNVn45+5JdO/n/jNg8yYW/9ZUgmqhDBetMjGIz+eXM8wxDioM3zMc2i73GDaV9QTf8+smKyOvWtl04LyQ5Y4oxzujCDeBypyD3xUEH0Ojy+YjSWHYGIKfHayXdCENN/oT0KPu3fdR0XRY3qb+KpErbuWfReq3iA/M+DbwJ9SSAuoF6rEhefX/HL4XsQuM+9uZNIgaHaIe1M9zyc+UHGFM4uCV4aeov5OtI1ju0iM+O9jsW4Lot2P8X2gJE7SOwWDb4LO5E4XJO6BtbsikqdbRO1ANuLQHMUX4XWTO8mA9WLkzPWsf8jPXn5vEphyoXeICe1dWO+We0dgWrsJ9FJD9yjumxgqqGd8spYWPYRwI6kYyb5tCmOriVM9G/5E4gFq9/sO8rI/FiovikCfuTMeMnCoVjW30133JFCbEtC9MJGQgm+EaXaTktAE1h+7VI/HLX+cQdpzW7F9rkgu8itPfqKDBo4tzhfCywpatIW vYeQ04zZ 26GFWawhfl7qOWAmNpiiEXgAEjS3dvGXR4JArydGsqAIX3lM5635iDEGS4vojH4V9Ml6d24ys0npb7UsjVr/8Yj5FBoNYPTXp74v7PzgVGGA8eNPdOcn2z0FEMHGyfQh0vK36ozvVt+t0/zMFdLYazYEuLx7DWSOHbmGM+rXcDCCNTU/q9fbnxPUt5RJkgfNqba+hyrwikUUheSKakupFCOWGvg0bZFBv4JgIvpj9aiXieFRU5F+i7eaM1JzkbOjgJoNTUHUAgdBQ0oRL2JOMDe4TnFb+fxMZRoqyqEqeX0QLDN7axBflsx+nJrZkhGy7fBT7G3vDbggp2a0UIGbulgjsrWQcYlHgY4yXxw5mqDkGU5Ohu1xZmldmXFc10/ZlGXSTe0eZcDZ9AF3G4+xgxIGS9V/qC3O9zks5SUOhMFRkzb6+uOqNE6xxuw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Peter Xu Now follow_page() is ready to handle hugetlb pages in whatever form, and over all architectures. Switch to the generic code path. Time to retire hugetlb_follow_page_mask(), following the previous retirement of follow_hugetlb_page() in 4849807114b8. There may be a slight difference of how the loops run when processing slow GUP over a large hugetlb range on cont_pte/cont_pmd supported archs: each loop of __get_user_pages() will resolve one pgtable entry with the patch applied, rather than relying on the size of hugetlb hstate, the latter may cover multiple entries in one loop. A quick performance test on an aarch64 VM on M1 chip shows 15% degrade over a tight loop of slow gup after the path switched. That shouldn't be a problem because slow-gup should not be a hot path for GUP in general: when page is commonly present, fast-gup will already succeed, while when the page is indeed missing and require a follow up page fault, the slow gup degrade will probably buried in the fault paths anyway. It also explains why slow gup for THP used to be very slow before 57edfcfd3419 ("mm/gup: accelerate thp gup even for "pages != NULL"") lands, the latter not part of a performance analysis but a side benefit. If the performance will be a concern, we can consider handle CONT_PTE in follow_page(). Before that is justified to be necessary, keep everything clean and simple. Signed-off-by: Peter Xu Reviewed-by: Jason Gunthorpe --- include/linux/hugetlb.h | 7 ---- mm/gup.c | 15 +++------ mm/hugetlb.c | 71 ----------------------------------------- 3 files changed, 5 insertions(+), 88 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 52d9efcf1edf..85e1c9931ae5 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -328,13 +328,6 @@ static inline void hugetlb_zap_end( { } -static inline struct page *hugetlb_follow_page_mask( - struct vm_area_struct *vma, unsigned long address, unsigned int flags, - unsigned int *page_mask) -{ - BUILD_BUG(); /* should never be compiled in if !CONFIG_HUGETLB_PAGE*/ -} - static inline int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, struct vm_area_struct *dst_vma, diff --git a/mm/gup.c b/mm/gup.c index 43a2e0a203cd..2eb5911ba849 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -997,18 +997,11 @@ static struct page *follow_page_mask(struct vm_area_struct *vma, { pgd_t *pgd, pgdval; struct mm_struct *mm = vma->vm_mm; + struct page *page; - ctx->page_mask = 0; - - /* - * Call hugetlb_follow_page_mask for hugetlb vmas as it will use - * special hugetlb page table walking code. This eliminates the - * need to check for hugetlb entries in the general walking code. - */ - if (is_vm_hugetlb_page(vma)) - return hugetlb_follow_page_mask(vma, address, flags, - &ctx->page_mask); + vma_pgtable_walk_begin(vma); + ctx->page_mask = 0; pgd = pgd_offset(mm, address); pgdval = *pgd; @@ -1020,6 +1013,8 @@ static struct page *follow_page_mask(struct vm_area_struct *vma, else page = follow_p4d_mask(vma, address, pgd, flags, ctx); + vma_pgtable_walk_end(vma); + return page; } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index abec04575c89..2e320757501b 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6883,77 +6883,6 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte, } #endif /* CONFIG_USERFAULTFD */ -struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, - unsigned long address, unsigned int flags, - unsigned int *page_mask) -{ - struct hstate *h = hstate_vma(vma); - struct mm_struct *mm = vma->vm_mm; - unsigned long haddr = address & huge_page_mask(h); - struct page *page = NULL; - spinlock_t *ptl; - pte_t *pte, entry; - int ret; - - hugetlb_vma_lock_read(vma); - pte = hugetlb_walk(vma, haddr, huge_page_size(h)); - if (!pte) - goto out_unlock; - - ptl = huge_pte_lock(h, mm, pte); - entry = huge_ptep_get(pte); - if (pte_present(entry)) { - page = pte_page(entry); - - if (!huge_pte_write(entry)) { - if (flags & FOLL_WRITE) { - page = NULL; - goto out; - } - - if (gup_must_unshare(vma, flags, page)) { - /* Tell the caller to do unsharing */ - page = ERR_PTR(-EMLINK); - goto out; - } - } - - page = nth_page(page, ((address & ~huge_page_mask(h)) >> PAGE_SHIFT)); - - /* - * Note that page may be a sub-page, and with vmemmap - * optimizations the page struct may be read only. - * try_grab_page() will increase the ref count on the - * head page, so this will be OK. - * - * try_grab_page() should always be able to get the page here, - * because we hold the ptl lock and have verified pte_present(). - */ - ret = try_grab_page(page, flags); - - if (WARN_ON_ONCE(ret)) { - page = ERR_PTR(ret); - goto out; - } - - *page_mask = (1U << huge_page_order(h)) - 1; - } -out: - spin_unlock(ptl); -out_unlock: - hugetlb_vma_unlock_read(vma); - - /* - * Fixup retval for dump requests: if pagecache doesn't exist, - * don't try to allocate a new page but just skip it. - */ - if (!page && (flags & FOLL_DUMP) && - !hugetlbfs_pagecache_present(h, vma, address)) - page = ERR_PTR(-EFAULT); - - return page; -} - long hugetlb_change_protection(struct vm_area_struct *vma, unsigned long address, unsigned long end, pgprot_t newprot, unsigned long cp_flags)