From patchwork Sat Mar 23 03:33:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13600524 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C52B2C47DD9 for ; Sat, 23 Mar 2024 03:33:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2EA716B0087; Fri, 22 Mar 2024 23:33:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 29AF16B008C; Fri, 22 Mar 2024 23:33:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 09DE16B008A; Fri, 22 Mar 2024 23:33:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id E20BD6B0085 for ; Fri, 22 Mar 2024 23:33:21 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 6553B1C1B02 for ; Sat, 23 Mar 2024 03:33:21 +0000 (UTC) X-FDA: 81926883402.10.1C41B45 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf25.hostedemail.com (Postfix) with ESMTP id 6AA53A001F for ; Sat, 23 Mar 2024 03:33:19 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=FKDAgiUS; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf25.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711164799; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+RT3JU78BLCUvx59f8uOYoEo4UVIewlw6hqvfhC3pJ8=; b=bmHDnwwss1zPKnnC4vGFCTqiiHIAvRuRnV5SPKgZFl9r6tV8NUDBuCf7eDNI4+83izkti1 XuJwzWk7vinyBxYiHcRKeNT6tS8gQRIsCZIVFTe28OJSmAzATm+bvtZbg1aA6g5I0iGE4w u9ZweNG8zGKeQ+a0ZhyRWRIQ6PZKTlo= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=FKDAgiUS; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf25.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711164799; a=rsa-sha256; cv=none; b=vcel5SuebOzz7R6HeJ89vwLBsv48iwKwPHY3pMegf9KEC909DgMXWRBxKrRmIk1mDZst6S ID2dXRDw0B56fnBgMqyVlOl0wlDZB5H8Gn5qXNnFc5wGlwe0eJYh6ErwkBMf33G2n3lIbW xiu1+NbOXAZ6QZvTtHEiWTeZI0zfoMw= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1711164798; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+RT3JU78BLCUvx59f8uOYoEo4UVIewlw6hqvfhC3pJ8=; b=FKDAgiUST353P6IqygU4QiWw3q4x9i/mKbhV8z+21kNtFQx+zRDYkGC5P4Gt13xaRXjLP/ Spfy1r+iDXLofW0gKaZCoy+hJJZA46ah2qLYb4ZDnECo9tWx8n/M/R76Qbhm0K1N5t7qA1 n92+9unpxqkW1ddyTXISOKh7VR0JaYU= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-74-mz0iCXE_P9GO5F0q6YB_6w-1; Fri, 22 Mar 2024 23:33:16 -0400 X-MC-Unique: mz0iCXE_P9GO5F0q6YB_6w-1 Received: by mail-qv1-f69.google.com with SMTP id 6a1803df08f44-69152af7760so7544996d6.1 for ; Fri, 22 Mar 2024 20:33:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711164795; x=1711769595; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+RT3JU78BLCUvx59f8uOYoEo4UVIewlw6hqvfhC3pJ8=; b=AVs/fn31gnoDrXC5peTW3K7Pi4CtWweE5bMR87ixxIvgU+Ot2aJGu58oSYq8l2F1S1 imbiMNTUbh9dWx+rWkujUtrLVqgrXKcSdwZ71u5m2zx5cmyr3kIbHmSeC5kP+VWyDMks p571Y5EuG96sOvvrCjnlDef5klnoYiFGOBkcB9ENUS27Aqg/0bAHM3zT1jgiHMnupsvm g0Hx0IW44prvu/3T7BRNww7fNTpSZJjRAXTNDwN68J31Uz8UkcZ6G06NircySKT5EwkH u7GgB4F1dXhz1u5yKO+oN1x0SB1FKPI1z8h9MOrpCsJNrg/edq0teesPbcLBAemz+HJe l/tg== X-Forwarded-Encrypted: i=1; AJvYcCWjlawqjz2BQHCBHwx2fBBUjFGLSxMM27/rTXXW6d9g+wOaF+yWANWPTBxDIDZORX88Sj36QOSwOILu5Aurz/MDjLM= X-Gm-Message-State: AOJu0YyaR3GyrOC+OGM/52RqCfUIb+rPvNqFRHBnnFw3I2m9tp3COQRV Wg33SkYd9iqdHx7EPR1vPtstwGOMLEN5eN18NFTVCmWHNNcF28JXZdGhRy72hbI4E69kjn5G4/U +VuvPtOzqqgtoUYEjuMipEzf9LwVEIGBQBBUtILmxIYBDgpxM X-Received: by 2002:a05:6214:4598:b0:696:7b32:cceb with SMTP id op24-20020a056214459800b006967b32ccebmr496272qvb.6.1711164795281; Fri, 22 Mar 2024 20:33:15 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGKF8jhZbJHSvX24ZFr9evzOa/c38cWLTQmFytc3WEfQcqwh3k7QGxTFmu0XJXF0XffA+dzpA== X-Received: by 2002:a05:6214:4598:b0:696:7b32:cceb with SMTP id op24-20020a056214459800b006967b32ccebmr496264qvb.6.1711164794792; Fri, 22 Mar 2024 20:33:14 -0700 (PDT) Received: from x1n.redhat.com ([99.254.121.117]) by smtp.gmail.com with ESMTPSA id j12-20020a05621419cc00b006910e7edfedsm1698198qvc.62.2024.03.22.20.33.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Mar 2024 20:33:14 -0700 (PDT) From: peterx@redhat.com To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Andrew Morton , SeongJae Park , peterx@redhat.com Subject: [PATCH 2/2] fixup! mm/gup: handle hugepd for follow_page() Date: Fri, 22 Mar 2024 23:33:10 -0400 Message-ID: <20240323033310.971447-3-peterx@redhat.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240323033310.971447-1-peterx@redhat.com> References: <20240323033310.971447-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Queue-Id: 6AA53A001F X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: zjc6mdzqtowe3zuyehojhct4ytmeuxu1 X-HE-Tag: 1711164799-270205 X-HE-Meta: U2FsdGVkX19MVHMr1N73pP2ZO8L+q/wHjzWkdbjlIlk0hKux2uTrVLFJqhB+frTItn7VZQMhwXBp4FDDn0TzACa8o+udUVsXmZjdvlegWRUo5GPUYzxeGFCjp5tk1EmzO4eAYPqI+/NEHYJWbQH/ur17r20MKYCS1r2ptYJB2LAa7hMA7GTLgU/9DDWjzLa4/iRVR0arLDGpIn+xUQaGfQXht+NYFB7udHIk1sQWLQUIYarlZAEkLsQqk69FHmafkAaXqUv/FKyn2gE4L15JWi0uek4/8Bk3RSaX9kQxwXiSLXMCEqQRf61JiT325Ffbvmb9zqN1/nIm0uu3vit7THZGmfryl/u1BGIWpA6NAVz8hIV45E89Q+eabzSWPovnlyiqBA2E+jYSnnlVvXeZ6Ti8z7Iep4OeLscu49kALEGIKArKQpde44uxWYS864xg6eXnK7qqhm6jWuj4S9KVufn292Jnmxt0PaOcfhFdCJfgxpuzgvWC6V9CN2dl/78Xv6UNYXOiVUxmnXtwQHQDUNpF/V4tbQyNnbkOiOv3nUdWWbfO2WBbVypRflxkUsFJIryVEg0ETU4bgXRE6bFfsdv8IFmbS2bXk0kebWy08JFUWM/oC5ktxdurRyOxpFmhhmAl3NON5BiCkkM9l61+bZr+decmoUsHp4FRHGhTgTCp5DQDJ5AqDLJf/po8IYus3j7Tme9gJpFmfWYgUJgwswItnZBwh/YyJqpy6yPQ0bnsgSi0LWikstGGQKPQ1Isy3RRhNbUzOzP0k+SauyHDymbMJkLZKGx2WHmziPsLwm4tiO82YEE9ugIFwG+R8HAf9GAiaz4iwsUjSLnkBbJliJtGhsqI+piI2ahYNQ6rwR9PkQ9RuuWczWpqx2Xe0jUB67xH1e/UoDnI0H/D4H0+PmX9CXvo8uaczmP+1qewrGN3Pg0kpf3dU5UNsIz4/JiF848k+GijugWUtG59oB4 yt6tv1Nm liM+n5tjhSU/WzsIkqAjpZM3UFr++QIwBiHfYVHJ+HE8RD8PerYEftvpFvInh/gymdEi2mFbE8JE9/O555tctSJKtUL/GdqqOJtihezucHGioBjDQwp8wOfO+DBQfHMgampsnKS20QHA4did2MdP6QsXyHEYS8m+zvEGT7SVKwH9Qpojxh5fixoIh7w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Peter Xu The major issue is that now slow gup will reuse some fast gup functions to parse hugepd entries. So we need to move hugepd and relevant functions out of HAVE_FAST_GUP, but also under CONFIG_MMU. Meanwhile, the helper record_subpages() can be used by either hugepd or fast-gup section. To avoid "unused function" warnings we must provide a macro to it, unfortunately. Signed-off-by: Peter Xu --- mm/gup.c | 287 +++++++++++++++++++++++++++---------------------------- 1 file changed, 143 insertions(+), 144 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 4cd349390477..fe9df268bef2 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -30,11 +30,6 @@ struct follow_page_context { unsigned int page_mask; }; -static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd, - unsigned long addr, unsigned int pdshift, - unsigned int flags, - struct follow_page_context *ctx); - static inline void sanity_check_pinned_pages(struct page **pages, unsigned long npages) { @@ -505,6 +500,149 @@ static inline void mm_set_has_pinned_flag(unsigned long *mm_flags) } #ifdef CONFIG_MMU + +#if defined(CONFIG_ARCH_HAS_HUGEPD) || defined(CONFIG_HAVE_FAST_GUP) +static int record_subpages(struct page *page, unsigned long sz, + unsigned long addr, unsigned long end, + struct page **pages) +{ + struct page *start_page; + int nr; + + start_page = nth_page(page, (addr & (sz - 1)) >> PAGE_SHIFT); + for (nr = 0; addr != end; nr++, addr += PAGE_SIZE) + pages[nr] = nth_page(start_page, nr); + + return nr; +} +#endif /* CONFIG_ARCH_HAS_HUGEPD || CONFIG_HAVE_FAST_GUP */ + +#ifdef CONFIG_ARCH_HAS_HUGEPD +static unsigned long hugepte_addr_end(unsigned long addr, unsigned long end, + unsigned long sz) +{ + unsigned long __boundary = (addr + sz) & ~(sz-1); + return (__boundary - 1 < end - 1) ? __boundary : end; +} + +static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr, + unsigned long end, unsigned int flags, + struct page **pages, int *nr) +{ + unsigned long pte_end; + struct page *page; + struct folio *folio; + pte_t pte; + int refs; + + pte_end = (addr + sz) & ~(sz-1); + if (pte_end < end) + end = pte_end; + + pte = huge_ptep_get(ptep); + + if (!pte_access_permitted(pte, flags & FOLL_WRITE)) + return 0; + + /* hugepages are never "special" */ + VM_BUG_ON(!pfn_valid(pte_pfn(pte))); + + page = pte_page(pte); + refs = record_subpages(page, sz, addr, end, pages + *nr); + + folio = try_grab_folio(page, refs, flags); + if (!folio) + return 0; + + if (unlikely(pte_val(pte) != pte_val(ptep_get(ptep)))) { + gup_put_folio(folio, refs, flags); + return 0; + } + + if (!pte_write(pte) && gup_must_unshare(NULL, flags, &folio->page)) { + gup_put_folio(folio, refs, flags); + return 0; + } + + *nr += refs; + folio_set_referenced(folio); + return 1; +} + +/* + * NOTE: currently GUP for a hugepd is only possible on hugetlbfs file + * systems on Power, which does not have issue with folio writeback against + * GUP updates. When hugepd will be extended to support non-hugetlbfs or + * even anonymous memory, we need to do extra check as what we do with most + * of the other folios. See writable_file_mapping_allowed() and + * folio_fast_pin_allowed() for more information. + */ +static int gup_huge_pd(hugepd_t hugepd, unsigned long addr, + unsigned int pdshift, unsigned long end, unsigned int flags, + struct page **pages, int *nr) +{ + pte_t *ptep; + unsigned long sz = 1UL << hugepd_shift(hugepd); + unsigned long next; + + ptep = hugepte_offset(hugepd, addr, pdshift); + do { + next = hugepte_addr_end(addr, end, sz); + if (!gup_hugepte(ptep, sz, addr, end, flags, pages, nr)) + return 0; + } while (ptep++, addr = next, addr != end); + + return 1; +} + +static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd, + unsigned long addr, unsigned int pdshift, + unsigned int flags, + struct follow_page_context *ctx) +{ + struct page *page; + struct hstate *h; + spinlock_t *ptl; + int nr = 0, ret; + pte_t *ptep; + + /* Only hugetlb supports hugepd */ + if (WARN_ON_ONCE(!is_vm_hugetlb_page(vma))) + return ERR_PTR(-EFAULT); + + h = hstate_vma(vma); + ptep = hugepte_offset(hugepd, addr, pdshift); + ptl = huge_pte_lock(h, vma->vm_mm, ptep); + ret = gup_huge_pd(hugepd, addr, pdshift, addr + PAGE_SIZE, + flags, &page, &nr); + spin_unlock(ptl); + + if (ret) { + WARN_ON_ONCE(nr != 1); + ctx->page_mask = (1U << huge_page_order(h)) - 1; + return page; + } + + return NULL; +} +#else /* CONFIG_ARCH_HAS_HUGEPD */ +static inline int gup_huge_pd(hugepd_t hugepd, unsigned long addr, + unsigned int pdshift, unsigned long end, unsigned int flags, + struct page **pages, int *nr) +{ + return 0; +} + +static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd, + unsigned long addr, unsigned int pdshift, + unsigned int flags, + struct follow_page_context *ctx) +{ + return NULL; +} +#endif /* CONFIG_ARCH_HAS_HUGEPD */ + + static struct page *no_page_table(struct vm_area_struct *vma, unsigned int flags, unsigned long address) { @@ -2962,145 +3100,6 @@ static int __gup_device_huge_pud(pud_t pud, pud_t *pudp, unsigned long addr, } #endif -static int record_subpages(struct page *page, unsigned long sz, - unsigned long addr, unsigned long end, - struct page **pages) -{ - struct page *start_page; - int nr; - - start_page = nth_page(page, (addr & (sz - 1)) >> PAGE_SHIFT); - for (nr = 0; addr != end; nr++, addr += PAGE_SIZE) - pages[nr] = nth_page(start_page, nr); - - return nr; -} - -#ifdef CONFIG_ARCH_HAS_HUGEPD -static unsigned long hugepte_addr_end(unsigned long addr, unsigned long end, - unsigned long sz) -{ - unsigned long __boundary = (addr + sz) & ~(sz-1); - return (__boundary - 1 < end - 1) ? __boundary : end; -} - -static int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr, - unsigned long end, unsigned int flags, - struct page **pages, int *nr) -{ - unsigned long pte_end; - struct page *page; - struct folio *folio; - pte_t pte; - int refs; - - pte_end = (addr + sz) & ~(sz-1); - if (pte_end < end) - end = pte_end; - - pte = huge_ptep_get(ptep); - - if (!pte_access_permitted(pte, flags & FOLL_WRITE)) - return 0; - - /* hugepages are never "special" */ - VM_BUG_ON(!pfn_valid(pte_pfn(pte))); - - page = pte_page(pte); - refs = record_subpages(page, sz, addr, end, pages + *nr); - - folio = try_grab_folio(page, refs, flags); - if (!folio) - return 0; - - if (unlikely(pte_val(pte) != pte_val(ptep_get(ptep)))) { - gup_put_folio(folio, refs, flags); - return 0; - } - - if (!pte_write(pte) && gup_must_unshare(NULL, flags, &folio->page)) { - gup_put_folio(folio, refs, flags); - return 0; - } - - *nr += refs; - folio_set_referenced(folio); - return 1; -} - -/* - * NOTE: currently GUP for a hugepd is only possible on hugetlbfs file - * systems on Power, which does not have issue with folio writeback against - * GUP updates. When hugepd will be extended to support non-hugetlbfs or - * even anonymous memory, we need to do extra check as what we do with most - * of the other folios. See writable_file_mapping_allowed() and - * folio_fast_pin_allowed() for more information. - */ -static int gup_huge_pd(hugepd_t hugepd, unsigned long addr, - unsigned int pdshift, unsigned long end, unsigned int flags, - struct page **pages, int *nr) -{ - pte_t *ptep; - unsigned long sz = 1UL << hugepd_shift(hugepd); - unsigned long next; - - ptep = hugepte_offset(hugepd, addr, pdshift); - do { - next = hugepte_addr_end(addr, end, sz); - if (!gup_hugepte(ptep, sz, addr, end, flags, pages, nr)) - return 0; - } while (ptep++, addr = next, addr != end); - - return 1; -} - -static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd, - unsigned long addr, unsigned int pdshift, - unsigned int flags, - struct follow_page_context *ctx) -{ - struct page *page; - struct hstate *h; - spinlock_t *ptl; - int nr = 0, ret; - pte_t *ptep; - - /* Only hugetlb supports hugepd */ - if (WARN_ON_ONCE(!is_vm_hugetlb_page(vma))) - return ERR_PTR(-EFAULT); - - h = hstate_vma(vma); - ptep = hugepte_offset(hugepd, addr, pdshift); - ptl = huge_pte_lock(h, vma->vm_mm, ptep); - ret = gup_huge_pd(hugepd, addr, pdshift, addr + PAGE_SIZE, - flags, &page, &nr); - spin_unlock(ptl); - - if (ret) { - WARN_ON_ONCE(nr != 1); - ctx->page_mask = (1U << huge_page_order(h)) - 1; - return page; - } - - return NULL; -} -#else -static inline int gup_huge_pd(hugepd_t hugepd, unsigned long addr, - unsigned int pdshift, unsigned long end, unsigned int flags, - struct page **pages, int *nr) -{ - return 0; -} - -static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd, - unsigned long addr, unsigned int pdshift, - unsigned int flags, - struct follow_page_context *ctx) -{ - return NULL; -} -#endif /* CONFIG_ARCH_HAS_HUGEPD */ - static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, unsigned long end, unsigned int flags, struct page **pages, int *nr)