From patchwork Wed Jun 28 21:53:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13296390 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14106EB64DC for ; Wed, 28 Jun 2023 21:53:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7880C8D0005; Wed, 28 Jun 2023 17:53:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 710DF8D0001; Wed, 28 Jun 2023 17:53:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5B16E8D0005; Wed, 28 Jun 2023 17:53:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 4DF158D0001 for ; Wed, 28 Jun 2023 17:53:20 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 1783A14028A for ; Wed, 28 Jun 2023 21:53:20 +0000 (UTC) X-FDA: 80953508160.05.DA04D2C Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf26.hostedemail.com (Postfix) with ESMTP id EC32B140024 for ; Wed, 28 Jun 2023 21:53:17 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=SBlEFL7z; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf26.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687989198; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=v3NUxws11iZimZs89ga1+mchuciCcK5t8cj7qgaLP/g=; b=mYL/4lfIEpcowrIIiznWswZcUjglaRSS6T+hiMMzdlNl7amNdyHSQCAnmOGIMU1zzfxNna J+WSRdYcDhU66+75XmRDW/4o2jFEdPxQIlfWZEmVA6Lqfsz+vPYjNBiCEHPBnH76C2cJ+9 G3QWohlw57tgPS3v0ai6aFgPGmXSxVE= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=SBlEFL7z; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf26.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687989198; a=rsa-sha256; cv=none; b=ZBojrXN/+b+D8yPCXMD1ZW6oah8zXNO+24L8SWbd1l76YPA9rviYGc7yz9HWbavbvIGw91 kJHQ6f94RwLCE51/o2udVmSFszp/89kChoGfQvbSMTkIpCGvOx1UeFvkKJoHrkAk3sdDR3 nKaFVWmL50TcUMuhw81+LrDbYxvJ3v0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687989197; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v3NUxws11iZimZs89ga1+mchuciCcK5t8cj7qgaLP/g=; b=SBlEFL7zaU18F4KSLlvbV34f9ZF4NjwhLF+xGGAxBRw46sXIoIFGjMOvjS1/rZcPzGV0Y/ Pc0GoAiktSeqxAMl1BKHEkDDFqtbQJkRuAsC3/+EpSr2LjJmk3JP2WN1KxnzeWxvy6RUq0 2ZKkq+rjjU8XK93DeIpUi0Yq58wtRQU= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-253-9YXeYkbbOhOq8lWKj-ZYDA-1; Wed, 28 Jun 2023 17:53:16 -0400 X-MC-Unique: 9YXeYkbbOhOq8lWKj-ZYDA-1 Received: by mail-qv1-f69.google.com with SMTP id 6a1803df08f44-635eb5b04e1so36536d6.1 for ; Wed, 28 Jun 2023 14:53:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687989195; x=1690581195; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=v3NUxws11iZimZs89ga1+mchuciCcK5t8cj7qgaLP/g=; b=e4hf5e34qiOOHefs2a1++7++VnmyYG+y5ave/Bn5HFrz1tlN8b0Lc6dA/wYXDNhOU5 nDdQXx+yGlcY9tGSSsGivwoLgyHc3yRpNfiZU9Fg6kteGlSPGjKud9BidvOzwfalYm/v DyA19k4P+pceF0eaFLvF43tSvUfNQgpP7WWICY0+3Zo7V+EAVmcuY7YmrTqY3Vn8FTaJ 0gFgzcw16os3zfadb3cma4ePcXtAnQA+Yv0Nykcd6rfI3uS5f1XQCG/KPW484rlcwS0U GlWM4gGFYICBr1Zx7JACm7POxAX6QrHzJtEaasT2k/skpam0QBAZU5PWZzUeiI58flLM DhdA== X-Gm-Message-State: AC+VfDzjtPOhU2tbXFrS/Zo2SLw4mHLHxpLZ+qvm9nqeFgKOHy6aimiw boPFFDoxmwGNtsGc6dZWzHMF32wSq/0pLO4CxbZUK7hNnBxPBpv9GE7Wqt6u+JyO0UDrxZuAyF+ AjeFxAZnW+6pL5J3tolfBpyR5PGtapajf0ptIfiGIyvBOBadt6Gw5TO/DPX9GgBbhIWHB X-Received: by 2002:a05:6214:f2f:b0:62b:6c6f:b3e3 with SMTP id iw15-20020a0562140f2f00b0062b6c6fb3e3mr45427380qvb.3.1687989195661; Wed, 28 Jun 2023 14:53:15 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6HBU77CP7xJIw4KayPpNq4pGfYosv7bGjCnvBLq1S/WNi+e+thj4Rrto2qPbX4dx7awzouIw== X-Received: by 2002:a05:6214:f2f:b0:62b:6c6f:b3e3 with SMTP id iw15-20020a0562140f2f00b0062b6c6fb3e3mr45427346qvb.3.1687989195254; Wed, 28 Jun 2023 14:53:15 -0700 (PDT) Received: from x1n.. (cpe5c7695f3aee0-cm5c7695f3aede.cpe.net.cable.rogers.com. [99.254.144.39]) by smtp.gmail.com with ESMTPSA id p3-20020a0cfac3000000b00631fea4d5bcsm6277797qvo.95.2023.06.28.14.53.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Jun 2023 14:53:15 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: David Hildenbrand , "Kirill A . Shutemov" , Andrew Morton , Andrea Arcangeli , Mike Rapoport , John Hubbard , Matthew Wilcox , Mike Kravetz , Vlastimil Babka , Yang Shi , James Houghton , Jason Gunthorpe , Lorenzo Stoakes , Hugh Dickins , peterx@redhat.com Subject: [PATCH v4 2/8] mm/hugetlb: Prepare hugetlb_follow_page_mask() for FOLL_PIN Date: Wed, 28 Jun 2023 17:53:04 -0400 Message-ID: <20230628215310.73782-3-peterx@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230628215310.73782-1-peterx@redhat.com> References: <20230628215310.73782-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Queue-Id: EC32B140024 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: gkeerrd6znbp1suimma6x9xxd1om7cjf X-HE-Tag: 1687989197-857084 X-HE-Meta: U2FsdGVkX19sR2R0b2ruz5y/1T4ONFzrnRLcz3RI2145saj5hjkY1yAI3gZOtsKCAH6D/hPc4TEJHwk+ALj8hsJKcNQm8rr/9dGhvP46f4AkUGGkBF0fOaP+9/K23Ey9PNbwqh83w5tunR7AdYMVuMFVJwoC+rI89iapZDsE//lEGgDXE/o/MJyXOZzm50SAbLCtDF7I76/owiDEh/r32+dGWPLq5xOL2gmFYC52gzJWQkjWy4AgiyiADChQMQ9QU/XWmywL0hU+lHKgruIvWotaUx2ffwjFtjlZE/OlAaVDP1c+7KEo+BDN6CmojqXB9LNE3KXKmNXwlSlLaXSuCBmHC/+XI/uJIbF3J6b5M8hsdktK46xHls95iX/6ovLe7hD8aCgD2tRTpUZ1lh68pl1rk4F2l4MiGEvPqjYbwvloMyR6I//2/R9qikmt14oKknrjrb7rRRobMN3lKDH7m7I/oPvNzlrwny+VsnUHlCsWMGD/cTZyBnLUgExPEFhLwJIFHclcMc3pilN7L/Y5jrP28bICZUDb2rgvIETXnRvj0xWsCKbUl7cWWqbYqt4yabuiPzY86Gw+MdpBnaFwIhfPEMYUzXicXuA7ZO14JZNOuzmAoWZu8T0sM5tm0P2FGRGozQwZnXoW14IuvhtR/5YgJn9rg5G1k5wCVsGQo/fmfVHhKLiY2/9ogfwSdiYuHO/5QKN+yIuQ4w7lbiF8t0wJbw4ABTGoRyUf6VD6I/Y4ZOtY4p1DaaBQqzOtc1DN+UvD0YPc7ET136hLnTuVuRwmX2XmtOy0DzU+O1bCOU+7oerq3ij4VwGAuzmWIL4+eBLaVjw1z1b7VO+xpHdh8JyOAXjMAdvByfWHl5gaQuvKOuqgJSFnyKrpEdVTIVkQwsFb+E7NWGe/Lmxf9xnvyShoyBn21XwflHKv0QDhrmostd3LumL5IxDRvo1Zvdy+3iyhuEwD1SuvWD/pPQw IkTch/kx oK1KAHDRFj9HV18CPqWzpUgrnOSHP+4NCU4umVr+qo2qpWIrL4vXsDpn1f9Lg17rIy16wvGB+4/CxhPjpr77WoBUWNNGgvrO/tAZrWLxD+VDzmzTFgLwQOdlxTgbRrZfHQsJurYh6X4lkPXIvqYW046e/88GPHrm81JLzRVqRPjB28uKGEa7gKndXtar/4Dd58ie5Nn2cv9esR5ismfQ0GhTvF4OOTJZjExpHmakYLRkYy/GTpD0+ac0JU3lCYEjR4210WxkIkFGeAF2lov795d3qGw6KgCwFkRNf1j1kwbMMX+xaAdC4CKWT2Mz4eEhtzF6HGvA+BafQSMU6oSOloUqx1y6ar/n3dIvDuU3IMkbC3Qk+JRGOYiGbtLwxCx1R/MulFWoGNZL4Xzc1ymsZ3oc0kbZAlauxFXmfwyuqfa1DW0pM4f2RoY5ohEHCPfGVciJpm0OiqcU+Sjg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: follow_page() doesn't use FOLL_PIN, meanwhile hugetlb seems to not be the target of FOLL_WRITE either. However add the checks. Namely, either the need to CoW due to missing write bit, or proper unsharing on !AnonExclusive pages over R/O pins to reject the follow page. That brings this function closer to follow_hugetlb_page(). So we don't care before, and also for now. But we'll care if we switch over slow-gup to use hugetlb_follow_page_mask(). We'll also care when to return -EMLINK properly, as that's the gup internal api to mean "we should unshare". Not really needed for follow page path, though. When at it, switching the try_grab_page() to use WARN_ON_ONCE(), to be clear that it just should never fail. When error happens, instead of setting page==NULL, capture the errno instead. Reviewed-by: Mike Kravetz Reviewed-by: David Hildenbrand Signed-off-by: Peter Xu --- mm/hugetlb.c | 33 ++++++++++++++++++++++----------- 1 file changed, 22 insertions(+), 11 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d04ba5782fdd..4410139cf890 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6462,13 +6462,7 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, struct page *page = NULL; spinlock_t *ptl; pte_t *pte, entry; - - /* - * FOLL_PIN is not supported for follow_page(). Ordinary GUP goes via - * follow_hugetlb_page(). - */ - if (WARN_ON_ONCE(flags & FOLL_PIN)) - return NULL; + int ret; hugetlb_vma_lock_read(vma); pte = hugetlb_walk(vma, haddr, huge_page_size(h)); @@ -6478,8 +6472,23 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, ptl = huge_pte_lock(h, mm, pte); entry = huge_ptep_get(pte); if (pte_present(entry)) { - page = pte_page(entry) + - ((address & ~huge_page_mask(h)) >> PAGE_SHIFT); + page = pte_page(entry); + + if (!huge_pte_write(entry)) { + if (flags & FOLL_WRITE) { + page = NULL; + goto out; + } + + if (gup_must_unshare(vma, flags, page)) { + /* Tell the caller to do unsharing */ + page = ERR_PTR(-EMLINK); + goto out; + } + } + + page += ((address & ~huge_page_mask(h)) >> PAGE_SHIFT); + /* * Note that page may be a sub-page, and with vmemmap * optimizations the page struct may be read only. @@ -6489,8 +6498,10 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, * try_grab_page() should always be able to get the page here, * because we hold the ptl lock and have verified pte_present(). */ - if (try_grab_page(page, flags)) { - page = NULL; + ret = try_grab_page(page, flags); + + if (WARN_ON_ONCE(ret)) { + page = ERR_PTR(ret); goto out; } }