From patchwork Fri Apr 4 21:06:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: SeongJae Park X-Patchwork-Id: 14038913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B750C3601E for ; Fri, 4 Apr 2025 21:07:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F2CB96B0092; Fri, 4 Apr 2025 17:07:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DAC896B0099; Fri, 4 Apr 2025 17:07:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AC2C96B0095; Fri, 4 Apr 2025 17:07:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 7B4D36B0093 for ; Fri, 4 Apr 2025 17:07:11 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E6553808E5 for ; Fri, 4 Apr 2025 21:07:12 +0000 (UTC) X-FDA: 83297596704.04.16C2933 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf27.hostedemail.com (Postfix) with ESMTP id 31D5140005 for ; Fri, 4 Apr 2025 21:07:10 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=mFAEWJ8U; spf=pass (imf27.hostedemail.com: domain of sj@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743800831; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xQUuSdvqqvhsAv6w3cj3wDtvTVdsB4ZlHl+t344Bfc4=; b=nc5HFIfi47hhtp/R/Izq9MaCdeoUkIFjItGL3D0a25w7rru7h6NjI3YIy3OS8q0NmtYCdX 98pf+AlAwGNCHMfDVr9zjRemi3F6VFg0qQ922lCpBoK6iqdpGsKjxojAIjl6p7IWrGgX0J eGimWs6hb/uxaA6ndOL8eWWNSaY5lmU= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=mFAEWJ8U; spf=pass (imf27.hostedemail.com: domain of sj@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=sj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743800831; a=rsa-sha256; cv=none; b=4yyup8KS8FN9PguD34G2qncN3IXkMjiur6v3XTiGSnmWRkBnK++9KCQnhjN5c0lh34fyuV hsLlH354PDMc2dtll5rEK/A6CSBe206ShfwQ4z9Slb+Nvl0svwquViU4G8Vbylk9YXI2pI JnSPM/uj1AIqMsdtrrEWX0w3F4Mqtb8= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 40D8E5C6C4A; Fri, 4 Apr 2025 21:04:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EAA66C4CEDD; Fri, 4 Apr 2025 21:07:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1743800830; bh=J7/gCDdc8fvuXxQWaVRGjApfDs/hGBRUpXDXtj9AVtM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=mFAEWJ8UakEVQgoyrGBCSCT/ySW7BY+FmZ4Y3zhWuE45GXBofakIyEn0PXwiETFTA fmMqTCKz10Ox03ttTmL2pS+lLcLBANmV87GIHc3ZpaKcZqhl8RuDge4oCu/jpSFVLN x/D1w5UmSKaieWuPAPVgmHVagEO2DrWcUzjSEA2ln2QEdSAjlGpxveG3rNGHaoOkTH TWG5YdXnPgIC7eNjYmsVLfqgdgH/zxgGVWBQhM1BLdSXeO38xgNROr3/DjwczH6F5h R2kJifQwWjZWC/Yyg7lW9Wpqwjy9v2X+IOVliGpxpLiSoexrTuNW6Hzx7hTYpHIUYg vp7tTqJGmxQQA== From: SeongJae Park To: Andrew Morton Cc: SeongJae Park , "Liam R.Howlett" , David Hildenbrand , Lorenzo Stoakes , Rik van Riel , Shakeel Butt , Vlastimil Babka , kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 3/4] mm/memory: split non-tlb flushing part from zap_page_range_single() Date: Fri, 4 Apr 2025 14:06:59 -0700 Message-Id: <20250404210700.2156-4-sj@kernel.org> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250404210700.2156-1-sj@kernel.org> References: <20250404210700.2156-1-sj@kernel.org> MIME-Version: 1.0 X-Rspamd-Queue-Id: 31D5140005 X-Stat-Signature: xq73nwobagnux5hwsgirfuur3nijiaus X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1743800830-793509 X-HE-Meta: U2FsdGVkX1/TQLz2NOHxoca0/6XHeKxnbqNjqxBK5gkNlFmOX0WhtzJxrZ9YjjIz7ayJtjDLoAO4sXBlOitzj6hQPCo94U5Am46KXwX7ojN4C4i+XRMt/Ht+pnoFF0VP5aPXMxIdh8pASMpRx4kdNfXU7cQB5itUnb7vrhba0k3vfC7EgQzrgFT/Ham87F1YCG332HlHoRtaEEsJR9tqLw+j/gCEyURmPudptf6WNfCEGHM41jKlU8495S6Sptsc5FSrCMpv/3dY5UvTGnvd3fmB2aluT+o1qHwP41Tc4PEzuHlCpMMny9xeaGdrhfE6vI8DvvS/dQyrQOrhFE224fMNFrcGF1sOzyj8mN40YN269xYsnHwI99V+Y9RqjaHlkpzdV0tDDOXJ0d8RMFd9X2KayA/DAMaaRlLYXdnY/Wb/qCXowEYNbDrkYUJXbxkqsrOAXWr8x0GzUmZBqpvhVGlOcvEYGKoaL/2anKbTSTzcv4YcorojDgvsPzBSkq4z5zbuVxz8o7DkBmIlFQ9gONl3gWwb1ekiS30A2Klk7aE8p6uBvbXuLj0sFnGG1khGNBpDunUbquM1mCYrRn070N/H9dbqc3Uuc/DHyaxpwrhSVr0UV/qCwLg4HrLC2f5ADWKB7Ip+EMEW/V4UgPLVVzwe73WYTCHoriLXTBLeHQAhe5Jj/dNrgvmLa0rxvmENY1x55tHzpIXod2HFEjmp37ou5wZqS8pDk3bACyPRVQQYbIm/G7u6etNK73jUqhBrPtbNXccutzR/cWxs/l5+nZ/acd8xqPCZnyI9+9sj2hfc4eyHjXUhdTECV4DdKKrlwBPf+OyvZZSQllAQmSnvFANNnscy7c/es2wFCSjexPaUIisKKRXje7qMJxTmRd1fy5iNc6WiBf/S4JvJo8AdnQ7tYtCqYT8F/unHwhx5ZoocIY30TJDmYFsEPrL5lGb4kvxEjpBHU3VjjT05x3y gZBrO3yj zU/+WjUUKQjCzl3/rE41fA0aeNlWurmSGAtYVhlGEPyWBR2zlBJ+aBJyrYHk6NW5RZhMys036Z+7PGWcL/O0L/tt3NqqVjfDR/RjsUmT8rSfxkHPd/Rz9YW2Gri80Mb5v1HTLhU9Kr1ttr3gsC1tiPkYvglhTT+8Rx9gHwFKvNGkz02Zj+/w+e4XqQwyISXZpGeh/xtg0wut+HsNsP6fLjXiYriMpmyvjrhEJNBWinNjAOlWj9+evCuR9lDV9B/se9+llipnY+A3PdEBINPVrgJcZZg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Some of zap_page_range_single() callers such as [process_]madvise() with MADV_DONTNEED[_LOCKED] cannot batch tlb flushes because zap_page_range_single() flushes tlb for each invocation. Split out the body of zap_page_range_single() except mmu_gather object initialization and gathered tlb entries flushing for such batched tlb flushing usage. To avoid hugetlb pages allocation failures from concurrent page faults, the tlb flush should be done before hugetlb faults unlocking, though. Do the flush and the unlock inside the split out function in the order for hugetlb vma case. Refer to commit 2820b0f09be9 ("hugetlbfs: close race between MADV_DONTNEED and page fault") for more details about the concurrent faults' page allocation failure problem. Signed-off-by: SeongJae Park --- mm/memory.c | 49 +++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 39 insertions(+), 10 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 8669b2c981a5..8c9bbb1a008c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1989,36 +1989,65 @@ void unmap_vmas(struct mmu_gather *tlb, struct ma_state *mas, mmu_notifier_invalidate_range_end(&range); } -/** - * zap_page_range_single - remove user pages in a given range +/* + * notify_unmap_single_vma - remove user pages in a given range + * @tlb: pointer to the caller's struct mmu_gather * @vma: vm_area_struct holding the applicable pages - * @address: starting address of pages to zap - * @size: number of bytes to zap + * @address: starting address of pages to remove + * @size: number of bytes to remove * @details: details of shared cache invalidation * - * The range must fit into one VMA. + * @tlb shouldn't be NULL. The range must fit into one VMA. If @vma is for + * hugetlb, @tlb is flushed and re-initialized by this function. */ -void zap_page_range_single(struct vm_area_struct *vma, unsigned long address, +static void notify_unmap_single_vma(struct mmu_gather *tlb, + struct vm_area_struct *vma, unsigned long address, unsigned long size, struct zap_details *details) { const unsigned long end = address + size; struct mmu_notifier_range range; - struct mmu_gather tlb; + + VM_WARN_ON_ONCE(!tlb); mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma->vm_mm, address, end); hugetlb_zap_begin(vma, &range.start, &range.end); - tlb_gather_mmu(&tlb, vma->vm_mm); update_hiwater_rss(vma->vm_mm); mmu_notifier_invalidate_range_start(&range); /* * unmap 'address-end' not 'range.start-range.end' as range * could have been expanded for hugetlb pmd sharing. */ - unmap_single_vma(&tlb, vma, address, end, details, false); + unmap_single_vma(tlb, vma, address, end, details, false); mmu_notifier_invalidate_range_end(&range); + if (is_vm_hugetlb_page(vma)) { + /* + * flush tlb and free resources before hugetlb_zap_end(), to + * avoid concurrent page faults' allocation failure + */ + tlb_finish_mmu(tlb); + hugetlb_zap_end(vma, details); + tlb_gather_mmu(tlb, vma->vm_mm); + } +} + +/** + * zap_page_range_single - remove user pages in a given range + * @vma: vm_area_struct holding the applicable pages + * @address: starting address of pages to zap + * @size: number of bytes to zap + * @details: details of shared cache invalidation + * + * The range must fit into one VMA. + */ +void zap_page_range_single(struct vm_area_struct *vma, unsigned long address, + unsigned long size, struct zap_details *details) +{ + struct mmu_gather tlb; + + tlb_gather_mmu(&tlb, vma->vm_mm); + notify_unmap_single_vma(&tlb, vma, address, size, details); tlb_finish_mmu(&tlb); - hugetlb_zap_end(vma, details); } /**