From patchwork Wed Dec 4 11:09:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13893582 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8BC0BE7716B for ; Wed, 4 Dec 2024 11:10:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 261296B0088; Wed, 4 Dec 2024 06:10:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 210186B0089; Wed, 4 Dec 2024 06:10:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0B12C6B008A; Wed, 4 Dec 2024 06:10:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id E1D156B0088 for ; Wed, 4 Dec 2024 06:10:26 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 8F497160FD2 for ; Wed, 4 Dec 2024 11:10:26 +0000 (UTC) X-FDA: 82857007590.23.6A46D65 Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) by imf21.hostedemail.com (Postfix) with ESMTP id 6516E1C000E for ; Wed, 4 Dec 2024 11:09:57 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=HWhLenPT; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf21.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.181 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733310611; a=rsa-sha256; cv=none; b=HtfktvFw8ncgmjmlqoh1KNps0MdfhJ98eKUwHsyNQQDbZIBGANQKZRdqsGl3UyG3lCcPHN DW97tpiJHsxnRldAWiqZ8/yRfCm1AnJo2/CoOVs9cYnE7cDn56fIJjzqO+lTgU0QgIAO1l Dp1qi59Ydm6grLbex0/T6o/eg7aaE/0= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=HWhLenPT; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf21.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.181 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733310611; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=J1cxqrRNLtVgEINqhNBjpEtuU6ycwPEQ1YtKh4XaP64=; b=L71c1JDUoluETpF3+Nx1I1GJCGUvuPxbHwU2h1iqLG+NxK2UMfoWsgh1kJINm3VeZJYOh/ rChibXeNPUzi6agAM3wImoqWouOTS2XrGvfeZKgjb9PScRscFpgjf8jBoSn6fobH/sJ8LQ BN/n3JT+pFrtqoNQY7iqrPUnGUYDiFQ= Received: by mail-pf1-f181.google.com with SMTP id d2e1a72fcca58-724e6c53fe2so5076476b3a.3 for ; Wed, 04 Dec 2024 03:10:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1733310623; x=1733915423; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=J1cxqrRNLtVgEINqhNBjpEtuU6ycwPEQ1YtKh4XaP64=; b=HWhLenPTIy8LEfVibaVc2xEI1g7ZS6i7Sus0cIuf37IUvrT2MbGil/S4a2MU2FaZr/ b21+P4mtJg9xhRDT6d+AAFIvehghI+7BT1Hv7Pa9zM9ieTvt16sdnQZFy0AmD1G0BgSm 50KhjLWOi+iUZaEqOJTWNfmJUWAemILVmQillr/X/RagnvSatjeilLT0yh41s9VJ4Vrj 3NEwoEyMhxn3EmVnVTPfXxt2WAazp1mKyLDO3HHFXB7+jhzgCZXfB4obkVAUp9WA8j/p JdIqi6oFXnyVe/aO86XGd0rx5FcuAPEe20ecu3789+HjBllr9tJ34/wtAJriMD6XgsfT +gbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733310623; x=1733915423; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=J1cxqrRNLtVgEINqhNBjpEtuU6ycwPEQ1YtKh4XaP64=; b=r8zWaJDrv0nzv2vc/wdb3G/OndL+29TW5rRmrz3NZ1vGNIUqNmjVR9zp0vTK3IPfGu iWO4vwho45s5FoFZtNAChGFOcYL6Ns5P7B/giPXBBZN9yh/LBuJTYz+oEU8ykfKbkRW0 JPRnOZfnBRa7DPmVXOJzuVenURb8+j6116L+RpyD3YaLRudZO9yC9sN95zoai9dQFGQ7 1uQagvT4DPvEY1PA7jWhdMbC5JoxNT1U+XVgPAXz/ISbvxRlkrwyogdQ6tYiuMjf0Mfh 8WzPQgaqU9Uxm9EmahJr0vHrCuVTFgHdHp7sXGx8EMGpljSkI6s6602ITqf6rtPmnZoM ciWQ== X-Forwarded-Encrypted: i=1; AJvYcCUImRvjawQ/IEfBWsXwqNKr7O+RQwPmmH3RtSsqHeDl6v2UvN7fqyx9GSiqo8hXn3u69DS+8O+/hg==@kvack.org X-Gm-Message-State: AOJu0Yxp8COBtfDn+RfQjOkvQN82JGVgRUP1gYw0uxMNzqx2OPe+l4C2 QqS+PGqi7Orln+phScWuwdYggm+ntC55iDnFVmjD0H83LjGN2zH3dQc+RUFCr/E= X-Gm-Gg: ASbGncuEWk605tA6aZLXvX/E4oGkIAJJpzKZl/3ZWCWVoTeSYuiVY8cYrgJDTwrqZ3M 3GpcEh4KGGOegwLZXckhDwIuOZB6r/0Am66PfNNXY17rge4E/31OagZFnn/BF/pmEB4+Sh1CqaZ f0O8Nel9edBxiVi2drhg/Hs/qFq0ip6Cq7zL8StK6DsJfyzUyMAWzh+SOWIAqtDWa7dl72TLvW0 /1e9VlEAlW/t1ecG1NVoICVitYQ9+bd+t7J2bIDXTNDTeBUjw3Y1FacXWPu9dWhp42dEs+v3WTI PeBn4Z0ZdHYxR2c= X-Google-Smtp-Source: AGHT+IF1f5oofbaWvETvYEiHLO/zdFbjQACTN4HhvvjpL8HNWx6S5PsEo00LYzRIYs4wKfqYTCWGBg== X-Received: by 2002:a17:902:e549:b0:215:6e28:827c with SMTP id d9443c01a7336-215d00f6d59mr48414015ad.56.1733310623257; Wed, 04 Dec 2024 03:10:23 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.148]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21527515731sm107447495ad.192.2024.12.04.03.10.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Dec 2024 03:10:22 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, peterx@redhat.com, akpm@linux-foundation.org Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, zokeefe@google.com, rientjes@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [PATCH v4 01/11] mm: khugepaged: recheck pmd state in retract_page_tables() Date: Wed, 4 Dec 2024 19:09:41 +0800 Message-Id: <70a51804cd19d44ccaf031825d9fb6eaf92f2bad.1733305182.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: ugfskkroqyohpjai8iem1a13111fmqdc X-Rspam-User: X-Rspamd-Queue-Id: 6516E1C000E X-Rspamd-Server: rspam08 X-HE-Tag: 1733310597-182730 X-HE-Meta: U2FsdGVkX1/JEimVN22wcZdcaiyBFhIygtZg5dPq+nbPGzT3FC+L8kAV7WNxYWneiavnHGVcbZUu3YBK0O/r65v9v/cs6wxpi5ISuf9X6handOHhiEJ4pJidOEMMeKhrNKaEW7wan40WkvF0G5Z4svvPbC0ULny/3gmAafamWjZyFVTehr+roxPJrv9nWSqWOdMfuG3wJ7YgzP1DnrQcmyUmVRJNAEe7mfz8CvCQjICWTgNYbaIIgS2Wd/5w3SY+jU49GvbtvsmLVDVaUH/+oia4g7M5zTBfhjOMhwGuf9mcKMAgJFaQgBXy7hCtFBiHjLWLnTirWbBN69oNiA0kxAXwaERa1TiUef/NnuU3z38D54qjbsY4I41YtflATTAsmHw3nQ+xWIW0gY5mygi+9RaVJIuFHt1wDFypgr7lvg/QgmHAL0pG/y8QgKNEdWcxZvwM4P/5G8MI6qBrCwx5dDrGyOgVNdA87mI5phlqARovaL4F2gpm1GQo0kio3WjdzH80pzRHhz0KJndaBdSgQapmQ4xh03NNyZnkmwlPJ3RnrRqQ50i2ahOzEFIXh5+AdMvlF3zijSP3gYrkk0jg+U5ugarYElR/WfWf4iV5QHrmbjl5hXAKPF5cSaPU6/BGBtOx3f7itVbQ3Ih0VaA4djYIAZyCMravDu+pt/hhyx09nje5O5gsGIZdLzQ0OA8RNQ5AH4vCC232ZtSIYfb76gx0jrroI3kAs766wCbCutoXluTTqh7VC8Ca1Rvi1La9NJzW3kQzNFSukAIgWMbikumLyL9D0RexU716xqdd2ee9w6jfLwjrV7wUJgkwiqx9laLb6tHnqpewhocPpSHFdzVdEWthf/Ei7tHSuVfW1QttgUG4FsogzKrKfyWYjmeI/fpx624VtZsEDw9Y5vOPm4mS6rV6LRrwNynaEFWJQq0iZpQIGiEl9gll9OUP2IBA1kD0W19h0lkHzHJWAXm QDMqxk6X gnWcf7KE8FBa3L1+BxkSw3yVzVTJf6djYnbyvZx6u9slcVRCrnj0J2IoFX06olTgDahQzC9HZ7JAhtH7okP5Vs0pw+takofyQ3FCYbehG+0yZ2WBG9l+edyb5EDtcmQ0+p0zGQWFx454vlgFeg5yI93sMLEQc9E3k4xwrFJaV/fB9Y7CDUHE0DnCaBt49yoGtXZbCpkZYUH09NHJuhIKhZZYlSUD+hafvcwlB7/fJP/SJuu/UHVWWmC6AapxV6aCF/9eR2T2O0VteIfsNoGobIj+ejb3RIWfXuTASgoEC2HqPTuhMxVbZFIb9zSxtg2c1Iz3RRf3BOGHkjWdnLNhpCAT+pmUh2+xTsl4fw1P7TlHSt8wM+tjyfkHoWq8qKbtIJSbK X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In retract_page_tables(), the lock of new_folio is still held, we will be blocked in the page fault path, which prevents the pte entries from being set again. So even though the old empty PTE page may be concurrently freed and a new PTE page is filled into the pmd entry, it is still empty and can be removed. So just refactor the retract_page_tables() a little bit and recheck the pmd state after holding the pmd lock. Suggested-by: Jann Horn Signed-off-by: Qi Zheng --- Documentation/mm/process_addrs.rst | 4 +++ mm/khugepaged.c | 45 ++++++++++++++++++++---------- 2 files changed, 35 insertions(+), 14 deletions(-) diff --git a/Documentation/mm/process_addrs.rst b/Documentation/mm/process_addrs.rst index 1d416658d7f59..81417fa2ed20b 100644 --- a/Documentation/mm/process_addrs.rst +++ b/Documentation/mm/process_addrs.rst @@ -531,6 +531,10 @@ are extra requirements for accessing them: new page table has been installed in the same location and filled with entries. Writers normally need to take the PTE lock and revalidate that the PMD entry still refers to the same PTE-level page table. + If the writer does not care whether it is the same PTE-level page table, it + can take the PMD lock and revalidate that the contents of pmd entry still meet + the requirements. In particular, this also happens in :c:func:`!retract_page_tables` + when handling :c:macro:`!MADV_COLLAPSE`. To access PTE-level page tables, a helper like :c:func:`!pte_offset_map_lock` or :c:func:`!pte_offset_map` can be used depending on stability requirements. diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 6f8d46d107b4b..99dc995aac110 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -947,17 +947,10 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address, return SCAN_SUCCEED; } -static int find_pmd_or_thp_or_none(struct mm_struct *mm, - unsigned long address, - pmd_t **pmd) +static inline int check_pmd_state(pmd_t *pmd) { - pmd_t pmde; + pmd_t pmde = pmdp_get_lockless(pmd); - *pmd = mm_find_pmd(mm, address); - if (!*pmd) - return SCAN_PMD_NULL; - - pmde = pmdp_get_lockless(*pmd); if (pmd_none(pmde)) return SCAN_PMD_NONE; if (!pmd_present(pmde)) @@ -971,6 +964,17 @@ static int find_pmd_or_thp_or_none(struct mm_struct *mm, return SCAN_SUCCEED; } +static int find_pmd_or_thp_or_none(struct mm_struct *mm, + unsigned long address, + pmd_t **pmd) +{ + *pmd = mm_find_pmd(mm, address); + if (!*pmd) + return SCAN_PMD_NULL; + + return check_pmd_state(*pmd); +} + static int check_pmd_still_valid(struct mm_struct *mm, unsigned long address, pmd_t *pmd) @@ -1720,7 +1724,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) pmd_t *pmd, pgt_pmd; spinlock_t *pml; spinlock_t *ptl; - bool skipped_uffd = false; + bool success = false; /* * Check vma->anon_vma to exclude MAP_PRIVATE mappings that @@ -1757,6 +1761,19 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) mmu_notifier_invalidate_range_start(&range); pml = pmd_lock(mm, pmd); + /* + * The lock of new_folio is still held, we will be blocked in + * the page fault path, which prevents the pte entries from + * being set again. So even though the old empty PTE page may be + * concurrently freed and a new PTE page is filled into the pmd + * entry, it is still empty and can be removed. + * + * So here we only need to recheck if the state of pmd entry + * still meets our requirements, rather than checking pmd_same() + * like elsewhere. + */ + if (check_pmd_state(pmd) != SCAN_SUCCEED) + goto drop_pml; ptl = pte_lockptr(mm, pmd); if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); @@ -1770,20 +1787,20 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) * repeating the anon_vma check protects from one category, * and repeating the userfaultfd_wp() check from another. */ - if (unlikely(vma->anon_vma || userfaultfd_wp(vma))) { - skipped_uffd = true; - } else { + if (likely(!vma->anon_vma && !userfaultfd_wp(vma))) { pgt_pmd = pmdp_collapse_flush(vma, addr, pmd); pmdp_get_lockless_sync(); + success = true; } if (ptl != pml) spin_unlock(ptl); +drop_pml: spin_unlock(pml); mmu_notifier_invalidate_range_end(&range); - if (!skipped_uffd) { + if (success) { mm_dec_nr_ptes(mm); page_table_check_pte_clear_range(mm, addr, pgt_pmd); pte_free_defer(mm, pmd_pgtable(pgt_pmd)); From patchwork Wed Dec 4 11:09:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13893583 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88670E7716D for ; Wed, 4 Dec 2024 11:10:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1EBAD6B008A; Wed, 4 Dec 2024 06:10:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1C3D16B008C; Wed, 4 Dec 2024 06:10:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 063EA6B0092; Wed, 4 Dec 2024 06:10:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id DCAD56B008A for ; Wed, 4 Dec 2024 06:10:34 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 78DDE412C9 for ; Wed, 4 Dec 2024 11:10:34 +0000 (UTC) X-FDA: 82857007968.12.43DB798 Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by imf14.hostedemail.com (Postfix) with ESMTP id AE7D3100003 for ; Wed, 4 Dec 2024 11:10:16 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=dGYAAp8V; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf14.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733310626; a=rsa-sha256; cv=none; b=Jj3l+zzxkZezuVFAvMRMrxVzg9eUQcMJg3kB500LEJSd18CmO6jCi1wotm5Mb+uB3v1mNO rvnmNFMb4Ul2tovjTZ2p32tq+aaN7MAXJkGmfm7Svpw73rDp4dcV5FWFW4i9YLzWwhGILR YwNmG3yFZssO1ttiOmNbeJgTTdyhmrw= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=dGYAAp8V; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf14.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733310626; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=z5NhkFR9u/ZCBqXODjRxrXNpbWksUAj27FTFaGEH9DI=; b=GRHYh9QR6CQO8BmPAJ39nSybpYklrdSBaNFi9UZD6vHGQ65pdyyCwJRp7gVAouuR90zc8v rkngsg3scL4/bb1S6RZggy8Mqj4yMv4wkwGW//2qqUtbtgBvEz1NnnWsg57TN3owLXYHyk kRlIpRKAnQbA/gMTI7N65ukvnWp7nYo= Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-215b4681c94so20335665ad.0 for ; Wed, 04 Dec 2024 03:10:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1733310631; x=1733915431; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=z5NhkFR9u/ZCBqXODjRxrXNpbWksUAj27FTFaGEH9DI=; b=dGYAAp8VolX1ee/OxVQrDmNGZG+AXXmgHKX0+aRk276cIP0MhPeB//Up/a2ksfBHLw jg5Lxs7Wn2Z9nvLdV4JhuFY67c/ND0sN22c4+EfDZHQP6MiVr+4FovRKK1v+R8xRRzpq AdxIUiF+4ZunoZJO2Ycy8R5YCBLexrqiVRPVVvnz+KNClDiE/pTlWRy9xaHVnMPoqTF+ Knp6vVugwQUJ28QK/jDdhfUdJ4uhRUp8r4+ga/PH2D5w9IFDxPXqZHTV76h94U7uXQRi UDIoLfQU1U0f2Gh1Ah4LKc0e/VaFodirgugNm+SQKuaCekgC90VwrCU8RbA+7nkmlCx8 kREQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733310631; x=1733915431; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=z5NhkFR9u/ZCBqXODjRxrXNpbWksUAj27FTFaGEH9DI=; b=MTq/bF1eRrPVrLuggtcNlTl/qbFkz4uF75MiZe3kXDbyRM8ivr1Jo8rVZAV+6eLvp3 Cx81JGFXTAEL0hu+EXaU2tFp38prpLizC990Q22pQGxYsrVQVFUoA8oeW8rONIFTvpz9 x+ciAX7NYp+GhsagBCe3TCrvcuLzFg0p956AuxlWhsvtowqr9jB5kMlxTsvmy5sXfAx8 NHgQWKlGMCq9cX7qqB9d9+CfMRzT4Bfl+0yqiJEXIGw4egceGdIU/U7ciTALxZDqcvq6 mzm1fZAaK84VXWUYOeOptLnKZ2ZLa8Y5Z/emQ/4/MfVkjUvCYOpEeSnmYcKsxNU3TY00 v74g== X-Forwarded-Encrypted: i=1; AJvYcCWnQIBUDOZB6s20kfBbsWKmH6/jM889B44v01PRqJ4/rnNjDmf6gtl/TVwjKs/QNQKGCFlx0x3LUQ==@kvack.org X-Gm-Message-State: AOJu0YxNcu6YqB98wK9RT0mQ1AU0zDE3mXxq7VWFE3i6nRxt0b6OZy7J CqivOq0TyECH9BXMYgZFvK51KF1LOspbpKbUuqsZfZZxILrv+pnwMMUPZsySQxc= X-Gm-Gg: ASbGncs5LdHUv7AR1wTnNWBJrQc3KTJg0Se8L0mv8L56eaRVzTJ/xnOGiIgVez9QAbs yyFYavSEpL419czWN6HDXcFv6JXTByLvaH3oKmnaKgCelD2qs1nYeSfRU6PF1Jy8pIpTaWLbgdG wVcL84nBEDbaLxI4GA0xxwstoiW8ysJW7vLEAYor3HuKD4PD6yjv79DnuiPAoEOyvY8FJWu+wPP NKsrXVtq/WojMqgwm6Dxd4wwVN7886yZkr9hSvPZKXVpm8Keu2aoriHWX3qEB1NegqjAHA+h40P ++k/jlTNB7Aj3+o= X-Google-Smtp-Source: AGHT+IHr/3WEXhcUbT+3scjK3eDqtKNFZN/P/43E0ViNGyViHEQX/poSMnV5EjBiq3NGqRPWRhcOyA== X-Received: by 2002:a17:902:f64d:b0:215:9894:5679 with SMTP id d9443c01a7336-215bc42229emr92525605ad.0.1733310630948; Wed, 04 Dec 2024 03:10:30 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.148]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21527515731sm107447495ad.192.2024.12.04.03.10.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Dec 2024 03:10:30 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, peterx@redhat.com, akpm@linux-foundation.org Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, zokeefe@google.com, rientjes@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [PATCH v4 02/11] mm: userfaultfd: recheck dst_pmd entry in move_pages_pte() Date: Wed, 4 Dec 2024 19:09:42 +0800 Message-Id: <8108c262757fc492626f3a2ffc44b775f2710e16.1733305182.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: AE7D3100003 X-Stat-Signature: uts756ojsujekiq4sktbw8p4j7qdwgy9 X-HE-Tag: 1733310616-761314 X-HE-Meta: U2FsdGVkX1++u9tAUSC54iQZUhtInykE69KevdWnpZHDri+hzgge2jysNvOyxd8FhzibfLfb63iVA20GZpnVxf6DCF4Z9be+jLsRCr93EfO8F70jVecJY8F+DFEZHLxs3XovdH9FmL6v9N9UStOX8JSDBVqBeJVobQRjunqyl7chWlmT8RmzabWBiZKnqGv4VKo1u0BJ8VM5WdV70UwkcG1qA7xwkWNR5BQRRbbm26ktWc5UxS8Jy1vG4sk8sCOE4czho3aOAcYTwEyJDjvccfgG2nuDIWIi90eMZJnBJGSYJoupQ5tt8TgkS/aUf2BiWtl0KvG1CBEwhoXmy0VyW5O2qX6JxaBnHfDjsJhdSJ/McIhb5SrHypno90Ab+bP8fDPJ9ZKoNGlwk3sYxU6Wdcd5PjIiC6Dcnxz6B82u3R800mZD1G38sOJVEwD2T5+Dq5iHpih+hMgS9qpAVMYh31Mnrq86vx5B9H2W7/SA1EhgS2n+c0S8AUerDhycXLcRlhKJIZnuLeD6tYdmZKFRODqac08/du/ZA2UcHqh6JimMKK0jeys6eFnEKoNQTCyr02gUI3gqIWLWkXiMjNXl9S8ypCUt1jUMwWx859J7pmO8wkHBacmCglTKCTkBRJ4LuCfGIJV5dlsZ187ybXF9XXM5MEPXAENTz/4de23H4o1x697xdxjVvVFTyqRT2iKiSD6n/wfjEJsobm/GAtJZtzaXTsAxdpEE7d6JFKRQ0sEvrG0hDPOCJnDW2vo/NYdHcvwUd5UXq/vn4Vp7v7sDrLNnQVC9r9Kr5SPwvFbSVzTmqGCp/giBhxrnl2MBfh+XfkUhdYNsH2groxC6+8R88FFlGaLQJTRGtufXikfeBKmLl8WLLrtgLwcbBD5zXpPw9Xc3681ptAvVqWk8t9LVuUXxuJtKHCRuUFBvO8na3ULLR0g6HMd2bRY5VEiY5QxywCerH4eMVL+LlYukGSc pW7UD3mL DcRdAYJzlGh00YEWCdPnYXj7KDWWxB+7yWbgSHYf4qOAoTV+lTj46uuhrW+5/sx+CTrhlm7Untp1UqsoDWFTTc7VD5Aj9JM+njGlJenqRPJRevOyYCgKHb8oA3lUoj1NytZmNl2z62ia6j1Tow3uQEjanfmwL90ObFPf6cqSUhgJi7CHMPibWb0BxbabL4Lvz6haMNL6Mkfl9tmExXf3HqjMN96g6oGeaR+/VmxZeDxFGwbrC74fHpH6tNlJ0xCn3Wa88QrSWq4MSS2WtFvLOvAZDUvAUaAXKR6lfrv3/tQCPhlMtVhiv5aJBRWJmKwtnHj5R2SyDXiRxUxuC9c+tojUvbV3ewtRm6mzP5FGInlRltxM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In move_pages_pte(), since dst_pte needs to be none, the subsequent pte_same() check cannot prevent the dst_pte page from being freed concurrently, so we also need to abtain dst_pmdval and recheck pmd_same(). Otherwise, once we support empty PTE page reclaimation for anonymous pages, it may result in moving the src_pte page into the dts_pte page that is about to be freed by RCU. Signed-off-by: Qi Zheng --- mm/userfaultfd.c | 51 +++++++++++++++++++++++++++++++----------------- 1 file changed, 33 insertions(+), 18 deletions(-) diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 60a0be33766ff..8e16dc290ddf1 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -1020,6 +1020,14 @@ void double_pt_unlock(spinlock_t *ptl1, __release(ptl2); } +static inline bool is_pte_pages_stable(pte_t *dst_pte, pte_t *src_pte, + pte_t orig_dst_pte, pte_t orig_src_pte, + pmd_t *dst_pmd, pmd_t dst_pmdval) +{ + return pte_same(ptep_get(src_pte), orig_src_pte) && + pte_same(ptep_get(dst_pte), orig_dst_pte) && + pmd_same(dst_pmdval, pmdp_get_lockless(dst_pmd)); +} static int move_present_pte(struct mm_struct *mm, struct vm_area_struct *dst_vma, @@ -1027,6 +1035,7 @@ static int move_present_pte(struct mm_struct *mm, unsigned long dst_addr, unsigned long src_addr, pte_t *dst_pte, pte_t *src_pte, pte_t orig_dst_pte, pte_t orig_src_pte, + pmd_t *dst_pmd, pmd_t dst_pmdval, spinlock_t *dst_ptl, spinlock_t *src_ptl, struct folio *src_folio) { @@ -1034,8 +1043,8 @@ static int move_present_pte(struct mm_struct *mm, double_pt_lock(dst_ptl, src_ptl); - if (!pte_same(ptep_get(src_pte), orig_src_pte) || - !pte_same(ptep_get(dst_pte), orig_dst_pte)) { + if (!is_pte_pages_stable(dst_pte, src_pte, orig_dst_pte, orig_src_pte, + dst_pmd, dst_pmdval)) { err = -EAGAIN; goto out; } @@ -1071,6 +1080,7 @@ static int move_swap_pte(struct mm_struct *mm, unsigned long dst_addr, unsigned long src_addr, pte_t *dst_pte, pte_t *src_pte, pte_t orig_dst_pte, pte_t orig_src_pte, + pmd_t *dst_pmd, pmd_t dst_pmdval, spinlock_t *dst_ptl, spinlock_t *src_ptl) { if (!pte_swp_exclusive(orig_src_pte)) @@ -1078,8 +1088,8 @@ static int move_swap_pte(struct mm_struct *mm, double_pt_lock(dst_ptl, src_ptl); - if (!pte_same(ptep_get(src_pte), orig_src_pte) || - !pte_same(ptep_get(dst_pte), orig_dst_pte)) { + if (!is_pte_pages_stable(dst_pte, src_pte, orig_dst_pte, orig_src_pte, + dst_pmd, dst_pmdval)) { double_pt_unlock(dst_ptl, src_ptl); return -EAGAIN; } @@ -1097,13 +1107,14 @@ static int move_zeropage_pte(struct mm_struct *mm, unsigned long dst_addr, unsigned long src_addr, pte_t *dst_pte, pte_t *src_pte, pte_t orig_dst_pte, pte_t orig_src_pte, + pmd_t *dst_pmd, pmd_t dst_pmdval, spinlock_t *dst_ptl, spinlock_t *src_ptl) { pte_t zero_pte; double_pt_lock(dst_ptl, src_ptl); - if (!pte_same(ptep_get(src_pte), orig_src_pte) || - !pte_same(ptep_get(dst_pte), orig_dst_pte)) { + if (!is_pte_pages_stable(dst_pte, src_pte, orig_dst_pte, orig_src_pte, + dst_pmd, dst_pmdval)) { double_pt_unlock(dst_ptl, src_ptl); return -EAGAIN; } @@ -1136,6 +1147,7 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, pte_t *src_pte = NULL; pte_t *dst_pte = NULL; pmd_t dummy_pmdval; + pmd_t dst_pmdval; struct folio *src_folio = NULL; struct anon_vma *src_anon_vma = NULL; struct mmu_notifier_range range; @@ -1148,11 +1160,11 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, retry: /* * Use the maywrite version to indicate that dst_pte will be modified, - * but since we will use pte_same() to detect the change of the pte - * entry, there is no need to get pmdval, so just pass a dummy variable - * to it. + * since dst_pte needs to be none, the subsequent pte_same() check + * cannot prevent the dst_pte page from being freed concurrently, so we + * also need to abtain dst_pmdval and recheck pmd_same() later. */ - dst_pte = pte_offset_map_rw_nolock(mm, dst_pmd, dst_addr, &dummy_pmdval, + dst_pte = pte_offset_map_rw_nolock(mm, dst_pmd, dst_addr, &dst_pmdval, &dst_ptl); /* Retry if a huge pmd materialized from under us */ @@ -1161,7 +1173,11 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, goto out; } - /* same as dst_pte */ + /* + * Unlike dst_pte, the subsequent pte_same() check can ensure the + * stability of the src_pte page, so there is no need to get pmdval, + * just pass a dummy variable to it. + */ src_pte = pte_offset_map_rw_nolock(mm, src_pmd, src_addr, &dummy_pmdval, &src_ptl); @@ -1213,7 +1229,7 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, err = move_zeropage_pte(mm, dst_vma, src_vma, dst_addr, src_addr, dst_pte, src_pte, orig_dst_pte, orig_src_pte, - dst_ptl, src_ptl); + dst_pmd, dst_pmdval, dst_ptl, src_ptl); goto out; } @@ -1303,8 +1319,8 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, err = move_present_pte(mm, dst_vma, src_vma, dst_addr, src_addr, dst_pte, src_pte, - orig_dst_pte, orig_src_pte, - dst_ptl, src_ptl, src_folio); + orig_dst_pte, orig_src_pte, dst_pmd, + dst_pmdval, dst_ptl, src_ptl, src_folio); } else { entry = pte_to_swp_entry(orig_src_pte); if (non_swap_entry(entry)) { @@ -1319,10 +1335,9 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, goto out; } - err = move_swap_pte(mm, dst_addr, src_addr, - dst_pte, src_pte, - orig_dst_pte, orig_src_pte, - dst_ptl, src_ptl); + err = move_swap_pte(mm, dst_addr, src_addr, dst_pte, src_pte, + orig_dst_pte, orig_src_pte, dst_pmd, + dst_pmdval, dst_ptl, src_ptl); } out: From patchwork Wed Dec 4 11:09:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13893584 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5263E7716D for ; Wed, 4 Dec 2024 11:10:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4AB9F6B0092; Wed, 4 Dec 2024 06:10:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 482BB6B0093; Wed, 4 Dec 2024 06:10:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 322B86B0095; Wed, 4 Dec 2024 06:10:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 09F6A6B0092 for ; Wed, 4 Dec 2024 06:10:44 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 9ADF9A0F2B for ; Wed, 4 Dec 2024 11:10:43 +0000 (UTC) X-FDA: 82857008598.02.EE57CFB Received: from mail-pg1-f175.google.com (mail-pg1-f175.google.com [209.85.215.175]) by imf29.hostedemail.com (Postfix) with ESMTP id E77D3120024 for ; Wed, 4 Dec 2024 11:10:20 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=K9pfmLoN; spf=pass (imf29.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.215.175 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733310635; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uBIs+RyxQ8T92otn81RkrE5S0sfAaFlsvX0Ncx+wxck=; b=FS8u0uVxhXdS25V7hmtLU5j/c8qe28gzZqzvHy2S2HJsQ+I/a6SgrLeP41BzG1i/M3jh9B Hh3KJnFVl9YwWmxtHocUKntcHlu5hE/+uYtLXHqb5kM1CUwwWC6gpzNGQXbt6FMvAr2ZLB W6GzaJX1V1wPeGSzk3x7GtaZLLFTMOU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733310635; a=rsa-sha256; cv=none; b=gSS1s9/sO/8qIWOXgEDOe/N3xbNMRPbVDvikhJAvKBdC50okdZnWlenEfAYaxFklY7Rj7X fh7mbqUDwdOoyzvgYDNWi51v6RFPSQulObif0sS73e2ktd8uxcy3KYu1+gbbQqXCIIjYVI kU8i/Q9h6hQN9iOMJLvPmjRnpwqYhVY= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=K9pfmLoN; spf=pass (imf29.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.215.175 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pg1-f175.google.com with SMTP id 41be03b00d2f7-7f46d5d1ad5so5079828a12.3 for ; Wed, 04 Dec 2024 03:10:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1733310640; x=1733915440; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=uBIs+RyxQ8T92otn81RkrE5S0sfAaFlsvX0Ncx+wxck=; b=K9pfmLoN7jgTNNrZW5wixeJUGQOwc3HiLFX5Wno1coDgHFddtdkylBZGur+OO6+okn h+pLO3MNOPu46LwysgnbjCyzp212iupE2ZX1Ncxt+r2KiHV4JFc7ZcG4SCQw5mI/rI1U kt2gVaio+QZwiCcqdgP24kpRv7TpVsMPI17U08DEeQciIbFOO4GNr28+XqiD5+iHJhS8 lcFQ5Dkcp62I/KnqbCeZirP3MoBicqVoWK5cSvH9nVVWfn+oYWEFxoTpwKTUsBW0KRlA awLDBidtqWqnk/RQLy4fsWdTt/9nU8TmVJX/ntIys9uX9MLwRrXJYEtW2JLhIfwUE0YO lDsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733310640; x=1733915440; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uBIs+RyxQ8T92otn81RkrE5S0sfAaFlsvX0Ncx+wxck=; b=Cd+XrIgiJeUvJc47r7Y+4qrk5fLn33VPugxxcqIfubvtNfRQsSASs4zxEECk/WMhcn 0uds2a3tFw5ZXWLGAjDaKc8C6kYtjAvdIz4//2nVDyqIIZ0dBKwuoEooMKpoW06RCjUl doTDpDsvDl+NgguNvmdwVwYJLgA9gpAJTqo0oHLWcZH0h2veLwLGq62HXf0OkQvrbtO/ 6/3DNqyRxBZplOW901Gle8C9M4TFnJON7FhsVQoZHTHE/kCDvnrBhtiVeeyQiweJZBD1 P9N/uaSDYz14bSJVvD1AblbixeK5yHFRSfCA6i3xWxlJcmv6ZCTfoeqW95BiWGGmpIIh UzUw== X-Forwarded-Encrypted: i=1; AJvYcCX9jMPbYPAZWVFXVN8KYnSXqpXtJvPePagj2+8S3G6fYhI2a5X4ICX3uiYh8FsoHJqlPprQ/+cxiQ==@kvack.org X-Gm-Message-State: AOJu0YxOInWfm7MagdlxMQPpXDYIZDpXxVBE8vZqnwmOQS3mToEroam/ VxF0g3woG2KEr1cuJYAEDnpgADCgtAQnp0iFuysp9y3R61cmxZRdYmNIefRgDq4= X-Gm-Gg: ASbGnctGDiigOPI9tKkqTzdxT30GFN4mYEqiiuFiJb9pQvssyClRYfQUaisyWFSdXvA iQaKXmYzaTHJ82mrnzKbo29xh7hKmh1SSnWVGejP0tifZurTaqOhXnNvGHLP9mkn4z1M5/AaFAu riO7FM1Y2X/Oq6gokkc1+lu9SFVb2hQsQdd0QoLWSsdxaO6lotPGLffjWIbHDXT3TkvwfmJJT0v whWLPbxRob3dvw4RGDd72Edkt5GRMPd6AdMiaG/4nHQ2yZQDbFqX2o3pCPvggxBXU6jNuY5ypfX Yw4VzW/OEvgHXLE= X-Google-Smtp-Source: AGHT+IER5lnoQQdWEC7gbQxZosi7N0mw2aeYdLAS9LZW7bMxGiW32vvaQ3PYOO0pQsVEtwB2CIp8WQ== X-Received: by 2002:a05:6a20:7f8e:b0:1e0:ca95:3cb2 with SMTP id adf61e73a8af0-1e1653a7c1fmr8526385637.8.1733310640114; Wed, 04 Dec 2024 03:10:40 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.148]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21527515731sm107447495ad.192.2024.12.04.03.10.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Dec 2024 03:10:38 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, peterx@redhat.com, akpm@linux-foundation.org Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, zokeefe@google.com, rientjes@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [PATCH v4 03/11] mm: introduce zap_nonpresent_ptes() Date: Wed, 4 Dec 2024 19:09:43 +0800 Message-Id: <009ca882036d9c7a9f815489cfeafe0bdb79d62d.1733305182.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: E77D3120024 X-Stat-Signature: bnc46i9gzdura877ebq538jwsygutkx1 X-Rspam-User: X-HE-Tag: 1733310620-115304 X-HE-Meta: U2FsdGVkX18VyswK0ExdlWzMAmt3mae9MeQlEYwlkrnj4/o9u0v8FEsGjbBofQfWC23ukn71YaCs2eXNMKpMi8VPoJe67uSbA0MfMATY8ksi7OR/4OTk8o1xCnFNKYOlVYC3LUey/IUMnwlm+TbUH51IXb1rc7N5XW5JrAdM3hbqvdCv47CZ5kfiG+ancsxqNnQIf8c5GgLHS2TP1rqUOp8SKiBVlUh6v1IhRXRNIO8pxjctUdJbZkyr+xk66LlYTOq+tDLG6hapoPmp0m8B579xPae9FHM8igChKiuR/q644dI+IIkYC8MxCWtQNwFIyeoa6uSYrY5vJotdSFMVaBEfWTBvddneIGphUBz1F5L/y8hM7OKzkqEqfiLyNJdFGlgDtlhpaeiUuHcDhBhArQAegYj9A9zXpcKm8H6DMtDqGxH7ti/B+GOP2yLjeas319vr+ALCtTFQ3vThyikVk8CxxGgG8kMFHxs+C9JF9sHpAor4k8j18zxv5jQ6/NPWPFxDqP7WVuIWRY3cW9cQeE2Kn0YyXNPciQ5QneO6WdU2WzxUEiTlo4NBOIILO+02VNBM1wYWSBK9CCMIQs3fSUZO4yz3ZraMCokMEWJ4d1GBE7F9aLE5xjVq4M11E88qF3QEb+qYYQW276pflaEihioFS34w84II2z8uofOI8gxUd6iraoQRufv79RPSkw9vYVnFYEaTGjbCwVFQrpncy+fprdkddD6aGsPoP68l3kwelsi4VgfNhufSW+QgcAq5CD+9rChjRWZPO92YqLvpiHC+cBc3oEtYeCmjorU20vglpwBHpdWpDPVf9svHlhw/JwfDsvKe10qJYbryAo2MpHaFoCMDLYVJPx9+3o/LdJDi4Mw69fNDqyoVlgU7zvcZMnWvstcAmxhcocTgg3j77pX3AYFVrC7sRPiQzxRO4i1jc0PdabS0u4bPjXciAfmh9nx6bwPCuTkNyc3xjgL kfgXlfke l1jybsEPLwl+y9AmsyBg0p8AouQtAQfKRktRaz0M5vmMyW8FjubdIvHSGZVFfV8GzRWrC/7OWpNg0hcUH9lS4I/b8cgN8zd+H+fpDBnuq/2kCt4XqhW7DIZd0cgB4sLq8k4Ab4732J8QGcUPLPto8uJ+6uKcvlUuiPsVKuAyywafNhe+7qujSZPCwKBX78XS5ZWkP8NjVm33ZL2ZmesbaoeLlvL/SaAzsKBUDOe7g+Hqorb6QpoNbzOKMs7Jx/jZG2l3qOIZHBOuQSwyyFC9EYcrteHgcRQLtnZid0jj05lnumA6WU8h12t0ifraCqgTzr6Ckd2PAFf2n7NgdDw/rLQK6GCePyJwbGpm6Blsto3jyuG9dgu73sdLDMz4mnol1H6WpM+L4h1DcF7Rz8Kc7sbyNLw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Similar to zap_present_ptes(), let's introduce zap_nonpresent_ptes() to handle non-present ptes, which can improve code readability. No functional change. Signed-off-by: Qi Zheng Reviewed-by: Jann Horn Acked-by: David Hildenbrand --- mm/memory.c | 136 ++++++++++++++++++++++++++++------------------------ 1 file changed, 73 insertions(+), 63 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index d5a1b0a6bf1fa..5624c22bb03cf 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1587,6 +1587,76 @@ static inline int zap_present_ptes(struct mmu_gather *tlb, return 1; } +static inline int zap_nonpresent_ptes(struct mmu_gather *tlb, + struct vm_area_struct *vma, pte_t *pte, pte_t ptent, + unsigned int max_nr, unsigned long addr, + struct zap_details *details, int *rss) +{ + swp_entry_t entry; + int nr = 1; + + entry = pte_to_swp_entry(ptent); + if (is_device_private_entry(entry) || + is_device_exclusive_entry(entry)) { + struct page *page = pfn_swap_entry_to_page(entry); + struct folio *folio = page_folio(page); + + if (unlikely(!should_zap_folio(details, folio))) + return 1; + /* + * Both device private/exclusive mappings should only + * work with anonymous page so far, so we don't need to + * consider uffd-wp bit when zap. For more information, + * see zap_install_uffd_wp_if_needed(). + */ + WARN_ON_ONCE(!vma_is_anonymous(vma)); + rss[mm_counter(folio)]--; + if (is_device_private_entry(entry)) + folio_remove_rmap_pte(folio, page, vma); + folio_put(folio); + } else if (!non_swap_entry(entry)) { + /* Genuine swap entries, hence a private anon pages */ + if (!should_zap_cows(details)) + return 1; + + nr = swap_pte_batch(pte, max_nr, ptent); + rss[MM_SWAPENTS] -= nr; + free_swap_and_cache_nr(entry, nr); + } else if (is_migration_entry(entry)) { + struct folio *folio = pfn_swap_entry_folio(entry); + + if (!should_zap_folio(details, folio)) + return 1; + rss[mm_counter(folio)]--; + } else if (pte_marker_entry_uffd_wp(entry)) { + /* + * For anon: always drop the marker; for file: only + * drop the marker if explicitly requested. + */ + if (!vma_is_anonymous(vma) && !zap_drop_markers(details)) + return 1; + } else if (is_guard_swp_entry(entry)) { + /* + * Ordinary zapping should not remove guard PTE + * markers. Only do so if we should remove PTE markers + * in general. + */ + if (!zap_drop_markers(details)) + return 1; + } else if (is_hwpoison_entry(entry) || is_poisoned_swp_entry(entry)) { + if (!should_zap_cows(details)) + return 1; + } else { + /* We should have covered all the swap entry types */ + pr_alert("unrecognized swap entry 0x%lx\n", entry.val); + WARN_ON_ONCE(1); + } + clear_not_present_full_ptes(vma->vm_mm, addr, pte, nr, tlb->fullmm); + zap_install_uffd_wp_if_needed(vma, addr, pte, nr, details, ptent); + + return nr; +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1598,7 +1668,6 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, spinlock_t *ptl; pte_t *start_pte; pte_t *pte; - swp_entry_t entry; int nr; tlb_change_page_size(tlb, PAGE_SIZE); @@ -1611,8 +1680,6 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, arch_enter_lazy_mmu_mode(); do { pte_t ptent = ptep_get(pte); - struct folio *folio; - struct page *page; int max_nr; nr = 1; @@ -1622,8 +1689,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, if (need_resched()) break; + max_nr = (end - addr) / PAGE_SIZE; if (pte_present(ptent)) { - max_nr = (end - addr) / PAGE_SIZE; nr = zap_present_ptes(tlb, vma, pte, ptent, max_nr, addr, details, rss, &force_flush, &force_break); @@ -1631,67 +1698,10 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, addr += nr * PAGE_SIZE; break; } - continue; - } - - entry = pte_to_swp_entry(ptent); - if (is_device_private_entry(entry) || - is_device_exclusive_entry(entry)) { - page = pfn_swap_entry_to_page(entry); - folio = page_folio(page); - if (unlikely(!should_zap_folio(details, folio))) - continue; - /* - * Both device private/exclusive mappings should only - * work with anonymous page so far, so we don't need to - * consider uffd-wp bit when zap. For more information, - * see zap_install_uffd_wp_if_needed(). - */ - WARN_ON_ONCE(!vma_is_anonymous(vma)); - rss[mm_counter(folio)]--; - if (is_device_private_entry(entry)) - folio_remove_rmap_pte(folio, page, vma); - folio_put(folio); - } else if (!non_swap_entry(entry)) { - max_nr = (end - addr) / PAGE_SIZE; - nr = swap_pte_batch(pte, max_nr, ptent); - /* Genuine swap entries, hence a private anon pages */ - if (!should_zap_cows(details)) - continue; - rss[MM_SWAPENTS] -= nr; - free_swap_and_cache_nr(entry, nr); - } else if (is_migration_entry(entry)) { - folio = pfn_swap_entry_folio(entry); - if (!should_zap_folio(details, folio)) - continue; - rss[mm_counter(folio)]--; - } else if (pte_marker_entry_uffd_wp(entry)) { - /* - * For anon: always drop the marker; for file: only - * drop the marker if explicitly requested. - */ - if (!vma_is_anonymous(vma) && - !zap_drop_markers(details)) - continue; - } else if (is_guard_swp_entry(entry)) { - /* - * Ordinary zapping should not remove guard PTE - * markers. Only do so if we should remove PTE markers - * in general. - */ - if (!zap_drop_markers(details)) - continue; - } else if (is_hwpoison_entry(entry) || - is_poisoned_swp_entry(entry)) { - if (!should_zap_cows(details)) - continue; } else { - /* We should have covered all the swap entry types */ - pr_alert("unrecognized swap entry 0x%lx\n", entry.val); - WARN_ON_ONCE(1); + nr = zap_nonpresent_ptes(tlb, vma, pte, ptent, max_nr, + addr, details, rss); } - clear_not_present_full_ptes(mm, addr, pte, nr, tlb->fullmm); - zap_install_uffd_wp_if_needed(vma, addr, pte, nr, details, ptent); } while (pte += nr, addr += PAGE_SIZE * nr, addr != end); add_mm_rss_vec(mm, rss); From patchwork Wed Dec 4 11:09:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13893585 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8AC02E7716D for ; Wed, 4 Dec 2024 11:10:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1EC316B0095; Wed, 4 Dec 2024 06:10:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 19C716B0098; Wed, 4 Dec 2024 06:10:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 03C196B0099; Wed, 4 Dec 2024 06:10:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id CCA306B0095 for ; Wed, 4 Dec 2024 06:10:52 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 06EAE1A0F2A for ; Wed, 4 Dec 2024 11:10:52 +0000 (UTC) X-FDA: 82857008724.24.2DCBDA0 Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) by imf03.hostedemail.com (Postfix) with ESMTP id 7D2CA2000B for ; Wed, 4 Dec 2024 11:10:43 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Tu05LzB9; spf=pass (imf03.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.177 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733310640; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=J2X5SMV19Fsn2OjJZllhusee1Lj+v0wZR5QDlA7SZpw=; b=T3NBkteLetKIGDkmnL29z1eKHHd/R6r1iiYQvTuqw6yMaes4i6VC9i2Y56SnbATdlNHwLg sbRU4vQkgaREuXdoayh/PWk4ZhPFQV/78mz97LiyyoCBYVgdqE5k+wAy9YYhtZEDp7ChbP 040uvNV1O3lhrgGFqm2W3mtjqIFBzIo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733310640; a=rsa-sha256; cv=none; b=0BTGYBFl25/fL0/ao7s8/yY14DV8F8cDkBnA3D+TT7FdGWky0XCvjGUzeGlWwhithPbBwf vjqFvmNhfClWoO+R2PznQ5/R60MhcNKgXcBghOtVe/g/y75dW8OuCPQxwljOGidb/+e2Lo J2YpRhrjsI5+7ZhdkgRwW7VeaknhlEk= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Tu05LzB9; spf=pass (imf03.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.177 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pl1-f177.google.com with SMTP id d9443c01a7336-215ac560292so26299215ad.2 for ; Wed, 04 Dec 2024 03:10:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1733310649; x=1733915449; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=J2X5SMV19Fsn2OjJZllhusee1Lj+v0wZR5QDlA7SZpw=; b=Tu05LzB9rc7/sf1zybd0NF2z9heTcpJv5mNJC2cSVnnGhKGqz05LEDSryt8OekUdhk GVC5CMVbSSagVnfTAEIF7Emn99lwSlvNCgdUAeCLrrvKTNq+4eXvEpAFsdrU3+FjBkjL IbW9W3vAtQ1TtCq0Wci5Jm0EO+gaZR5UpRNawVqGfdrk1xn8r9Ap5FhbVHuPkOxcZVbp Y8drbHjlkq+0I+gtobhGRsDeQ6HGd0osjQW1RSpYKTSwuFw0rkRoUrGAkdz38i33N/2t VUcEQAX+RXz1Rcg0Kw1hnP9wj4EkPoR0Tb0XSLJMd8EAIF+Zqyk8HoF9BYUQqTTFgBFR rr1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733310649; x=1733915449; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=J2X5SMV19Fsn2OjJZllhusee1Lj+v0wZR5QDlA7SZpw=; b=hwHQt35UZz4d04twGiyQmJEynCEw0cTmQYyi4mj84bwVG1MZYrPGqYapFfOWfkwAIM IMIl3KSa9AKqpUKP20/JRtfn5TARt6aDasLlU6NVDxkR5jgTEG5hAjspWojJKdVijZsr Jhj1fbN5tqRHowQ80Vpshxt67lpwIBVFUy4C3/K6QmRJrJVrTHobjJeP1wheWaXVJMve xzVgGEV67CEozllGk0tMhu0kId6p1MTCam0KGXfLoytTwgFwNv7ezPmQfxoX2J/M+JnV t0PPqqOoAJzaiWa9b6Ar5wV4+hmiuL7I1QzEElN+A/X1ynVqarVL9ORR+f/aep2HdWvD MDkQ== X-Forwarded-Encrypted: i=1; AJvYcCV2gz/DBUxSX2JptYcXum/MHHyP8D1iWeavIePLrmpvJdAEydZwx9cbqRrAeD5oOO/bTqqR9Cnoag==@kvack.org X-Gm-Message-State: AOJu0Yy5vqbvBv0WYOH/5RnP3/VttZGT3SWdxTnKSdgWNJ8gzQKDQg7U D26YcQjIiiXNvVHhHMnL1UZ2eGyLcs3gQB8SDf4H2t0ZJ4KxIhkt4PynihdyTCYWRtVuRvRl0Sf X X-Gm-Gg: ASbGncsLiIJI2Ab/u27jThzB7fBABOR3eymGD2wvRRCPoqvl6JXgZeNQtY3afgEs3DY oNw6eLrNiSEXF5GvbBTiOoTsHdPm835xxTJ9W0LLWAw9aJvGKu58f16HmlKS2J6jaXFfwtBVJki F4IXTN0uxGlCU/Ahg7XzNRWlHrIAnVCzHRhhRgpdbuwBoAiFuujDswq7DztrvsDsyXEENEvzMIz PCB6Udy5S6IfglaMMVkiYPfWo7pmytXC5XbIRXwJdt3uYzfiJxVnzQyIUthWJtZVwLRbPawxspq 6A/to3ECmLPo7mA= X-Google-Smtp-Source: AGHT+IH7KMftfdbT7IJ/SYECv1Z0b6kIRJQ0pshXMsGiPcHLBFGh7gTwpZlD1z3c4LTYAe+5oKH33A== X-Received: by 2002:a17:902:d2c7:b0:215:b18d:e1 with SMTP id d9443c01a7336-215bcfc2ab1mr102416785ad.24.1733310648745; Wed, 04 Dec 2024 03:10:48 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.148]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21527515731sm107447495ad.192.2024.12.04.03.10.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Dec 2024 03:10:47 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, peterx@redhat.com, akpm@linux-foundation.org Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, zokeefe@google.com, rientjes@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [PATCH v4 04/11] mm: introduce do_zap_pte_range() Date: Wed, 4 Dec 2024 19:09:44 +0800 Message-Id: X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 7D2CA2000B X-Stat-Signature: kcyt9uckyqryc75ya43mjc78mqqexnna X-Rspam-User: X-HE-Tag: 1733310643-681152 X-HE-Meta: U2FsdGVkX1/Wv+hnzCOe2P7EhdJZ418f44wmdDHVswA2tf5aelN2d8lWrLqR1yFPGT9e/v+jDVZEhVyH5T44CrVCnGUKG3MdExKtNeVlzQ6dBVYIsl8dpviPznSkrjQ1+upOdfzsR/dF5cN0GNVRAuuk92EJi02bQx2ea+DywKRUwxyHaUEhnHmNPY2kVxZON4g9fzWA/jEmqaVAeFq2P9n09EfR7n7YOl2jlQBecvBDRoXqx6hcuaar+TxpBSAGnmcHXKzNB6n5A/RWPMOWCQS3ez/9mqAZLUwkStpRCKXHSdz4hiUzt+5EvHvb5MJNf2WY17FV+nYy/F3AItSqj5CmToCjEl//6jUTROhnr5ryEh/tWQL3rZG5VVW7SHj6UBN1+JFpAaL2gQR4ZQebqGVZpeS6qFpuELBjCmqQx0DspsAzKA3Nx4z8MSYxc02y8wCnDkS5ROvFXWk7cMzoY1jNWcw2x6gsHSqqqjWC4E9BJQhz2R6FaH9wGkPXUOzxN2dWtD1oM3Wsb8ZZfRwCoH2yKd5StS8F5WyRqAiyPGf7nwKSjhG4fvYmS+o0zVJf1BTrbPxPG4huISVOKnS5MvMCZ1bDu4crVKGzcB9gzHpN2MUlLaG3Z1NAUUPVu6nJp8kks0IHvMa2c9EU16umDhex/+KUQ+JXLcMkqneZgI+65hYlcFRqsxPdkJ6217Veq4Icv/URczsmJ3xhScbz7wHOu+hps6XVBtdazHdX2Sz+hqCORinXmEKT8la3lBono4ZyeeU6o+4cKQWbSYWNKhCDw4okY2S5OwvB49/hXWSsH6M9rPDwVnowi0VO+MkfE1jLlGAlyAO3kgzDBdopUKUKdsMEK7O4VY2cPx8BABSAn1lJdMHeFFUTvSBn1SWiy2NljxIhxk8z1DePoR+uc3VbXrZgdBKbLnETH8O4DoJBgfykg9KFQfRaobnPQqOaskdzGqRuapndVDpCQwv waw5zQVR P/4NCx/tOQ/rl654BrrhcwOSEEmU9wCJiFwDwBtaYj7by9xPgiksAZnHredq9YuWmDo11k/yat9DhRab9zVbtRlGcwKyWHUs5Rm5RosvG1CQ83UmBbVwJxgNFoGi2/Vw6XIihtappgL8D/J+kD0/WxYglA7ap8jENkVRhCi02Gc7n8+fAJoEBly0j/vLUW2Z6+hU3W1qaTWwilGSXBPy+Jz5DNpAEgEgNEk34HjcIDg4cWZEj9F+S1x471ccdfyvxUm7+9twvQT2DyXEyGsjOQja8ZLZvMB2lasXQ0s6twx516FnSKgvGs6m/h86eYJIr9XPzcjL8zD34g4hqyw7yIGNCb/4dj5mWGFtJ0kSn4DjXJ/lENzGIOmMSasHEXf8cdZlF8razTEyBH4ABIzweI/wPYA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This commit introduces do_zap_pte_range() to actually zap the PTEs, which will help improve code readability and facilitate secondary checking of the processed PTEs in the future. No functional change. Signed-off-by: Qi Zheng Reviewed-by: Jann Horn Acked-by: David Hildenbrand --- mm/memory.c | 45 ++++++++++++++++++++++++++------------------- 1 file changed, 26 insertions(+), 19 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 5624c22bb03cf..abe07e6bdd1bb 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1657,6 +1657,27 @@ static inline int zap_nonpresent_ptes(struct mmu_gather *tlb, return nr; } +static inline int do_zap_pte_range(struct mmu_gather *tlb, + struct vm_area_struct *vma, pte_t *pte, + unsigned long addr, unsigned long end, + struct zap_details *details, int *rss, + bool *force_flush, bool *force_break) +{ + pte_t ptent = ptep_get(pte); + int max_nr = (end - addr) / PAGE_SIZE; + + if (pte_none(ptent)) + return 1; + + if (pte_present(ptent)) + return zap_present_ptes(tlb, vma, pte, ptent, max_nr, + addr, details, rss, force_flush, + force_break); + + return zap_nonpresent_ptes(tlb, vma, pte, ptent, max_nr, addr, + details, rss); +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1679,28 +1700,14 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, flush_tlb_batched_pending(mm); arch_enter_lazy_mmu_mode(); do { - pte_t ptent = ptep_get(pte); - int max_nr; - - nr = 1; - if (pte_none(ptent)) - continue; - if (need_resched()) break; - max_nr = (end - addr) / PAGE_SIZE; - if (pte_present(ptent)) { - nr = zap_present_ptes(tlb, vma, pte, ptent, max_nr, - addr, details, rss, &force_flush, - &force_break); - if (unlikely(force_break)) { - addr += nr * PAGE_SIZE; - break; - } - } else { - nr = zap_nonpresent_ptes(tlb, vma, pte, ptent, max_nr, - addr, details, rss); + nr = do_zap_pte_range(tlb, vma, pte, addr, end, details, rss, + &force_flush, &force_break); + if (unlikely(force_break)) { + addr += nr * PAGE_SIZE; + break; } } while (pte += nr, addr += PAGE_SIZE * nr, addr != end); From patchwork Wed Dec 4 11:09:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13893586 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFFF1E7716D for ; Wed, 4 Dec 2024 11:11:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 76C1B6B0099; Wed, 4 Dec 2024 06:11:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 719CF6B009A; Wed, 4 Dec 2024 06:11:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5BA736B009B; Wed, 4 Dec 2024 06:11:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 3C78A6B0099 for ; Wed, 4 Dec 2024 06:11:00 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id E8209160F78 for ; Wed, 4 Dec 2024 11:10:59 +0000 (UTC) X-FDA: 82857008808.19.07B6345 Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by imf30.hostedemail.com (Postfix) with ESMTP id C1A618000D for ; Wed, 4 Dec 2024 11:10:30 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Tm2TFRIG; spf=pass (imf30.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733310643; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QkuROlRprAHSTBsp8RZw3I2mnycKFd5DDkO/yWw4n1g=; b=fjflhPyegNw/MLGRc8TiVosmqZ9piCVCuOHFn+v365wk5ucyYelvegQHvIMTmFWLYnkS7k ISm9hWpQSEmBF/EDDHKhiqMSdJoRiDJdoBXpRmU76WWADrh5Tm71wLNEBq5z78cDIqOZmd Ov6OGVoVmJL44FMPwLbCR5qMzEEt5Nc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733310643; a=rsa-sha256; cv=none; b=b2HGNiGrtoAV9fJQpWzaH/BtwuJsuiCliie5Aimkx7AnQ2wii2mRVsJtQJPTVfh1ni96Sp 8WBOGHTZyDDagA7VVYfYCIWD4oEVlxxC82trorJEGrt36yoWyO+iiZtzzS+tjth8aNeMfE gf4Xvq6wkcQ6TCPfXHzlsvflZgxCQMo= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=Tm2TFRIG; spf=pass (imf30.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-2155312884fso50283375ad.0 for ; Wed, 04 Dec 2024 03:10:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1733310656; x=1733915456; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QkuROlRprAHSTBsp8RZw3I2mnycKFd5DDkO/yWw4n1g=; b=Tm2TFRIGYdw91S11nahO+qQd3aE1uTYqrpl2OegHszHwzXpMWZVsTwHg9Y+64kR+jy swUCcS8U+DyqI2S0bIiXKyrUxKyL08dTgWdsHaUWYlIjVKuW+gma02FonEkPZxxqKaFX qLXpDq4arLuqgryB9W0OrTyIiXtylua9Sk5A79fnncJpEXScXHtUvyZmpYHFxzzoDRSK Vdy5ODhOIiGCTK73aM2MX9aAsI/KHALcVGeadSvGiv7IxdFDgvzb5Fl5Cdu3I7dJknRM Hc5wr0DXNj+DojUt0QnXcLfhp545m6Ux22dB8CLfawfC8xvOT4BXsdLXgUVVMcmv6FU1 0C1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733310656; x=1733915456; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QkuROlRprAHSTBsp8RZw3I2mnycKFd5DDkO/yWw4n1g=; b=EKyQ56QfpTwEk1mVc5IumUuhipKqHGIiE9LdYf0h2Co800cqAHNQUHnvWU0Fs7Vkco O0kXxyAwj9XQipj75UKoONhczdku0tUIwpg3o9jpSGsaGLy105ivI1yriNi2TI23Y4og Cl9//NtFIGN8qBOYvNUxk/doP2Pn0TiQsJDFvouHR18N2t2A1Q3FKTWgCUAmTnFnEZOK 7GCKcvgBymk/iCRS0QS6GlujEy2XlA3jbrcZpmdrVhQxkh1ad9KoUVtGf/1VR/RHBei1 3rg9BGC5IOxW+UAADPrJfXgV4SfB98bjp15OkZuPvdxSpgXdskFTSYfryrSd6W4NCxDM M9NQ== X-Forwarded-Encrypted: i=1; AJvYcCVQzMOnee/m/weA83OJrWC0asErk+BrHcnD6bwdhC3h6QrKbmGumOk9OosorGLLc0GUIYdQekgMlg==@kvack.org X-Gm-Message-State: AOJu0YzOVfn/tz/RyJK/l3vD7Zjjy+bV/suNlytdawjv63xf/LjD9x4B /hwzSy+w6wJLxPbF4KYjccaRFFd+eo3cpSpRuW/UEjJc5aMQ7PlOHF8IdWjDEHQ= X-Gm-Gg: ASbGncuBJXlw3i3Kbt/9a2Ox8rUqwroxa/eIYLCCAbD3OCkzjERW2vGWxvFwhYHIUBk kXyjE64L+by9nh+/sMjWiEirFw7S29Gm0rH5BkMLNHErMvL3VdMpgTpK0+Y22H2NSk5fnO0BguH DBmCYsyAlkp44E1WUk5u0b13L/cUZYqe9F0Oytz9HLPP2BRJ3IqzNNThkUA8utAJa5rNAYr9l9K +AJ4NUi+iRd650pi2V7mzw6KCAqmnEJKP4S2U0UY8L0Wo5P9xno1Ar6Opi5iMxycf+hIeELkhWj BfDtL9gS+8dnVbM= X-Google-Smtp-Source: AGHT+IEu/5U9u+MTuBZhWK2lULLFn2ChhXdgNwcwoaBBrqVb6T6ZJFfs9ziJlEdo4kgFVcC+MHm1Dw== X-Received: by 2002:a17:902:dac7:b0:215:72aa:693f with SMTP id d9443c01a7336-215bd1b476emr75201455ad.9.1733310656657; Wed, 04 Dec 2024 03:10:56 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.148]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21527515731sm107447495ad.192.2024.12.04.03.10.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Dec 2024 03:10:55 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, peterx@redhat.com, akpm@linux-foundation.org Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, zokeefe@google.com, rientjes@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [PATCH v4 05/11] mm: skip over all consecutive none ptes in do_zap_pte_range() Date: Wed, 4 Dec 2024 19:09:45 +0800 Message-Id: <8ecffbf990afd1c8ccc195a2ec321d55f0923908.1733305182.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: C1A618000D X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: ccq348toa7gna48ku7mbmgefcydc8owy X-HE-Tag: 1733310630-950930 X-HE-Meta: U2FsdGVkX19hQdEY1y5KVDFribXgd/UxsEWMGOseYFuDRvDIZFhkgxp35qvBWakm09uYfomOZa+WvVgWX8fxoZL2PzhKeOmumgg4zixPescBT+yny32pFuvbMmeQZ2Wnd1nsPEcEXFD3ePPix7evax8nviaGAxTIQ3GqGDQJSdI/DzGo1j7DKvdMRNKFj81LFLSKY1a55dobJ5IVcQ7b0rhMkIr7YmvpYVG++uzvh00vUG6LZQ/PUDkIcNxhL6vzZ22rrcgAPvAvokPcWks5VSBB4O014TexKawupGKkZw8ywNMqlLSpCaiQR3hqloU4llD4gYksv96UBVrgaLcJH5AxV2wgYxab9uhCcPQsZgGsXbW53rP9UuMAZPWrFlsD6QCPb9PsJOucN6WNrLF8CaXyP0W6Uoz/EUF7Sh4lfBVEvOOLF1nlYt3nYE2Q9zwK2kDbMi3646b443oTVgwOAI75ddzo74WyEDiGPwJNZDag6n6o3Sxnq77WTOdGPf8EIp/VYTiEJu7/W+fyP9MP6D0ep/5SUOG4xbo4I0x3GF+kH2YBserozl79sV8wgxMxyqC2eC/Ny1T3d97y5ZuB/Hl0pA6KiWVbYB/CBpkjjMSv6nJBY7vFjCv+EiNvYShVVLaw/GhKEPBt/i6WIJOSfNnXtx0f8qM+/tbk8hLp4uNt09YIG4BhSnetE2Dh4vsWjUT0Z9tqgsThoVA59g+6gfwOlctbG0q/v+2v7qWSjYkYkoAf3o4z37bOVO9plW9lrO8JqUFKQtCstUJjgM25fDdY46Hx9dzRhkMBKYYdejlNWmBcHyuLTibNnvU4ip9zQgkD4Lc4RnaPoD3VsOLIT3F+1fnPp5KHQW2LSDBOpr4myL0Ow8KMuPC2UkuryG0aSMyKaPNPShZfv9hG7RVgoTIpoYR/X0rd41R6fRZ0D6uSkINhcYkYgDBgFGh09EqTV95zaAXYPTWZnE+XgIZ H4gCiVX7 iaRsxxMX11881pG96794Ef2s3bD1KPOiuhlMB4uMD02IBbnGjJT4TTkkIfUKTtS4o4efCeIXeWHQSRWTxCg70OWzntvNdfrimOLPcdOM2wuOsXQC34dsXEgGGjUhu33xZHPpIalwLGx4nRuRtzMlyG2TLJyCd7jCAoMFUSX+T6saZl1yKvPCyHfxLFrcpQTTvPdK8uKywBYPXJ3FfqiiUb6/8U9ffTFXMRn5DxnWRwnIgWFfDFPESpHelUzhn2B26YFbBHJgfR87WxJtOL5oXULiooAf0y1l387yFMADfJg1sEmMRdk7qT587WxP5q2Vgdu+8tDsqyd/bQQU7uEkG534FMtWH5dmzyiQWwp0hY38LjqMSiwmetIPakn1aBH43AbpL X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Skip over all consecutive none ptes in do_zap_pte_range(), which helps optimize away need_resched() + force_break + incremental pte/addr increments etc. Suggested-by: David Hildenbrand Signed-off-by: Qi Zheng --- mm/memory.c | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index abe07e6bdd1bb..7f8869a22b57c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1665,17 +1665,30 @@ static inline int do_zap_pte_range(struct mmu_gather *tlb, { pte_t ptent = ptep_get(pte); int max_nr = (end - addr) / PAGE_SIZE; + int nr = 0; - if (pte_none(ptent)) - return 1; + /* Skip all consecutive none ptes */ + if (pte_none(ptent)) { + for (nr = 1; nr < max_nr; nr++) { + ptent = ptep_get(pte + nr); + if (!pte_none(ptent)) + break; + } + max_nr -= nr; + if (!max_nr) + return nr; + pte += nr; + addr += nr * PAGE_SIZE; + } if (pte_present(ptent)) - return zap_present_ptes(tlb, vma, pte, ptent, max_nr, - addr, details, rss, force_flush, - force_break); + nr += zap_present_ptes(tlb, vma, pte, ptent, max_nr, addr, + details, rss, force_flush, force_break); + else + nr += zap_nonpresent_ptes(tlb, vma, pte, ptent, max_nr, addr, + details, rss); - return zap_nonpresent_ptes(tlb, vma, pte, ptent, max_nr, addr, - details, rss); + return nr; } static unsigned long zap_pte_range(struct mmu_gather *tlb, From patchwork Wed Dec 4 11:09:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13893587 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B18CBE7716B for ; Wed, 4 Dec 2024 11:11:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4F84B6B009B; Wed, 4 Dec 2024 06:11:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4A8BA6B009C; Wed, 4 Dec 2024 06:11:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 349746B009D; Wed, 4 Dec 2024 06:11:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 1250F6B009B for ; Wed, 4 Dec 2024 06:11:08 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 9CE79C0EBF for ; Wed, 4 Dec 2024 11:11:07 +0000 (UTC) X-FDA: 82857009144.03.32F548E Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) by imf23.hostedemail.com (Postfix) with ESMTP id 30B11140018 for ; Wed, 4 Dec 2024 11:10:55 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=P6+l3lQO; spf=pass (imf23.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.178 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733310654; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YBcCGE4XlG5U77pGfU9yv3mqGKsFB2BwbFt8OgUUtNY=; b=sAlZYxYcN+rDv9U0kK1XZaOZ2voVWOCqb796aXwmsXdtlo4CO4biBPMyy+ETWPLbagfH2y KgYhvNIsDrdj7HPEnxgJM5MbROmDqLrsBxH0PlijWtuT5Z8AIdz5vB6JVNlhhXKGzLwkVH 4+FZYUvFESIFhpUvxAuMR7+kiaC2760= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=P6+l3lQO; spf=pass (imf23.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.178 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733310654; a=rsa-sha256; cv=none; b=3Xm+XlNSJnFWBBWZ8UlMXcrN5NOHIziBaoHwUtwPpXLlG+Koj0grlLypHpSTjzXS0Qk6dV C7wgfu4aY5LGKCijYwGzdYGXbxMTYQjxtNKbcAltq49tsPSVSq2g8onoGKIyxYG1clF64D 13inFiDVgMBHT/fG8M4FL4aVZEUv+to= Received: by mail-pl1-f178.google.com with SMTP id d9443c01a7336-215cc7b0c56so14459135ad.3 for ; Wed, 04 Dec 2024 03:11:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1733310664; x=1733915464; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=YBcCGE4XlG5U77pGfU9yv3mqGKsFB2BwbFt8OgUUtNY=; b=P6+l3lQOIrAsjmyU1ZHzulQZKfItp87fB6qhBX656k4BegJBX/MKlUAyLFDM3rfiyl 8164YA7CPhwup5t93hfnWU9FZh/rrrFPRhGxRjVvE6H+Ixt70PZZewex3BXMECrnQiLq x5H5BBY5KGVHyCILtOsz5t8P3f8zZIwdh+8K/3PnECMBsFDYlVmIBSZYT7DbKSbc0c/6 CKLLqw4JCt4ck9L3hpiabwCpKH3E/rkRcxel4pY7GirNTYwzwI5jWZm7J3W61oOXAioJ 2eBQJHbI3DCcVkgU3LZyDGyHF1kjbP1uUGO5KFcVuLAcrIpNJrMdwjdQbzBmmRZ4MRQe b8Lg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733310664; x=1733915464; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YBcCGE4XlG5U77pGfU9yv3mqGKsFB2BwbFt8OgUUtNY=; b=IhqUxi6ux7YSRtZp/bOSGXTB1ucSlZqqB4MJ6OeLOp2I6TobsYGYamLcR0N7OYGKzV vuNYRwf/wX+1/sDJ6tOXudrYrXHl9+K+qroR5GVk2gwWyER8miWBXM+GZ/dF+FK+eigX Zk3P3yvULjkQLy71nutihaLnt+6YOiS7C2eiu2DkU2VaxzQ25ZttTA0neCXAAxGtsLYT Y8wssDyehocstXQfw6R2meT2DnpsyLsx83uTZxjUgeI58vVX6GtZscYRTva4cFMute4B YMnI3dnEzorQ6MMjCMfOLzpD/RLPGZzrOR4CFQmaWklck9M1VyOLXEHp6GSjGvUD3L+i voQg== X-Forwarded-Encrypted: i=1; AJvYcCVtKGM8Fz7BQ6fWNRfqVuDELezx/mwnuuvWb/IzKK84IF3yCACDvaZvUL/FuaVztp/CdeFVrmACCA==@kvack.org X-Gm-Message-State: AOJu0YxBj4X5xtAWAhdze+sqnqAlj0ADiVDsOlfSy4W6O02mxrVwl3Tr f5BziG+az+ELFI6YTYI/tKN763kK8itkaW7tf8PizFmIkXCiC2YMocUMzfKQo60= X-Gm-Gg: ASbGncsolLnLU2pKLQUepOrY6FL1JzzTGqGbkqqJFKMkaMwtPdDS3qUvJ7cr1XSSKQU ssbDsT+p3NCCAz9n9Dnyi9tuEeqY7hLYkUD21NCbMlP2bYHgp6NeEGaycxFG+XACsN3YS4Wkpo2 8rcsKL3hGnto4/WFYkGWQf1fEjp0dEYIGmOpms30G0QwhBQOnNUeYSIXpTWuooWrba52fDs5dUc Lthf59lBHeKSngouApM+XKpWnsiolu5FkPZfdZsApVOnghCmy7dkxPf0NLVFMc2ZnT5lpLxwK/h 0/VwWNIvtqEVORk= X-Google-Smtp-Source: AGHT+IEq78EZW9DPKE8ZtBXaTwCJuq93zz+P96KIjyqh8unYy/RsaRnFkAa1Daiqvh42R9qprpb8TQ== X-Received: by 2002:a17:902:f687:b0:215:5a53:edee with SMTP id d9443c01a7336-215bcfc52f9mr64931185ad.9.1733310664497; Wed, 04 Dec 2024 03:11:04 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.148]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21527515731sm107447495ad.192.2024.12.04.03.10.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Dec 2024 03:11:03 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, peterx@redhat.com, akpm@linux-foundation.org Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, zokeefe@google.com, rientjes@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [PATCH v4 06/11] mm: zap_install_uffd_wp_if_needed: return whether uffd-wp pte has been re-installed Date: Wed, 4 Dec 2024 19:09:46 +0800 Message-Id: <9d4516554724eda87d6576468042a1741c475413.1733305182.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: 30B11140018 X-Rspamd-Server: rspam12 X-Stat-Signature: gq9zuqkjq71mn61gza3tgkqm69gscido X-Rspam-User: X-HE-Tag: 1733310655-483463 X-HE-Meta: U2FsdGVkX1+tCvBcgJMSXCgBbLZlXIB/hPCJCWz0F3z5znDsgCKvVc8joq314CdLCVoAWS6T/sdx66APcWnRYY6Xi3zGem+rs/5b9//vNao9PJQrlQBQaO54vjT7iUOB2vcQ7r7Y+gEaSt9mtqjxMt1oUCTGwKxC0R9zNaF2gZPmgc7TmxPuYTVeBnILpJcHz+2Psfn2bGVMtx8QOioclRjbUm6rCq3Z/R8Bf9RS+109TUIFbamuIwfgj9+1VakFoYtFs+Wc8hAm56NrxtbVGeoT7mVlwDSk+0KeX+21ZVcbXOrWQYLXnjBVVVvDzMZFRVuXR4QwhQntrpbvXYSgp3jxfuw5jk05ag7QzPLGJKEhTuu/nqp77BtAhJY4rt7NEDiZdwkrA2V9jctnEEWDT9K8TsDCKH+0kNdlm1cC4WPisnBNWQWK0M0EFQRmkHVhbXM5dUw4+zy4CkI48UPJRXgl9De8/T3kW3mwFV3Bb+1FL55CEgedA93ULQCK/5CbIP+BD0y9eZkNG5u+q9CcXsgonA/PG3QdmSbfY5FzS90H4Iy9NPqeO1EfH+7lgcjhV3OnZ1MkDMjMQhZxuFgltdtbDzwfGROL7L4zxXKfQCXqb14HaVNvV5HrYLV0dq+JekG21zZ7b7LvR/Ps2K90IF7TGV+T9ZHmtmcvJUvXJp5llMRefvqDvjyyik2aQ26CfaD4ezz+tFMbpBvzyUeDtU8kGNhLKccUbR2a1BNsfbzFPPfyukNoGzs8q8fYObgt96YJLDc3GgnnZBymKUFumv3dmtl0XEOcvl+X4Z9z6JY5toTxhmNZ1LlUlwplqEqjqirC4eGCVmlUDKdzdgI/XId9+0NaC3OukilznjuQf1mg+hEw3I3C7Ev6burO1x7jCrhcd/fdlETaA0KqhSbwYrK148bFjfLLcptldqPU0X5GoZONnXWdPSdVDp/roooC16etFUF/mKzgvelH2F6 OmPYw+MV KAUlKNj9ZAhiKqgdnjXg/rtHK7FQJZjYUKXgcJX1WWX3U3iP8XUqKAw09fSCRb8cClvEJzXJCfDaC9bSFwt6KK77aKPyEdsy0BmcKxf0fOa9q2IyUm6Ai58uVwuOESLevxt82+Q64gSTU7t4KzIBmyWL0k36ok80TVnBLbFVAY46u2v1ZOy//i3MnDOCLnkjHT6g2k3ocxqHU9nw4LzqLzyRRbg5pV7wtTU4rXRWSuxM/yMAbTcGR4aDkpA1peUedQY+s2+mbtBQVTYj3fT0au0CLipmcKoeGIIhG6GSd6cC8Yhjcbun11+QCYKVxEUlyb5EbViFKOzuS7k4v/LKoDU0ZarjbASpAzUYk5vjSaXFaWmI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In some cases, we'll replace the none pte with an uffd-wp swap special pte marker when necessary. Let's expose this information to the caller through the return value, so that subsequent commits can use this information to detect whether the PTE page is empty. Signed-off-by: Qi Zheng --- include/linux/mm_inline.h | 11 +++++++---- mm/memory.c | 16 ++++++++++++---- 2 files changed, 19 insertions(+), 8 deletions(-) diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index 1b6a917fffa4b..34e5097182a02 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -564,9 +564,9 @@ static inline pte_marker copy_pte_marker( * Must be called with pgtable lock held so that no thread will see the none * pte, and if they see it, they'll fault and serialize at the pgtable lock. * - * This function is a no-op if PTE_MARKER_UFFD_WP is not enabled. + * Returns true if an uffd-wp pte was installed, false otherwise. */ -static inline void +static inline bool pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, pte_t *pte, pte_t pteval) { @@ -583,7 +583,7 @@ pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, * with a swap pte. There's no way of leaking the bit. */ if (vma_is_anonymous(vma) || !userfaultfd_wp(vma)) - return; + return false; /* A uffd-wp wr-protected normal pte */ if (unlikely(pte_present(pteval) && pte_uffd_wp(pteval))) @@ -596,10 +596,13 @@ pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, if (unlikely(pte_swp_uffd_wp_any(pteval))) arm_uffd_pte = true; - if (unlikely(arm_uffd_pte)) + if (unlikely(arm_uffd_pte)) { set_pte_at(vma->vm_mm, addr, pte, make_pte_marker(PTE_MARKER_UFFD_WP)); + return true; + } #endif + return false; } static inline bool vma_has_recency(struct vm_area_struct *vma) diff --git a/mm/memory.c b/mm/memory.c index 7f8869a22b57c..1f149bc2c0586 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1466,27 +1466,35 @@ static inline bool zap_drop_markers(struct zap_details *details) /* * This function makes sure that we'll replace the none pte with an uffd-wp * swap special pte marker when necessary. Must be with the pgtable lock held. + * + * Returns true if uffd-wp ptes was installed, false otherwise. */ -static inline void +static inline bool zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, pte_t *pte, int nr, struct zap_details *details, pte_t pteval) { + bool was_installed = false; + +#ifdef CONFIG_PTE_MARKER_UFFD_WP /* Zap on anonymous always means dropping everything */ if (vma_is_anonymous(vma)) - return; + return false; if (zap_drop_markers(details)) - return; + return false; for (;;) { /* the PFN in the PTE is irrelevant. */ - pte_install_uffd_wp_if_needed(vma, addr, pte, pteval); + if (pte_install_uffd_wp_if_needed(vma, addr, pte, pteval)) + was_installed = true; if (--nr == 0) break; pte++; addr += PAGE_SIZE; } +#endif + return was_installed; } static __always_inline void zap_present_folio_ptes(struct mmu_gather *tlb, From patchwork Wed Dec 4 11:09:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13893588 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB435E7716B for ; Wed, 4 Dec 2024 11:11:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5E2E06B009D; Wed, 4 Dec 2024 06:11:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 592C96B009E; Wed, 4 Dec 2024 06:11:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 45AE46B009F; Wed, 4 Dec 2024 06:11:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 285B16B009D for ; Wed, 4 Dec 2024 06:11:16 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 9B206160F6C for ; Wed, 4 Dec 2024 11:11:15 +0000 (UTC) X-FDA: 82857009270.09.6596B2A Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by imf03.hostedemail.com (Postfix) with ESMTP id 3F64220014 for ; Wed, 4 Dec 2024 11:11:07 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=eSckiQpO; spf=pass (imf03.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733310659; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/4hB6V6uA9TX2vZoO0lHgFVoZw/iaOacAaPzNOaZz/w=; b=Fg3yNHLl4MZelAChyUaf5dKODEesXDz/j7Wcj2XcAWYsGuNcsNnYq+PblSe0mCaRPkFk2M 6YmL9CsJbb+ANS1vTrXnfYl/dvixkvVqn8Y+I6Lmgi9ePipjHj3LJIuO6hhWMEMxwcrDKn TumoIsD96MJ6w+9tbEX599Y1pTpFGOE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733310659; a=rsa-sha256; cv=none; b=wVeOnDkUpJF0xL1rl49lFcdpdechdo62YiC9Y7fTkzSVRGbdy3w+vw8cAlIh9F5Hy4vedV hJkfmEyFqpOFwKBE7p/PV7wdyai17qOWzFq6bCqA6ogSBHYEnyKkea8CRLR9AiQtUljG5w LRPc2NnJUJAWeN+Cp1eNk9Urk4K5rs4= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=eSckiQpO; spf=pass (imf03.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-21578cfad81so30814185ad.3 for ; Wed, 04 Dec 2024 03:11:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1733310672; x=1733915472; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/4hB6V6uA9TX2vZoO0lHgFVoZw/iaOacAaPzNOaZz/w=; b=eSckiQpORkd51XwN0MxdPfeiegXu5t+Gtyz/8pR3DYu6XqQXUNwEYj9OBlWuqZHVw5 i92TnydsnLyYEm7mbRI72fmlIFIcb6DVCEyqyRSqrwc6kvivLMJZLMT7pQ4tOFSqRit0 Hghmkf379ru5nlf1v+PSqYzKEFWy5O+WM5P+NWoyrs3OXmmWz7fGGF7EGqc2UzdQbI7j kp13mPVHBz/vKlcU67FxSfH1UjApMAzINEKpvjVoG2Qezy3Ef5oAZEzIKoqUUL1wJEER 2u2iKxoWLlKlevnk+9ydJZrjCP1O4g3jFN/BsDiiOjIf3c+ORKCQbHdlW4ZYXVM24lMV 6KSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733310672; x=1733915472; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/4hB6V6uA9TX2vZoO0lHgFVoZw/iaOacAaPzNOaZz/w=; b=wJFDPO9nCPFhi13seqRxegkkWM4h+Re26RXQWeKSG79Edzf+QMVdFsyX3o/CqJBJ3b qjND7BHGZlW1w0viw16/mYNzsfjYFpXLyAD0mYQ6r3OfdUf8YaWMzOgSCUabiXOTGTu0 4nBkXHMkEHvpyxF/SCal62lnxiwckrao5FM9YjdAYsE78/TbiqOR5BuRUJRmUJM/D/jW FDzfk/4yC23MlJsc5AeaWyfZ4rw1dLb3bAVST5SNf+1gOojcr+7RJ8oTbCcjgDjZ7cXs P0AAuhivvVPJDzEMKjeVNtZJ8CPhy4f0+Zfb4d1NX+lVvCwIqt63ylza7lb3G0lBO5ZC WAtQ== X-Forwarded-Encrypted: i=1; AJvYcCX3vRuAhmIe67M0rf9l3gX3sJJlwjPisql7veYN/MFADiKCvbXxHglEg7RMpXe1MNdaKwbz38w8xQ==@kvack.org X-Gm-Message-State: AOJu0Ywjm1FGbLq+BSMBUZK+2A34dwmAwMZNt00w2LwaYBxmqZUHG78t R4cA4SkuY5R+7j226jQahP9BvVz1cV2sqa4/gAmDWGkuhgwBiQh3qrzWxsaKqlc= X-Gm-Gg: ASbGnctHNUNM4Z9DQJ5mwnt/6ujvYI6EpTFlu35Ht7kqAKNqcVYsFg8sgWffG05/aOg Zd4Q5MHb9yI6i667jC2EoF8XnjnP2U4Ys2EiVLfGbmHGSMNtRsa4f1r87WAX+sCAcsMWKB0/eJe 9xr784O84VdI2Uua17v9if9q1iE7KsqjMfPt2oXmQTH+HkCXGiAwbn/nZAme85f/vCT/vg5ZXYk BrwPNIz73auUngsCXf5UT0XGMfb4V+XO/1/v5cQGjiSFTikfUA3zLwAEs3WNJFoVgIQSNfMzNhg qkD1D8XwVS/vhsE= X-Google-Smtp-Source: AGHT+IHVmRnYQumB1wXhU3np0xfxzAbGYvF0A99UdzOBbPzf6Gm/CK0T1MxPbJQR0FQZPPyKoGad2Q== X-Received: by 2002:a17:902:ce0c:b0:211:6b21:73d9 with SMTP id d9443c01a7336-215bd24146cmr87777885ad.37.1733310672514; Wed, 04 Dec 2024 03:11:12 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.148]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21527515731sm107447495ad.192.2024.12.04.03.11.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Dec 2024 03:11:11 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, peterx@redhat.com, akpm@linux-foundation.org Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, zokeefe@google.com, rientjes@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [PATCH v4 07/11] mm: do_zap_pte_range: return any_skipped information to the caller Date: Wed, 4 Dec 2024 19:09:47 +0800 Message-Id: <59f33ec9f74e9f058ed319b0bfadd76b0f7adf9b.1733305182.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: 3F64220014 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: 5ejphas8z65xuyki7c3nnipjuw4auyf1 X-HE-Tag: 1733310667-330395 X-HE-Meta: U2FsdGVkX1+oH1ERhRYdd3EX5vp2DQ/kXKnDcWn60ExvamqRI408N5PBDry6ehB/tbf9REEdPpmA3Dziy66FgKK5/FYNYn65FaNZMBm7cRWNyOfNZqMEdkjNN2Bm7WZsSz3U4/UiACy2n+gML3jrfdmxPRMub3g2IneFXJp5/eZBcf6jvXuqJn/26WipIRHUSxljZRbNGZ1YpbQRKvORzvzJi7FVzQg2+UBbCLU0tNVNJZ2b+kpi4n8az5WRYDLsoITnTea03K568yvqRNpIoexSqNuqZXM3FeOra7JHi3C8pUK0rKFKkOP/eM+BaQyxeHqsEFJE91or/qoOB8PGfbU4GNRojUYCQdqzHsAw31VG2kxR5aAalgsePkFPidklRoBdwBYbeuW2wE6SHZlD8Jt7NzuF1AtuvoinDJKE3kTLTo5c39iQBatC5qVW8EZII4ojOjwAc/mwPpg4MTRftlytVtOjr10uksgNbD+/zi+rJQIStEpp2QNN4h3SDJy6RMxihbHE57lyposw1ysa6Y3sFyiSHBRawoRztr0X570aqyIhB+GjFpfB/Asp+UamWECM/5Ju0VvlZn33bDkNmo/NwOBp8GVLofU+mUl6SZOddyP+INBziK0qq5QSCZCV6FsPJ0JyvZ9pdGag8U+4ihT+HD1VaAyBK63ZhIb7U0uWImFuR/OQ4usmzvNcyRibE5V6sIUdWpvY85qSVWOLR8wLaCz8wGaJ0xFGlVxLXemOBy6Y5ipWYwclbMewX9ZjNeyASgFa1JYzDd2+lVJ5u2RzN7KTdyiOg2m9PSopCLjpz94bFsLLw9cFBtDjoKDmsjwx737gI3NqQqYbIjC34xg3XUjjjJtAWeKoHIv+otiVtUsj1zsZucSkpuKdN+Z6EHpfJ9IZsUeEKvdO6aRldlxRO+Hz0X1sqPIltAfQxgPD4UPgMb2Z2ye+spBN4BeWc2EC2lyXttmz5hqunzh rQZQDxR0 cqFokeFJU+c4LAU006ny2q+TSiTVWJTnvfmGHf21P1w9uL4ZwGmaoqDUwramQHbxE1POn/4f1GjwrVV2Is1i2hJIhFkRwoZyAodjSgxmMECUGBaUCxZW7H8q9MGAmvPiSGCGDyldt6i9zOG7jnqE9KeqyaFa1F2+fYXWfcv5EICPtjf6KkPrQ8xD84S3onQJwXT0Kf4Hv3pVZsR5m8xrr0iTVm4Zfi6Aqxr7Qhou9HUFDNGF/e+dSTq8JXUxs1AVvePP3DufAY0yYFa9phGreY4/gCnWBnKduVSU2aTRd4o63xZUURQmzE7OAT47m9p6f6914J2CnkMkKRPi5MFULiSDoiJdv5zR4b3P0ciCdApeWZWw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Let the caller of do_zap_pte_range() know whether we skip zap ptes or reinstall uffd-wp ptes through any_skipped parameter, so that subsequent commits can use this information in zap_pte_range() to detect whether the PTE page can be reclaimed. Signed-off-by: Qi Zheng --- mm/memory.c | 36 +++++++++++++++++++++--------------- 1 file changed, 21 insertions(+), 15 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 1f149bc2c0586..fdefa551d1250 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1501,7 +1501,7 @@ static __always_inline void zap_present_folio_ptes(struct mmu_gather *tlb, struct vm_area_struct *vma, struct folio *folio, struct page *page, pte_t *pte, pte_t ptent, unsigned int nr, unsigned long addr, struct zap_details *details, int *rss, - bool *force_flush, bool *force_break) + bool *force_flush, bool *force_break, bool *any_skipped) { struct mm_struct *mm = tlb->mm; bool delay_rmap = false; @@ -1527,8 +1527,8 @@ static __always_inline void zap_present_folio_ptes(struct mmu_gather *tlb, arch_check_zapped_pte(vma, ptent); tlb_remove_tlb_entries(tlb, pte, nr, addr); if (unlikely(userfaultfd_pte_wp(vma, ptent))) - zap_install_uffd_wp_if_needed(vma, addr, pte, nr, details, - ptent); + *any_skipped = zap_install_uffd_wp_if_needed(vma, addr, pte, + nr, details, ptent); if (!delay_rmap) { folio_remove_rmap_ptes(folio, page, nr, vma); @@ -1552,7 +1552,7 @@ static inline int zap_present_ptes(struct mmu_gather *tlb, struct vm_area_struct *vma, pte_t *pte, pte_t ptent, unsigned int max_nr, unsigned long addr, struct zap_details *details, int *rss, bool *force_flush, - bool *force_break) + bool *force_break, bool *any_skipped) { const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY; struct mm_struct *mm = tlb->mm; @@ -1567,15 +1567,17 @@ static inline int zap_present_ptes(struct mmu_gather *tlb, arch_check_zapped_pte(vma, ptent); tlb_remove_tlb_entry(tlb, pte, addr); if (userfaultfd_pte_wp(vma, ptent)) - zap_install_uffd_wp_if_needed(vma, addr, pte, 1, - details, ptent); + *any_skipped = zap_install_uffd_wp_if_needed(vma, addr, + pte, 1, details, ptent); ksm_might_unmap_zero_page(mm, ptent); return 1; } folio = page_folio(page); - if (unlikely(!should_zap_folio(details, folio))) + if (unlikely(!should_zap_folio(details, folio))) { + *any_skipped = true; return 1; + } /* * Make sure that the common "small folio" case is as fast as possible @@ -1587,22 +1589,23 @@ static inline int zap_present_ptes(struct mmu_gather *tlb, zap_present_folio_ptes(tlb, vma, folio, page, pte, ptent, nr, addr, details, rss, force_flush, - force_break); + force_break, any_skipped); return nr; } zap_present_folio_ptes(tlb, vma, folio, page, pte, ptent, 1, addr, - details, rss, force_flush, force_break); + details, rss, force_flush, force_break, any_skipped); return 1; } static inline int zap_nonpresent_ptes(struct mmu_gather *tlb, struct vm_area_struct *vma, pte_t *pte, pte_t ptent, unsigned int max_nr, unsigned long addr, - struct zap_details *details, int *rss) + struct zap_details *details, int *rss, bool *any_skipped) { swp_entry_t entry; int nr = 1; + *any_skipped = true; entry = pte_to_swp_entry(ptent); if (is_device_private_entry(entry) || is_device_exclusive_entry(entry)) { @@ -1660,7 +1663,7 @@ static inline int zap_nonpresent_ptes(struct mmu_gather *tlb, WARN_ON_ONCE(1); } clear_not_present_full_ptes(vma->vm_mm, addr, pte, nr, tlb->fullmm); - zap_install_uffd_wp_if_needed(vma, addr, pte, nr, details, ptent); + *any_skipped = zap_install_uffd_wp_if_needed(vma, addr, pte, nr, details, ptent); return nr; } @@ -1669,7 +1672,8 @@ static inline int do_zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pte_t *pte, unsigned long addr, unsigned long end, struct zap_details *details, int *rss, - bool *force_flush, bool *force_break) + bool *force_flush, bool *force_break, + bool *any_skipped) { pte_t ptent = ptep_get(pte); int max_nr = (end - addr) / PAGE_SIZE; @@ -1691,10 +1695,11 @@ static inline int do_zap_pte_range(struct mmu_gather *tlb, if (pte_present(ptent)) nr += zap_present_ptes(tlb, vma, pte, ptent, max_nr, addr, - details, rss, force_flush, force_break); + details, rss, force_flush, force_break, + any_skipped); else nr += zap_nonpresent_ptes(tlb, vma, pte, ptent, max_nr, addr, - details, rss); + details, rss, any_skipped); return nr; } @@ -1705,6 +1710,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, struct zap_details *details) { bool force_flush = false, force_break = false; + bool any_skipped = false; struct mm_struct *mm = tlb->mm; int rss[NR_MM_COUNTERS]; spinlock_t *ptl; @@ -1725,7 +1731,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, break; nr = do_zap_pte_range(tlb, vma, pte, addr, end, details, rss, - &force_flush, &force_break); + &force_flush, &force_break, &any_skipped); if (unlikely(force_break)) { addr += nr * PAGE_SIZE; break; From patchwork Wed Dec 4 11:09:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13893589 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 721DFE7716D for ; Wed, 4 Dec 2024 11:11:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 08B806B009F; Wed, 4 Dec 2024 06:11:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 03BC56B00A0; Wed, 4 Dec 2024 06:11:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E21296B00A1; Wed, 4 Dec 2024 06:11:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C3C066B009F for ; Wed, 4 Dec 2024 06:11:23 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 7991F160F5C for ; Wed, 4 Dec 2024 11:11:23 +0000 (UTC) X-FDA: 82857009774.08.9D9A66A Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by imf05.hostedemail.com (Postfix) with ESMTP id 18417100015 for ; Wed, 4 Dec 2024 11:10:53 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=K6PX0vHf; spf=pass (imf05.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733310671; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7pv7uitRReKN+1XEtzxbevTSt67jmAqoi+0p+H++uGg=; b=ieCDyBkEpsCr4rOWLoIwTP5VRMDN9JilzNdSzLuTO7rk0yPdEgY/1l819bxzc8UPiuE3B6 KsajywLn39K+8HkaKcpRPvJD8jad65zTLizCMNMDgOJr5qyroQ4sBuBJFY7zVhZY4iS0BP 8q3zExXKgWukGnG9TgJLyJXXF3AxXwQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733310671; a=rsa-sha256; cv=none; b=ApI9XDx3VTH3M/Z2syTWNe2lP/8ulB8ITZjnTMiLox1Ws/MsQlCGLAFb+HqaDZzuzH8Wb0 TDz8ZG0RdYQZ3kMCTuzXLsNqVAk03lNV/PPPLU+9exOzdoOuQOjvCsfAQbv7u+t1XcyxK9 KIBblXO/rr73dHYTDyCcqY9BYPZpP4U= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=K6PX0vHf; spf=pass (imf05.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-215da73e256so3576825ad.0 for ; Wed, 04 Dec 2024 03:11:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1733310680; x=1733915480; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7pv7uitRReKN+1XEtzxbevTSt67jmAqoi+0p+H++uGg=; b=K6PX0vHf7pAB0nWqyMgcCxLsjVPOXuUgBWW3qw61YJY1RXbPNmmkGqPwXwaE1HxjKL G9zf1S7D7UbvhgnV9bONHJmnKK6ugbDksflaJKNjXoxbEv3cBw45WwUcfQC4BHjvVruu JGAqBWeg5MQRsT/GXnu71/xVrNWOq2ZynnEo5x/6fAGLSSNTAYd1vm2HV8LCfBetjiQA Vi1/wdh3sUzRTjVPI7lqKCmAi4MDl1uyRru2X74oamWKDH8gfLtnH3FaXBfdTyN6H4D9 XfyimCHORbFRNvqASFgszP75xD1k3CPfAvWRibsTaUtug+s7ELi/TV2WVDJxWRLSLd19 80ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733310680; x=1733915480; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7pv7uitRReKN+1XEtzxbevTSt67jmAqoi+0p+H++uGg=; b=F39A0P0RxU4YsHOeqFg4EB9zDQhJZ7y6Q+xtJyrOATFG/D/XPO4Nx9PXMUY8mSwJoz XwbclIUO2ZkXQFsxh4HCQzkPcYngjheXuus8Fx3EhI8Yv9yNJ1Dwfx871yXlu+4LFWke 7JAFJsivXFs1gHTj7wF9Dn4err12ZpLBCmS51Vuk2A+C9KPkvLZk3jA27aWVhXXJiGVD GPkDaIO/0ZqbrXQRnSyRNe3xfyLxpDETSZgRxb8LmO0qFNV4gzg5oM/5rhaaWTacI3aN FH20CKRV6wlPjFIvGhbxk6rLldW8YupAy6zZ3C5ygby6Z4ba4I+LzVeQRXGkdcDZ5Q2n sOmw== X-Forwarded-Encrypted: i=1; AJvYcCUEuIso5P5ammVx0aPBi4A7ypIAwsCfmidUdqxnS9LC+RJ9Kw8PwaMWoSEKlQpSuu4sxgD3Vq5WaQ==@kvack.org X-Gm-Message-State: AOJu0Yyp8QjMMehLmiR7BC+/G4HGa4jz284Z3Tt3afq6bDtUigyeDylO K7omgC3iOw9CtNdjk3Lg0s/Ou/ILp4Pf/Cv6qQVld7BCJhoUcct8LDvkGg+mq0E= X-Gm-Gg: ASbGncuJUHE+JcuZKxbbz6blnjTJZ57ieEhM1fYWC0KUD/PEF32Py0F/iAwOJ9rX5+0 +gN+4WNsqT8Kz9boaXk+MQDnVVq31y5HB5IZUFQRpNr7oany+51aEuA4IjQroSpgCKK9EksmKhP qb/v44c6iJIel4FkAgI07hP3rGpJuacLL4NS9uPm6AdJf8BhQDyp1mqQ9Lq/J84nYZt4n9LdkH2 6Ga/doxFIr3NbgqJjuGYkeGzb8QDedBZGgTzUtrOHpwHmrP96ffD2xuQ+wb5T2zrZUMWpuH/G5z Mexix4oUsZRyiL4= X-Google-Smtp-Source: AGHT+IHbc9xNaqbbB+HahdZkvlcNVEtw+zco1U61JfjC0HrH8ZPfSYoGTbGPdGo4pnmKoLHl9UqICQ== X-Received: by 2002:a17:902:f712:b0:211:ff13:8652 with SMTP id d9443c01a7336-215d0050dfcmr58113365ad.28.1733310680194; Wed, 04 Dec 2024 03:11:20 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.148]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21527515731sm107447495ad.192.2024.12.04.03.11.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Dec 2024 03:11:19 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, peterx@redhat.com, akpm@linux-foundation.org Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, zokeefe@google.com, rientjes@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [PATCH v4 08/11] mm: make zap_pte_range() handle full within-PMD range Date: Wed, 4 Dec 2024 19:09:48 +0800 Message-Id: <76c95ee641da7808cd66d642ab95841df4048295.1733305182.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 18417100015 X-Stat-Signature: xi4zposr4kz6g8ejt7jyiiw95nardp9a X-Rspam-User: X-HE-Tag: 1733310653-67748 X-HE-Meta: U2FsdGVkX1/pffpqY+w+uwuFX7tnBpra3D/MxNwz6mYWYKS04dZgROWMkpUlamM4Xuud2dwO73vfSyTQLKN7s+TmRV6FlShYuGHWEGEOXZo7CaicF/Dyo4jd1B1OEuPUWl0u0EfUG5S010GwVQm38iHsgiGOecCXhezt29b2zsjVkszJgolLIpawHKowMB7P8s3Erz/fXV9FppU9+hz6z21aCHVqDdnTxLa+IYIngJTNIWLimwxLtbiIvYvNQ+7sutVOcP+eAIiWSkWiyz7q6AWAul4LYU/ozflTwdGqlEY94WUwfygU2a51miZQHZJYYFZsyZ1O2E5tBm49f+f3H8KThwLoet4SagnKk+8xvSIo+eEZB9YGr8jRZgw6iwFrjNuBL26X5C8cI/AlKZUiV3QQbTx9I8wMGNH9QtsAMhkwMiPCI/A/SXG2rLvTlW8wnihQSFrurzbuLhFZzsj/pt+uwN+cTeFKuYS+izixENlNqY4uOGW9V0IiPC6KYNLREKbIgr6sHm/2aVph+MrxA5ItoieqixX3ZiD6qOngERvjDF+ql0EkAE18gOoUZW7nsZJvdPsukKoXHoYrgfj99DTVb3ZP8oiHED27VnNkUsr76m5WgA7iweoqg10zeDLJf3zgLVrDgAsWDcvvUbBXj61r/+nw57/vr9eUtDBIq/Ku858oLW4H/fw2rfW4/6art0rTSRoZRZI3KMDHftZG+/zqVO/TMryRmCIPGjZT34hgMCA9kA+H2/DHL8O5X8UpWwTQpUreZTe2jsdRx/dMh8rB/cnoiVEs+N8peEjhtwd7TbU+UaWS1bsFirWbFNEd74bgYfbMPBycwQ2MyJWnJrgPk0g2daZf6Ihhk4NmlEGNBIwwZ729fAzRML2p/fjTaQDNEjrEm6l44e6ZpOhQv/f5pHFw9K1dhffaoN37xkcTMaKHYxCnYNRql53pRYTt8WzY8o5vaPM0jjC38yO oXTp2j2r C9naPdn4dA96U0ofm1vz6OzTysARlcH38IonvOLd9g/sJB11IY6pcO6vz4OjVN13CxYcxgtDQLyt2BhSNSO23BFRbU1PGmw5rKX2gYOd9xiCEDLijUS6Fzgw520X1dB4rBhmnfYrogzwr016kaDMowA166saNFVJHhQ5Eg0HY/+ysCoZ7BG5IK+9jrqtq66+QxlIbS9gAwNnqCklHYaIrE5XNCp3kTsBgUL+Vy32NvFHDXpi1UsQ49CVyHpX7MME7eOVTkIXKFzrd1yD0NgZUwAc5iBRF1vqsPURJQbuHq1c5r8husjtOKT+UfcvzYWHdZcz86e/Lerbf40cdjDep3Ls93KZmRnnqUmKgflKo+tvauTg+vsCAzggeKPXJzCJJWWc/ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In preparation for reclaiming empty PTE pages, this commit first makes zap_pte_range() to handle the full within-PMD range, so that we can more easily detect and free PTE pages in this function in subsequent commits. Signed-off-by: Qi Zheng Reviewed-by: Jann Horn --- mm/memory.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index fdefa551d1250..36a59bea289d1 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1718,6 +1718,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, pte_t *pte; int nr; +retry: tlb_change_page_size(tlb, PAGE_SIZE); init_rss_vec(rss); start_pte = pte = pte_offset_map_lock(mm, pmd, addr, &ptl); @@ -1757,6 +1758,13 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, if (force_flush) tlb_flush_mmu(tlb); + if (addr != end) { + cond_resched(); + force_flush = false; + force_break = false; + goto retry; + } + return addr; } From patchwork Wed Dec 4 11:09:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13893590 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07DCDE7716B for ; Wed, 4 Dec 2024 11:11:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9B45C6B00A1; Wed, 4 Dec 2024 06:11:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 965FA6B00A2; Wed, 4 Dec 2024 06:11:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 805766B00A3; Wed, 4 Dec 2024 06:11:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 628866B00A1 for ; Wed, 4 Dec 2024 06:11:31 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 27CAEC0EC3 for ; Wed, 4 Dec 2024 11:11:31 +0000 (UTC) X-FDA: 82857009732.17.8B38D3F Received: from mail-pl1-f180.google.com (mail-pl1-f180.google.com [209.85.214.180]) by imf05.hostedemail.com (Postfix) with ESMTP id F406B10000E for ; Wed, 4 Dec 2024 11:11:01 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=jrUgwzBW; spf=pass (imf05.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.180 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733310679; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qoapVpNXl28ZG5UuJ7jnDD6Hy269eKhZbzIlX0dbMls=; b=scWJfeqjishgYCdb+9xDAzpkbxzwL3IyFrVGWCKSUcSiiCIdZEnBxk729CspQ+UFn1mGs5 m1EZ/Ay+jhEYKGSai6YdRKb0uxXOa/mMMQYELEChaQDbu/NLlJBwTsxTsI1DNQysdhp+V5 vMG/EbyUR18fXzt98dpVT73g97LsbR8= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=jrUgwzBW; spf=pass (imf05.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.180 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733310679; a=rsa-sha256; cv=none; b=Z2mH4zrmEcoHU8n+xVyzXsWKXcbNHsmduHLIo0/l6C7OXNr6FE/8iHtYshAhGM719mF3ZG 7xI/4OQDtq4FJGFP0+9pDmEhdCHjy27C8DTUPGsfs3DdwPJxN1Va2Km6n4ex0eeWFvGFkH /BAGiVGtz+tbzvcE++ocxJTrJ+kNhng= Received: by mail-pl1-f180.google.com with SMTP id d9443c01a7336-215c54e5f24so15640635ad.2 for ; Wed, 04 Dec 2024 03:11:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1733310688; x=1733915488; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qoapVpNXl28ZG5UuJ7jnDD6Hy269eKhZbzIlX0dbMls=; b=jrUgwzBWuMfHupOvhZTL33IuK+Q3H/A4cAOSGazhPgT92JSKznGrWwCUTtVeyaZx9D IZs3Uddmsc/DPbtLv34auN+Eha2AKoJQqkXS3AMjzOTV4bZcNQtOFV01MUFGfSFL2WLe hnrxu4qkkcnYEWF3h487ovV5lfKuCjeAdEKGxxs+fCIApmvbjMQSwyc8Fcgv/dQclEC8 Hpd8vfWqFqTaP8f+bwWmRQ4wzMW+G3KQT8i4quB/20wPewxXfynTDGGnVGNrs6gxstSQ Hj3GcghsSGrNJufnhK1kcIFwfF/f2iHy1+dGxVgrdK+tOAZ5difFeA4mOdiHhKWcdSzs sUyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733310688; x=1733915488; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qoapVpNXl28ZG5UuJ7jnDD6Hy269eKhZbzIlX0dbMls=; b=f08ugvgQ5YAC5lPoQbNj5N6b4C1rutTkKO6GAxpJOKHHe3Pat1CM27O2rrSaM0lMPA XHtKSRJO3EyNXJEuF/Ltfe+Hsew0pq2yX0W3UaB6UCs6TxD1fTy0E+gowQkYIjnoxjLL xa1SFdaGGgfmKwjGZk6QGvHtjbK6jt12JqJAqyxgNWwx42yOrgP9QsHnkCRuKZhYU1J0 vALcRdY5Oang/WPe0XwxP5Rloece5aci3DVQrlk3VfW2foc1f4zSUq5+bkvRlEo5tK2i KoF130G48afgd+o1PX6LcHLy1pCddXhUgdtNCON9yo6deM68opewLkNFvLfGxBGjuxqS 7hAQ== X-Forwarded-Encrypted: i=1; AJvYcCU7ZY6rnBPAJDAwx+iozVCd84Kf62WXYLhwuMmyAFJtLwXLaI2+OzmGUAE5FiGXRQrfd6SERLwAAQ==@kvack.org X-Gm-Message-State: AOJu0YxGN6Tve73011az9HvgDY3WZc694g0BirVMqholJC3WUadVGjBm ky65iSiTmrdBhV1XbAnpPpHQJLB+OFeqWDsjpHWF/6GCIzwQzjU/4icTLDOQ7rkWhiThB40mZ43 S X-Gm-Gg: ASbGnctvqS/w8k2nNTluuQszp0l5sBcpiEWHqdi++eBj0Ae6R+Eq1S73Kl7akVHEIUe +ZjoAJji+1tlf3mO056U/oNvsnowMosVKwChkOqbsBCoIY/2DkdPu/adc3HyYFgoouAbMyn+XQN d3gkmHCLkrs8xoIZii5h0ca1UA475Gzg1WyLYnjJBsaDxOA7ed82x6d0qUfADVxhTPuifb8lqUd eRajA7LAeG9tibfQz/1QUlrLfNvZpjp18SfZ+1M3T8OXLadC7Dp6zj4wgfTBAa4raPIYfwNCVG/ RHkUFQ7LWehV2SE= X-Google-Smtp-Source: AGHT+IEfEQGVP8Gt1483n1zkgrWVjfWQwHKtLl/KnEOEh9tvsaCxkNv7R/GEU8g81pmpLtGbqUrK9w== X-Received: by 2002:a17:903:2442:b0:215:a98c:aabb with SMTP id d9443c01a7336-215bd0e66a5mr90103075ad.24.1733310688006; Wed, 04 Dec 2024 03:11:28 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.148]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21527515731sm107447495ad.192.2024.12.04.03.11.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Dec 2024 03:11:27 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, peterx@redhat.com, akpm@linux-foundation.org Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, zokeefe@google.com, rientjes@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [PATCH v4 09/11] mm: pgtable: reclaim empty PTE page in madvise(MADV_DONTNEED) Date: Wed, 4 Dec 2024 19:09:49 +0800 Message-Id: <92aba2b319a734913f18ba41e7d86a265f0b84e2.1733305182.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Stat-Signature: f77zhymmhsmfgbzgfi4mz1nia1yxigxc X-Rspamd-Queue-Id: F406B10000E X-Rspam-User: X-HE-Tag: 1733310661-228700 X-HE-Meta: U2FsdGVkX18epXr/oQ/okWtnZw8msq6PYZFp79vrb0A6cjz121F/uff8bZTy/p37uOzwih9w3UaxGpKl314c29aet5+ewvbcZj+OKjVi9RUiplrJXe46JIAp5E3QVAUevEyTBdta6Fd2fhbHcRVfk0QnvsQ2M/0AOOECapob5dBVizaiTuvnQK37HhM85ZVZe1URZdhHet9tJU8bNRQ7gRlqE/P+IJtsH1mEwpSNYXNVDqj8rOicEK96v8UbGsDYrTFiF5BI/RayuI2pZvE/bPkLgupAdu4PCzPWm5zW99ftzi6IAsNE16Z8zZSCZQrw/aXaYYUeJgmF9AIHZvSK1ot+SYzVb/USoGCgvGrGcx8J8TKw+M7PziPpQvx7RLw8mQt/y4qduEu5XNV9FBO/L/A+Ua7laHWy06uvvfQv1oHkqQ7WiLHzia0yOk14kbrtLNvziJIr18edFRSF5X8g9cykDEv8Cy7wReqjoN5ZcDIUpb2fvwYB0Xrc4fl0xEbYpVrqjq73hH687/9izXwgJnhaicLEPoyDTxMHHOFWxpb6L6VheOR2CgzaeNENaHBZetRgoMdf6vCo+fwJsfQ6m2mq3g30HnG8g4FcRJ4CBZCgNGIQNgxGwRMcR2gohFhXI4scjyzr0zQoZ/LZZnkDW3dec404aAUy2Rk1OtfEttUPq8QWQLF/WqLd7ghDGEeYIMBSLmXJ6w8td+neNH+jPD5EziwHxkoLKI37fD/ZdnOffgrNWWne6PpAv8dgEz4KdQU4dJiI88mcdLy/ZKwTienn7nFIez6gwSf1/MqK5OP3eUhFZVGqZYTvH4sJpPSm7savn50j0u5b56cGqjDtBRvJHyyq8b8QoXXxTRnC7yVZd57VXQwfgA/k8j2fb61yFiKnOoXU0iiDs8Ihs01kQlGtIK5CL6+Jw5zXBgfQlXqVyTbSEu4PxsEplumpxA7HbqvbFfXOJPl57uUwsFH 3m7MKfPf KGD27iZAlNEMnaGyDkYQpfX5MfTZfdvvWj4/72/7Fj+yP9fUI5H3CopQZCCV/crYLwLmkOdqIFy3JN/JrASWzxmuKOwgr75iqyuEOdd3bcCP88I6xIaLhJGoPDn5vmUBoRTzyu3e7SAH8HBlep+jc78dEnWjL4gbLkLqa2t+WML4lYKj1FPkwpDzMxdvQ7XFA5RqvAnG9ntharJOHDOJjP0yQnlMONeniwOKHjmNO56whWvW6dIGR2+wBXOhUzWdpRWh4lkTdUz1hA4HlaI/OzNj5aiPNnwAIEyN6vmvVEf30D3sujMin2OEGueDrMY/kYfVTpsLxPw5bf/xOzpOUbtT1oqCjLlTQyi59iYd98MDnsDI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now in order to pursue high performance, applications mostly use some high-performance user-mode memory allocators, such as jemalloc or tcmalloc. These memory allocators use madvise(MADV_DONTNEED or MADV_FREE) to release physical memory, but neither MADV_DONTNEED nor MADV_FREE will release page table memory, which may cause huge page table memory usage. The following are a memory usage snapshot of one process which actually happened on our server: VIRT: 55t RES: 590g VmPTE: 110g In this case, most of the page table entries are empty. For such a PTE page where all entries are empty, we can actually free it back to the system for others to use. As a first step, this commit aims to synchronously free the empty PTE pages in madvise(MADV_DONTNEED) case. We will detect and free empty PTE pages in zap_pte_range(), and will add zap_details.reclaim_pt to exclude cases other than madvise(MADV_DONTNEED). Once an empty PTE is detected, we first try to hold the pmd lock within the pte lock. If successful, we clear the pmd entry directly (fast path). Otherwise, we wait until the pte lock is released, then re-hold the pmd and pte locks and loop PTRS_PER_PTE times to check pte_none() to re-detect whether the PTE page is empty and free it (slow path). For other cases such as madvise(MADV_FREE), consider scanning and freeing empty PTE pages asynchronously in the future. The following code snippet can show the effect of optimization: mmap 50G while (1) { for (; i < 1024 * 25; i++) { touch 2M memory madvise MADV_DONTNEED 2M } } As we can see, the memory usage of VmPTE is reduced: before after VIRT 50.0 GB 50.0 GB RES 3.1 MB 3.1 MB VmPTE 102640 KB 240 KB Signed-off-by: Qi Zheng --- include/linux/mm.h | 1 + mm/Kconfig | 15 ++++++++++ mm/Makefile | 1 + mm/internal.h | 19 +++++++++++++ mm/madvise.c | 7 ++++- mm/memory.c | 21 ++++++++++++-- mm/pt_reclaim.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++ 7 files changed, 132 insertions(+), 3 deletions(-) create mode 100644 mm/pt_reclaim.c diff --git a/include/linux/mm.h b/include/linux/mm.h index 12fb3b9334269..8f3c824ee5a77 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2319,6 +2319,7 @@ extern void pagefault_out_of_memory(void); struct zap_details { struct folio *single_folio; /* Locked folio to be unmapped */ bool even_cows; /* Zap COWed private pages too? */ + bool reclaim_pt; /* Need reclaim page tables? */ zap_flags_t zap_flags; /* Extra flags for zapping */ }; diff --git a/mm/Kconfig b/mm/Kconfig index 84000b0168086..7949ab121070f 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1301,6 +1301,21 @@ config ARCH_HAS_USER_SHADOW_STACK The architecture has hardware support for userspace shadow call stacks (eg, x86 CET, arm64 GCS or RISC-V Zicfiss). +config ARCH_SUPPORTS_PT_RECLAIM + def_bool n + +config PT_RECLAIM + bool "reclaim empty user page table pages" + default y + depends on ARCH_SUPPORTS_PT_RECLAIM && MMU && SMP + select MMU_GATHER_RCU_TABLE_FREE + help + Try to reclaim empty user page table pages in paths other than munmap + and exit_mmap path. + + Note: now only empty user PTE page table pages will be reclaimed. + + source "mm/damon/Kconfig" endmenu diff --git a/mm/Makefile b/mm/Makefile index dba52bb0da8ab..850386a67b3e0 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -146,3 +146,4 @@ obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o obj-$(CONFIG_EXECMEM) += execmem.o obj-$(CONFIG_TMPFS_QUOTA) += shmem_quota.o +obj-$(CONFIG_PT_RECLAIM) += pt_reclaim.o diff --git a/mm/internal.h b/mm/internal.h index 74713b44bedb6..3958a965e56e1 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1545,4 +1545,23 @@ int walk_page_range_mm(struct mm_struct *mm, unsigned long start, unsigned long end, const struct mm_walk_ops *ops, void *private); +/* pt_reclaim.c */ +bool try_get_and_clear_pmd(struct mm_struct *mm, pmd_t *pmd, pmd_t *pmdval); +void free_pte(struct mm_struct *mm, unsigned long addr, struct mmu_gather *tlb, + pmd_t pmdval); +void try_to_free_pte(struct mm_struct *mm, pmd_t *pmd, unsigned long addr, + struct mmu_gather *tlb); + +#ifdef CONFIG_PT_RECLAIM +bool reclaim_pt_is_enabled(unsigned long start, unsigned long end, + struct zap_details *details); +#else +static inline bool reclaim_pt_is_enabled(unsigned long start, unsigned long end, + struct zap_details *details) +{ + return false; +} +#endif /* CONFIG_PT_RECLAIM */ + + #endif /* __MM_INTERNAL_H */ diff --git a/mm/madvise.c b/mm/madvise.c index 0ceae57da7dad..49f3a75046f63 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -851,7 +851,12 @@ static int madvise_free_single_vma(struct vm_area_struct *vma, static long madvise_dontneed_single_vma(struct vm_area_struct *vma, unsigned long start, unsigned long end) { - zap_page_range_single(vma, start, end - start, NULL); + struct zap_details details = { + .reclaim_pt = true, + .even_cows = true, + }; + + zap_page_range_single(vma, start, end - start, &details); return 0; } diff --git a/mm/memory.c b/mm/memory.c index 36a59bea289d1..1fc1f14839916 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1436,7 +1436,7 @@ copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) static inline bool should_zap_cows(struct zap_details *details) { /* By default, zap all pages */ - if (!details) + if (!details || details->reclaim_pt) return true; /* Or, we zap COWed pages only if the caller wants to */ @@ -1710,12 +1710,15 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, struct zap_details *details) { bool force_flush = false, force_break = false; - bool any_skipped = false; struct mm_struct *mm = tlb->mm; int rss[NR_MM_COUNTERS]; spinlock_t *ptl; pte_t *start_pte; pte_t *pte; + pmd_t pmdval; + unsigned long start = addr; + bool can_reclaim_pt = reclaim_pt_is_enabled(start, end, details); + bool direct_reclaim = false; int nr; retry: @@ -1728,17 +1731,24 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, flush_tlb_batched_pending(mm); arch_enter_lazy_mmu_mode(); do { + bool any_skipped = false; + if (need_resched()) break; nr = do_zap_pte_range(tlb, vma, pte, addr, end, details, rss, &force_flush, &force_break, &any_skipped); + if (any_skipped) + can_reclaim_pt = false; if (unlikely(force_break)) { addr += nr * PAGE_SIZE; break; } } while (pte += nr, addr += PAGE_SIZE * nr, addr != end); + if (can_reclaim_pt && addr == end) + direct_reclaim = try_get_and_clear_pmd(mm, pmd, &pmdval); + add_mm_rss_vec(mm, rss); arch_leave_lazy_mmu_mode(); @@ -1765,6 +1775,13 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, goto retry; } + if (can_reclaim_pt) { + if (direct_reclaim) + free_pte(mm, start, tlb, pmdval); + else + try_to_free_pte(mm, pmd, start, tlb); + } + return addr; } diff --git a/mm/pt_reclaim.c b/mm/pt_reclaim.c new file mode 100644 index 0000000000000..6540a3115dde8 --- /dev/null +++ b/mm/pt_reclaim.c @@ -0,0 +1,71 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include + +#include "internal.h" + +bool reclaim_pt_is_enabled(unsigned long start, unsigned long end, + struct zap_details *details) +{ + return details && details->reclaim_pt && (end - start >= PMD_SIZE); +} + +bool try_get_and_clear_pmd(struct mm_struct *mm, pmd_t *pmd, pmd_t *pmdval) +{ + spinlock_t *pml = pmd_lockptr(mm, pmd); + + if (!spin_trylock(pml)) + return false; + + *pmdval = pmdp_get_lockless(pmd); + pmd_clear(pmd); + spin_unlock(pml); + + return true; +} + +void free_pte(struct mm_struct *mm, unsigned long addr, struct mmu_gather *tlb, + pmd_t pmdval) +{ + pte_free_tlb(tlb, pmd_pgtable(pmdval), addr); + mm_dec_nr_ptes(mm); +} + +void try_to_free_pte(struct mm_struct *mm, pmd_t *pmd, unsigned long addr, + struct mmu_gather *tlb) +{ + pmd_t pmdval; + spinlock_t *pml, *ptl; + pte_t *start_pte, *pte; + int i; + + pml = pmd_lock(mm, pmd); + start_pte = pte_offset_map_rw_nolock(mm, pmd, addr, &pmdval, &ptl); + if (!start_pte) + goto out_ptl; + if (ptl != pml) + spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); + + /* Check if it is empty PTE page */ + for (i = 0, pte = start_pte; i < PTRS_PER_PTE; i++, pte++) { + if (!pte_none(ptep_get(pte))) + goto out_ptl; + } + pte_unmap(start_pte); + + pmd_clear(pmd); + + if (ptl != pml) + spin_unlock(ptl); + spin_unlock(pml); + + free_pte(mm, addr, tlb, pmdval); + + return; +out_ptl: + if (start_pte) + pte_unmap_unlock(start_pte, ptl); + if (ptl != pml) + spin_unlock(pml); +} From patchwork Wed Dec 4 11:09:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13893591 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D02E2E7716D for ; Wed, 4 Dec 2024 11:11:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 632056B00A3; Wed, 4 Dec 2024 06:11:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5E1B96B00A4; Wed, 4 Dec 2024 06:11:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 45B876B00A5; Wed, 4 Dec 2024 06:11:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 2791E6B00A3 for ; Wed, 4 Dec 2024 06:11:39 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id D7E45160F4B for ; Wed, 4 Dec 2024 11:11:38 +0000 (UTC) X-FDA: 82857010824.04.1315E10 Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) by imf28.hostedemail.com (Postfix) with ESMTP id 7E51DC0010 for ; Wed, 4 Dec 2024 11:11:19 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=YlzE8FBa; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf28.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.176 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733310690; a=rsa-sha256; cv=none; b=3mTfPbkaMAyjLOJwx/5+HukQY8fQ6vxVlBW5WRDkmN0wD7JWgatl+YjzBRbUf1QR7+UUXM QXbfPfyEY9ZVHkbWJJPVRR6Bkbx3hNUZV6I2433Fvx0lzQEH40gsVl3y9IaoeNc3usUP0S Vr4kc0GLpiH3EuPJ2vBjekgQekq0S9U= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=YlzE8FBa; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf28.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.176 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733310690; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9fSWWcfeV8hzoFpFk7VaZVUYbnww7eiSUjs7tN8xr0c=; b=JT/y/Ri/z+6/G7kjcAxaUGXVh+lVnjyrb69/VDMPvrR9k4+bOXEaodVlsHUUVkNJ4fB186 jbUNUAhCeVClNzNj/oGjm6jtMqkBqlALsI1bJyZuXqWeI59AtHc5xDOtdtFjv1zMmI7aZ6 56WDCVdWGkIvZ1HrFbUH4yi/1+esT+w= Received: by mail-pf1-f176.google.com with SMTP id d2e1a72fcca58-724f383c5bfso5158277b3a.1 for ; Wed, 04 Dec 2024 03:11:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1733310696; x=1733915496; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9fSWWcfeV8hzoFpFk7VaZVUYbnww7eiSUjs7tN8xr0c=; b=YlzE8FBa/d9n1VxLgRP5yPcP6X9yh/0aoS8IdMD0U9pstDmnvGvTBDDVFJLc86JbWG AYhxTLTEYbgCpmeWYupNJSZDWmhGK22vR2Arz4YUsdliT1Esfox6Aw76HcK8Od4mOq1w IKoJwtQ2k+X7mzN/XLv+5qTY7YAfi3iTgKWDYpUZoaWYBvL2pMAGIUpIQYVtiMYRMa8o 7fclpUHgQV7zMZiWY6SrRXuuV3pT38CmikZ4s9M9D873NVG5BxY0+LwYO0E5BsoqfM5R jROumj0KsmZlqSkAkZwk+qld8Hj/0r5ZxGd0hv1Se5alY7ls/U5QiA9nDs+CshtO/PZb jtOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733310696; x=1733915496; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9fSWWcfeV8hzoFpFk7VaZVUYbnww7eiSUjs7tN8xr0c=; b=PBg48eamEV4CA+vvzR1BlfQiVDRnqKrXY1uaCoyb5ZPCKYFHXOElhYcbcB5FHbeoyb 0mCszVIPfWOVJvMkueriR20Xxv/VcSAcpskCExdn8kSh4gWjU5t6cULHGni+Stmf6Vzp 7gyUu0MG15MMkenRnJgvlQttdSIwOguRZym5czuat+8Zxma5CISPQEByQnX9OBNOfJw6 WXE6Zsoekwu6tAZGTzdVlQXGzOaTIB38EP0kbugI0/5eaZqhEf3srPIFU+y/+Ik7nWaw 3K0P4QPF/ux+3HDyKZ8naF9NlQWNd9TExWUrbxB/pw7IFnmhIJPhtKpWpQnK9viyaNos BQkg== X-Forwarded-Encrypted: i=1; AJvYcCWruMDpnY+7plUiYDB6p79o3AjGUGLiidPtLvnzuWQt+KU3n2Kqy5tptMYljCe3PgVjaQ4ZV9MMSw==@kvack.org X-Gm-Message-State: AOJu0YxFGKJtqpH6JZzIyjr2AChHKL7+nNiszz0eWw43UPN8lWQTSXkY o2ajq5CIm+y/+45GwB/oK2j359khM1ZwjBbFV2MA61nWX+L3OjaKkJqUvDj0SPY= X-Gm-Gg: ASbGncvSF8p8prCuFHb0sILgxe/71C0ZibtWMT6SuY2x2BjknS6PjUYCKG8xmAtXXF+ 8RtGysaibQhSd6UsDf6etQm7Q2ZivytE17U6euBzz+LZLNV/pXhAy/Z/ILCJj87kvha9/dl6RSF +m3HDJgcNNs+YQv6ry3RpIZApcXtOt8c4QJQxs2AKKQekyKUCpF8Gx+oyWC5bczNAFd3VsnhBks 4VRZp0p0mnOrkavPlpV9uGOfAXXQ+W5aJAj2OnCIJDVnKYKzaTq2U7v5Nr4c4vSCXon70g+DklZ Gk9HM7U3FEcRsnI= X-Google-Smtp-Source: AGHT+IFovq7hDlm4wLdDhm1VdKh1zZ19bET/1mJY6NLSrbW7cKbNUzk3vayQjzm+UlBEnc1Q8UAnEw== X-Received: by 2002:a17:902:ea10:b0:215:a190:ba28 with SMTP id d9443c01a7336-215d0041378mr59888075ad.22.1733310695765; Wed, 04 Dec 2024 03:11:35 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.148]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21527515731sm107447495ad.192.2024.12.04.03.11.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Dec 2024 03:11:35 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, peterx@redhat.com, akpm@linux-foundation.org Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, zokeefe@google.com, rientjes@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [PATCH v4 10/11] x86: mm: free page table pages by RCU instead of semi RCU Date: Wed, 4 Dec 2024 19:09:50 +0800 Message-Id: <0287d442a973150b0e1019cc406e6322d148277a.1733305182.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 7E51DC0010 X-Stat-Signature: r16yz8bobab8jjtqbgyqf8xmspk1o7qa X-HE-Tag: 1733310679-266200 X-HE-Meta: U2FsdGVkX18Br+yUFfhvvf8ppidm4yYoQxG9THLZI3a3rgCiYMg0LrWsx/Q5KujffMPFoRdlr5BG4vBoqsr4I8FSoBlZbU2pUxygtaWbVrkFM3lbzQYoQoNHxuT4S0/NuEyrke7+SJJOeeBIIYd501m/zfsk5NnDs4oUmMpZTwfILLs0inAgABaJc3wvEKKHqKmb9/zIygI7EkQyOTY8tCfsrP1gfjkhV+GWatyMowoghZ6awluOsL0yJhycazoCEsRKmwTGSutRz7V6C00B6KundCY+FS3aham6Spue6KYBHjQvSNx4GyS82mHnxwu+/j0tpVMjED5AErzGAcDI+fXakgyRsXEF/PD7z9NAjjE6UtxqRSoeVeAJhXozvQj0TomF6RqpnT0+eEf2zcxI8TevkO0skwAIP0+CZS52a80Y4yCLGTHjJH9vop0voZlusMsY1SJFnhGGGPJFNzO/91weUA5qCg5uMEDlNQ863aKuaN9cllcFJ/LHnXnjrk9xGXfAcW6TLeEjVJxeVVMTmZJFYBgB4rw792U9/bHoBpSfzUvujUcV81pHNDLTdXaQm8a3cmqwfLzyotpI2IWVqhGrUh0MieFHywSDuJItyCKu3tojTs2V9CTEzmZjCjH9lOdtBdgtlkfZACk/rtS+nNoWrJg4UvdigX5pG0xjtC8hlFppm02cjnjM3+Z+0yIJq/C/QQsgQ2MYglNTLkbD6sLPLaCphp4Osh7GR6XHoMIIQNPJYfwOHeNQTKtl1ezO6pksMmZSuhtA2vjcY3J/NuBHNjBU3OWaBDDqgFB08xgE2ZH9Mxb9cYJvKWdC3k1pOI4+TC37xJnVwLFO5hzibMw08AuwXicmIzN06cX4bvbxgQsZu8gNIm3TXBs0uRfwWBkZe9gyfJJeCnZ+fcK+eza/eWMS86/Npie6Ikxzjq4tdoskIZ0XEHVlYe8JfsRHeTm/qF6qCKHv6I9RSb4 KJM2RRek P0m0+h8xi4BE+GZ8RNTgkBn8Bhb/le6yh40jha+UC6NliobBOc7HN9pOhvsSwn5UNXTkg2p5ZRCrHW03kz3n0s9xF2XNaou9+yspn0hnYq3QFO5GA8/ExnmhRSLgL9Gy1GCrAPpIovGybmbMjf5dCXjoSzNReukIW4gk8paMN2oe5Typbm94dE8SaVq6d87A8KrWewxopy45Bo8UFp0W4SPQ+vAZeo9Oq/8mtZU48q6CSsvZFJ13MBhUqlXT6kmpA6BsaD0YbkeDdlFwwRe2pZ54KhW5HZ+iuuTbFmcWb2HmchK68zEIDUhbrxPQJ1O1IdQuOAFtWTJzG6qIjHQV+P1fu9vG/2j9L5SG/qFmUQVtm+15ruTWV7TLM0L8lUipi4AOyv2SIJjhiTtDukAVlAV/OWqAy04tC8U+P X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now, if CONFIG_MMU_GATHER_RCU_TABLE_FREE is selected, the page table pages will be freed by semi RCU, that is: - batch table freeing: asynchronous free by RCU - single table freeing: IPI + synchronous free In this way, the page table can be lockless traversed by disabling IRQ in paths such as fast GUP. But this is not enough to free the empty PTE page table pages in paths other that munmap and exit_mmap path, because IPI cannot be synchronized with rcu_read_lock() in pte_offset_map{_lock}(). In preparation for supporting empty PTE page table pages reclaimation, let single table also be freed by RCU like batch table freeing. Then we can also use pte_offset_map() etc to prevent PTE page from being freed. Like pte_free_defer(), we can also safely use ptdesc->pt_rcu_head to free the page table pages: - The pt_rcu_head is unioned with pt_list and pmd_huge_pte. - For pt_list, it is used to manage the PGD page in x86. Fortunately tlb_remove_table() will not be used for free PGD pages, so it is safe to use pt_rcu_head. - For pmd_huge_pte, it is used for THPs, so it is safe. After applying this patch, if CONFIG_PT_RECLAIM is enabled, the function call of free_pte() is as follows: free_pte pte_free_tlb __pte_free_tlb ___pte_free_tlb paravirt_tlb_remove_table tlb_remove_table [!CONFIG_PARAVIRT, Xen PV, Hyper-V, KVM] [no-free-memory slowpath:] tlb_table_invalidate tlb_remove_table_one __tlb_remove_table_one [frees via RCU] [fastpath:] tlb_table_flush tlb_remove_table_free [frees via RCU] native_tlb_remove_table [CONFIG_PARAVIRT on native] tlb_remove_table [see above] Signed-off-by: Qi Zheng Cc: x86@kernel.org Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra --- arch/x86/include/asm/tlb.h | 20 ++++++++++++++++++++ arch/x86/kernel/paravirt.c | 7 +++++++ arch/x86/mm/pgtable.c | 10 +++++++++- include/linux/mm_types.h | 4 +++- mm/mmu_gather.c | 9 ++++++++- 5 files changed, 47 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h index 4d3c9d00d6b6b..73f0786181cc9 100644 --- a/arch/x86/include/asm/tlb.h +++ b/arch/x86/include/asm/tlb.h @@ -34,8 +34,28 @@ static inline void __tlb_remove_table(void *table) free_page_and_swap_cache(table); } +#ifdef CONFIG_PT_RECLAIM +static inline void __tlb_remove_table_one_rcu(struct rcu_head *head) +{ + struct page *page; + + page = container_of(head, struct page, rcu_head); + put_page(page); +} + +static inline void __tlb_remove_table_one(void *table) +{ + struct page *page; + + page = table; + call_rcu(&page->rcu_head, __tlb_remove_table_one_rcu); +} +#define __tlb_remove_table_one __tlb_remove_table_one +#endif /* CONFIG_PT_RECLAIM */ + static inline void invlpg(unsigned long addr) { asm volatile("invlpg (%0)" ::"r" (addr) : "memory"); } + #endif /* _ASM_X86_TLB_H */ diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index fec3815335558..89688921ea62e 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -59,10 +59,17 @@ void __init native_pv_lock_init(void) static_branch_enable(&virt_spin_lock_key); } +#ifndef CONFIG_PT_RECLAIM static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) { tlb_remove_page(tlb, table); } +#else +static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) +{ + tlb_remove_table(tlb, table); +} +#endif struct static_key paravirt_steal_enabled; struct static_key paravirt_steal_rq_enabled; diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 5745a354a241c..69a357b15974a 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -19,12 +19,20 @@ EXPORT_SYMBOL(physical_mask); #endif #ifndef CONFIG_PARAVIRT +#ifndef CONFIG_PT_RECLAIM static inline void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) { tlb_remove_page(tlb, table); } -#endif +#else +static inline +void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) +{ + tlb_remove_table(tlb, table); +} +#endif /* !CONFIG_PT_RECLAIM */ +#endif /* !CONFIG_PARAVIRT */ gfp_t __userpte_alloc_gfp = GFP_PGTABLE_USER | PGTABLE_HIGHMEM; diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 3a35546bac944..706b3c926a089 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -438,7 +438,9 @@ FOLIO_MATCH(compound_head, _head_2a); * struct ptdesc - Memory descriptor for page tables. * @__page_flags: Same as page flags. Powerpc only. * @pt_rcu_head: For freeing page table pages. - * @pt_list: List of used page tables. Used for s390 and x86. + * @pt_list: List of used page tables. Used for s390 gmap shadow pages + * (which are not linked into the user page tables) and x86 + * pgds. * @_pt_pad_1: Padding that aliases with page's compound head. * @pmd_huge_pte: Protected by ptdesc->ptl, used for THPs. * @__page_mapping: Aliases with page->mapping. Unused for page tables. diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index 99b3e9408aa0f..1e21022bcf339 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -311,11 +311,18 @@ static inline void tlb_table_invalidate(struct mmu_gather *tlb) } } -static void tlb_remove_table_one(void *table) +#ifndef __tlb_remove_table_one +static inline void __tlb_remove_table_one(void *table) { tlb_remove_table_sync_one(); __tlb_remove_table(table); } +#endif + +static void tlb_remove_table_one(void *table) +{ + __tlb_remove_table_one(table); +} static void tlb_table_flush(struct mmu_gather *tlb) { From patchwork Wed Dec 4 11:09:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13893592 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12D2DE7716D for ; Wed, 4 Dec 2024 11:11:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 98C7C6B00A5; Wed, 4 Dec 2024 06:11:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 93D5F6B00A6; Wed, 4 Dec 2024 06:11:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7B8306B00A7; Wed, 4 Dec 2024 06:11:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 5A4AF6B00A5 for ; Wed, 4 Dec 2024 06:11:47 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 03BDE80ED5 for ; Wed, 4 Dec 2024 11:11:46 +0000 (UTC) X-FDA: 82857011076.15.6215F09 Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by imf27.hostedemail.com (Postfix) with ESMTP id 7F0CD4001A for ; Wed, 4 Dec 2024 11:11:27 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="cSUJ/vsm"; spf=pass (imf27.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733310698; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=S+2TKl5mjR1/DeVU9OKWZNhWFB9aKSgXlEoedfavFGk=; b=dfTp0gYJdGBCoVRhy0iF2YIE9jrdjxHXpVOpFjKSI3zfVlFbA3+RlwLvugu1RS4xKDz8mA ljU0tZt9G6oUuDBFqjTSBWosCXIBAK1HgDIVr/bZlpGYzoiJxP5Ri1mdDvFfvABxVEQ0FX b+I/NOfn6pos5cPbvaV/V8w401ncC9M= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733310698; a=rsa-sha256; cv=none; b=zxzzDI9vcXbc0jDeLjzhpKoEdn8Aj/1GhokVlSOZ/mc+pVrBV0paT5mzN5Q9zdgZW1gni3 Ah31v8FvYqnCbTBoe1CZa98NHpbwnzeLmj/dGEDgmA9yOysW0s+yc7NDLgE4tteNtiRiqg HokV0/Xx2aQaTJtPxFyJfhbraceKj/s= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="cSUJ/vsm"; spf=pass (imf27.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-215e194b65aso3573975ad.1 for ; Wed, 04 Dec 2024 03:11:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1733310704; x=1733915504; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=S+2TKl5mjR1/DeVU9OKWZNhWFB9aKSgXlEoedfavFGk=; b=cSUJ/vsmcbJPJ39NNugmuSgh5smZuWhBO4EIN5TNQyfmtXjTgwQ6ZxPp5N9sZEgPMs aKVi9/uuehI44fn853ypEiQa97Fan3Y0pie7tmOf2DQQcrKmC0NLBP035OF3yRYrrzVc 9kKNN8BMN1sB7AWnIS/EdM+1Xq31VbIpuYD0DD7q/ZHRhHeq8mtjVfXYmehixii5Sgu6 elOSH7hUyfY4eDagmEbxApDrY1QJVvER38gVb3M2yXSZ6L64x+0axH6j3V32D9qCCirq iHNKgWOaex3Ki6ZsX2iuLy+Wrc49lPAW3A1mwstEl2CIYtEcgD8ocfCqAjkM7f4m4pqJ TBKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733310704; x=1733915504; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=S+2TKl5mjR1/DeVU9OKWZNhWFB9aKSgXlEoedfavFGk=; b=NI4h73UUVWS24JDG5ev4mbcAmaLxYJ9StbpWImoBwdWYPeRhCmg0lmPvZfRyVZ5HCs 5fDgCXBt089Xw6BL0zyRou8fpqKre58CIA3kqhwu42963isif9x/rVww5Tm/zemao5/5 F81tW2mD0HIRNrcmDjwZ9OX366KuxsiKPYMD+/N+ONGF2DRi91f3xTXzlh2NwXfg2or0 7YnHl2CEq063lnFGRXLmH3s3Zs67V2EeoO9WNmxO7HRmQOj3LXDr5A/eceF3QtqkYCZe VgxwYXFpjMIkV45pPiKWVvpucW2FDx117tGhyLWq82LTKAhqD1jX4IvnWvDJKww3hjdF qjhQ== X-Forwarded-Encrypted: i=1; AJvYcCUszSLLSQeSPDmApSS1RK2igiZmaWaSCrhwSdF1y0kyqXY2Tiahvf4tQPALhN9dAFHBUk9lfxlRVw==@kvack.org X-Gm-Message-State: AOJu0Yy0QalAybjgzU1/XrqXLfHorrMTII5elb4s9p+QbRDWHOQ97tHP mWzw88Y9MuHTmFtqoWbsoXJkVTDnGlb1lYD2Vh9sL0DeEA2TXQ5neCQYggZ0vJA= X-Gm-Gg: ASbGncuK1+6jkzCj5yN1dlz/LAhCTW9q309IHxju6QXgJitY/YlJejbsfi9s8cIuesb G8/KO13GQSeCxuHvNlfl4ufeEe6XZOjwPV+eKEiQL/181KQVgQ48I9Dk95VJEVdDDUiECoewNrl OE0Gct0CT2jm04zhMNcvp1RFShXHM9osr3YKtuijx+A22Wnqmt5wCxupcycQLNGGdUH0HMfhMmw SmT62wK63JFKfQ5WeL8qPYhU5nAsbEQuF9J2m3mu+sTflgD0Gh40kc3Py3YUty7IsaeIrXujOGk jKkZpFdBoBXYlJQ= X-Google-Smtp-Source: AGHT+IGw0dejDQHZRzw7N1ju/UdXqXmsmAXf2JNNoAGHvDVcgOtViNVAc4qWa1MErCGtacjUW7kRdQ== X-Received: by 2002:a17:902:f543:b0:20c:a44b:3221 with SMTP id d9443c01a7336-215bd1cb76emr76174235ad.15.1733310703813; Wed, 04 Dec 2024 03:11:43 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.148]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21527515731sm107447495ad.192.2024.12.04.03.11.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Dec 2024 03:11:42 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, peterx@redhat.com, akpm@linux-foundation.org Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, zokeefe@google.com, rientjes@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qi Zheng Subject: [PATCH v4 11/11] x86: select ARCH_SUPPORTS_PT_RECLAIM if X86_64 Date: Wed, 4 Dec 2024 19:09:51 +0800 Message-Id: <841c1f35478d5354872d307888979c9e20de9c09.1733305182.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 7F0CD4001A X-Stat-Signature: 5s753z4zc4ngw8x5cbhddcndmr3nka7f X-Rspam-User: X-HE-Tag: 1733310687-603484 X-HE-Meta: U2FsdGVkX1+itke7qyH/pk3moTer2qJduQejPBTmHzGxJuGj47OTwIgP+siXjEsv+yoqeel5/YUKNzJlKZv+2P9qXAAfGlz46AgnthpKQQ5+YDx0udOEdX4+qEtRC2OyvJ0dda+VEWQHT80Dju7IKF6dbQU9ImWqStDGCFZR2BLuFDbb0cS8reSM0oANSNBiYRQBOO19rMRPmNs4C6xeewk1CXknu/f23JMvDy5eqwuKP2UT+cKP7zHwhWjm0e44QTcVAbrzKtcShjN18clcfpa4e4NLSROZTK7uCER50RfGQhg5IjFps2NUGx12gBMz1wwZZAv0awmISfl5od+XTvCvEBMcsqRmWqQGkbNdTQVAtyoWdr9j6ToEZcYSJLvgk2vUnKoOXXGnO2mVDZJWuPu9k3Y7Pchxd4Q+iIoSGwKx5bPPCHg0pP9MG7G/9YlawlQUkBDPHMUX7Ssdjz9syq42vJ/m7tOmQO9s9UekUEe+9tZ4W49gU/6I8JmWoL8Rj4HP+XFmbWxqefyTb32FnDmyN9S2WFQnC13vqTZDB0GUs3F5Ay4gDPLJOATRtButI3ZaPjUinMFSXPpgoXMVo0LOwKNuD0o0RjxpidQxAdl+fKKQdxf18KjQRODzuBWseLnBVJFb1JfRpqcDUxzhCWlRIEmEC8+831lxpG1NMKVudJMuPdmIOa9O+KkVtodiO52SPyy1LttzXvoBl9fGqX89kUtwzkU8DK4kaF5/tYJpHp2f9X9/ueaTdhR8uf3sz+VPilBtn7XzcXBTUae9drlCNkS1V9ugHmfV6IrSoE25B+nMTyY1Fni1W3AjBDKzlRig7mI6VUSZFKRu3GY2x2310e3P0+rXCE1MJ8Lm7jKjJU32piuMTHLYA2EzafBVdxtq6B8G3kAFtIusW5xLUBuw5pnIVDKijrAW2nd1BJ1FvZmn4xDOJFZs0IhyNRfZa28fLjcIQl2GoC/hsGD ijXaxMKq na6hbwcULkx8E562UUF9MZi7zZzPBOW0B4P8m2hwoLhROFcCH9IC2i18NsKEbZX8qXTzJRoLqWleFhcRPPedX8AymnQKvb2bbXJX0zp2tl4AFuFGvVPhBJw9hPiRUQBQbrrXYLsPNvo5vMZo6OHrBKOxLIQDXL/WRusTSz4dnkQoNNkSwp+5TdvI8PzqSunLJWVCrLWjgM51vBSpQ/RHsw+BANDaVhmA6kqYwdClypPmbNHyV4OAFKFAttxcUQAmzXTAuoGq8RTZg2Xzf8bdOz03vswqiCR/jF2zsleEHJsYmuFyJCH+IkyMmllU5DalfgSa4MrLo3DI7XA/zI86QkBk3sfRtCCEIg6OWkLv9HeMeNwj1MYN1G5I3jDwcf9lHbV76DU4VqWpJ/fpvgThQAa2WwSt6OGGF7CdM1BedUvl+Vni0xQPW4cKFLoVxNOp/r50hva17+nhw8lUpuN6RyVZpwg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now, x86 has fully supported the CONFIG_PT_RECLAIM feature, and reclaiming PTE pages is profitable only on 64-bit systems, so select ARCH_SUPPORTS_PT_RECLAIM if X86_64. Signed-off-by: Qi Zheng Cc: x86@kernel.org Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra --- arch/x86/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 65f8478fe7a96..77f001c6a5679 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -324,6 +324,7 @@ config X86 select FUNCTION_ALIGNMENT_4B imply IMA_SECURE_AND_OR_TRUSTED_BOOT if EFI select HAVE_DYNAMIC_FTRACE_NO_PATCHABLE + select ARCH_SUPPORTS_PT_RECLAIM if X86_64 config INSTRUCTION_DECODER def_bool y