From patchwork Thu Nov 14 06:59:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13874610 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4AFB0D65C4F for ; Thu, 14 Nov 2024 07:00:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AA4216B0088; Thu, 14 Nov 2024 02:00:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A53F76B0089; Thu, 14 Nov 2024 02:00:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F4F16B008A; Thu, 14 Nov 2024 02:00:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 710C16B0088 for ; Thu, 14 Nov 2024 02:00:37 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 23D51160F8C for ; Thu, 14 Nov 2024 07:00:37 +0000 (UTC) X-FDA: 82783800288.03.FE7500A Received: from mail-pf1-f171.google.com (mail-pf1-f171.google.com [209.85.210.171]) by imf17.hostedemail.com (Postfix) with ESMTP id 8279A40450 for ; Thu, 14 Nov 2024 07:00:01 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=ZlXCoeu1; spf=pass (imf17.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.171 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731567579; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=23GC9z2kYnX1PBmcPyoYRptMMW0UHpf+qVqZ3j/14aw=; b=ybS9cT4jrSLS2SGHkA5nrtV4q5NrI73h204TG0Vg7d19OjMb56AyZ128L4aZid3ZTe4KE7 bMpAqhRFinpKGTYzhpXEGpbcSyeUnxSiScCIVaF62UdhlfYpqV/LNpEgY9xzCpLbXM+d3B /znZfA8SSMP+MHPYPdlSmA6+NnF5++s= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731567579; a=rsa-sha256; cv=none; b=RolHD4Ox8zyY0eowVexUsEdKJrXc4yZW/uPZRDFwWOvMFhU6C2NCDTqdGw4eyAG1A29V0Z 806ZsKAKT3nQt28Oqk35bzZeft//1j62HBHD5ZS5nRYQ6oAbTa+3euclyqByTgf+jT7cUU z3UczCi+z6QAXICHLUgXRnFAcC6/Vic= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=ZlXCoeu1; spf=pass (imf17.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.171 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pf1-f171.google.com with SMTP id d2e1a72fcca58-723f37dd76cso232871b3a.0 for ; Wed, 13 Nov 2024 23:00:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1731567634; x=1732172434; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=23GC9z2kYnX1PBmcPyoYRptMMW0UHpf+qVqZ3j/14aw=; b=ZlXCoeu1mNRrsHahj+pQ7wC31/dYrQ5KwgF5M89AAOCS/wKn4wddWZCnUPDJA/bH6v cd9uNYTbFJIxcKsCg4+fRQ3rBC/Cj1Wba85pJCqfOFWTHWhXq/9lm/uuMCJCXKzw/g/O qfgh/fxbZeGLoFORcUWXSKJCplASdNJEw1aCtDfRBz2YOPq8sbQfpfVGTpGYDZCgkCpK LkLNhxNXPaSLklYExGxgaFXtf+pkIUFaNl1QbjH+uZESDFWNSVyWsIIQj73bGKp07nUy kqcg6knIE+iHz/Bc44EtWoys7EutFlhgEy7m9bAB9Y40OOr8eIeruQYj3P1tike+psvV Eveg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731567634; x=1732172434; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=23GC9z2kYnX1PBmcPyoYRptMMW0UHpf+qVqZ3j/14aw=; b=GbSC5bFiMLoZpjHol9/rL86rJafCeOXfZLBcRybkft3f6hrV+331YRK2bHIo3H/OYw d97f+SKxSvdlAFK1BpTWeKvfkVpCpVYSFC7eECeYTYKkTDTzKax0MsjZmcljJeeUxu/W 7a+VDyGKFy6en7rDUMSnBSPkwlC53LwzIcbBD6Yqss2dRBDJdyPy/3DZdkiGAapeHXPr g4pPpFrGYoYRjXOkePaJdrlvNzq99XMM+n5aP7dPcLz00uGXbd+Q0zaVT37wyK9VwAIl hFvMRqtz1mOJ394NLn23+AliZb9fBekNJBQcNbnl2yUDuJ89edRrkAk/ml25Cvkq+psE rL1w== X-Forwarded-Encrypted: i=1; AJvYcCVmy4rOiIm6eH2zXUtTvVyhug7hr7c+PS5mV/8pfHgHq2u0qNmr3b/+JxG2IHKWTGvE7k8c6y8hcg==@kvack.org X-Gm-Message-State: AOJu0YxDgorKAhk/N5009qDDJtH+DmqW+eS18fYiyiReipTIUuDF3Zoc 7SuBorsJVF++6tE0SRxk7TP36bQJMu1nEJiEXVFG5eZgT5PHJg8VAxALPiHO4y0= X-Google-Smtp-Source: AGHT+IE3O39rZGVFEPKMSxSa/5Ca+GqYQaa3AwUorYO2XE8T6fJA6cVIU7wKCrtDgDFVz/hz4vTDDg== X-Received: by 2002:a05:6a20:8418:b0:1db:d81a:a900 with SMTP id adf61e73a8af0-1dc700ee224mr8489901637.0.1731567633901; Wed, 13 Nov 2024 23:00:33 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-211c7d389c2sm4119065ad.268.2024.11.13.23.00.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Nov 2024 23:00:33 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, peterx@redhat.com Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, zokeefe@google.com, rientjes@google.com, Qi Zheng Subject: [PATCH v3 1/9] mm: khugepaged: recheck pmd state in retract_page_tables() Date: Thu, 14 Nov 2024 14:59:52 +0800 Message-Id: X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: n7fbbcm3tfzup7a4efwnpca1amf3hatc X-Rspam-User: X-Rspamd-Queue-Id: 8279A40450 X-Rspamd-Server: rspam02 X-HE-Tag: 1731567601-802449 X-HE-Meta: U2FsdGVkX19ep4d7snJnS2MDbgMLSDBAdDZ2nGKI67dh0cWu+Xo3Bnm29nuSm/Z9h5LbxGTmvbaO830C8AyisnsCnNFEVl7qEY5tD2Zmb1/ZAhI1h3lkg+PWJRfX9O4L0HbtFprd8A3Qt0+f0aZ/sJH79iOssijriifDoAYYPPfjbOA+UmaoaKCkOe7bY8wBcJ+VzTVLFm4mH6SonVG76rf2u2cyYCvZRy70+2EP0Yw12d/qgD3YkL/QEvhsEoQSIiBvN138At6iQ5wAnpmTuQAthh441f4Mz3Dx/0a8owEs6kfFO1Rd5VVTTankiVEMnuT5QhsFBCklVlNSQIZRUE6NIwWYAQW8Pl9/9eC3WWlFbI0fOteawQCAmxfvFhF0Vk7wcvFsIWJh8IH4d3i5N+x08WRQirrLDYNo4Dd2vZxjTyVFSmtjsM2bgAM09I/cQXWnHjkg2noToLW8PR9NtcqwitmHk5aci1kc1Aj3KcXSXg3QXNpkMbRCiLtTa8Ph4QBKrNiBF+uefjlEFJfmKJCJ8+LbuuMMkwrXc7Mver4lTajyu5LHNIIwaqRvzPDSvhUrwl312lQh9y8umd2wJl0P/JnlamjWzwC0IVDYxSsjQ+Bs+UT5flZ7w0dB+2dyO0hr+oMyrf3kWnp+JNTq8UxM69J/WR/szmcN6FN8FkqRcRHAnPpnvABJCt395tNX3Zl+jvs1AnUdboHe2iui9qZifPgd+Hhy+ZCqj3i4bR8QE7Jdz/cAh7KeuORbs84kIWfVYLkdjCqIrJDP79hlgRNiuFA/zPwIxiXLBF3DcY4ANnGXd6xf2qI6Mjf4+E6egr4mSJS9oQwMaXnhsQmO2h1GrXtn2fh61vZG7jmKVGK6X9q7dRGS34WjjazluP/nQKCabLKXs/cln23EpEe/85AX40M7uGxMKfTR4uYc0UAFGwk/0aeJD1lDAB529aQm5qjmEwIQgFpzaGOsZkt +fCkm+WW IZV+CY4u92pf0KK0xq++esNlusRUN+4Uq40TBu212wXLAZkB4u9839xc0+T/HJicjAYhI3pB69u7h6/fxUd9hr7OTb/3TMjbhA7Sj2OMbti7dTDEZiNzQFpdnoeICvpIO7Gu56pTywYasi5PZUukYrWIInUotyQYnZAHBVOnjUbbZeFZJkpZk5Kfb3CIWmqXtVRpg6wKmudZGO3nA8Urhym+yEVL9WqP7GhHCFrrSeBP3s5n7YmEqkKFsr6Uafq84cJqTk86zWKQtFAtih/cHhUQzbrGVJtyqrlBgIoAgmj1uIkbzJztmVzDo5qJRkubcR9wY7vA3uo4aY9h+BOBep6yifK0cL7wtYwmeKaxeYxAf1iQ0Ym06t8ylw9rn/7d7QRlW X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In retract_page_tables(), the lock of new_folio is still held, we will be blocked in the page fault path, which prevents the pte entries from being set again. So even though the old empty PTE page may be concurrently freed and a new PTE page is filled into the pmd entry, it is still empty and can be removed. So just refactor the retract_page_tables() a little bit and recheck the pmd state after holding the pmd lock. Suggested-by: Jann Horn Signed-off-by: Qi Zheng --- mm/khugepaged.c | 45 +++++++++++++++++++++++++++++++-------------- 1 file changed, 31 insertions(+), 14 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 6f8d46d107b4b..99dc995aac110 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -947,17 +947,10 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address, return SCAN_SUCCEED; } -static int find_pmd_or_thp_or_none(struct mm_struct *mm, - unsigned long address, - pmd_t **pmd) +static inline int check_pmd_state(pmd_t *pmd) { - pmd_t pmde; + pmd_t pmde = pmdp_get_lockless(pmd); - *pmd = mm_find_pmd(mm, address); - if (!*pmd) - return SCAN_PMD_NULL; - - pmde = pmdp_get_lockless(*pmd); if (pmd_none(pmde)) return SCAN_PMD_NONE; if (!pmd_present(pmde)) @@ -971,6 +964,17 @@ static int find_pmd_or_thp_or_none(struct mm_struct *mm, return SCAN_SUCCEED; } +static int find_pmd_or_thp_or_none(struct mm_struct *mm, + unsigned long address, + pmd_t **pmd) +{ + *pmd = mm_find_pmd(mm, address); + if (!*pmd) + return SCAN_PMD_NULL; + + return check_pmd_state(*pmd); +} + static int check_pmd_still_valid(struct mm_struct *mm, unsigned long address, pmd_t *pmd) @@ -1720,7 +1724,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) pmd_t *pmd, pgt_pmd; spinlock_t *pml; spinlock_t *ptl; - bool skipped_uffd = false; + bool success = false; /* * Check vma->anon_vma to exclude MAP_PRIVATE mappings that @@ -1757,6 +1761,19 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) mmu_notifier_invalidate_range_start(&range); pml = pmd_lock(mm, pmd); + /* + * The lock of new_folio is still held, we will be blocked in + * the page fault path, which prevents the pte entries from + * being set again. So even though the old empty PTE page may be + * concurrently freed and a new PTE page is filled into the pmd + * entry, it is still empty and can be removed. + * + * So here we only need to recheck if the state of pmd entry + * still meets our requirements, rather than checking pmd_same() + * like elsewhere. + */ + if (check_pmd_state(pmd) != SCAN_SUCCEED) + goto drop_pml; ptl = pte_lockptr(mm, pmd); if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); @@ -1770,20 +1787,20 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) * repeating the anon_vma check protects from one category, * and repeating the userfaultfd_wp() check from another. */ - if (unlikely(vma->anon_vma || userfaultfd_wp(vma))) { - skipped_uffd = true; - } else { + if (likely(!vma->anon_vma && !userfaultfd_wp(vma))) { pgt_pmd = pmdp_collapse_flush(vma, addr, pmd); pmdp_get_lockless_sync(); + success = true; } if (ptl != pml) spin_unlock(ptl); +drop_pml: spin_unlock(pml); mmu_notifier_invalidate_range_end(&range); - if (!skipped_uffd) { + if (success) { mm_dec_nr_ptes(mm); page_table_check_pte_clear_range(mm, addr, pgt_pmd); pte_free_defer(mm, pmd_pgtable(pgt_pmd)); From patchwork Thu Nov 14 06:59:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13874611 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A2C4D65C4F for ; Thu, 14 Nov 2024 07:00:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5C1706B008A; Thu, 14 Nov 2024 02:00:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 571FE6B008C; Thu, 14 Nov 2024 02:00:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 439DB6B0092; Thu, 14 Nov 2024 02:00:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 25CE36B008A for ; Thu, 14 Nov 2024 02:00:45 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id C5BB7C0F87 for ; Thu, 14 Nov 2024 07:00:44 +0000 (UTC) X-FDA: 82783801422.21.09AC6BC Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by imf15.hostedemail.com (Postfix) with ESMTP id 25DF2A0472 for ; Thu, 14 Nov 2024 06:59:57 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=CxTqPyyf; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf15.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731567465; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=z5NhkFR9u/ZCBqXODjRxrXNpbWksUAj27FTFaGEH9DI=; b=2fi3+PSuU9OEoGVGwK4GvqOTk00x/uFvqT1Sh7Dge1b77hd0FAT5SpyeMf1XDsmejd4Red E/gxtnCXlla6JR8tgY5Zij1HsGnOLXTvs1cddEdydKSbxMi/T6oPmK/4zYaApusD73Fs3D iEfHL5ZaeSX6KKz1tyIO9BrZiTBRUpA= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=CxTqPyyf; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf15.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731567465; a=rsa-sha256; cv=none; b=AZMahC+zyZEdzV2QgccivjJZy8r5l79u0NRF1m4Wl5/LqpTCXAIrKK1jwKaKZpF7SIkvLl MD3kroUWGx1iqBgJ3xNQxqw0KBYSvZAXBkYeVJ9+hAJImJCwMcCckpII7Mmw00y+9Kaqt0 xUNDv7oha5RwFmbjTDNUDKhl5Iw7rm4= Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-20c714cd9c8so2557605ad.0 for ; Wed, 13 Nov 2024 23:00:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1731567641; x=1732172441; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=z5NhkFR9u/ZCBqXODjRxrXNpbWksUAj27FTFaGEH9DI=; b=CxTqPyyfwlrV2pJzTaSTrEPi9PSjH6TOV/N/eiaPvdZXUpPGzl4xP8CleXB8WvIytp pzsWWMJ7gT0A2C7ZkaJ9KfrzNYYqUW2znmShosHIvOTTfwE/4fiVxNdHUvFw84S7eGy7 YHiFhFJmeiuJEGoRX5ul3pJn7QDdzH8keFwVZWTnSFn/HRLYSIXgD1/pja/+lRR7GsiI 3lOmWMvB4zDywUzo2oFp9Xrh/YgPaGDy4NvkfxtdrQVSjv0R43o8d2ZfcwMnR7Kx8mGq JxxG2ETEJguN06FKOOrcHt9kS5ON54nCF+EskVkTzg46c/yQ0qhv4mVqRluT5Vx4sNss /geQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731567641; x=1732172441; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=z5NhkFR9u/ZCBqXODjRxrXNpbWksUAj27FTFaGEH9DI=; b=YSvdH5jYVyyTZQU2pUyVztl4HVq6kOeimu9Z+xrBgPAgBAWRRs80W2nhzerdNSeD20 4Zxn/AHn/FQGu3IE5DbXVHZA3qH/2RacLofBPw1ErMY5ZMxMt3jIuN4rLg0kQi27GqLX 3WeXgXxRKg6H34k+y3lJnJ2Dpo5V9HDy3118wR185qXUa92wo2TI86HRI+1iALrlI+FE s3JQ/i7c0Zt2Lw55D2iiZ9PBxtNpd2NxonGLvQ9RE2U07G/TKvsnTXlfAnrHviBGOGyE srooYvKjOxSzbPnHrJ9Qy35YdIhhw1+Sj4IVYbS7UoCwgVw7nKd3u/oHYSbp88Z1rakO nHhA== X-Forwarded-Encrypted: i=1; AJvYcCULNJSK/ygWgPxjDg/RwQF0LE6X50SvtqshwTU1229WXLol1XyZVZ5w3Up4pRzxse19AKzkoQFwOw==@kvack.org X-Gm-Message-State: AOJu0YwnlW5xZhj5DlqqgJMm7OrqyaCBYYadvA+yvDInGoK0xBCGQB8f G30qiLc5yFrxxV8vBGpx1iHuwhGn/20po24k7jS+8SVXkjZhnltCaCzTtxHM22g= X-Google-Smtp-Source: AGHT+IHBiLojldOiy5h7kyJqW1/SyLghLkJxoovGFCwaxMGieVNX0Uk7Et6paaJVqXGLWQ0ZBmUDAg== X-Received: by 2002:a17:902:d489:b0:20b:b75d:e8c1 with SMTP id d9443c01a7336-211c4f9cf52mr14080785ad.4.1731567641481; Wed, 13 Nov 2024 23:00:41 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-211c7d389c2sm4119065ad.268.2024.11.13.23.00.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Nov 2024 23:00:40 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, peterx@redhat.com Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, zokeefe@google.com, rientjes@google.com, Qi Zheng Subject: [PATCH v3 2/9] mm: userfaultfd: recheck dst_pmd entry in move_pages_pte() Date: Thu, 14 Nov 2024 14:59:53 +0800 Message-Id: X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 25DF2A0472 X-Stat-Signature: aedbzp4zwkwk4beb3nbg75ye7bojdhck X-Rspam-User: X-HE-Tag: 1731567597-821860 X-HE-Meta: U2FsdGVkX18BItWCEMZN4tPQejuo6LIxlbcWzjYzf55G6KMIW0Qt1zGYYVwn04KV64dqxUCnpCze/sp8VSjBmFsisG4/0LONyDRxCqniR+DIP6TSUxt3/hh+PbTS3Kz8ZzCWZwWkmj+fGHXDj+7dfbP6jaBDa9I5ryX6HsvbIr7Dyob2woVZKXgDSIsi1K1gSZnz4yP0thBoLGracdljQTkHgGyzdjVaepJhPIXqqU2L8dKLZYhMdiCMSKRx/UYQGVncSyJqEDZx/XQIx15+ZPTc5dqOjfKSzkAG9FL7GtUu/3UD3NsKxvP4VRFB+tvELe7LFjh/cD+dgPWfiUh74hbAeQY1dIu5I6uTN6otax435I/YM/yVfSBVv+QqHRf05je/iZlPt8hJkuCPgb7c6YZS2XSas994frbF2xpGfT7ZMD131rhbtFp6kJGRpdMkF+m4nGVCy1wPRARZWIZ4yjl2mjQR9pbcV35cbi0WIkb8nrU0dxolD1GSDX3Gse/4z0eqEUhLzpNOh0+vzX8/alMWeUDHlf1RBzReCJeAiOnEaBtJ0RHVvNycjbYBSlH/mZNiCHt4UImcrMnHxJCKEkUTcyWU4q2pgMrooWrHnowJhvxt5S8vffuO6CB5FGARdNzpA9GcloDh/wJhdTcYItRQ+86xg6jyHTr1ILbntoxyLpDq3qJ/i1qihWrQPCaZwfTjGpfgq5aaDEWKP+aZoIU6njWVQJwJKzHWCiGBfMKdBvGOfEamVKGBalfWbep7P5zcmeEQktD3l7O4aWuXAOpCH91Ptu5403ZEMEzXuymbOFYJLkG82wwvNMac9G63nmmm/twbp3JfmMJIyoDmMCvJTr3aIFjGrcDK2KjSVWNx5I9y1no6U2s5HY+q56mSzwgxiMpNUjahRQ91suBe+/fcAtjpRj+iYZAli9OgdH3TvvPthc/aFeTwu0+0INeEX79j1cImoxR4qjGEucR 1LXzrujb r+R1e5fjE+hkNGhcqh4dcJl+A20R3YyYLFjQMKVuVcgerO5q5TV4ARz2gHq+MvlVbVuw6zfq3WwlQ12qsmRJ1y9ruQqlJG/HAJb4SJnX/vBjZiYlXmjLvsADjaBlwX1K71+0wpX24W6Kt5+iFSgvuSP4QBhq7MR88IbLl+Rd9a5SNAcLljTpn6RjKc3BTSfVxMRxT2IWSa0ytZpHlFpSZJj+zQtkbGW/RqBZp2PruPg51sH04sXhwUvyiF4oAx1GNhuuBzqwfqzML0IkFoLJJoQrVAtfhc5/Xq0aJcy/q4cdvuF7RZjknEKXTSPpwpGfwm92RZA9BrE1SeBWJkYde/xI9s5oQlWXT7sa/63X6Vtmjfaw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In move_pages_pte(), since dst_pte needs to be none, the subsequent pte_same() check cannot prevent the dst_pte page from being freed concurrently, so we also need to abtain dst_pmdval and recheck pmd_same(). Otherwise, once we support empty PTE page reclaimation for anonymous pages, it may result in moving the src_pte page into the dts_pte page that is about to be freed by RCU. Signed-off-by: Qi Zheng --- mm/userfaultfd.c | 51 +++++++++++++++++++++++++++++++----------------- 1 file changed, 33 insertions(+), 18 deletions(-) diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 60a0be33766ff..8e16dc290ddf1 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -1020,6 +1020,14 @@ void double_pt_unlock(spinlock_t *ptl1, __release(ptl2); } +static inline bool is_pte_pages_stable(pte_t *dst_pte, pte_t *src_pte, + pte_t orig_dst_pte, pte_t orig_src_pte, + pmd_t *dst_pmd, pmd_t dst_pmdval) +{ + return pte_same(ptep_get(src_pte), orig_src_pte) && + pte_same(ptep_get(dst_pte), orig_dst_pte) && + pmd_same(dst_pmdval, pmdp_get_lockless(dst_pmd)); +} static int move_present_pte(struct mm_struct *mm, struct vm_area_struct *dst_vma, @@ -1027,6 +1035,7 @@ static int move_present_pte(struct mm_struct *mm, unsigned long dst_addr, unsigned long src_addr, pte_t *dst_pte, pte_t *src_pte, pte_t orig_dst_pte, pte_t orig_src_pte, + pmd_t *dst_pmd, pmd_t dst_pmdval, spinlock_t *dst_ptl, spinlock_t *src_ptl, struct folio *src_folio) { @@ -1034,8 +1043,8 @@ static int move_present_pte(struct mm_struct *mm, double_pt_lock(dst_ptl, src_ptl); - if (!pte_same(ptep_get(src_pte), orig_src_pte) || - !pte_same(ptep_get(dst_pte), orig_dst_pte)) { + if (!is_pte_pages_stable(dst_pte, src_pte, orig_dst_pte, orig_src_pte, + dst_pmd, dst_pmdval)) { err = -EAGAIN; goto out; } @@ -1071,6 +1080,7 @@ static int move_swap_pte(struct mm_struct *mm, unsigned long dst_addr, unsigned long src_addr, pte_t *dst_pte, pte_t *src_pte, pte_t orig_dst_pte, pte_t orig_src_pte, + pmd_t *dst_pmd, pmd_t dst_pmdval, spinlock_t *dst_ptl, spinlock_t *src_ptl) { if (!pte_swp_exclusive(orig_src_pte)) @@ -1078,8 +1088,8 @@ static int move_swap_pte(struct mm_struct *mm, double_pt_lock(dst_ptl, src_ptl); - if (!pte_same(ptep_get(src_pte), orig_src_pte) || - !pte_same(ptep_get(dst_pte), orig_dst_pte)) { + if (!is_pte_pages_stable(dst_pte, src_pte, orig_dst_pte, orig_src_pte, + dst_pmd, dst_pmdval)) { double_pt_unlock(dst_ptl, src_ptl); return -EAGAIN; } @@ -1097,13 +1107,14 @@ static int move_zeropage_pte(struct mm_struct *mm, unsigned long dst_addr, unsigned long src_addr, pte_t *dst_pte, pte_t *src_pte, pte_t orig_dst_pte, pte_t orig_src_pte, + pmd_t *dst_pmd, pmd_t dst_pmdval, spinlock_t *dst_ptl, spinlock_t *src_ptl) { pte_t zero_pte; double_pt_lock(dst_ptl, src_ptl); - if (!pte_same(ptep_get(src_pte), orig_src_pte) || - !pte_same(ptep_get(dst_pte), orig_dst_pte)) { + if (!is_pte_pages_stable(dst_pte, src_pte, orig_dst_pte, orig_src_pte, + dst_pmd, dst_pmdval)) { double_pt_unlock(dst_ptl, src_ptl); return -EAGAIN; } @@ -1136,6 +1147,7 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, pte_t *src_pte = NULL; pte_t *dst_pte = NULL; pmd_t dummy_pmdval; + pmd_t dst_pmdval; struct folio *src_folio = NULL; struct anon_vma *src_anon_vma = NULL; struct mmu_notifier_range range; @@ -1148,11 +1160,11 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, retry: /* * Use the maywrite version to indicate that dst_pte will be modified, - * but since we will use pte_same() to detect the change of the pte - * entry, there is no need to get pmdval, so just pass a dummy variable - * to it. + * since dst_pte needs to be none, the subsequent pte_same() check + * cannot prevent the dst_pte page from being freed concurrently, so we + * also need to abtain dst_pmdval and recheck pmd_same() later. */ - dst_pte = pte_offset_map_rw_nolock(mm, dst_pmd, dst_addr, &dummy_pmdval, + dst_pte = pte_offset_map_rw_nolock(mm, dst_pmd, dst_addr, &dst_pmdval, &dst_ptl); /* Retry if a huge pmd materialized from under us */ @@ -1161,7 +1173,11 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, goto out; } - /* same as dst_pte */ + /* + * Unlike dst_pte, the subsequent pte_same() check can ensure the + * stability of the src_pte page, so there is no need to get pmdval, + * just pass a dummy variable to it. + */ src_pte = pte_offset_map_rw_nolock(mm, src_pmd, src_addr, &dummy_pmdval, &src_ptl); @@ -1213,7 +1229,7 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, err = move_zeropage_pte(mm, dst_vma, src_vma, dst_addr, src_addr, dst_pte, src_pte, orig_dst_pte, orig_src_pte, - dst_ptl, src_ptl); + dst_pmd, dst_pmdval, dst_ptl, src_ptl); goto out; } @@ -1303,8 +1319,8 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, err = move_present_pte(mm, dst_vma, src_vma, dst_addr, src_addr, dst_pte, src_pte, - orig_dst_pte, orig_src_pte, - dst_ptl, src_ptl, src_folio); + orig_dst_pte, orig_src_pte, dst_pmd, + dst_pmdval, dst_ptl, src_ptl, src_folio); } else { entry = pte_to_swp_entry(orig_src_pte); if (non_swap_entry(entry)) { @@ -1319,10 +1335,9 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, goto out; } - err = move_swap_pte(mm, dst_addr, src_addr, - dst_pte, src_pte, - orig_dst_pte, orig_src_pte, - dst_ptl, src_ptl); + err = move_swap_pte(mm, dst_addr, src_addr, dst_pte, src_pte, + orig_dst_pte, orig_src_pte, dst_pmd, + dst_pmdval, dst_ptl, src_ptl); } out: From patchwork Thu Nov 14 06:59:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13874612 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D3ABD65C52 for ; Thu, 14 Nov 2024 07:00:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BB5C16B0092; Thu, 14 Nov 2024 02:00:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B18D86B0093; Thu, 14 Nov 2024 02:00:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 943456B0096; Thu, 14 Nov 2024 02:00:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 721CF6B0092 for ; Thu, 14 Nov 2024 02:00:53 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id D456A160F8B for ; Thu, 14 Nov 2024 07:00:52 +0000 (UTC) X-FDA: 82783801464.25.14EF2A5 Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) by imf07.hostedemail.com (Postfix) with ESMTP id 70CBF40431 for ; Thu, 14 Nov 2024 06:59:48 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=lYUcpL9F; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf07.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731567520; a=rsa-sha256; cv=none; b=bozFawHpOF1BqeJ0VHUrEXkaZn9zVsX9aCwEr+Vz/Yv5IXHTHyMiQicjPC5A+Z0m3AVgCk I9O3DkWNDUyzyR1HwQeBLxzdQ1Vfm/Z0Xm+0fDTJPSEf7ksKF5EX4sh0bPRervcH66e/So sKv9kSIanIITCLfwgEPSCfUCm8030fI= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=lYUcpL9F; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf07.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731567520; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Injd64heqojhVahuoVzqrsqJ00krl1pndEuOREo5sHA=; b=uxIg9N8WZ7mUX+uEJPaUv3B27ETmrVLpRRKlnNejh0QDUovcaNkTIhJ9EHA4ojwRkPF44e NWzjNpmr0G0k0udsKVJD0mgXXWXWVNoKbZ7h5BbGS5QynWyVuqWWSMW43dKNzTGCrGy5V3 2rtPpKpelvrgE9xAXGwXBvWwVcOVxAw= Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-20e576dbc42so2529805ad.0 for ; Wed, 13 Nov 2024 23:00:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1731567649; x=1732172449; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Injd64heqojhVahuoVzqrsqJ00krl1pndEuOREo5sHA=; b=lYUcpL9FMMQAvhNTcqPw9EzkOIeWqsqiLR94k6STE+mEJDQZ5jqAjoSKnHENMLTxwY m5aVGPvO9A/3Q1XdpFQyC5A/xQEtWNZcMmz5tpNmbyCdgliYrNywiSp7hWKiwE3I9PTd DFdM4qWLoD6ZdzIdUQ8619VswWsewTEx1GSD42aeqNzYquEF3/t+Q1IyBJMQmwmtKZT8 Yz2n3h7N1chwFIn9kyowFGkNZKyEf3efn/L4IWW0cf6icSf/Jtof6X1l2dbNIYRmIcdY rUG8dFGKVxSRlHKAaNN2E0WrRKI5fLCB3RSUVtLy1gO3lHt1e9V6DoFYl0LQLFtgM4I4 U8FQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731567649; x=1732172449; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Injd64heqojhVahuoVzqrsqJ00krl1pndEuOREo5sHA=; b=MKo8+sKDvFBKG3sPw+Y56N7m7QPfg9DX+RLIhKpqwOD7Xq5bjVd3RYp5FclEt2tRLg Tn104qFFdaMW1pNVAaVbeW0CWUHKoOnWSeBQr5Mj4I8cETgDE7vIXe5auiWB2n/JRqYy MuDc7kO15CVM5UKyOngOgkaOSt6WZ6PM61uuzs1wHgOTNLxie12FP7PM9dnNGG36CSJX SZUJwHSIWZfUfIf1OQeLxYo6AzqcFITnlJX5waic5comL/UAdSWp2R6Jc0GcTEAs3nCY jgk6RfFb+IPUMQp4Z9tgWXR+FXNUDOpz7pEw5zrlyDDPPFG9GFbyvcY5DpUZFuOABVg5 mHYg== X-Forwarded-Encrypted: i=1; AJvYcCWZdIjyyIdgMZ3+FlM+ROvG+sustXT5TzvOi1W+8Jcgy6sdiFvRqg7Ykk+kSZv29rVIQTzLlsuy0w==@kvack.org X-Gm-Message-State: AOJu0YxYOyw8mTrCpDS7c6GGjswFwE9ciU5ftwxsA/VkvqPR/nrECpcY YpJ4FPsjoL89WIDwLJKI5q2b1wSft54cr++f2RoqLMdidqs90pUDcDmin4O03+s= X-Google-Smtp-Source: AGHT+IF9J5Srma9m0kVIWNzHVFQU0S2TqIZpiUbAkjwNNaShq+S4YCS+Ol7MzG3HFBa6aKSnPA4QSw== X-Received: by 2002:a17:903:2310:b0:20c:aed1:813a with SMTP id d9443c01a7336-211b66023e2mr54146835ad.14.1731567648903; Wed, 13 Nov 2024 23:00:48 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-211c7d389c2sm4119065ad.268.2024.11.13.23.00.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Nov 2024 23:00:48 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, peterx@redhat.com Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, zokeefe@google.com, rientjes@google.com, Qi Zheng Subject: [PATCH v3 3/9] mm: introduce zap_nonpresent_ptes() Date: Thu, 14 Nov 2024 14:59:54 +0800 Message-Id: <25e70f171e17370ec65159a301ff4f852991e14c.1731566457.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: 70CBF40431 X-Stat-Signature: e1e9cw4jefqafpkybrzgjxcikyq9i3cm X-Rspam-User: X-Rspamd-Server: rspam05 X-HE-Tag: 1731567588-957166 X-HE-Meta: U2FsdGVkX1/74X3q3e2tbwiNAq5vzgwUFJ4XhYvLA9WnB8yfRN0X8ClYZ/HLjO8LAn6aNdnv4elHVQykHHYwvCMK3fDskJRwJhM/o2c/lLeVhDjt7nBSgUkPutNMW8of5BfnpROM8mYT8X64yXQEt3VScCuUiLMPZDJz0vxTtByqZGSkZcBIpXFpYzqU8oHTi3S0gR3GRLMdvKqG3WdB3MMxk+d/U+ypBlX5oc1dW4/U4HvULzepWZe2GsFgPbudAw/FAQOrD5fd9n+BpTOr/GH5KNM89Fs4QVp7r6fL8ptwSEb9txVM7bTD7FG4T/LLqKaDMfZb4mRvW0eRKq5Bqq7Resc+j8U+iys6MabkcB78QjrY1o2a8rSuG999LeVW73rd3A6zn4RTnUUy+YV/xLJtOIP0Ij2osK+KGOJHJa6yiSNLWy1iBe8VyHQLDUQLpnRcx+Es59Q+UzQj3DvBiz9fYikvIk+CTaZVi0ILxr1lPogcyiJ4Qtj4D6qGT3UdzwycbN8+Fq3mI5M6TU2vAewShNKqzkn2/duqk9bghXnAniPVDGaiDuhtv2t2WJp/BHbAmlNiyN0/XEBpwEQLPUrpqNgxuuw/zeCLsswANClDIfeUyc2A3llhNOJuNRoo+ExVZykFXla0+bef3sV/I5E/95x4nyD+tT+dJsd2jjgZKTwkwG8h6Q15CCfZ5pB7w1pYXwXxMSShOFadtORPPFHQfbaUPKCbjy4RNCz4wqVoizFypw/C3Fzmk3zw4U3lRxoDSB5aW4Qu75h2lb/8R9ZOYnBO5ur6L/x6QMvAGz78bC2kImeKOfCZn6Zds9kYWFicaRTQECFlzuk2MIi6Zbm4enmBmJV334a/db/6YWYMp4wnXZGkgyjbs/3hBq8yaXPMCYcl/Uzq9+zWR+I6PA3hjGCT0//W4weHa5+Iba11Zyd0iFjLC/4WnFmwl+vB3mJg0mi4V4sjLIGEicU 9kNkHr3v PXKowF7m3UE5Tsp2uLMaab2P6hnF9PwEzaIQH1N2OHg1bqKmV/9rBtT3PIvnntl2t4mlwD3V1W0uJ9hAzuerIlMu+aaj9iIWiEbW9Lc5QwGAXu6Q5lrA3lRaAyEavCnBvYAz+ddJ/WiI91ggypc+G+fyRCnQg3FjWQW0zIBswxUM4I1UUnfqZCOs/a2yqf1Vwgd48CNu6tm567cFWjqFbu5mgjH6tC2FoKxMM1hx4dDxKCEP8TJ8nkPa+jebi2hxQeHbRM6FLJTe6ZEWpKL5dfhNqu0rGO5ncxbc5Hzz8ep6bME+iiSTSZjf/rJVEmvRcs6B+QtYdn1sZTccqqv+CBv2ogJnB0siD/bTczrjCXDITadnh/vS7FYr+c7eNDIurRIaLwKv1kJ2t/Ut+eUKM9eAeVQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Similar to zap_present_ptes(), let's introduce zap_nonpresent_ptes() to handle non-present ptes, which can improve code readability. No functional change. Signed-off-by: Qi Zheng Reviewed-by: Jann Horn Acked-by: David Hildenbrand --- mm/memory.c | 136 ++++++++++++++++++++++++++++------------------------ 1 file changed, 73 insertions(+), 63 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 209885a4134f7..bd9ebe0f4471f 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1587,6 +1587,76 @@ static inline int zap_present_ptes(struct mmu_gather *tlb, return 1; } +static inline int zap_nonpresent_ptes(struct mmu_gather *tlb, + struct vm_area_struct *vma, pte_t *pte, pte_t ptent, + unsigned int max_nr, unsigned long addr, + struct zap_details *details, int *rss) +{ + swp_entry_t entry; + int nr = 1; + + entry = pte_to_swp_entry(ptent); + if (is_device_private_entry(entry) || + is_device_exclusive_entry(entry)) { + struct page *page = pfn_swap_entry_to_page(entry); + struct folio *folio = page_folio(page); + + if (unlikely(!should_zap_folio(details, folio))) + return 1; + /* + * Both device private/exclusive mappings should only + * work with anonymous page so far, so we don't need to + * consider uffd-wp bit when zap. For more information, + * see zap_install_uffd_wp_if_needed(). + */ + WARN_ON_ONCE(!vma_is_anonymous(vma)); + rss[mm_counter(folio)]--; + if (is_device_private_entry(entry)) + folio_remove_rmap_pte(folio, page, vma); + folio_put(folio); + } else if (!non_swap_entry(entry)) { + /* Genuine swap entries, hence a private anon pages */ + if (!should_zap_cows(details)) + return 1; + + nr = swap_pte_batch(pte, max_nr, ptent); + rss[MM_SWAPENTS] -= nr; + free_swap_and_cache_nr(entry, nr); + } else if (is_migration_entry(entry)) { + struct folio *folio = pfn_swap_entry_folio(entry); + + if (!should_zap_folio(details, folio)) + return 1; + rss[mm_counter(folio)]--; + } else if (pte_marker_entry_uffd_wp(entry)) { + /* + * For anon: always drop the marker; for file: only + * drop the marker if explicitly requested. + */ + if (!vma_is_anonymous(vma) && !zap_drop_markers(details)) + return 1; + } else if (is_guard_swp_entry(entry)) { + /* + * Ordinary zapping should not remove guard PTE + * markers. Only do so if we should remove PTE markers + * in general. + */ + if (!zap_drop_markers(details)) + return 1; + } else if (is_hwpoison_entry(entry) || is_poisoned_swp_entry(entry)) { + if (!should_zap_cows(details)) + return 1; + } else { + /* We should have covered all the swap entry types */ + pr_alert("unrecognized swap entry 0x%lx\n", entry.val); + WARN_ON_ONCE(1); + } + clear_not_present_full_ptes(vma->vm_mm, addr, pte, nr, tlb->fullmm); + zap_install_uffd_wp_if_needed(vma, addr, pte, nr, details, ptent); + + return nr; +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1598,7 +1668,6 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, spinlock_t *ptl; pte_t *start_pte; pte_t *pte; - swp_entry_t entry; int nr; tlb_change_page_size(tlb, PAGE_SIZE); @@ -1611,8 +1680,6 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, arch_enter_lazy_mmu_mode(); do { pte_t ptent = ptep_get(pte); - struct folio *folio; - struct page *page; int max_nr; nr = 1; @@ -1622,8 +1689,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, if (need_resched()) break; + max_nr = (end - addr) / PAGE_SIZE; if (pte_present(ptent)) { - max_nr = (end - addr) / PAGE_SIZE; nr = zap_present_ptes(tlb, vma, pte, ptent, max_nr, addr, details, rss, &force_flush, &force_break); @@ -1631,67 +1698,10 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, addr += nr * PAGE_SIZE; break; } - continue; - } - - entry = pte_to_swp_entry(ptent); - if (is_device_private_entry(entry) || - is_device_exclusive_entry(entry)) { - page = pfn_swap_entry_to_page(entry); - folio = page_folio(page); - if (unlikely(!should_zap_folio(details, folio))) - continue; - /* - * Both device private/exclusive mappings should only - * work with anonymous page so far, so we don't need to - * consider uffd-wp bit when zap. For more information, - * see zap_install_uffd_wp_if_needed(). - */ - WARN_ON_ONCE(!vma_is_anonymous(vma)); - rss[mm_counter(folio)]--; - if (is_device_private_entry(entry)) - folio_remove_rmap_pte(folio, page, vma); - folio_put(folio); - } else if (!non_swap_entry(entry)) { - max_nr = (end - addr) / PAGE_SIZE; - nr = swap_pte_batch(pte, max_nr, ptent); - /* Genuine swap entries, hence a private anon pages */ - if (!should_zap_cows(details)) - continue; - rss[MM_SWAPENTS] -= nr; - free_swap_and_cache_nr(entry, nr); - } else if (is_migration_entry(entry)) { - folio = pfn_swap_entry_folio(entry); - if (!should_zap_folio(details, folio)) - continue; - rss[mm_counter(folio)]--; - } else if (pte_marker_entry_uffd_wp(entry)) { - /* - * For anon: always drop the marker; for file: only - * drop the marker if explicitly requested. - */ - if (!vma_is_anonymous(vma) && - !zap_drop_markers(details)) - continue; - } else if (is_guard_swp_entry(entry)) { - /* - * Ordinary zapping should not remove guard PTE - * markers. Only do so if we should remove PTE markers - * in general. - */ - if (!zap_drop_markers(details)) - continue; - } else if (is_hwpoison_entry(entry) || - is_poisoned_swp_entry(entry)) { - if (!should_zap_cows(details)) - continue; } else { - /* We should have covered all the swap entry types */ - pr_alert("unrecognized swap entry 0x%lx\n", entry.val); - WARN_ON_ONCE(1); + nr = zap_nonpresent_ptes(tlb, vma, pte, ptent, max_nr, + addr, details, rss); } - clear_not_present_full_ptes(mm, addr, pte, nr, tlb->fullmm); - zap_install_uffd_wp_if_needed(vma, addr, pte, nr, details, ptent); } while (pte += nr, addr += PAGE_SIZE * nr, addr != end); add_mm_rss_vec(mm, rss); From patchwork Thu Nov 14 06:59:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13874613 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69961D65C52 for ; Thu, 14 Nov 2024 07:01:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A2E836B0096; Thu, 14 Nov 2024 02:01:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9B8746B0098; Thu, 14 Nov 2024 02:01:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 832526B0099; Thu, 14 Nov 2024 02:01:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 6033C6B0096 for ; Thu, 14 Nov 2024 02:01:05 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 218B3C0F7B for ; Thu, 14 Nov 2024 07:01:05 +0000 (UTC) X-FDA: 82783802766.18.5833FED Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) by imf22.hostedemail.com (Postfix) with ESMTP id C7B70C0422 for ; Thu, 14 Nov 2024 07:00:04 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=kAqNFb1y; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf22.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731567596; a=rsa-sha256; cv=none; b=Yd6ynB3H+L1qUJX8phA+/LddG9tX1kkEmwFBz6d9/XDTekkJzCBls5juNDqhaC4SFfFmlc Gq4W0qoyp6XwWIil/CI59h7+z6TEu5b+g+ZF97E0HWZ98sp2gHKeR/mTtbvtACefmepk2K LEkbOc0Yl+YOSmAV8j2uLIM4b0iw4uI= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=kAqNFb1y; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf22.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731567596; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FQVB91K7fAT15psCxftJJTVCdE7dh3d9kskucoAkCus=; b=6KQfwkCZNp7rKV4C39u5B0HwPhtjW4Hh6SZzGXc1gi4ugVR3qrEt+BsQdQF2wPH4KXXsXi I6jZxh4a8COpEeNOzdjTKV0dbefM0cBDlex6OmH5Kq9THp5uzCOuQFxh4Ybn0+yYaztK5B zcDyrVlVgVZkpgupJGgBc6/xTb964iU= Received: by mail-pf1-f169.google.com with SMTP id d2e1a72fcca58-720aa3dbda5so171096b3a.1 for ; Wed, 13 Nov 2024 23:00:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1731567658; x=1732172458; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=FQVB91K7fAT15psCxftJJTVCdE7dh3d9kskucoAkCus=; b=kAqNFb1yShQOOWIUPay/spuAmY3cywCbLSvPF/+qnpdN4YDaiI4y4KwlE5gwI7Go8w ERqz7FTwK4Gjmvjjvco7ZUn2kcqkxggb9YeD4T7NnTAZDiC1/EiDhg0UNfHBQfEDAoxd G49bwuZuWUqI/QG3Ub+n7i9xHJA+ZTGaI7I1wPF35R2jgaHHOXc0H+S35TIhz0LP4HIl QJfTIPgUNDjeWBsrQvkoqmXmTEqENXPXNCvyTxmbjkiPIUmb23JpJyl2e302ISB2ooFy WpJDg5CoI2hhKanYnOL3QzniuzOvgjDt1hwOYCTUchIXAShcvAn8hY+hXMrZ7qUy8D/G 24+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731567658; x=1732172458; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FQVB91K7fAT15psCxftJJTVCdE7dh3d9kskucoAkCus=; b=t6uKlyGQcwuPWOTlEQ9dy39VFBVNuiKlV8ZOzJdZ4qnQsjJbfIY3oDXq5yLPU/62v4 yqe+WAtVgFIqBc8pY3dSG6LL5PJYSU6qt4W3oPazTPStTzirZldiLE49Jeuu5bbfzYPK AHkXDg3U9wC5DZXzXeYXbBo8zgckajgONsm4xHJHHWjXoRrLOOikjQvjE9CREEAnH3Sg 9VC9207rEcm1dhHe4ZcIvXX3r9tpLTFRnyYtpyGoHCQrIQ85RaAN8KKCXyL//X7937X1 taeIf96EKL424SOO5RtVjqkBTeFp4a5u3Xvk6FU11uLmNtRQnzBKWEcCy54KryFCPLF0 ryOw== X-Forwarded-Encrypted: i=1; AJvYcCUkQDj8ofYg6EDdgw+kNCs3xfw7UAWTvqS+yVbiDASkG5etU3f+4Vp8hxJ6Du8j14/DWbInf3MOgg==@kvack.org X-Gm-Message-State: AOJu0Yw2DrYaVETSpUqO2lA/aBZXYVuRTkzinK9iMxfPVY4fpYtPvK57 9bHOA0umcck9WZobWkOIq5Bo6O2m3EBei7/MlA4MB3cxI3yWnXQ/RIQwCP5YJdc= X-Google-Smtp-Source: AGHT+IEEOci5upVGjhq10Ml46AvoKHnyJ01pyJF7RhtSq5zmKJYKrXkqxyrBi8tCjSUuj7BETevvWg== X-Received: by 2002:a17:90b:43:b0:2d8:e524:797b with SMTP id 98e67ed59e1d1-2e9e4bed427mr12110579a91.18.1731567658258; Wed, 13 Nov 2024 23:00:58 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-211c7d389c2sm4119065ad.268.2024.11.13.23.00.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Nov 2024 23:00:57 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, peterx@redhat.com Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, zokeefe@google.com, rientjes@google.com, Qi Zheng Subject: [PATCH v3 4/9] mm: introduce skip_none_ptes() Date: Thu, 14 Nov 2024 14:59:55 +0800 Message-Id: <574bc9b646c87d878a5048edb63698a1f8483e10.1731566457.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: C7B70C0422 X-Rspamd-Server: rspam11 X-Stat-Signature: h17hhkga17x7xnuftqdx16ya75j78ygc X-HE-Tag: 1731567604-418194 X-HE-Meta: U2FsdGVkX1+limiG3PdEjqRKHgyHRFrtSxxkLBaG5MQW6hrZAQcv9S+e/aDQcSMKgOm0JB2lywN5EzEAVUudOwR6FBC7JysHX6/kueEh92KmqQYXIQzJqt2sv/zHekBNzTB10xPucjKYQEWOGKEwnkLxoouEjIwR+FFy67vWaTB/4e0XBnR1Q1+0ywkdnSC7/gOuOat1oSG2vlQAnOCZaNoEP//vz2+2Arr8YiyzBPdw+zPmUewwmtUXQC9madJAIGFSqcbjo0G1Rwg/Tno5mSbNrgafmiUcgqOTXLGv2lOo6NvQ43GCQs0eGEHiC+ft25TE0ri3PifOh0URbxOxwclo+82XT0MixiT/IYYZC4MOvYtWj6q9aT0LvIWe8rwvgBQuAoxlv4AhgBAwJhHVOKx+y5sbBl1ScnkZoCP1iA2xglXdBFiXnSyzMg9gu17l3wE+LNQz4sTgJJrqPx5RZLryOmbeKDjH/flhCooMT7aMKEiEVEhdVxUew2c16fPouAhADjwl9ioMtdlw7vfiGHlVz0wcT0TFmJd5DLUopZZX4ZbX5LQ24FtnM3ZuqCoI1+bOREGNlYvrQkZGaG/IWZFxgjbUevePfYdsFFREFDVx28AAiCmlc7vzFSKeexZt/zH2+W6R7MqjJMe7rZg875TxnnMTZJblw2byPmtr3eTiQ3PqDc0KZ4Txej2difM0dxxkCX3hbcDSVUuJ+A0GAh7b0FfX+050qweak8QiO+uaxEgHSxIn5kLGwPS1pdEPcyLWC4j96D03HjlxF3EhKYOOILaEakVhKze4mn3EKCVmWGw+Z++A1yJFrGwQPpmuENyT8hqBqNCIAwB5u/mFxTUm1M3IzdqtPFiZ7r5XPaZc0EzkMSlH8Zd3de1ja2Z5SZlWktIV/A04llsgRL5jpoopGE9XDyGnckyZT+H+Hp3OwXBVNmiGmQB1a5rbLvpQe2Ghr3ypFhK1ka1GTNa UKSVRDyW O/IfIuf1rnTlpcj5EJwO1ZPEWAVs30zHkwzHBViOBr6nAucEKWbLMutQKsyeVaTI1lIrDsyCROMpa2CzurlQkH99BiVWkzzpOrCLC5Zo8bTHT5KjI+bNRBSun0jg4wNQhypJqu52RtvHFpYeFeJNciTX7KDJ5lHEOiYuK4qFApv+sjGe/rJXwnvWTZI0Vx53BAiDA2gC3eu8wIQSNCgfYxLvdun2Y1sQtYZgvr8Eey+W/97LDPKZ0vdvNS+od1Wku3mfL/Tns6KcQoDBbBAlN7xnlbp3hkHq1mw/iAPWuil2+w8IAJvixHdIPBfv9n6ZDnU6XDffQeCC1LhnVHJbMdZMSfLH11Oril/scedxMmjkxFUewBVRtMCfYwJ5zjkmSMLSV X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This commit introduces skip_none_ptes() to skip over all consecutive none ptes in zap_pte_range(), which helps optimize away need_resched() + force_break + incremental pte/addr increments etc. Suggested-by: David Hildenbrand Signed-off-by: Qi Zheng --- mm/memory.c | 34 ++++++++++++++++++++++++++++++---- 1 file changed, 30 insertions(+), 4 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index bd9ebe0f4471f..24633d0e1445a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1657,6 +1657,28 @@ static inline int zap_nonpresent_ptes(struct mmu_gather *tlb, return nr; } +static inline int skip_none_ptes(pte_t *pte, unsigned long addr, + unsigned long end) +{ + pte_t ptent = ptep_get(pte); + int max_nr; + int nr; + + if (!pte_none(ptent)) + return 0; + + max_nr = (end - addr) / PAGE_SIZE; + nr = 1; + + for (; nr < max_nr; nr++) { + ptent = ptep_get(pte + nr); + if (!pte_none(ptent)) + break; + } + + return nr; +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1682,13 +1704,17 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, pte_t ptent = ptep_get(pte); int max_nr; - nr = 1; - if (pte_none(ptent)) - continue; - if (need_resched()) break; + nr = skip_none_ptes(pte, addr, end); + if (nr) { + addr += PAGE_SIZE * nr; + if (addr == end) + break; + pte += nr; + } + max_nr = (end - addr) / PAGE_SIZE; if (pte_present(ptent)) { nr = zap_present_ptes(tlb, vma, pte, ptent, max_nr, From patchwork Thu Nov 14 06:59:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13874614 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AEEE9D65C52 for ; Thu, 14 Nov 2024 07:01:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 465236B0099; Thu, 14 Nov 2024 02:01:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3C6256B009A; Thu, 14 Nov 2024 02:01:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2197F6B009C; Thu, 14 Nov 2024 02:01:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id F18616B0099 for ; Thu, 14 Nov 2024 02:01:09 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A895DC0F7C for ; Thu, 14 Nov 2024 07:01:09 +0000 (UTC) X-FDA: 82783803438.28.31DDA9A Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by imf04.hostedemail.com (Postfix) with ESMTP id 41FC740449 for ; Thu, 14 Nov 2024 07:00:12 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=SgB39wUL; spf=pass (imf04.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731567492; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jUqyGtITukFY0TNyC3digbboCysogNiC1m+PxqBi3vg=; b=MaR6xX8Mn75VyDBveehX4VKFB0IjuOqykD7XZYdxRjVd7KTDof4U0hxBfmXU2aN6gl/Jix t7tH0Thv/B89HyhWknBSSOMvDdHl0SNGQHmVgliRmTAWQ5upZ4kaVSah8z9VbhAIWL+JkT yiYhcjnesMPc+SJwBykQaVUj9/zLYhE= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=SgB39wUL; spf=pass (imf04.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731567492; a=rsa-sha256; cv=none; b=0Jzvy1Z1/T3K/FehL955SZTCLgsX6vqnI0G2yxc/zgmLu9I4HV8EhtiGi2ka2cCpEXzfnC e2qJ9b46nAXZlF0vSkhCysJroabhWNtv0m3IQourFfM/xk0CCbk1MBGpjEmizQ7ndk09j1 1yPnyWgYa3snNJkliCSovgw30YKmPlY= Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-20cdda5cfb6so2310685ad.3 for ; Wed, 13 Nov 2024 23:01:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1731567666; x=1732172466; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jUqyGtITukFY0TNyC3digbboCysogNiC1m+PxqBi3vg=; b=SgB39wULZFLF1lyrwjnqGJwYbb+7oytgQEqeFwzASrHKG7KlG5GkuCW1Vm5TgPWQy+ jfd7QZuhsyFaJ4VAjPK7dUEjTmQrzor5qLb8InAA6nmRAUOwiWYAhKzi151B2wdemWLs FKrEfBCEDj//f9ujpU/AYskH7LFxlFH7yDLr/OZ2lfVB7YXX3POogCYfkWQcYhmSW2Xj UkI1TxQ+RZXpgfmB28Z4DIpEXbRU33omKqZiMwaa3fu7Uk2X2Yw6Iscf99Fh5kuaosfa BbNdj0tbu0STsUiNFKfEGBF/ItzlhUAAkx20sBH9LEMB1qKIIjWThAc8iUJHNUFp5fOW IpMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731567666; x=1732172466; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jUqyGtITukFY0TNyC3digbboCysogNiC1m+PxqBi3vg=; b=jGIZXI9TvmrMbCu2pRXiW8JghwNNCbsYYHMe0BmrUKtPeBpW+yT/gSYhGtqPO6xtrf nLI67wPjEvd4Kdapd0WPqeYIlPdwjL63rXhJlGZV2QYaXIpj2lf2c1COAw1W6bHXfFFA oekO5QJCLIDXk9Tvq6zdjAfDmOjBHK0S/zvAAVZxisgVXenayKrLD3Sh2E1FF8koIP3p aKJwNA/jtL0SD/l9m0wFvkLlfAI4C5D7HmBFfo6eWsr80dWfT0CgsJpC5oeyR/1vVwSz f7nvZO3W/XQeNUjtL6WVtcc51dv9dGVjZwKgi88QIS12keK/Dj08qYchAiJP6Arvy1FR LFew== X-Forwarded-Encrypted: i=1; AJvYcCWANQFwPY5Lc4jkpNy1I+CfrpoYKOdUWOu1sxN554MTqmSOVya+EpIKinxW8aBNhXXWfVZIGcq8sg==@kvack.org X-Gm-Message-State: AOJu0YzymXuge8rh4lNA4TfbKBHY8iicdDReXW45dA1Q6+vKWJLX40Bv XH01nrt5xfCj1AgsUxhpqTgcfS478wZ82uGua0VLLa5NvPUGpoZGGNjXYFsajQ4= X-Google-Smtp-Source: AGHT+IH+CIyFvL8Lk36CjibW+Kiayt53v1Ia4QUhn9aVanblqL6M7a/wyMFzo1loiDImRID/mm6uDQ== X-Received: by 2002:a17:902:ec82:b0:20c:d469:ba95 with SMTP id d9443c01a7336-211b6609df6mr63772575ad.16.1731567666242; Wed, 13 Nov 2024 23:01:06 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-211c7d389c2sm4119065ad.268.2024.11.13.23.00.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Nov 2024 23:01:05 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, peterx@redhat.com Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, zokeefe@google.com, rientjes@google.com, Qi Zheng Subject: [PATCH v3 5/9] mm: introduce do_zap_pte_range() Date: Thu, 14 Nov 2024 14:59:56 +0800 Message-Id: <1d8467c1428573cc666ca3150ba66877f7b316cf.1731566457.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 41FC740449 X-Stat-Signature: 41a5oh66ucdq6ngn578fpcou7h1tc17z X-Rspam-User: X-HE-Tag: 1731567612-825255 X-HE-Meta: U2FsdGVkX18ovusbkt5ZoXzbJ42GVP6oEyXz7nQbxmmQxFTWZLZDh48c3FB87Avm9/wJBs60eb6x46QughW8b1/o7S2ImE6qXkNV79zDXVqKxtzEv6yBMjs/c/0udqsVm1clUdicU8vlbPnMMYTBqk8ODC6zsgcU+zrC5VCQUjcUEgieg/pQj5HWsavY6wpdA/wCh9Yem5fOEO9KA/MlG4U89ub6w6zFRAWs3kWYlhaMASZKkjRiyzjK8UCuBvt6Ckm3M6ZNRihxxquSF042UXI1UOjF2TtlagUN/CARfERc95sO3GsWMHMhdKJINY+txO0ddgL459jWO2YS/nvNH1hQmQp5P4fk7hT6q4YQyPibxwPoUNzmWOHvWCkP0ycSJ99wE4grzsRX9PxQNbgTLurmEK4XnmPynEk9VjWH4McxbBV3nJCpXmcEEdqoCUwV+Szp+c1KgqnRvtLoMF70ZxA/4VqfO8SUQGyW/sqf6EJGloMcvqlldjpKjZiDY2bXav4Ebdj19fPhHW74WfxqRUvo0apkQ0PxjuIG6DRp0nyLnEyY5DK/1opBt9M76jVUxuHpTo9owCq9NgD0S6g6LzDpor8rLvK1zfyOzub0nbDi7jgkdAdkUJD7pywFFgL/eYCXPdMz4aA/Jz02Bkpovdqcmg7d76U3iqB0tK9+t953OCrFttE3f3cWXq1VCChAmGUf6mcIjF6MtZgVZPYCf1A8P+CNXPydF+WQUh++2qNPNFuWQD422YCKQbSRAa2tjresJlioxkmtel3WCf+s481csbNHvlO9U3dtU2267NuZctHVKyPs8xC9zCAeBQpxnowOFHUjL7mDUN7rEurOrhlMJObkTN9LhBMKGCl+k+R+ZHYbbs690+sWFv2dpoP8u8+T+v8L9UKo0lcjRVYjOuoKzNN1QPDmONeY6U4rVcVNB5bQM1pHcoHLjn3K9A326d7GrVJguhSdOsZ8IMT gterx7Rt fDHmzNTuNFvOMcBro1qNCsXjDe0qEm11XHn5HXAgRSAzHzqlJQDzWgN3XbzjFKpOOpUAoKUBNnK55xh8RFzgqSZUn+VzTGjdYyz4edKSrQzlnZuQL7sDaeoTPcX3oljnnIsYl9fnowH1DIatgdAEfLJuPpEeZZ05d7fbHfsEShTXllhvIqcRYn1qgsXFrECwwUrrf+B4pOE2vgUzCzF6tkO7VbhvFVEGhOr/5mJJSrZ64UuRx+GNEOd4ssdOHhwVQujIumf9vnHO99xXc+q4B2nuxazt4PMqgM1jm85IRtaS35+hw432BKxCHvQMCB9pKsXL4H3GQV9s6ztx3Y83pLNLf3vLSePhNz85TfkpLrqXYByj/wZYezRxMmcCBh3FHosPBcfbutP1BvxtWSbioq5itAg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This commit introduces do_zap_pte_range() to actually zap the PTEs, which will help improve code readability and facilitate secondary checking of the processed PTEs in the future. No functional change. Signed-off-by: Qi Zheng Reviewed-by: Jann Horn Acked-by: David Hildenbrand --- mm/memory.c | 39 ++++++++++++++++++++++++--------------- 1 file changed, 24 insertions(+), 15 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 24633d0e1445a..bf5ac8e0b4656 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1679,6 +1679,25 @@ static inline int skip_none_ptes(pte_t *pte, unsigned long addr, return nr; } +/* If PTE_MARKER_UFFD_WP is enabled, the uffd-wp PTEs may be re-installed. */ +static inline int do_zap_pte_range(struct mmu_gather *tlb, + struct vm_area_struct *vma, pte_t *pte, + unsigned long addr, unsigned long end, + struct zap_details *details, int *rss, + bool *force_flush, bool *force_break) +{ + pte_t ptent = ptep_get(pte); + int max_nr = (end - addr) / PAGE_SIZE; + + if (pte_present(ptent)) + return zap_present_ptes(tlb, vma, pte, ptent, max_nr, + addr, details, rss, force_flush, + force_break); + + return zap_nonpresent_ptes(tlb, vma, pte, ptent, max_nr, addr, + details, rss); +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1701,9 +1720,6 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, flush_tlb_batched_pending(mm); arch_enter_lazy_mmu_mode(); do { - pte_t ptent = ptep_get(pte); - int max_nr; - if (need_resched()) break; @@ -1715,18 +1731,11 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, pte += nr; } - max_nr = (end - addr) / PAGE_SIZE; - if (pte_present(ptent)) { - nr = zap_present_ptes(tlb, vma, pte, ptent, max_nr, - addr, details, rss, &force_flush, - &force_break); - if (unlikely(force_break)) { - addr += nr * PAGE_SIZE; - break; - } - } else { - nr = zap_nonpresent_ptes(tlb, vma, pte, ptent, max_nr, - addr, details, rss); + nr = do_zap_pte_range(tlb, vma, pte, addr, end, details, + rss, &force_flush, &force_break); + if (unlikely(force_break)) { + addr += nr * PAGE_SIZE; + break; } } while (pte += nr, addr += PAGE_SIZE * nr, addr != end); From patchwork Thu Nov 14 06:59:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13874615 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE5EED65C4F for ; Thu, 14 Nov 2024 07:01:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 65E656B009C; Thu, 14 Nov 2024 02:01:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5E7A06B009E; Thu, 14 Nov 2024 02:01:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4445A6B009F; Thu, 14 Nov 2024 02:01:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 1FCB46B009C for ; Thu, 14 Nov 2024 02:01:18 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id D50F8A0F5C for ; Thu, 14 Nov 2024 07:01:17 +0000 (UTC) X-FDA: 82783803144.15.CE13B74 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by imf22.hostedemail.com (Postfix) with ESMTP id AB7A3C042B for ; Thu, 14 Nov 2024 07:00:20 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=O9aPFu9O; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf22.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731567497; a=rsa-sha256; cv=none; b=UgGUvt/pMChusw0N8v7wbLWUqODgBwu+TE2dx6yzdaljh9nj3R4T8C6LbL4Ko1zy4AKyRK TTBDrwL9dEZh4mIEIQpIVuCHHaUzj78oOqi5NgmCfEGThEgt6CaL1SKTA4wiTgkK8C1go8 bsV2xiD5prfEHqhKb6Lcb+v1kM5dg3g= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=O9aPFu9O; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf22.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731567497; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=U4ITvW0bkgyK2CcMa7XVY4WP6GOaHym7f9gBon+26RE=; b=ODlGzTtCLsboOjDqn0R43dCeitRHuoBvXJQrG6sNdSZquaK98n6sBYj/KFpkvU2ucW6G+l OrofGmLOotegNlsy4KzMh+pJ9Ye3jvobSILHOWV8xOEDzATVfGbuVFlJhGkVLxYNaJzDO7 IKAlEAbpNTHyGND+HoFkJ+5uEAm9qXY= Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-211c1bd70f6so3356955ad.0 for ; Wed, 13 Nov 2024 23:01:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1731567674; x=1732172474; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=U4ITvW0bkgyK2CcMa7XVY4WP6GOaHym7f9gBon+26RE=; b=O9aPFu9OypWJTNYc0QroHhub5GovalnyN+XAvFj2WPr7xpaEtYJTZLQYxUg63ap82e /YrqRoNOPwYKnYhN6SFATng4qof4unchdY4Vm072VwfXGO5l7UCGJKsRHdr6VvzTad79 6sbSHSNwtWIdjwLB8qfK/49NJsw7Tjy7k/is8qBqkfFH/gzzYt1Hyg/j/xfQHvd/A78b 9JFMQ+C4yWZZA5YKIUy94N9LkOAUlzfklnJTlHHmlbrpfKXRprVhftKOY1PdI22XiZa/ 6bEJVv26buyz1fFZFRv5K6E8m+6tHdAvgKzIfmVr6jc1FYHHEN5VT6mzVl1mSFUdd0/U D3rQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731567674; x=1732172474; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=U4ITvW0bkgyK2CcMa7XVY4WP6GOaHym7f9gBon+26RE=; b=K/JPv8M4fj5ayxfVWRUb8IIWWqYLrMW37FDJnrnZFLjfupY7Olwd4SzGmH+gmHFb2G n9DirhQxyWuyPxNvUYo4nEXPDT2vg9r8ZZJhetxfmnCTklwnptTnTGUe6QfHQtE5l3wh F/k7fsJAMFvZtNA8Mym5H7UFZ3s1aKD1cZ3yH0Fo/txfMRY4cOWU7A0JTcBQWeDyRNr6 19XqMwZ56L3qojdfc1b/m0uIWUXht4acnTvgWyjQhOw61o8G6txjMUSCOPIn8EldNkLp K3/gvu78OubAJJfkTxWitK1lIbrS/B0WhQSXl9QLNbRkKERiwqz8zz0uY3bYew76XiTk HEVw== X-Forwarded-Encrypted: i=1; AJvYcCW9kKxdnzDf4xbZ/qI8u6YFQO1TkTOm5TqXEwB3W8GlYMtbxP6LFHeydrlnJaZtcwxq4EUajgr1Lw==@kvack.org X-Gm-Message-State: AOJu0Yxm0KHkvbveTw6mJOl+yiRpDbi2qr3REBaG0GI+G6LWiYgho1Av bRlcN88Isj/qFE53e+qGWJU3kozLs8L9sp6LT90Q3t6FodPQbA47+lR88XiuTI0= X-Google-Smtp-Source: AGHT+IHMIWeAj04FwYV71IQtIKJmFnsONOBKIQaZTsFXBlA5BRH3Z1Wl93O3jOYR0L0oLnRZ11BTuw== X-Received: by 2002:a17:903:11c3:b0:20c:5ffe:3ef1 with SMTP id d9443c01a7336-211c0f897b9mr36683095ad.17.1731567674099; Wed, 13 Nov 2024 23:01:14 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-211c7d389c2sm4119065ad.268.2024.11.13.23.01.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Nov 2024 23:01:13 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, peterx@redhat.com Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, zokeefe@google.com, rientjes@google.com, Qi Zheng Subject: [PATCH v3 6/9] mm: make zap_pte_range() handle full within-PMD range Date: Thu, 14 Nov 2024 14:59:57 +0800 Message-Id: <3aaf6c2338372866b85cea78140f5ea497ccc33d.1731566457.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: p5uukmowkkx9cfqere6iitkpge1etmg9 X-Rspamd-Queue-Id: AB7A3C042B X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1731567620-163512 X-HE-Meta: U2FsdGVkX1+Pp4WnA7l+0GBclnnb43u89zcRxGbF2gXf9RDPxcRkrB1F76mE+YlgkToEQ3UHa43Y1a1TAhS6ihPlE4Ar8SKukUtZEV5RtdD+2eY0eJuLD/kxs4/HLU0AVldhQGdPEv2i189LIlM6wB78tOXVlsX0mxrsaEjuBJVmH+lDCf7D7wpH29qW4iKcjnAX1miP1CgpkCn8UATuWLZuFNH0DqMaTGalJKsMhjpfx0mjQuIIe+jSwE68ttg2ZgY/PqQL03Y76f9jJVMrTHPKjSmczIq1Ni06IcR2eF2ZvmUxWoDg2d9S6lwR1Q2Fzn72QsH3Irkc2sTQhbJE/VfaB+jJ6KPgHLlKvSNIr8/m/67ExJ4VEC0id3rvV7N+9kTJtbgZfqol6uUpKDRSJPo6q539OY2e1RJPEHK//e5mSqMwWuFEnhskFXiGyosnAbFG1akEEZDwzzgLrw1LGElukn02DNhuzHPSJ551ExB8uRoHoUI7OXgraF1bd6WSBsbLtUiMCCyqr6hbUwMU1HJbx33ZCD3HuAIRuJDhJyLXqAs18vrUGlmUVAS2gXcsgw1IEguMYOL+kGAi4sQLFDHoime7XC2s18JXJhafZEkHu5EP6mHbxegXP1LAeuW5lKPOoV7fnpTkXL8EhGu25teZTsK1LXnbFyL4M4vE9YdwZlz/RCD4KO4O6ZV6dNY9p8Cnkfr+K++2E6z+mW3VU200NXV7P79Glp+dNEoOuyCN04ie/CGennWvz+i6VMzRJr7ZKU+fmWWrnFYlbNTvN08cjmoT3MRhdoE6avzI8VYaml9TOokX5Om0nicK6sRV++7Z4/V8BqEeW4ZndsslSVYxLt6CWdalYDwVEuV1OsYU5sySnC51+IhgC/t/rAjZRca5DPSYZyJeOCWskh0+C6PLCHqOD6nuqXgfN2TuxW1LtkRN/ZWZ/0hT1d16TrxsdKz5Gl/sjKAEfAVvfKN 9KNY4cWs yQUmNo0f4kU616jKSwcotpj4FerzmCz3Rn1ZwYonKdYSNHxmlJX4vbaknDHVhWskUFbr2Q73I23qtnpWMTlV9tCX1vKamuCfLc07AjhTczrlX8JBOQ6g4NT3eVp4t7TFXBpv4d5FQPY+tDSaXspebIM6BtUktzwGQSCQdDNuVScwBL6S/5wCCBuJDRNJZpJINCy6rTihHEZSjnMP9SEcTmThIDpBni7XcL2Ky++4O80cTw0q+I4kpI99ayTR2NA8/InB6QzU0HhxPLr/qMejofSV4fGq9BYfVfgDDti/w7hihVUmJtrpnLI+1rMVOsRfhJds9crEMCX33x3zZDp2zlKmGDPJXL3lHrcCoOPxSOmun40tug6uSZ7u6NmdLjuDxXepq X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In preparation for reclaiming empty PTE pages, this commit first makes zap_pte_range() to handle the full within-PMD range, so that we can more easily detect and free PTE pages in this function in subsequent commits. Signed-off-by: Qi Zheng Reviewed-by: Jann Horn --- mm/memory.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index bf5ac8e0b4656..8b3348ff374ff 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1711,6 +1711,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, pte_t *pte; int nr; +retry: tlb_change_page_size(tlb, PAGE_SIZE); init_rss_vec(rss); start_pte = pte = pte_offset_map_lock(mm, pmd, addr, &ptl); @@ -1758,6 +1759,13 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, if (force_flush) tlb_flush_mmu(tlb); + if (addr != end) { + cond_resched(); + force_flush = false; + force_break = false; + goto retry; + } + return addr; } From patchwork Thu Nov 14 06:59:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13874616 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A83A8D65C52 for ; Thu, 14 Nov 2024 07:01:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 43B9A6B009F; Thu, 14 Nov 2024 02:01:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3C4B36B00A0; Thu, 14 Nov 2024 02:01:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 217DE6B00A1; Thu, 14 Nov 2024 02:01:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id F32426B009F for ; Thu, 14 Nov 2024 02:01:25 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id AC86980FD8 for ; Thu, 14 Nov 2024 07:01:25 +0000 (UTC) X-FDA: 82783804110.14.89BC2B5 Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) by imf04.hostedemail.com (Postfix) with ESMTP id 9E7F340426 for ; Thu, 14 Nov 2024 07:00:27 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="EX/WhNrO"; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf04.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.179 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731567619; a=rsa-sha256; cv=none; b=C7qTuiT0ACPFA/n+z2a4P6nQrI19TdrQv3BjGCqMp2ORSvgLZkSvICVQtaD/0slBN70Om3 /ldWylJ+0DfNyAShNr1DvK1hO+oDyCqPYR5+Zf/tIv6N+zrJs9Nrmv6HFQQfhToiq2siNG lU/+zXL0QjaKXJkB13Gcz0viEf1HE3g= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="EX/WhNrO"; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf04.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.179 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731567619; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ao15XNwWqfhJRpavPj5UJluLHg99Tu4ljFLM7T2kOsA=; b=gNbKoOpaaEbAN1Xi0d4ym/+pWpN7QDRyXHafzKMe2lCKhvQ5MivFWXWzAZMVvMeXInVMJy PjJVCEQh9oIt+XhmV62yQhjfRSWU7MBu5E1NEsgvKopVaUsqadRLONVHolxzA9tAoCp6qV 1m5K0QFt1lT61SjmkejgozG9MNwUhFA= Received: by mail-pl1-f179.google.com with SMTP id d9443c01a7336-210e5369b7dso2348775ad.3 for ; Wed, 13 Nov 2024 23:01:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1731567682; x=1732172482; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Ao15XNwWqfhJRpavPj5UJluLHg99Tu4ljFLM7T2kOsA=; b=EX/WhNrOIYLs4tR3Ius0LKNFCMEoj3XWTygUGd+EQoYR3p4g/lyaPXEOzyqOQkBWbw FU30Cmeu/WnSkTEeiITX/XQn4EyexfOBP/Z2gS0ZGnY5OytU1WUfsBkG0DLA33r9dWoX CTXF9zQ5tby9QAzjt7ngHBDsWvzSD7ZVn7sO7t7UQBPUYWkrLhEldEal6jpfa6+ozAKF 910G4ENLC+isJt1FuZM1oIoBTnWOoyD/CAvNBGTgNNdxAwKEPqqEpsXllwhUf/roCVjY 4mSnjc9qF9ZvCi92Ji5CDYD04/sDhSsWNXDh9Tg1np1iFfRddg8R7uf9PlB6B37hHcPm 5y6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731567682; x=1732172482; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Ao15XNwWqfhJRpavPj5UJluLHg99Tu4ljFLM7T2kOsA=; b=G3ZKDPvdKpzF1wyNLNvHW1DON2JH39KeJBv9MhtDw95RIXsSdYYqGYmkJSe7FoTcwD LvjD9kgIbRHV7bl8trJsdbNFoQZR52nBfxmG88P+w7FWS25U/l0OvEILP3i9zGq7R0o/ 4kn9RTCZtiQMh4lpB7p/t53Fb68rLvBKiGCn77wdeV+KZP8iFmFcJJt7NK8YvICvXLe4 1WP9kg9ZcjKAvhl8MKl+LoSrolPixrsbo6xACUW3FW+Cex34Ai8dThRsAA2PHDSZZdiE oYB0QP4jEhaB4NSE5Ss84ddCxpF3ungZ2qcy1iEXGR8v0IlbuzPBOmjdBG3FotZbNcE9 fuRw== X-Forwarded-Encrypted: i=1; AJvYcCV6A6RnN4sLb1RbDM3MbSUuzqHI1VM3UxOPJFUrQeHd4MTeLoagcCzSphUDZAgdBcf84wVNXhbsrw==@kvack.org X-Gm-Message-State: AOJu0YxSIdGw13ZhJmSQQcQn+ERNbx9c3k1fgvxlzVKDKqJpLEXMdOJa SJayX01x0RZTV9qnx5XWIklz1EWsjAlqiUZpqHd13q+bHIcMHNUddURG0ZkP3PA= X-Google-Smtp-Source: AGHT+IHw+F+4TmogLhrbU8N9wTkF2jmE4YHNJryvizmOe98BZRuryF1Qlfv8KxZRs6BXORJTlrQqRA== X-Received: by 2002:a17:902:e892:b0:20b:7210:5859 with SMTP id d9443c01a7336-211c50b0c51mr15181305ad.38.1731567681854; Wed, 13 Nov 2024 23:01:21 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-211c7d389c2sm4119065ad.268.2024.11.13.23.01.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Nov 2024 23:01:21 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, peterx@redhat.com Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, zokeefe@google.com, rientjes@google.com, Qi Zheng Subject: [PATCH v3 7/9] mm: pgtable: try to reclaim empty PTE page in madvise(MADV_DONTNEED) Date: Thu, 14 Nov 2024 14:59:58 +0800 Message-Id: <9e6f0cff7ae29cd8bd1812d3a0e3513de3f42f42.1731566457.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 9E7F340426 X-Rspamd-Server: rspam11 X-Stat-Signature: gocfhak1bfg6sw4htfn5ae4k1fz74huf X-HE-Tag: 1731567627-517654 X-HE-Meta: U2FsdGVkX1/3Ha+kdnJ2NEQtKcxDb0pRZW1BzWKEorBTaerst3HIJNhL5k/NykHlVUYd8Ts2ivV1Aee1ju5E+BWqvG9M9b47xsbOcdLidg2cs9Hw4Syu30aFV/dVRFQwk6HYQ+hZhq5VoQXEFN4Lm18m04DN1Nl5ajvGtqt0u9Y/08whkDqOPN6RZP09kM8PEIU+4mp2258FvFonJvA9/vkA/0g2f3DWJARKKxMfcbH8Pkw+ALcRo3YP6QvTkOPjSqnysUO4/50W94u5WVfiVdAnUJ4xaGyjDApRNOfVfY9xbr5d/OXxrN41OIitxQnq1EbIOZN5BhVQW1vSfJhFL9xJuXXbl5nbJkUuabOH0qm/ARs4iZF5NCWYZlXIWGUN0eCyUyb+8//JFtqmo4DL25TC1hat3eCaHhaxc265YwFvAnQNiRF8dAJCyzzD7A6C6+8nKZT11PpH/hlqbLDhgPeWuR/5iWDw/XYWBfutUx6uwgWQK2ZDm0frX+nPkxoszJFHF3AaPBhIMpDt8p3h50dSFDU/SrYOrCxXZnFqxO/ePnmnX1K9EOHbq0NentgL0o8Qnd4Ihp/mMDKhaZS4qAdDT1G8WvnRd0RV/J36c4B6X9dPrvIs55TF5k72RiTwVm6fRhlfGIivLfLex/uNhSoXFxBRq6TMuFCENpzXnjvBlUitcjj6CUr+mwLLk+Mq/kCHtbBuv34FzLrBuicLWXpd5e/tCNDRQRqya9IdnnB4i8dSoNqEQmczusgTFBGlIiLSYYdFsdnGHk6UEgTxYTW1J7mgBUDmhgcXW5l10dxWlwLFzphxFujwUaAJ6FI1i13Twd6rSpkCJDGbssNuHZo+GQBjbw0Ponm31l4X3g3v0uiqdfTUyw8JMedBjX6ELJrJLSH2zUVJIfGJgYDqo2dwFr4/kj7rmAKX/oYYD3jG4QFbwBcHXeyo07+TaN8AUyiU+yieG66PcztX5gG ekkOGIy8 NT8MKC+8/hWALuJRjm2+B/+SPnvPsdUpQ8EIy/259cCq9UjkrwZ+qF5y/fYHtQv7cpMcmHI28zMRlxDDo58sLnxqtbLZYPGbmQDMaWVoJpsAxJZGVjG/VpRD2OHO+80D55Xt0yb4HKKujP8RraI6MVeY+DyVXa3E1picS/Rs+DzCA6zYNk+9d8hYsyJZR0xzQCs29mZgPBczt/O84s9R/OIULIwHc7mgPIZaVe3j4/IvcgidUflt+hBPqsLdgpPnANftgbxewqdvR9CVbHX9q4sodFYxWGBgF0PdPJTGf5j/H/h7zhznw5MYfLwRPZ4ioPMGSsc7a+TU6uoYRhgyXoAgg+SCnxFAO6UZ6wPL9atrjKcA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now in order to pursue high performance, applications mostly use some high-performance user-mode memory allocators, such as jemalloc or tcmalloc. These memory allocators use madvise(MADV_DONTNEED or MADV_FREE) to release physical memory, but neither MADV_DONTNEED nor MADV_FREE will release page table memory, which may cause huge page table memory usage. The following are a memory usage snapshot of one process which actually happened on our server: VIRT: 55t RES: 590g VmPTE: 110g In this case, most of the page table entries are empty. For such a PTE page where all entries are empty, we can actually free it back to the system for others to use. As a first step, this commit aims to synchronously free the empty PTE pages in madvise(MADV_DONTNEED) case. We will detect and free empty PTE pages in zap_pte_range(), and will add zap_details.reclaim_pt to exclude cases other than madvise(MADV_DONTNEED). Once an empty PTE is detected, we first try to hold the pmd lock within the pte lock. If successful, we clear the pmd entry directly (fast path). Otherwise, we wait until the pte lock is released, then re-hold the pmd and pte locks and loop PTRS_PER_PTE times to check pte_none() to re-detect whether the PTE page is empty and free it (slow path). For other cases such as madvise(MADV_FREE), consider scanning and freeing empty PTE pages asynchronously in the future. The following code snippet can show the effect of optimization: mmap 50G while (1) { for (; i < 1024 * 25; i++) { touch 2M memory madvise MADV_DONTNEED 2M } } As we can see, the memory usage of VmPTE is reduced: before after VIRT 50.0 GB 50.0 GB RES 3.1 MB 3.1 MB VmPTE 102640 KB 240 KB Signed-off-by: Qi Zheng --- include/linux/mm.h | 1 + mm/Kconfig | 15 ++++++++++ mm/Makefile | 1 + mm/internal.h | 19 +++++++++++++ mm/madvise.c | 7 ++++- mm/memory.c | 45 ++++++++++++++++++++++++++++- mm/pt_reclaim.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++ 7 files changed, 157 insertions(+), 2 deletions(-) create mode 100644 mm/pt_reclaim.c diff --git a/include/linux/mm.h b/include/linux/mm.h index ca59d165f1f2e..1fcd4172d2c03 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2319,6 +2319,7 @@ extern void pagefault_out_of_memory(void); struct zap_details { struct folio *single_folio; /* Locked folio to be unmapped */ bool even_cows; /* Zap COWed private pages too? */ + bool reclaim_pt; /* Need reclaim page tables? */ zap_flags_t zap_flags; /* Extra flags for zapping */ }; diff --git a/mm/Kconfig b/mm/Kconfig index 84000b0168086..7949ab121070f 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1301,6 +1301,21 @@ config ARCH_HAS_USER_SHADOW_STACK The architecture has hardware support for userspace shadow call stacks (eg, x86 CET, arm64 GCS or RISC-V Zicfiss). +config ARCH_SUPPORTS_PT_RECLAIM + def_bool n + +config PT_RECLAIM + bool "reclaim empty user page table pages" + default y + depends on ARCH_SUPPORTS_PT_RECLAIM && MMU && SMP + select MMU_GATHER_RCU_TABLE_FREE + help + Try to reclaim empty user page table pages in paths other than munmap + and exit_mmap path. + + Note: now only empty user PTE page table pages will be reclaimed. + + source "mm/damon/Kconfig" endmenu diff --git a/mm/Makefile b/mm/Makefile index dba52bb0da8ab..850386a67b3e0 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -146,3 +146,4 @@ obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o obj-$(CONFIG_EXECMEM) += execmem.o obj-$(CONFIG_TMPFS_QUOTA) += shmem_quota.o +obj-$(CONFIG_PT_RECLAIM) += pt_reclaim.o diff --git a/mm/internal.h b/mm/internal.h index 5a7302baeed7c..5b2aef61073f1 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1530,4 +1530,23 @@ int walk_page_range_mm(struct mm_struct *mm, unsigned long start, unsigned long end, const struct mm_walk_ops *ops, void *private); +/* pt_reclaim.c */ +bool try_get_and_clear_pmd(struct mm_struct *mm, pmd_t *pmd, pmd_t *pmdval); +void free_pte(struct mm_struct *mm, unsigned long addr, struct mmu_gather *tlb, + pmd_t pmdval); +void try_to_free_pte(struct mm_struct *mm, pmd_t *pmd, unsigned long addr, + struct mmu_gather *tlb); + +#ifdef CONFIG_PT_RECLAIM +bool reclaim_pt_is_enabled(unsigned long start, unsigned long end, + struct zap_details *details); +#else +static inline bool reclaim_pt_is_enabled(unsigned long start, unsigned long end, + struct zap_details *details) +{ + return false; +} +#endif /* CONFIG_PT_RECLAIM */ + + #endif /* __MM_INTERNAL_H */ diff --git a/mm/madvise.c b/mm/madvise.c index 0ceae57da7dad..49f3a75046f63 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -851,7 +851,12 @@ static int madvise_free_single_vma(struct vm_area_struct *vma, static long madvise_dontneed_single_vma(struct vm_area_struct *vma, unsigned long start, unsigned long end) { - zap_page_range_single(vma, start, end - start, NULL); + struct zap_details details = { + .reclaim_pt = true, + .even_cows = true, + }; + + zap_page_range_single(vma, start, end - start, &details); return 0; } diff --git a/mm/memory.c b/mm/memory.c index 8b3348ff374ff..fe93b0648c430 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1436,7 +1436,7 @@ copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) static inline bool should_zap_cows(struct zap_details *details) { /* By default, zap all pages */ - if (!details) + if (!details || details->reclaim_pt) return true; /* Or, we zap COWed pages only if the caller wants to */ @@ -1698,6 +1698,30 @@ static inline int do_zap_pte_range(struct mmu_gather *tlb, details, rss); } +static inline int count_pte_none(pte_t *pte, int nr) +{ + int none_nr = 0; + + /* + * If PTE_MARKER_UFFD_WP is enabled, the uffd-wp PTEs may be + * re-installed, so we need to check pte_none() one by one. + * Otherwise, checking a single PTE in a batch is sufficient. + */ +#ifdef CONFIG_PTE_MARKER_UFFD_WP + for (;;) { + if (pte_none(ptep_get(pte))) + none_nr++; + if (--nr == 0) + break; + pte++; + } +#else + if (pte_none(ptep_get(pte))) + none_nr = nr; +#endif + return none_nr; +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1709,6 +1733,11 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, spinlock_t *ptl; pte_t *start_pte; pte_t *pte; + pmd_t pmdval; + unsigned long start = addr; + bool can_reclaim_pt = reclaim_pt_is_enabled(start, end, details); + bool direct_reclaim = false; + int none_nr = 0; int nr; retry: @@ -1726,6 +1755,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, nr = skip_none_ptes(pte, addr, end); if (nr) { + if (can_reclaim_pt) + none_nr += nr; addr += PAGE_SIZE * nr; if (addr == end) break; @@ -1734,12 +1765,17 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, nr = do_zap_pte_range(tlb, vma, pte, addr, end, details, rss, &force_flush, &force_break); + if (can_reclaim_pt) + none_nr += count_pte_none(pte, nr); if (unlikely(force_break)) { addr += nr * PAGE_SIZE; break; } } while (pte += nr, addr += PAGE_SIZE * nr, addr != end); + if (can_reclaim_pt && addr == end && (none_nr == PTRS_PER_PTE)) + direct_reclaim = try_get_and_clear_pmd(mm, pmd, &pmdval); + add_mm_rss_vec(mm, rss); arch_leave_lazy_mmu_mode(); @@ -1766,6 +1802,13 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, goto retry; } + if (can_reclaim_pt) { + if (direct_reclaim) + free_pte(mm, start, tlb, pmdval); + else + try_to_free_pte(mm, pmd, start, tlb); + } + return addr; } diff --git a/mm/pt_reclaim.c b/mm/pt_reclaim.c new file mode 100644 index 0000000000000..6540a3115dde8 --- /dev/null +++ b/mm/pt_reclaim.c @@ -0,0 +1,71 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include + +#include "internal.h" + +bool reclaim_pt_is_enabled(unsigned long start, unsigned long end, + struct zap_details *details) +{ + return details && details->reclaim_pt && (end - start >= PMD_SIZE); +} + +bool try_get_and_clear_pmd(struct mm_struct *mm, pmd_t *pmd, pmd_t *pmdval) +{ + spinlock_t *pml = pmd_lockptr(mm, pmd); + + if (!spin_trylock(pml)) + return false; + + *pmdval = pmdp_get_lockless(pmd); + pmd_clear(pmd); + spin_unlock(pml); + + return true; +} + +void free_pte(struct mm_struct *mm, unsigned long addr, struct mmu_gather *tlb, + pmd_t pmdval) +{ + pte_free_tlb(tlb, pmd_pgtable(pmdval), addr); + mm_dec_nr_ptes(mm); +} + +void try_to_free_pte(struct mm_struct *mm, pmd_t *pmd, unsigned long addr, + struct mmu_gather *tlb) +{ + pmd_t pmdval; + spinlock_t *pml, *ptl; + pte_t *start_pte, *pte; + int i; + + pml = pmd_lock(mm, pmd); + start_pte = pte_offset_map_rw_nolock(mm, pmd, addr, &pmdval, &ptl); + if (!start_pte) + goto out_ptl; + if (ptl != pml) + spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); + + /* Check if it is empty PTE page */ + for (i = 0, pte = start_pte; i < PTRS_PER_PTE; i++, pte++) { + if (!pte_none(ptep_get(pte))) + goto out_ptl; + } + pte_unmap(start_pte); + + pmd_clear(pmd); + + if (ptl != pml) + spin_unlock(ptl); + spin_unlock(pml); + + free_pte(mm, addr, tlb, pmdval); + + return; +out_ptl: + if (start_pte) + pte_unmap_unlock(start_pte, ptl); + if (ptl != pml) + spin_unlock(pml); +} From patchwork Thu Nov 14 06:59:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13874617 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 401CFD65C52 for ; Thu, 14 Nov 2024 07:01:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C51A66B00A1; Thu, 14 Nov 2024 02:01:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BDE596B00A2; Thu, 14 Nov 2024 02:01:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A078A6B00A3; Thu, 14 Nov 2024 02:01:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 7BE886B00A1 for ; Thu, 14 Nov 2024 02:01:33 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 2CD0BA0EBC for ; Thu, 14 Nov 2024 07:01:33 +0000 (UTC) X-FDA: 82783804446.14.FB589D3 Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) by imf18.hostedemail.com (Postfix) with ESMTP id 7BE671C043A for ; Thu, 14 Nov 2024 07:01:11 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=TWJGOCv5; spf=pass (imf18.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731567496; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EIspmjr7fEoQ+01q6Xqvp9udSZQrZ5Y5COB53v8W124=; b=ddvfJzTzXwNUh8zI6pOLHBb3XDHRH7n1HIMmmss5+8IQQ4vZBuWvLF7LsJ/BldOX4boje6 adrL25sMSf4KHgNxwVKPkjC+I2UCp5eJSNG8WluA4Yh7uMpXTXESYX668GLJ0fIWJPUzaz THwkL6qS4UZ1vQzSHhkAEJD1UO6fZJY= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=TWJGOCv5; spf=pass (imf18.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731567496; a=rsa-sha256; cv=none; b=D3JHRuIfmFAGIh1SpNGmFfZck6Cb5mBmjFKv6jUUke1Y3Aw2UhwxQy9evfOpuhFmapUewX /CXJipeQNzEgAPRMo41rrI4Fmcm1gqS6B3vPTVTDsGa38PRuJlieLo/mPG0vzkL0nG2313 Z7dsFcLhwBWDFbEMwdjEJ8NqS0BiRaw= Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-20cceb8d8b4so1457585ad.1 for ; Wed, 13 Nov 2024 23:01:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1731567690; x=1732172490; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=EIspmjr7fEoQ+01q6Xqvp9udSZQrZ5Y5COB53v8W124=; b=TWJGOCv54824T8vbDFrpO14eP9LTcGyn/scH71adqczsTM1VG8m0KGJapYSJ9LxITB /vGmEAFN0KMJuUElC1Yc09+QToCCrzOyeLLAjzcOz6bO8XGkxvMXVndBo85St4EyMviv GbcsasotQ+TfRUwd4xCPOR97dCZgndXHTC8Htsmed5XM9X2qb+RqwFSTzhVJ3KCDTyGr CsginC9Id5WB5NbiWcxupa0VpFbyxKj8XyQMEdbuuuup6ReStzmVxySW3d0PWbBz0KSR FgZr1TTUNrrA/qf+jLAgpMQKKa/QnXie82o0DEZQsEVGNUNje/EEWn6NL1akYMRUiILf j4ZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731567690; x=1732172490; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=EIspmjr7fEoQ+01q6Xqvp9udSZQrZ5Y5COB53v8W124=; b=Rb3hHuzFjsDQCEz6rAb+o7OnejPWANed03so4XBgzt6OzYoNp3tUm50dBAQN+LMjbP kNYtQQbLYFkMFOu7ZWPhaTuu+q1yc5Jd8Mjw+Gk+UtK5g16BWzJR+eUbPdMmkDDX7NFO DOkUTzU4WRSb0XSjWX8pg6clS0bZi0i+5z8uipbmPtdLjtzUKRRdz2WkMr5ve0tJV3FD FYkO5V57RU8Z9QCJemddh1Tc8IhEQZDfCS6i6MA7amk5cRBPfOWRb3F9Tr1/gbkenjTk u+M7KAtt98SkMzO0j5ubLKeF8NKwKOzb5WtAzYvfWhAJSTvztc/9M7uS4v3IqJBvTupy 9fng== X-Forwarded-Encrypted: i=1; AJvYcCUT+zjvshImBtj+SQ1YpsxNN/2Vtjs0lQiCAzKNKC9dpq2rWfe09geERqbkkqe/vtNygQr8apPQ3w==@kvack.org X-Gm-Message-State: AOJu0YwfNopFlqgI4e/BthC6fTV9o6VyiAiedpwqe00Uo77cOvAdrE/M Ksj4AEWd0lBl4/70a19GP2js2XzRF0sizOBf877hXFu1xCEAhYG684HHpF1EokY= X-Google-Smtp-Source: AGHT+IGZZsMk4C1Gc1YiyqnQ/Vm69/2CldCbtvO7gKrKwJI6uPz/y6t4qr23UntpDUqpfJ9OLSSGjA== X-Received: by 2002:a17:902:cf11:b0:211:2fb2:6a6 with SMTP id d9443c01a7336-211c0fa7399mr34336085ad.24.1731567689701; Wed, 13 Nov 2024 23:01:29 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-211c7d389c2sm4119065ad.268.2024.11.13.23.01.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Nov 2024 23:01:28 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, peterx@redhat.com Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, zokeefe@google.com, rientjes@google.com, Qi Zheng Subject: [PATCH v3 8/9] x86: mm: free page table pages by RCU instead of semi RCU Date: Thu, 14 Nov 2024 14:59:59 +0800 Message-Id: X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 7BE671C043A X-Stat-Signature: jykiuq6gh4dk3wyehirp4tgy65w9gmu7 X-Rspam-User: X-HE-Tag: 1731567671-567985 X-HE-Meta: U2FsdGVkX18sRc0Clv8//0qdw8EoOj6zO1t+bZYOFo/hgApoYDTk8AjxyNDLRWhAJyryaV1msBN/7Ggj0MRRRzOuSXqJ4HOUaRjKNrp/mPtXiG5JgfCaYe7aY0uJVJ/nsVqtDI94pArI3OKxeRCmptYTDSmKX/LqAqTW/5C5nHYl2ALc7HDEOtc4mHHHcZMNZ/IhXDtOKhWjChtykRBj1RnN9CWUCAxbeoX/qCCQouJxNGBatd0vMTp1Mb2ZRAu/dRTTUUJ2WB0ON4b03IuaFPyesa+KM0N6ZHPX5Xd48Q7CRrCTu9RjfSPE9q72B1HD90kmQQnB5H01FJXzT1QN0wzUPqb6VG3kx5J3Hq5FBp8HoohtLlTgNUgwwlncadPu3mYbG1nobvdBjO9kZw656VorMQI2d0UubZD/1rWPW0ki9pqGtM8VSe1o0IyM/LOWS/QIifjsB0iNUVejNgCK3V5DlOgdviCLIt495sc9pUFnoDfdVeafSLGu8pfL/SN8hiI797NVBAEhoBD7vdfFFjkUR8jS/8ETDH9f9rpcEvBWAIiHG+r94omTrM1Fs8csvpaGL6FhzQ/5HGfgyS6dTlOQCIknN8DkRyX/ijHsfsg1p9d0eJAzYFl7I2mNviMyVTcsrbrfHQXB/guj2p+TPlzX+P4LCq9hfJ+yZ/rOK4JTpkJt8ZhH14Rb6ww3NhWqf8CJvRYgqvx2QXgElS+FNosAkX2WMIVfNL10JgTnD+Q6y9FN5DDes9eBP4+OVcgjo01K4WEbpkXgx5eGHDy3qNi8HO8ZMo+I2BQq3guRyUGu0c+ig7zguzUShq5Jqv1MY0to8l8PQFKuA2t2JsQA1FlpJVR4Vz1QsNVAiSsyiN1opSUMGrvuvOocA+TvDyc96B0tsmN7bGkSDkXkHmA9q0kmdO8lCBG4ib8gV3qKIx1xzO1nW/L7IOs/cbWo3DgyU32ARwyqv0YPe8cszWf ZBNB9V20 Edcj7DYVAQ7XOqd5aRoXgD6/bJ68KZJYH3VV+ZfREi6LrGKbllGjcG1dv71qiEqxdNCipvHjZ8BHMX4Xf/kuylKWcHp9pjOn9cxHWDbpqHxz4pWC7C1yauCZOaI9VSOUsObt0J6uoUOwc2Q80JMTat/JuDTwEo6rmL/5AN32Zv3O0IvDfNqoljoxRPr3JtuUPM0AMfNJIUOrLa4qvB+ZXFp8dAXrgOuqevEw8+GZb2EvX7ivzEwdZtK4tHyuNP7Yqk8zzYjRlQ5P/XWuhowE8cjBNcHFlaTQ+FDxhZ2jb4J4fiRG0NBB601uPxPwWCIfYIUdbtIjY8dYuwZmmaayd2HbtNVvIjmxnHjZ0Y0dTLLvW3E3SIb2gWBVym1Qzeec6AYD4Gj6X6M3KHcXSoJ7Wi1u8Nz8Vwlb5P2pF X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now, if CONFIG_MMU_GATHER_RCU_TABLE_FREE is selected, the page table pages will be freed by semi RCU, that is: - batch table freeing: asynchronous free by RCU - single table freeing: IPI + synchronous free In this way, the page table can be lockless traversed by disabling IRQ in paths such as fast GUP. But this is not enough to free the empty PTE page table pages in paths other that munmap and exit_mmap path, because IPI cannot be synchronized with rcu_read_lock() in pte_offset_map{_lock}(). In preparation for supporting empty PTE page table pages reclaimation, let single table also be freed by RCU like batch table freeing. Then we can also use pte_offset_map() etc to prevent PTE page from being freed. Like pte_free_defer(), we can also safely use ptdesc->pt_rcu_head to free the page table pages: - The pt_rcu_head is unioned with pt_list and pmd_huge_pte. - For pt_list, it is used to manage the PGD page in x86. Fortunately tlb_remove_table() will not be used for free PGD pages, so it is safe to use pt_rcu_head. - For pmd_huge_pte, it is used for THPs, so it is safe. After applying this patch, if CONFIG_PT_RECLAIM is enabled, the function call of free_pte() is as follows: free_pte pte_free_tlb __pte_free_tlb ___pte_free_tlb paravirt_tlb_remove_table tlb_remove_table [!CONFIG_PARAVIRT, Xen PV, Hyper-V, KVM] [no-free-memory slowpath:] tlb_table_invalidate tlb_remove_table_one __tlb_remove_table_one [frees via RCU] [fastpath:] tlb_table_flush tlb_remove_table_free [frees via RCU] native_tlb_remove_table [CONFIG_PARAVIRT on native] tlb_remove_table [see above] Signed-off-by: Qi Zheng Cc: x86@kernel.org Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra --- arch/x86/include/asm/tlb.h | 19 +++++++++++++++++++ arch/x86/kernel/paravirt.c | 7 +++++++ arch/x86/mm/pgtable.c | 10 +++++++++- include/linux/mm_types.h | 4 +++- mm/mmu_gather.c | 9 ++++++++- 5 files changed, 46 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h index 580636cdc257b..d134ecf1ada06 100644 --- a/arch/x86/include/asm/tlb.h +++ b/arch/x86/include/asm/tlb.h @@ -34,4 +34,23 @@ static inline void __tlb_remove_table(void *table) free_page_and_swap_cache(table); } +#ifdef CONFIG_PT_RECLAIM +static inline void __tlb_remove_table_one_rcu(struct rcu_head *head) +{ + struct page *page; + + page = container_of(head, struct page, rcu_head); + put_page(page); +} + +static inline void __tlb_remove_table_one(void *table) +{ + struct page *page; + + page = table; + call_rcu(&page->rcu_head, __tlb_remove_table_one_rcu); +} +#define __tlb_remove_table_one __tlb_remove_table_one +#endif /* CONFIG_PT_RECLAIM */ + #endif /* _ASM_X86_TLB_H */ diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index fec3815335558..89688921ea62e 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -59,10 +59,17 @@ void __init native_pv_lock_init(void) static_branch_enable(&virt_spin_lock_key); } +#ifndef CONFIG_PT_RECLAIM static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) { tlb_remove_page(tlb, table); } +#else +static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) +{ + tlb_remove_table(tlb, table); +} +#endif struct static_key paravirt_steal_enabled; struct static_key paravirt_steal_rq_enabled; diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 5745a354a241c..69a357b15974a 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -19,12 +19,20 @@ EXPORT_SYMBOL(physical_mask); #endif #ifndef CONFIG_PARAVIRT +#ifndef CONFIG_PT_RECLAIM static inline void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) { tlb_remove_page(tlb, table); } -#endif +#else +static inline +void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) +{ + tlb_remove_table(tlb, table); +} +#endif /* !CONFIG_PT_RECLAIM */ +#endif /* !CONFIG_PARAVIRT */ gfp_t __userpte_alloc_gfp = GFP_PGTABLE_USER | PGTABLE_HIGHMEM; diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 97e2f4fe1d6c4..266f53b2bb497 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -438,7 +438,9 @@ FOLIO_MATCH(compound_head, _head_2a); * struct ptdesc - Memory descriptor for page tables. * @__page_flags: Same as page flags. Powerpc only. * @pt_rcu_head: For freeing page table pages. - * @pt_list: List of used page tables. Used for s390 and x86. + * @pt_list: List of used page tables. Used for s390 gmap shadow pages + * (which are not linked into the user page tables) and x86 + * pgds. * @_pt_pad_1: Padding that aliases with page's compound head. * @pmd_huge_pte: Protected by ptdesc->ptl, used for THPs. * @__page_mapping: Aliases with page->mapping. Unused for page tables. diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index 99b3e9408aa0f..1e21022bcf339 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -311,11 +311,18 @@ static inline void tlb_table_invalidate(struct mmu_gather *tlb) } } -static void tlb_remove_table_one(void *table) +#ifndef __tlb_remove_table_one +static inline void __tlb_remove_table_one(void *table) { tlb_remove_table_sync_one(); __tlb_remove_table(table); } +#endif + +static void tlb_remove_table_one(void *table) +{ + __tlb_remove_table_one(table); +} static void tlb_table_flush(struct mmu_gather *tlb) { From patchwork Thu Nov 14 07:00:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13874618 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88038D65C52 for ; Thu, 14 Nov 2024 07:01:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 01FAA6B00A3; Thu, 14 Nov 2024 02:01:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EE9506B00A4; Thu, 14 Nov 2024 02:01:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D62A16B00A5; Thu, 14 Nov 2024 02:01:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id AF40C6B00A3 for ; Thu, 14 Nov 2024 02:01:41 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 53E6E140FE3 for ; Thu, 14 Nov 2024 07:01:41 +0000 (UTC) X-FDA: 82783802136.17.178264E Received: from mail-pj1-f42.google.com (mail-pj1-f42.google.com [209.85.216.42]) by imf06.hostedemail.com (Postfix) with ESMTP id 54B98180476 for ; Thu, 14 Nov 2024 07:01:08 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=k1VPvIfr; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf06.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.42 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731567522; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ux2VNbXo90ImX5V4UUDESmDG/GCgEeKifxWHEZQhBkY=; b=RWaX9hKoQoS8hJhCAuDQlJa4T2wlvkf5MIVFAzqcHrZdJWpqBNflf7pu8sDKO7NRxFcqbH 4wgDDjFRH/nTyJe0YBsS/blSPT9A5NIwj9S4bf2gvR+FFM7vEPeXanFY5bN4OyXKDVor3/ TQ4BR0gQJOCIw+f9GtRlKJWiX5MqcnE= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=k1VPvIfr; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf06.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.42 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731567522; a=rsa-sha256; cv=none; b=5jFpBsMucMvzbrEM5rITvu06kvNN73U/Y6Numf5I7Hg0T7FnsTP5hYAEw5MMZrUSFAA4nh V2iQm44c5epR3on0HEhcWilB8MftyYZ5CrMfM2DN7VnGMfYyk/70dXKDAA9AbQH5eWSHfP 2923YulTU7keqw6RC68+dMf7stMhlpI= Received: by mail-pj1-f42.google.com with SMTP id 98e67ed59e1d1-2e91403950dso238667a91.3 for ; Wed, 13 Nov 2024 23:01:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1731567698; x=1732172498; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Ux2VNbXo90ImX5V4UUDESmDG/GCgEeKifxWHEZQhBkY=; b=k1VPvIfrTIQvGl6euasvNN5LGa9fJwWhbbyHqTaJRmFjBj+/jO5YL6AVt7DifbgnOe dFUj9nGN3LTPGg5btNVLCnI5K06la8wx3PsqplhytYWse1WZ+vjX83dFflMaw/IOBYkG hMEDhHIYoDcPJ1xFizg4+c9Fp27zO8S6IfWLjdTEQakFYzyq01usiq2c3UdTenGnkXH+ qHOhurWHB/oRpZKxETezAVERN0tet7jK73HbcIfvpqHfQJKzns+A+af1JebWvl5rixk2 DBenQHZbYwO5y0zthBauYQ4cEQFvDNX6E6KCNpBZM8ijbaM6zPjvSwUg23wZ3+b6p7s/ nEDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731567698; x=1732172498; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Ux2VNbXo90ImX5V4UUDESmDG/GCgEeKifxWHEZQhBkY=; b=oc2MUfo7jsYDXhhVKq6fKaoofcST4EKsBLQIb5IY0Op50u4m9aVGUfh9+UQUFSa/DY Obn153AYjffBV2tZFyoqS/9xswm9/TXamKem3LRAghk5htBnMwNS3QgDeLvjQ26jxGxh WI3iusD9mQM7tQFmm/Veu2Hqjbg58uETVmMZ30h+7tiKctSIRltHVGkJ75XsYAkN9R5x /fC/dDwMQ6TPiyZUZ6Wh+jTltSps8DM75vJBuZMR9lYaFc3oYAuqgiexKVNrKpQz7wMG Ap+n56mO8LXnApVpnVJv+kcZ6UzmvH71et7GHAZPTlHiKI0EaREw9RgikBVFMjIZjl/C 1j8A== X-Forwarded-Encrypted: i=1; AJvYcCUBCb43aDiYANYz4Bh26jPD7Kg5PHR2jQvHqtkX7NbEe4Jkga9J8d/+lqLW+ilSbqSeOZiq7+WS8w==@kvack.org X-Gm-Message-State: AOJu0YyqL4jyHuY/HorNt8p8kJkU4RRSAejSGBmEmC+J5aTb5qPfJ8Ia AAUjjdAVYbkPaDoLIqHGn+7SBL3svIKWfmZKJB1PLFmlcHnXL3RnNoU4tdVhTUE= X-Google-Smtp-Source: AGHT+IGD/Oma/paalKZqrPUZZp2giaMy+M6Rqmdoz0DNDtq/SsMG2116QxX53frrrDiQ+TqULr4MYA== X-Received: by 2002:a17:90b:17cb:b0:2e2:cef9:8f68 with SMTP id 98e67ed59e1d1-2e9b16ee943mr30937378a91.4.1731567698222; Wed, 13 Nov 2024 23:01:38 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-211c7d389c2sm4119065ad.268.2024.11.13.23.01.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Nov 2024 23:01:36 -0800 (PST) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, peterx@redhat.com Cc: mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, zokeefe@google.com, rientjes@google.com, Qi Zheng Subject: [PATCH v3 9/9] x86: select ARCH_SUPPORTS_PT_RECLAIM if X86_64 Date: Thu, 14 Nov 2024 15:00:00 +0800 Message-Id: <69c25b17661499afe4b35f1d30b26dd346a649ec.1731566457.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 54B98180476 X-Stat-Signature: zm3pw3eczch1twus619igfr679hm9wr1 X-Rspam-User: X-HE-Tag: 1731567668-310328 X-HE-Meta: U2FsdGVkX18+upAcm33Nk0yHljv1EkV88k8aXrNWgen9p/vA53uXp/d0JBw9BmUqgjfhlzJABYoUQrzhNcFrJiQVJ8tJYpYcgmiLqdAsNJ32EWasnVHPopHtw4HsdrZcyu3QoHNwrS4ORz+L6jVgXxl8USVbJ8Q67i5FpKuOpwAyE39w8vX3JVcYo+E39ZpLil+87l+BdnocYxtefNjxHaUFjE57JqnrfqoBypCVX2Q0XxWphaxSzLV2zo5dSyXpOaICsr+87t69EuacpL6kOUFzy54kfZ3ui0cewwExzglcjB4En6PhR0wZkB3NAFtn8tcZbipLpIYqlA8XhXgxLnrUY/BSD2CngpUGrmtRNcYzrh3HmM34jxZuIQQD82mmucxDzUJOfHzaOy3rV6x15aielpsl04OnDjpSE1naXPaM0FvdVV3sQUsjgcXH+HprZjTF3O7oUh4WAlfd0Qa89hJ7clYzFfQxYVrfS54dd0RmVmm3U05wjLSZQlzbderzEYlOn92kciXZn3pLMas4t8f7UJXpx1W/sLSKNqLZNK9KI7oXWwwb/xcL1GkhBQ3JoTydaGaqrK6/s+TL6d2hlkvvuz+7zWY2tOpUKXg/GWgpVbbojNQ94hWWM8mh364d68QyC/x23Tnex8INX7VZ4P19RR1WgixvEib8vq1pAMneN0mVWVadIZH9RSAioZgdXg3Yi/NzBgY20SomZXrZu/65aVKVHcAAwcttVoP8gb7pLCeAF49hJXpNi6SBx1XioeazRNCkJtOEevh1tciYyztwnhwi/xzPSQYIoWhzSlyKWNfjO74xAMzIlCz+tV3fEDK8wg3jSmRaemPDI1r/8MQ3dpuFTM/o5Etj2aTX5KMBBH4A7lVZn32+fAlniaIujhyhjM0d2baCClap0CKUXl4Edaw181Mo6Uh1MD1z+DX6RkYH31F1vjidOcDtoE/Qgm6nVvyROjtX4gE5tzF uS886Uo1 Pvwse6tRL2srFyExgi0lNhG7TTB6VcWiGjOs/5M/frI6yaEqbqNXNCfDb8WaoYViJRgdgHVWTpcXGWv1K9OoQfLvMPjv6oBybYnSsU7k7jv7gqPSDK9KIWpKKzSD7yI6WWteCCtGmJZkU8v2khiCAmRnRteKyKpydej8F/gRxNQV/5riD+aiPuz8BhsYLY+xbIh+pt3M2lStluEbIgUFdlVzyfY8XFdUGO0ydAbm+bJ6v68LXMPCdRLsaSDtuByVzaFQ/JHRBF49MPbmvfUJ1J0+pzg8TVJpXWCI0ZEgjyZLMR4aR5cNjEighJIzc7QuSjYHQVTf2kX6Ep2aOJLzDTsOoRImDy0xZrpqokD91i6Tyi620uS692nvij/dtRu1AJiKdosDWc9744saV6tHVPKsEjyt2XDDNL4EZSskBDw3aXMFvBqfHyyzS5Tz/ipbzmrpthPP3iuLib9KUhSNNqS6ytQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now, x86 has fully supported the CONFIG_PT_RECLAIM feature, and reclaiming PTE pages is profitable only on 64-bit systems, so select ARCH_SUPPORTS_PT_RECLAIM if X86_64. Signed-off-by: Qi Zheng Cc: x86@kernel.org Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra --- arch/x86/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index e74a611bff4a6..54526ce2b1d90 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -322,6 +322,7 @@ config X86 select FUNCTION_ALIGNMENT_4B imply IMA_SECURE_AND_OR_TRUSTED_BOOT if EFI select HAVE_DYNAMIC_FTRACE_NO_PATCHABLE + select ARCH_SUPPORTS_PT_RECLAIM if X86_64 config INSTRUCTION_DECODER def_bool y