From patchwork Tue Dec 31 04:35:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 13923619 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E14A9E77188 for ; Tue, 31 Dec 2024 04:36:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 73E466B009A; Mon, 30 Dec 2024 23:36:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6C7406B009B; Mon, 30 Dec 2024 23:36:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4F54C6B009C; Mon, 30 Dec 2024 23:36:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 281C16B009A for ; Mon, 30 Dec 2024 23:36:07 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id D33CF47C26 for ; Tue, 31 Dec 2024 04:36:06 +0000 (UTC) X-FDA: 82953990546.24.ADDD200 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf04.hostedemail.com (Postfix) with ESMTP id D441140002 for ; Tue, 31 Dec 2024 04:35:15 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="rd9/zckB"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of 3s3RzZwYKCKwkglTMaSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--yuzhao.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3s3RzZwYKCKwkglTMaSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--yuzhao.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1735619720; a=rsa-sha256; cv=none; b=iJdGOopubFhKtlkyIh1WFH+8OxAVLRc9LmSMtXQSGxWxpdeiJ47LsODJ0r/RE5o1vis6Xr 5gl9YkmjWYPHC6+OhUu6jgpvOqnJXRXDPdmUQIskuAKcV7G6cwJH9dyNUV35765TOA+ukI YIgIIWe3QcbNEqgFHA/s5VHif9dMsl0= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="rd9/zckB"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of 3s3RzZwYKCKwkglTMaSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--yuzhao.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3s3RzZwYKCKwkglTMaSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--yuzhao.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1735619720; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uSnhijC2SU9G2JpiGLLPy0X8NVAvldR4ppcAmuCmJd8=; b=oGqtCECPVBLVHJd5mSKX56NvxAg9nPqRj2J/tGimstrWicT579/V+H+ebfXhWH2JTR+hyE S/LwD4LYXZnvmoX0Tn18ZtKsphYOzGInaVxK0pxKchP4AxFolqEVkHQ+aRJXgByNW89rJF fteIsrDZlNYseyTKe2nFLx/Q9txnugk= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2163c2f32fdso210110075ad.2 for ; Mon, 30 Dec 2024 20:36:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735619764; x=1736224564; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=uSnhijC2SU9G2JpiGLLPy0X8NVAvldR4ppcAmuCmJd8=; b=rd9/zckBSSUR+v3drfFoKeoes3nY9E/5zFBR/FiZCeAQewCmMfmnBQaYbZbC0uU1mO j2eZs3wGNcuzBZLg/IHnFzrkYwtx3HQrsPaaP3R7JVdG512PlGOYrgsXFBmbjGP0x2vY ZouGAxpXt4hPzJdotyhQniFTeD0yKPmS900zCK0yoMUOzVpRoJGFVwHy0jsaoVKBf0Mt ioF2NYRcUS0fsCHUkJWv7wMjg6JNNU9OpbdacBUPZ1pGw0BTL9fQbUHIUW7XtwRpqn4Q qfQQkKZn1XM+bLzeX+jl4BoxdP0od9Xl4rsbV6Zs4GLXCPzph+6Sfd6yVaMlpBeLVCd9 7sUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735619764; x=1736224564; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=uSnhijC2SU9G2JpiGLLPy0X8NVAvldR4ppcAmuCmJd8=; b=jmswIpFu1qgBD5X9caH1d6qD8KEcVpF9ZDWzk1Tljcgp3Y5YZR8lYb6CEW0jMs6egG Evaog99MfFvMKeTBx7EMy/OlSX9MlzzBkRWXQIov1PKEk+BTlzVXcPUl//saQ92JqNeX 2wl83nrIWifQItKA/eBuL4WUlTJ9gmymlX6IkASdND4vzeCjgj4IHYiSIkYzc65bx6/u vWqQiuiZDDYrLJkMIazhGwQyvnjnGro+o+Sqp3AcDi2N92QCNkhIVQXb6CyylHumvwBZ LUNIwpfTcxK2YPaI1sY7GoNAposmg2pvFAzRfzj25wVdDkT6kuW44JroREyjjXN+EOGX q6yQ== X-Gm-Message-State: AOJu0YznCL+NweGe6BCoG3c3Giiu2riOA2H7VVYXXLRDCJe77J43VoQH MPJ+p9ihXDLucB8WhaAUnx7vmiHbbgcYNky4PujNpWLtDMCE+WLE0cMkCcDdKt5zbwm5asXSxg3 eGQ== X-Google-Smtp-Source: AGHT+IG+LWeqA7gxlbhJ4Aa6fTniLxLak1adrDfoZ2HZ8A15k6jPqKUZa911/S4AQd/cYmc6w8UN3NHudRE= X-Received: from pgbda9.prod.google.com ([2002:a05:6a02:2389:b0:7fd:577a:6d1e]) (user=yuzhao job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:ef4e:b0:216:7761:cc36 with SMTP id d9443c01a7336-219e6f14bbamr619775545ad.43.1735619763742; Mon, 30 Dec 2024 20:36:03 -0800 (PST) Date: Mon, 30 Dec 2024 21:35:38 -0700 In-Reply-To: <20241231043538.4075764-1-yuzhao@google.com> Mime-Version: 1.0 References: <20241231043538.4075764-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241231043538.4075764-8-yuzhao@google.com> Subject: [PATCH mm-unstable v4 7/7] mm/mglru: fix PTE-mapped large folios From: Yu Zhao To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yu Zhao , Barry Song , Kalesh Singh X-Stat-Signature: tqfuiaj7ifcnuaqqseinjeoaaqsqsft4 X-Rspam-User: X-Rspamd-Queue-Id: D441140002 X-Rspamd-Server: rspam08 X-HE-Tag: 1735619715-776609 X-HE-Meta: U2FsdGVkX19FahI0FQ0sQkMHKzINHEcP7PLczPn1+MiEe0WY9kegFjJ0Xllhi8j4UWpW9ggYs1MlbXDyNuzk+y5eGbdpmX08YPsYM/0oZMpDFPG03R7OhZ0zjD8r1rkjjVxzsNfpcNMV2XaGOwxpdzSYwoLgDjph+0b9M7gTifBNLQMbdfDzoBpPgMcQFZjQFOeOrLrU0A8nJLsQE+n2i3DoMCkRaqcWa1t+ibRwHHJLRrcPeg+vZEajXMtMF0kAq41Eun21xuY9y6eX8osbzCrSaEFma+6JBErRfSaF1hjbzyvku06ZNr/XSLwSCcFjCECbPu2T4ZQWFnhcejkClNs2iYTEzNjNDcDKs5kfNHSxgY2rzcpsT5buZMOv/uLj7I79UskghtCZfjtJeHm1qUAD9hU6O2tZvmILrtsTgrywUCs6jCGsBqoQQf7BPQ8YlTZZKflOHzzo9CjbiAg4Oz8ij/U87BD9lJGoxdIgVs/nc/IwLH1IYo1sunusWy2PW51D4iHe1OuySRd6CNX5/79Y/0hxcHD0yJEPj/BHfcwQckshdCKttRKDGOIknF174NghIJTF2OiNcIaFgEO+2NRQiJCY3ls7yoixYdLQCiOolXG9llQ0XtCglXqPGpAXbqSIUu08zDwa3fkAOpjhyjyXK5TY/aplNf+M4NVFBLq3UZscsePChEUTy+KQtke21ofwqx9iDDC73MkAv1u5BSw1csUPtKcoW+vAdjICYrt+jV4QdRnWBhgPwuJEko/QiRzN2gVnJwOeSanHPqiZAgifZEYhVUaE/L1b8Sn3g5PjHKvXLQDWtVnVBp4alibXBYpLmlBCFA46YqLEo57YsCsOExrL5I2RHK0t1kREdUvPaw+u6yTHYZzSKnrt4OoGv29yWDJMKOaOYek6zb4BQPqAg7ggkYW6mCQC/JOmnedrxlidupIRD8MChsp/WZ+XrtFJWPsyEMWCx57WK+R kZPuBV9M PY3OpGdhAlmmGZM17CAUpj+NOBpYuyxPL58DsNULJBXu3NGPFiR3nzQ2lKERe84b+ctbt3Gvu2a7VitkwykxfMsizx7vo+zD9vOOIEV5F2BXkmXUmyuibbV0k7hEOTos0qeDcilXpXM8V4obptIZYfw2gKbur97Mk+j8kGoHnQwk5/FnIthdHX2jdR0Wz0hPpWbK9OvyJAd9VZ/4366xQrqlZqjCOxyfNgE6QNzZTnh8BXwrY3bNg7uzVnkM3iinXPYTEpqxyNznQBH4qIbunBepWDTGqWI5Xjvpl11RKLtNfbJv6y56FvzR/CJtJgALaVhOfH3i8hCuhvWviMzk+hVnZVnTlr/muVaU6qAttCHUukwEMW67koWvCaoF6sq6IV+iWsVRXEB03AuOxs0mW0QgAvJ2NbfUviF/bg5/DDrbF3J+a3sP0MJdI8M+CBsVPskZUV//recs3MAjt9c57qZPtrHZV6Cxg4gpTDDuqzQonJdLZMBbdRcdC4PcoPtTwIz5EGdeM8RFr5m2MbEnmNjvF70AgvU69UdK4yLnaPMxfqdERsdzmoUO3Fw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Count the accessed bits from PTEs mapping the same large folio as one access rather than multiple accesses. The last patch changed how folios accessed through page tables are promoted: rather than getting promoted after the accessed bit is cleared for the first time, a folio only gets promoted thereafter. Counting the accessed bits from the same large folio as multiple accesses can cause that folio to be promoted prematurely, which in turn can cause overprotection of single-use large folios. This patch reduced the sys time of the kernel compilation by 95% CI [2, 5]% on Altra M128-30 with 3GB DRAM, 12GB zram, 16KB THPs and -j32. Reported-by: Barry Song Signed-off-by: Yu Zhao Tested-by: Kalesh Singh --- mm/vmscan.c | 110 ++++++++++++++++++++++++++++++++++------------------ 1 file changed, 72 insertions(+), 38 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 74bc85fc7cdf..a099876fa029 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3431,29 +3431,55 @@ static bool suitable_to_scan(int total, int young) return young * n >= total; } +static void walk_update_folio(struct lru_gen_mm_walk *walk, struct folio *folio, + int new_gen, bool dirty) +{ + int old_gen; + + if (!folio) + return; + + if (dirty && !folio_test_dirty(folio) && + !(folio_test_anon(folio) && folio_test_swapbacked(folio) && + !folio_test_swapcache(folio))) + folio_mark_dirty(folio); + + if (walk) { + old_gen = folio_update_gen(folio, new_gen); + if (old_gen >= 0 && old_gen != new_gen) + update_batch_size(walk, folio, old_gen, new_gen); + } else if (lru_gen_set_refs(folio)) { + old_gen = folio_lru_gen(folio); + if (old_gen >= 0 && old_gen != new_gen) + folio_activate(folio); + } +} + static bool walk_pte_range(pmd_t *pmd, unsigned long start, unsigned long end, struct mm_walk *args) { int i; + bool dirty; pte_t *pte; spinlock_t *ptl; unsigned long addr; int total = 0; int young = 0; + struct folio *last = NULL; struct lru_gen_mm_walk *walk = args->private; struct mem_cgroup *memcg = lruvec_memcg(walk->lruvec); struct pglist_data *pgdat = lruvec_pgdat(walk->lruvec); DEFINE_MAX_SEQ(walk->lruvec); - int old_gen, new_gen = lru_gen_from_seq(max_seq); + int gen = lru_gen_from_seq(max_seq); pmd_t pmdval; - pte = pte_offset_map_rw_nolock(args->mm, pmd, start & PMD_MASK, &pmdval, - &ptl); + pte = pte_offset_map_rw_nolock(args->mm, pmd, start & PMD_MASK, &pmdval, &ptl); if (!pte) return false; + if (!spin_trylock(ptl)) { pte_unmap(pte); - return false; + return true; } if (unlikely(!pmd_same(pmdval, pmdp_get_lockless(pmd)))) { @@ -3482,19 +3508,23 @@ static bool walk_pte_range(pmd_t *pmd, unsigned long start, unsigned long end, if (!ptep_clear_young_notify(args->vma, addr, pte + i)) continue; + if (last != folio) { + walk_update_folio(walk, last, gen, dirty); + + last = folio; + dirty = false; + } + + if (pte_dirty(ptent)) + dirty = true; + young++; walk->mm_stats[MM_LEAF_YOUNG]++; - - if (pte_dirty(ptent) && !folio_test_dirty(folio) && - !(folio_test_anon(folio) && folio_test_swapbacked(folio) && - !folio_test_swapcache(folio))) - folio_mark_dirty(folio); - - old_gen = folio_update_gen(folio, new_gen); - if (old_gen >= 0 && old_gen != new_gen) - update_batch_size(walk, folio, old_gen, new_gen); } + walk_update_folio(walk, last, gen, dirty); + last = NULL; + if (i < PTRS_PER_PTE && get_next_vma(PMD_MASK, PAGE_SIZE, args, &start, &end)) goto restart; @@ -3508,13 +3538,15 @@ static void walk_pmd_range_locked(pud_t *pud, unsigned long addr, struct vm_area struct mm_walk *args, unsigned long *bitmap, unsigned long *first) { int i; + bool dirty; pmd_t *pmd; spinlock_t *ptl; + struct folio *last = NULL; struct lru_gen_mm_walk *walk = args->private; struct mem_cgroup *memcg = lruvec_memcg(walk->lruvec); struct pglist_data *pgdat = lruvec_pgdat(walk->lruvec); DEFINE_MAX_SEQ(walk->lruvec); - int old_gen, new_gen = lru_gen_from_seq(max_seq); + int gen = lru_gen_from_seq(max_seq); VM_WARN_ON_ONCE(pud_leaf(*pud)); @@ -3567,20 +3599,23 @@ static void walk_pmd_range_locked(pud_t *pud, unsigned long addr, struct vm_area if (!pmdp_clear_young_notify(vma, addr, pmd + i)) goto next; + if (last != folio) { + walk_update_folio(walk, last, gen, dirty); + + last = folio; + dirty = false; + } + + if (pmd_dirty(pmd[i])) + dirty = true; + walk->mm_stats[MM_LEAF_YOUNG]++; - - if (pmd_dirty(pmd[i]) && !folio_test_dirty(folio) && - !(folio_test_anon(folio) && folio_test_swapbacked(folio) && - !folio_test_swapcache(folio))) - folio_mark_dirty(folio); - - old_gen = folio_update_gen(folio, new_gen); - if (old_gen >= 0 && old_gen != new_gen) - update_batch_size(walk, folio, old_gen, new_gen); next: i = i > MIN_LRU_BATCH ? 0 : find_next_bit(bitmap, MIN_LRU_BATCH, i) + 1; } while (i <= MIN_LRU_BATCH); + walk_update_folio(walk, last, gen, dirty); + arch_leave_lazy_mmu_mode(); spin_unlock(ptl); done: @@ -4115,9 +4150,11 @@ static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc) bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw) { int i; + bool dirty; unsigned long start; unsigned long end; struct lru_gen_mm_walk *walk; + struct folio *last = NULL; int young = 1; pte_t *pte = pvmw->pte; unsigned long addr = pvmw->address; @@ -4128,7 +4165,7 @@ bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw) struct lruvec *lruvec = mem_cgroup_lruvec(memcg, pgdat); struct lru_gen_mm_state *mm_state = get_mm_state(lruvec); DEFINE_MAX_SEQ(lruvec); - int old_gen, new_gen = lru_gen_from_seq(max_seq); + int gen = lru_gen_from_seq(max_seq); lockdep_assert_held(pvmw->ptl); VM_WARN_ON_ONCE_FOLIO(folio_test_lru(folio), folio); @@ -4182,24 +4219,21 @@ bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw) if (!ptep_clear_young_notify(vma, addr, pte + i)) continue; - young++; + if (last != folio) { + walk_update_folio(walk, last, gen, dirty); - if (pte_dirty(ptent) && !folio_test_dirty(folio) && - !(folio_test_anon(folio) && folio_test_swapbacked(folio) && - !folio_test_swapcache(folio))) - folio_mark_dirty(folio); - - if (walk) { - old_gen = folio_update_gen(folio, new_gen); - if (old_gen >= 0 && old_gen != new_gen) - update_batch_size(walk, folio, old_gen, new_gen); - } else if (lru_gen_set_refs(folio)) { - old_gen = folio_lru_gen(folio); - if (old_gen >= 0 && old_gen != new_gen) - folio_activate(folio); + last = folio; + dirty = false; } + + if (pte_dirty(ptent)) + dirty = true; + + young++; } + walk_update_folio(walk, last, gen, dirty); + arch_leave_lazy_mmu_mode(); /* feedback from rmap walkers to page table walkers */