From patchwork Fri Jun 14 01:51:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lance Yang X-Patchwork-Id: 13697713 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 140EBC27C4F for ; Fri, 14 Jun 2024 01:53:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 10C6C6B0092; Thu, 13 Jun 2024 21:52:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0E1276B009A; Thu, 13 Jun 2024 21:52:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EC2B56B009D; Thu, 13 Jun 2024 21:52:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id CD3BD6B0092 for ; Thu, 13 Jun 2024 21:52:15 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 276EC1C0956 for ; Fri, 14 Jun 2024 01:52:15 +0000 (UTC) X-FDA: 82227819030.02.DCED4A1 Received: from mail-ot1-f51.google.com (mail-ot1-f51.google.com [209.85.210.51]) by imf15.hostedemail.com (Postfix) with ESMTP id 7753FA0004 for ; Fri, 14 Jun 2024 01:52:13 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=lZRoMNjh; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf15.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.210.51 as permitted sender) smtp.mailfrom=ioworker0@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718329931; a=rsa-sha256; cv=none; b=gEP5PtD8uHe+3ju9+Dju3Tm+9Ky+UDZeYy8GCsJtHGGO+OsWNvxDhYNSqMlSv8WWJzanCE wTdwJE/s+Cw3XVm3M0MfAxwCZ5nHSqeFCJeZbBazLTsbX3v1T4RWkRJ/LFpGgAsJBTUJp3 sTTlis13fFuVN2IYsUOe/r9u2HOA8lU= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=lZRoMNjh; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf15.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.210.51 as permitted sender) smtp.mailfrom=ioworker0@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718329931; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=1wXsOFCer7vHGFr9Qte+BXKKrfUhXYrUOWzgt+gKOxk=; b=jgHl4eFEBkwsZIsWHAd3QYlUOG9qmToxDHHRJJbPeW7pjUJlm7zzmnUv2aysWu1IP6rDh6 JtyEba0fBGvgWwz8wOn9O9fUU8dN6IqchRLnEZMK20VjpysKPQJ67B/QiHT4tRniDA0qxX Hqve2Qx88MoRJ5+N8bw9+Fje9EqjvHc= Received: by mail-ot1-f51.google.com with SMTP id 46e09a7af769-6f97a4c4588so959286a34.2 for ; Thu, 13 Jun 2024 18:52:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718329932; x=1718934732; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=1wXsOFCer7vHGFr9Qte+BXKKrfUhXYrUOWzgt+gKOxk=; b=lZRoMNjhvsErAWd/xWgD9Yhj4jY2cjwwbvSGkKDzMRYFFg0Wrc9vza692k3jkpk2kQ bZMfMMFyeNGyjhG1pqhrbaByXwyaO2bgguAfk082864P8Hp9A187TP8qi+zR9lgiDiAZ gdUMu9gZPMSKcWV+7MxmVtJBXHtoQ4D6ShPnpGywfAM91siCtgA0rxpkX/cGSRQc62Y+ fyiZGu/cP/Gig260MAST+tFSG4Qlq3px5uwZfW3nnUdedmJkNF+a/u5lEsoLjqbahV1o p0nhVPO/4mO54B7cHx1T/MjUL1XqUPcTmgaEYk52XKHaKHC3BH9Feakm8fM8yogRcDCK UnVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718329932; x=1718934732; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=1wXsOFCer7vHGFr9Qte+BXKKrfUhXYrUOWzgt+gKOxk=; b=cz2YkmxyiGx4fAfCO2PkJ8bjESSkIbCKIoY5WH6pqyCLlvN3wx+RRleISqq4V5cNkn X2nwxv+QoWyZH39vkwx5X/5EG0iAc+ZXQ8/TseEE3VieYrXZM27zMNf62gH4xGWFlk5V 3huX8IOmjefQVuZfLZWZPTyUBw/2R6pd9695fw6x2F6fJGb3McU+36Wsu4xWrz3TN2OZ w73cMg335tESAd4503ge1e4tJF7SNASe0Gj1BOIumUfuDlDWFZ6yX05WiYotBp4aiUQG 96h0Wbmu+CTMTMCWfIN4s13ae8le03GrWtxUcL7ydxLZZ7wcw1E60oki8q86Eh6NZ1fS sM8Q== X-Forwarded-Encrypted: i=1; AJvYcCUxmo69BaRJDd32jM7cfXhj+HyZrvAhk9HM6mh0KvW/H6PQnBogy+647/JhjP1SAe+XnFefWSnOtrdxKKm2D/p/REU= X-Gm-Message-State: AOJu0YzUEnav+7nMmUFM1y3w94d4nTgLuF2kohhtZAtbBN9kEg/tT4EA jHId/buvo+qAZH1qL5J0y0O3yAInUezUQhnKJVUgzB096e0jUi1N X-Google-Smtp-Source: AGHT+IH8DzzZU7rX/4lt7YzMOCeNUh+JeaFDz5nfyyGRhoig8GO71+REZov7Bt05RgQd3c6h9EEi8A== X-Received: by 2002:a05:6870:a690:b0:254:b0b0:9335 with SMTP id 586e51a60fabf-25842981936mr1466954fac.33.1718329931164; Thu, 13 Jun 2024 18:52:11 -0700 (PDT) Received: from LancedeMBP.lan.lan ([2403:2c80:6::304f]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-705cc96d4c3sm2000912b3a.59.2024.06.13.18.52.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jun 2024 18:52:10 -0700 (PDT) From: Lance Yang To: akpm@linux-foundation.org Cc: willy@infradead.org, sj@kernel.org, baolin.wang@linux.alibaba.com, maskray@google.com, ziy@nvidia.com, ryan.roberts@arm.com, david@redhat.com, 21cnbao@gmail.com, mhocko@suse.com, fengwei.yin@intel.com, zokeefe@google.com, shy828301@gmail.com, xiehuan09@gmail.com, libang.li@antgroup.com, wangkefeng.wang@huawei.com, songmuchun@bytedance.com, peterx@redhat.com, minchan@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Lance Yang Subject: [PATCH v8 0/3] Reclaim lazyfree THP without splitting Date: Fri, 14 Jun 2024 09:51:35 +0800 Message-Id: <20240614015138.31461-1-ioworker0@gmail.com> X-Mailer: git-send-email 2.33.1 MIME-Version: 1.0 X-Rspamd-Queue-Id: 7753FA0004 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: ndf47txmueobmh7djtcche5pqo8h8ng4 X-HE-Tag: 1718329933-41353 X-HE-Meta: U2FsdGVkX19/SVp5t+Gf2R41te52IypBGBjWeAlRWzEFNGwLDXCPQXMRcYGboMlQ8yXVl2UodTuNKnr2YHctRifkK2wwKKvjGgznKQyw8OOxZHx34ah3NVrpQOEEgXgIS5c3JrzKvLbvrjOQvPGpkiWgDmG2G73CVpOVYwUXiof1nF6RgSKQGZrtM8TSsh6aJQ/eBv6O20hD5w3tL/DfprC4jJJ0UWA6vQmxfKhlfrSCYf2MEqDWX9mgFJzoDAchy9hJyufyJuMmCGppiqflAg8BR/26KS31QUJHBztKXHaYYxe2/dsq6/tQO2Da1HKIUIFF+8UQZku3NYRr9uMitxcFYDBNG3OE9rtKYBBSEvPaOihsINr9xKWDgDeK0r9Bbae/6XAZs2l31IKM++SmkhsdNpT3p99512jKARIj+RM7utwyBY7Ek0xu9qHYLcvnFR0PuOWfWSAJIoj7uCyqelRLEMctBSKUBXYJysBBOAbAS+iQSRO6u38peOg/0WDw5MT8SALY0b8v0j0Jfygzig8AhsfdCaDw4uliAkZiKdtRmJWRI1oAhecUkcqHLwan72rYxm0htzdilICkmcgFL6SN0qPt/MWFkyEfMcKszgSXk0CYZ5HKxRrvU7IlZVNSWGYgh8qO0AFnSymjo71FHiIGxrfIdzStEJr5e9mKMobVNGJ3nn1CJReNvGX07yVi0f9ezZUsL6XKWrOVvQ4WlY0HgeTbyG5DC4UNUb5X2OCQdvSz+yUMyEKaPNxGc5/mO5vKciaoXw39PjoOn6QL1syUzUV1NoLQS1JjAZhX8lnHP4V6S7Ue9IfjT0QCMTv9/d9kPJJJIFFfX4eO8HmytrfXht+nVV6M93zt4tq81AWu92M7FuJmwrLRtIJuJTI0DSeEGD5ibOvFv+aHo4yAxaKIIyyagbi5AqCtPwqKpzqUhtQQZmfAMdm0o11nBUJ8ule22BFmO4QJLiWyGy/ gA53bw6F s8xGEzCd9s4Vy+WMwzDQIEt8rYjb88eX+CBBcisW8c+oZJWpKob8sa/sgRPI9eCKYkqVKKenuNe7P9sXLoM4EkrSqpMbyaucQKROHu6RZS92334zCWC614llpB4DlJRQvTPbuc76nSQLxK2kdkVMt/jGUb1Lwot9EmAFxi9FNCeOzu8r4U8zX3WeFHgQxb2nTSUTf8qmdNlmlnZSgeDSVH7caW+f14U1exxzwfgc1UF7P6vQK4ntR5oA7EwZS77ut46YU1WWt8T1LsHKcBb/UBL7eWmquIJJjqRIf4se08NP2Rz1tGn2d2ZY0cGRoHOcivgJoyZrxVz6egpaVOSdc35reW57alZu5Ov+MWDwy0lGaq/A43ChuXyG5SlbYkvlBC5l5658THk3SMaX2RqvAUO8nxjq9gzq5/+q/Ox+VQTUulo8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi all, This series adds support for reclaiming PMD-mapped THP marked as lazyfree without needing to first split the large folio via split_huge_pmd_address(). When the user no longer requires the pages, they would use madvise(MADV_FREE) to mark the pages as lazy free. Subsequently, they typically would not re-write to that memory again. During memory reclaim, if we detect that the large folio and its PMD are both still marked as clean and there are no unexpected references(such as GUP), so we can just discard the memory lazily, improving the efficiency of memory reclamation in this case. Performance Testing =================== On an Intel i5 CPU, reclaiming 1GiB of lazyfree THPs using mem_cgroup_force_empty() results in the following runtimes in seconds (shorter is better): -------------------------------------------- | Old | New | Change | -------------------------------------------- | 0.683426 | 0.049197 | -92.80% | -------------------------------------------- --- Changes since v7 [7] ==================== - mm/rmap: remove duplicated exit code in pagewalk loop - Pick RB from Barry - thanks! - Rename walk_done_err to walk_abort (per Barry and David) - mm/rmap: integrate PMD-mapped folio splitting into pagewalk loop - Make page_vma_mapped_walk_restart() more general (per David) - Squash page_vma_mapped_walk_restart() into this patch (per David) - mm/vmscan: avoid split lazyfree THP during shrink_folio_list() - Don't unmark a PMD-mapped folio as lazyfree in unmap_huge_pmd_locked() - Drop the unused "pmd_mapped" variable (per Baolin) Changes since v6 [6] ==================== - mm/rmap: remove duplicated exit code in pagewalk loop - Pick RB from David - thanks! - mm/rmap: add helper to restart pgtable walk on changes - Add the page_vma_mapped_walk_restart() helper to handle scenarios where the page table walk needs to be restarted due to changes in the page table (suggested by David) - mm/rmap: integrate PMD-mapped folio splitting into pagewalk loop - Pass 'pvmw.address' to split_huge_pmd_locked() (per David) - Drop the check for PMD-mapped THP that is missing the mlock (per David) - mm/vmscan: avoid split lazyfree THP during shrink_folio_list() - Rename the function __discard_trans_pmd_locked() to __discard_anon_folio_pmd_locked() (per David) Changes since v5 [5] ==================== - mm/rmap: remove duplicated exit code in pagewalk loop - Pick RB from Baolin Wang - thanks! - mm/mlock: check for THP missing the mlock in try_to_unmap_one() - Merge this patch into patch 2 (per Baolin Wang) - mm/vmscan: avoid split lazyfree THP during shrink_folio_list() - Mark a folio as being backed by swap space if the folio or its PMD was redirtied (per Baolin Wang) - Use pmdp_huge_clear_flush() to get and flush a PMD entry (per Baolin Wang) Changes since v4 [4] ==================== - mm/rmap: remove duplicated exit code in pagewalk loop - Pick RB from Zi Yan - thanks! - mm/rmap: integrate PMD-mapped folio splitting into pagewalk loop - Remove the redundant alignment (per Baolin Wang) - Set pvmw.ptl to NULL after unlocking the PTL (per Baolin Wang) - mm/mlock: check for THP missing the mlock in try_to_unmap_one() - Check whether the mlock of PMD-mapped THP was missed (suggested by Baolin Wang) - mm/vmscan: avoid split lazyfree THP during shrink_folio_list() - No need to check the TTU_SPLIT_HUGE_PMD flag for unmap_huge_pmd_locked() (per Zi Yan) - Drain the local mlock batch after folio_remove_rmap_pmd() (per Baolin Wang) Changes since v3 [3] ==================== - mm/rmap: integrate PMD-mapped folio splitting into pagewalk loop - Resolve compilation errors by handling the case where CONFIG_PGTABLE_HAS_HUGE_LEAVES is undefined (thanks to SeongJae Park) - mm/vmscan: avoid split lazyfree THP during shrink_folio_list() - Remove the unnecessary conditional compilation directives (thanks to Barry Song) - Resolve compilation errors due to undefined references to unmap_huge_pmd_locked and split_huge_pmd_locked (thanks to Barry) Changes since v2 [2] ==================== - Update the changelog (thanks to David Hildenbrand) - Support try_to_unmap_one() to unmap PMD-mapped folios (thanks a lot to David Hildenbrand and Zi Yan) Changes since v1 [1] ==================== - Update the changelog - Follow the exact same logic as in try_to_unmap_one() (per David Hildenbrand) - Remove the extra code from rmap.c (per Matthew Wilcox) [1] https://lore.kernel.org/linux-mm/20240417141111.77855-1-ioworker0@gmail.com [2] https://lore.kernel.org/linux-mm/20240422055213.60231-1-ioworker0@gmail.com [3] https://lore.kernel.org/linux-mm/20240429132308.38794-1-ioworker0@gmail.com [4] https://lore.kernel.org/linux-mm/20240501042700.83974-1-ioworker0@gmail.com [5] https://lore.kernel.org/linux-mm/20240513074712.7608-1-ioworker0@gmail.com [6] https://lore.kernel.org/linux-mm/20240521040244.48760-1-ioworker0@gmail.com [7] https://lore.kernel.org/linux-mm/20240610120209.66311-1-ioworker0@gmail.com Lance Yang (3): mm/rmap: remove duplicated exit code in pagewalk loop mm/rmap: integrate PMD-mapped folio splitting into pagewalk loop mm/vmscan: avoid split lazyfree THP during shrink_folio_list() include/linux/huge_mm.h | 15 +++++ include/linux/rmap.h | 24 ++++++++ mm/huge_memory.c | 118 +++++++++++++++++++++++++++++++++------- mm/rmap.c | 68 ++++++++++++----------- 4 files changed, 174 insertions(+), 51 deletions(-) base-commit: fb8d20fa1a94f807336ed209d33da8ec15ae6c3a