From patchwork Wed Jul 17 07:12:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13735170 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0515DC3DA42 for ; Wed, 17 Jul 2024 07:13:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 66F206B0083; Wed, 17 Jul 2024 03:13:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 61FD26B0088; Wed, 17 Jul 2024 03:13:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 50DD66B0089; Wed, 17 Jul 2024 03:13:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 3250B6B0083 for ; Wed, 17 Jul 2024 03:13:10 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id C7C3181446 for ; Wed, 17 Jul 2024 07:13:09 +0000 (UTC) X-FDA: 82348378098.24.89CF58C Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf13.hostedemail.com (Postfix) with ESMTP id ED17820011 for ; Wed, 17 Jul 2024 07:13:07 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf13.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721200356; a=rsa-sha256; cv=none; b=3TGPiCaBXwYHfzm4Gwo4V+/+MwsoLIH3YaI8XWX9vLzE2z6WiBmt9hOEO0HzLIc4S3G5jG h4evN6Vv8TJvfmdyEDHkZvMjUnpW6U1nYRTwNNamjmGDSkDIgEeR91kKvOVEvKWYP9PyKR xnYnye2f0wvtBWXX6/83LMH17xb9yBI= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf13.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721200356; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references; bh=QH4lfmxCxrIihgHdCSg4KCZxY/QH4wDc/isIThs7oJk=; b=WagGCM5xJeMpOXbQcObFCSHLwbEC4MvC0ODqG1eBNFbpnD6OtEWWGzOIpeBOwNRB1XYBvw 6SSpXh3Ya6bFRzZ1bSSBAfTiO9sASb4IjuSW880iNifDleyK1hqUpBDe91UY6znNL8Lbyh iVU0TfrF57ffnFQ69B0BQRU0chfzmRg= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 431ED1063; Wed, 17 Jul 2024 00:13:32 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 402523F762; Wed, 17 Jul 2024 00:13:05 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Hugh Dickins , Jonathan Corbet , "Matthew Wilcox (Oracle)" , David Hildenbrand , Barry Song , Lance Yang , Baolin Wang , Gavin Shan , Pankaj Raghav , Daniel Gomez Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH v1 0/4] Control folio sizes used for page cache memory Date: Wed, 17 Jul 2024 08:12:52 +0100 Message-ID: <20240717071257.4141363-1-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: ED17820011 X-Stat-Signature: y7pbeddh173ynmyg1a53ywzu7rsn85dm X-Rspam-User: X-HE-Tag: 1721200387-212570 X-HE-Meta: U2FsdGVkX1+gD3FqxKtFxDqoUUE+Md5KPgG9q9S9x4pM27cTdmyMffSKU17qbsVbzDonKNS/jLad4V5YGMXWebzbIb4rDL1XqSdrW/0eR0izdqZ/unywmUSwuzL8Y5k9Hc6VPRIGFPsfUBqlZRA4HeCL7jset1WrEB+p1Dfv7bCpVBZSvEU2hUKgNMXQjvM8HULhCX1Oc1fiRoiKnChWFiKiyHwX+XAlPAXVr6YoZXlTfJQFY0VbPf+epbr2YWzZziDYv3ozjevorVDYX4HQeQhVdT6078+eRI0+WM4hhXxwgugU9j/7Yre8YqP6FIv2HM2NrV9fv3ZwYpxCae/I1HzUDz16vvLgQz/35vzeN5ML4mx89IP0zhVkvirULdUo8ew5YQLkYvMRjq8c7BdoMkL3bXdWE0dFmryf3wMzFtD48ckOWThNbAtnxdROb+EihBSefz6KfAi+s6G8BOmMylaob0xVHR0OCNLwM9Vmu2/XE19pHNo+ExEtDASjVGXhdS7UgUXQ9lIDA9GKEKHE2q7XNxawBwHcK4Zw/2+DjHDmIuduYxT09kGxrxhJnOG1ZB7UmVNXGHLFWLm5QY14AsUUfrpoK7zLIZPYU+R4DnciGk31DEKuibwim2zNq4oB81tedAafBCDFeWvD2uq1zc8FX8qnSOHWgjBgAVlUly00vo5ho00gJpfzUPF4pGrbimEhTUZP8eubAvBrzCawud4SuPe7LUN1j16kFk+MNnyOdi5DNKH1iUKok+OU+EINxOpg4/1W75dwZ9uLgV2tq9/x8N9plTtU058mYRppJpQUrBHV5oEc/tyTa5cZbqyWgefykRVDnO+VWg3o/ztJRDyeXkQZ8WHRO2PVAk+EwNuFyI3BN8f+4pYhPFiddECrx2vHeuzZWtVncluI99a2eSnmy+1OktJ/0HlJ89oQdpgZYLcBkvXBjy8NHDenMEEcQzrJiiJPHlEg2NQh0R0 97AgkDJn L7e3ajYWtvJBv+iJ6kEMa1LuAjMBSCRX/KBD/TrOeBweDE6ic79t1LUN2WftV3Ka1nq+sGEK5KabpJoI2gTxR0n0qufvkT9KYEeVj18iqyeH74bxuxw+R9RQ/spy0EwL7arITl0VAMzXypDV+zXe7BdcvJ6K7kb/rX+t3TwzAXAVJFFEeBbTA8ntitflIqSbCNpoIHuSXodHlQKqpW8RgAf9/c+NxqIS50weV7XWuFcYyPlM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi All, This series is an RFC that adds sysfs and kernel cmdline controls to configure the set of allowed large folio sizes that can be used when allocating file-memory for the page cache. As part of the control mechanism, it provides for a special-case "preferred folio size for executable mappings" marker. I'm trying to solve 2 separate problems with this series: 1. Reduce pressure in iTLB and improve performance on arm64: This is a modified approach for the change at [1]. Instead of hardcoding the preferred executable folio size into the arch, user space can now select it. This decouples the arch code and also makes the mechanism more generic; it can be bypassed (the default) or any folio size can be set. For my use case, 64K is preferred, but I've also heard from Willy of a use case where putting all text into 2M PMD-sized folios is preferred. This approach avoids the need for synchonous MADV_COLLAPSE (and therefore faulting in all text ahead of time) to achieve that. 2. Reduce memory fragmentation in systems under high memory pressure (e.g. Android): The theory goes that if all folios are 64K, then failure to allocate a 64K folio should become unlikely. But if the page cache is allocating lots of different orders, with most allocations having an order below 64K (as is the case today) then ability to allocate 64K folios diminishes. By providing control over the allowed set of folio sizes, we can tune to avoid crucial 64K folio allocation failure. Additionally I've heard (second hand) of the need to disable large folios in the page cache entirely due to latency concerns in some settings. These controls allow all of this without kernel changes. The value of (1) is clear and the performance improvements are documented in patch 2. I don't yet have any data demonstrating the theory for (2) since I can't reproduce the setup that Barry had at [2]. But my view is that by adding these controls we will enable the community to explore further, in the same way that the anon mTHP controls helped harden the understanding for anonymous memory. --- This series depends on the "mTHP allocation stats for file-backed memory" series at [3], which itself applies on top of yesterday's mm-unstable (650b6752c8a3). All mm selftests have been run; no regressions were observed. [1] https://lore.kernel.org/linux-mm/20240215154059.2863126-1-ryan.roberts@arm.com/ [2] https://www.youtube.com/watch?v=ht7eGWqwmNs&list=PLbzoR-pLrL6oj1rVTXLnV7cOuetvjKn9q&index=4 [3] https://lore.kernel.org/linux-mm/20240716135907.4047689-1-ryan.roberts@arm.com/ Thanks, Ryan Ryan Roberts (4): mm: mTHP user controls to configure pagecache large folio sizes mm: Introduce "always+exec" for mTHP file_enabled control mm: Override mTHP "enabled" defaults at kernel cmdline mm: Override mTHP "file_enabled" defaults at kernel cmdline .../admin-guide/kernel-parameters.txt | 16 ++ Documentation/admin-guide/mm/transhuge.rst | 66 +++++++- include/linux/huge_mm.h | 61 ++++--- mm/filemap.c | 26 ++- mm/huge_memory.c | 158 +++++++++++++++++- mm/readahead.c | 43 ++++- 6 files changed, 329 insertions(+), 41 deletions(-) -- 2.43.0