From patchwork Tue Feb 11 11:13:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 13969515 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D0A4C0219B for ; Tue, 11 Feb 2025 11:13:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1BF236B007B; Tue, 11 Feb 2025 06:13:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 16FA36B0082; Tue, 11 Feb 2025 06:13:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 036096B0083; Tue, 11 Feb 2025 06:13:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id D9D866B007B for ; Tue, 11 Feb 2025 06:13:47 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 801CF814AF for ; Tue, 11 Feb 2025 11:13:47 +0000 (UTC) X-FDA: 83107403694.07.966C880 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf27.hostedemail.com (Postfix) with ESMTP id 7E2E340004 for ; Tue, 11 Feb 2025 11:13:45 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf27.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739272426; a=rsa-sha256; cv=none; b=tfrnzyGqJlo1SJ5q75IY5hz/tuwoa+qBsi8QKy+VY2XALR0atgOVtRhon/3XRio/0LUPlG /7rvHEeHFaJ7OkzcPulhd0/qPxxm3MNJEukFHjrz0GQN23MK62cS+Xy1e4SdWsWzce46wE WGdpw5Iq9SOF1vcrmJVxCmfehgznvww= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf27.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739272426; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references; bh=KYup9SmksDmqhAREI9pISaB9kivW8qT/0hbO1r2dStc=; b=r3x09CJ/k6i2CP7EQzKfhCGaT0TS5PK3CBQnYsgqIDqZ6WzIgxcVuWM/ph+nwj2QXmARZ9 +00EFL3E7/nRHk1ZIb2a/FfLG6mgekM/m8qENosd/gG+ROI/h+dXNSRH6c8obByN8AqXjn dWHjGzvHr7xWDrzR4q/+gUV79IXFYbc= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EF0F113D5; Tue, 11 Feb 2025 03:14:05 -0800 (PST) Received: from K4MQJ0H1H2.emea.arm.com (K4MQJ0H1H2.blr.arm.com [10.162.40.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id E9B623F5A1; Tue, 11 Feb 2025 03:13:33 -0800 (PST) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: npache@redhat.com, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH v2 00/17] khugepaged: Asynchronous mTHP collapse Date: Tue, 11 Feb 2025 16:43:09 +0530 Message-Id: <20250211111326.14295-1-dev.jain@arm.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 7E2E340004 X-Rspamd-Server: rspam12 X-Stat-Signature: b6tnneesphzxn3defr8xekhtmbpomzdp X-HE-Tag: 1739272425-665469 X-HE-Meta: U2FsdGVkX1+PMD2V0S8DhN2kj++Oj4hbzHIw6W9a+bDoYKCblREdGPVkLtfBW9Q0VFBejPSHEncKdG6TtPM5FzjiXScIV19BzcqVJPhiEfXlo4ndLkA99OVYwwo0CK+EIRl158J3LpCGCofTg2R6o2y8awCy3ONUxzzeqmgY6PSCHrmoaZXmcV+6/G/FpPc4yO+fGdQHt8x2z8ywMXh2fN5ALXjX7nBP+zDpA+NS0H0643hbp5M1ixxadY5FCfic9hwvzpG5eqIDWS4zrw1ACYS6Qpw9WXLwlPyhfDt8QowrnRjoyKjg7WIlpUkxLMa8mvdT0nnoZU3YCaztAkET5VXD2JYJLaGZzZhEqBg1eeBZiPuTO1sZTquVPJ7ybNGYEU6Wg+p2uu2v0Q4DbSSKGJ59WWz+6ZnkdOviVkE6j/bZ0N6OLfZDf8U0eQDBdtJvY96lblflsmyFJBKrsrkMNzkAj5qodSdWuXOl4zWM1EJDS2mBBgZely8+N1oWgoXpgbX5w64U1Ig/vIrPJcTNXTkAIHlbgNUetuCm1ITbs/uE5lTdmMrH5DYziPxzT0TxBuM5N8W+aNpSzhxNxZekdCy3OQjlqguPWS+aQecob/LZ7rNok7PXt5PKJ9JZGDielNRoSPee3FdFXU8LZxiXwVp5NhhwvtK9PgloCQDMVpJZhcGokb2ezWCFdEM/W3o+BitWjiEDK4O7xiy7Epjc+srceTc6PZiUV8YhNUE9nLM3UDUQJ5nfAs+eZ7bUm5g5VLp9ftErgVHvDhrLQ4vWeEhrK+NBiaHbAwhEBLTemdlRBeoa87oPVufewdT8Qvs6oKC55Hedr28kuQKyrcExMd0Y1yStzGJzCzq2cJwWQ1ei/1t+7ZLdW+1elj2z262BKxyjJdCsvCi5b11EZ3ikVjMb8ebe1KHIrzm8Kgn9YYvZU4rFAWq29a5JhP+xk22uuCIqqETQ7SWeFsDt0b9 mYZZTj39 2uP5EWMi9pQmqVV2uYXs8nj0Q+Cmn9/KNGNHFl/40pMEV8bTAHMR4ap8YhD+NFm2i/9l0NT89iclJS11jU0DxbqkK+gY/rVaXux7TpvxEl7fpxF71OVmK8DhmRpwHBr1lxWu313mFRmM9DYvbEuwitd/uMMi5SXdo7gg4sb58fGLAK3Wx7lrfB9HfaDzh03k0eK5MOd5iKpAHIyGvUx46CHu+Lw22KmQCa+t2YTb+xAdPATlL1z5Rjh144w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This patchset extends khugepaged from collapsing only PMD-sized THPs to collapsing anonymous mTHPs. mTHPs were introduced in the kernel to improve memory management by allocating chunks of larger memory, so as to reduce number of page faults, TLB misses (due to TLB coalescing), reduce length of LRU lists, etc. However, the mTHP property is often lost due to CoW, swap-in/out, and when the kernel just cannot find enough physically contiguous memory to allocate on fault. Henceforth, there is a need to regain mTHPs in the system asynchronously. This work is an attempt in this direction, starting with anonymous folios. In the fault handler, we select the THP order in a greedy manner; the same has been used here, along with the same sysfs interface to control the order of collapse. In contrast to PMD-collapse, we (hopefully) get rid of the mmap_write_lock(). --------------------------------------------------------- Testing --------------------------------------------------------- The set has been build tested on x86_64. For Aarch64, 1. mm-selftests: No regressions. 2. Analyzing with tools/mm/thpmaps on different userspace programs mapping aligned VMAs of a large size, faulting in basepages/mTHPs (according to sysfs), and then madvise()'ing the VMA, khugepaged is able to 100% collapse the VMAs. This patchset is rebased on mm-unstable (4637fa5d47a49c977116321cc575ea22215df22d). v1->v2: - Handle VMAs less than PMD size (patches 12-15) - Do not add mTHP into deferred split queue - Drop lock optimization and collapse mTHP under mmap_write_lock() - Define policy on what to do when we encounter a folio order larger than the order we are scanning for - Prevent the creep problem by enforcing tunable simplification - Update Documentation - Drop patch 12 from v1 updating selftest w.r.t the creep problem - Drop patch 1 from v1 v1: https://lore.kernel.org/all/20241216165105.56185-1-dev.jain@arm.com/ Dev Jain (17): khugepaged: Generalize alloc_charge_folio() khugepaged: Generalize hugepage_vma_revalidate() khugepaged: Generalize __collapse_huge_page_swapin() khugepaged: Generalize __collapse_huge_page_isolate() khugepaged: Generalize __collapse_huge_page_copy() khugepaged: Abstract PMD-THP collapse khugepaged: Scan PTEs order-wise khugepaged: Introduce vma_collapse_anon_folio() khugepaged: Define collapse policy if a larger folio is already mapped khugepaged: Exit early on fully-mapped aligned mTHP khugepaged: Enable sysfs to control order of collapse khugepaged: Enable variable-sized VMA collapse khugepaged: Lock all VMAs mapping the PTE table khugepaged: Reset scan address to correct alignment khugepaged: Delay cond_resched() khugepaged: Implement strict policy for mTHP collapse Documentation: transhuge: Define khugepaged mTHP collapse policy Documentation/admin-guide/mm/transhuge.rst | 49 +- include/linux/huge_mm.h | 2 + mm/huge_memory.c | 4 + mm/khugepaged.c | 603 ++++++++++++++++----- 4 files changed, 511 insertions(+), 147 deletions(-)