From patchwork Thu Sep 26 01:34:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13812675 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E96AACCF9EB for ; Thu, 26 Sep 2024 01:35:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 021746B00AC; Wed, 25 Sep 2024 21:35:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F115D6B00AD; Wed, 25 Sep 2024 21:35:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DB1C06B00AE; Wed, 25 Sep 2024 21:35:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id BB2DC6B00AC for ; Wed, 25 Sep 2024 21:35:18 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 21468C11FD for ; Thu, 26 Sep 2024 01:35:18 +0000 (UTC) X-FDA: 82605171516.20.BAF588F Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf06.hostedemail.com (Postfix) with ESMTP id 5E1C2180008 for ; Thu, 26 Sep 2024 01:35:16 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XsMHJk7V; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of 3U7r0ZgoKCNgDNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3U7r0ZgoKCNgDNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727314418; a=rsa-sha256; cv=none; b=GOPMtT89YjY8SEyGiZgSQxLCWqHakmsh9wUyqswQP3FAbXwa5SLc6JgTkIVbIzv+4PzHzN 484Y3azoKEGGr7Zw6YsaAudfOzSiPLsmgBph+G7Nu7Ejp1nWK0VudJvOH+QZoUL2DtiTZC ku/NcZ3KQLLfqVuLVC4xwwwymfmg5ww= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XsMHJk7V; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of 3U7r0ZgoKCNgDNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3U7r0ZgoKCNgDNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727314418; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=VU04rMbBG4hQZH66KIJbz/36hQ90QEQKf0kwa/exU7s=; b=X1nY0pULe1+l0JrqqnR5B+Qu/OkktT/+G1E37vVi9jZWAUvp5e/tAiqq1ucBLh/vpU68Xc 4xDmkc+olXKD9bHn0TF4SIH9Hz8HoZym1kVTEUa8IuArqf3+QnWqZ8DUrr1Y2nuZ6J6wc1 u4plEgsGsEYtaHLXIRtbedKsgCUrlQA= Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e1159159528so2256651276.1 for ; Wed, 25 Sep 2024 18:35:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1727314515; x=1727919315; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=VU04rMbBG4hQZH66KIJbz/36hQ90QEQKf0kwa/exU7s=; b=XsMHJk7VVkfsGIFCw0TXUQCazZenIvkrfFG037fvimE9+3N/2rE879T96qIhU51jO3 bFdXIqROaKQ2ha1fLrI1GH4LHbn1fxO6iI/WT7FtoMAt7wnHafK3i0wbDJMadQz65omC hzAp3GS6/X7x4RQhd29FF1krH+E40nEFrgqdJ25rFnfm3Y/Bwwlhu82yxHzEapIxD9Pd cvOTNl2EWuGV4pkhF2xncAnKaaQc09ZLO2mzshs21kozQDZ4Lob/NGU05aI8qyIl5diz OTFEStRfm9o9O8vkKMwndFqQm+PrakJj1OipKfvE7i+4cwsAmHeKQUk4NM9pMC5LGR63 NNcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727314515; x=1727919315; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=VU04rMbBG4hQZH66KIJbz/36hQ90QEQKf0kwa/exU7s=; b=IfCQxq4b0n/DropcqdG7UL4GV9j1WREZ1TkvjKzxaYtCDXhUrXJlkqq2eNp4XeA7hI 3KmPIvdz9IBm6gZMOEoQF3piiFURytJGcJf+2E6gVAVFzUuFIjmoBthaCe5hAR/1zkmk yF9A3huPxTSArLDQyeeSbPv6zJ4PmymU6Fn4k/EUok/qtTH5VbzZoCtygWAi64czVB7Q k+CmfiusxOOoHRq6nPSVardb3aM/j0ujGGkUk37M7u2BXuPx2PH5Cnjf4BY1PIwuKmgE kETVz/rk9voeVEL9tUMDugUGVysgUEqy9ant53p4wm1xCPkwYxW9OLgF0EucTfbadhZR qd2A== X-Forwarded-Encrypted: i=1; AJvYcCXn+EfLc24lsbaOXKk0k6SRaQismoMqvp9oXH2SgCaFf9DAH5w69ECxTJRr1yoH056PhBHetaZ4YA==@kvack.org X-Gm-Message-State: AOJu0Yy+3tcmiX88+Ya3fwJutQiTgXpK6mXJvRQK5AGz4uIUDVSqGOL8 N+RqXygScmKAJN7EKA1P+bLVSfIOOW8Z/nAR+6boa7dBcy+lmO0cUCd6sqvlLyrz83tqYBv7d7S eL4RiHWMMLO2km1YLzg== X-Google-Smtp-Source: AGHT+IFp/msJsj4288R/ygmqz/AmN2G/ZyRf++8GaxDH1j7Mu8CzY0R/FPvCXduMLUW1do9JgPGzUuM+NCjd3k9+ X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:13d:fb22:ac12:a84b]) (user=jthoughton job=sendgmr) by 2002:a05:6902:2287:b0:e1d:9ac3:a2ee with SMTP id 3f1490d57ef6-e25ca952192mr22515276.4.1727314515061; Wed, 25 Sep 2024 18:35:15 -0700 (PDT) Date: Thu, 26 Sep 2024 01:34:48 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.46.0.792.g87dc391469-goog Message-ID: <20240926013506.860253-1-jthoughton@google.com> Subject: [PATCH v7 00/18] mm: multi-gen LRU: Walk secondary MMU page tables while aging From: James Houghton To: Sean Christopherson , Paolo Bonzini Cc: Andrew Morton , David Matlack , David Rientjes , James Houghton , Jason Gunthorpe , Jonathan Corbet , Marc Zyngier , Oliver Upton , Wei Xu , Yu Zhao , Axel Rasmussen , kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 5E1C2180008 X-Stat-Signature: ye7d9qktujnbm8eoynrszakp1jjp6sf1 X-Rspam-User: X-HE-Tag: 1727314516-661833 X-HE-Meta: U2FsdGVkX1/DubGWdZohkGcPddHnIGDE4aMApIOHgbeSdyqT8KllX6N/HJZih/MpRi3ZzbwmB3zg3DqDq3OgPiSnXs8RYHN0Hs3E/W6LjvO6Cf/RYM/GoJtKRfrgYuTb7MOEAfaCvr1IFu7F9ImDv3HTR7iD9XSCtMLiLjEVoI+EzzjPJqkaAMNtkU4wkB5+HMGo3Ui7IAhTvSig2m3DKAM0t14GzimHy9a33PTeBH1LQf2iEjBBgCCEF93Kgvll86sHg40COW8E4GpZf2uMw/zsekT+I0F45Xu1DJ/M/IkhG2WtUCL9HgdJuemc7EvaGBEKvwgX7lrWXPlkSdCfUUOWgL/Low4o2hfXABpFTG8YEfC0LQF9NOifLLhym5KZBCHFnse2oz7XhknpHRw49K3rdW/fWRXy+lAs6/Qk0m4Cc3SvZJYV9y6jr9hy7adLWF6KcNlnZhAElXRt/Na+bsl5LHT+3R3pD0JYRbXqQUt1j+0rwELyWe4RMt+x5xVypV4eGzgAkxDC1+KLuRQR72hmFJhIFcGtR7WxFV+E97LD6xe+4c6PEGT0aAv2IiacKBDxk144JG25QsKiAuMSx5zM+gLl44RqLa1E8EKFMyCiYAMywDklF4Y/MU1MhhBThCZMVci3GcAOj8jjba9WuSczhDtxFKmV/o28v2aXmRB6rJOOZF5Cw+v98VVpOc/t6DRc8IY+3Fa/uEYgLcHBt8A7TNAHokHfC8WYr9EKU34QFT3ZatX5Y6f8pl0aJzkdSHbP/QXaiwkbwTIIxcBT96UHOii7ny7vYACIiSLLj1gYzC0chR5ly+it/8HGx62y/jjRT8fjRaMDCYp2yg7z4CKGc2cdwooJZ6lXPwbUbBtIP12E2wuBW5DXx9xl85wiWs4cyrzwhVm3NJVjjK2qvLiKPu6kRB1S36wmgvk3pwUkre6xc4oLqI6BA5kN4C9qan5wZCdH9uG3+fOPo0u DS+8aMIF T9Axo+udD181UnFN4gN8g/QXaHS+etjI76NrwWDIE+sfv+zgIf1PMpkg1phEo2BQxlkrd1N6ybDEokeecPvmnkzpB1Bxlqf5mz9k1I4scsloIGXbfISyzTchh5c9PsTTd4MQLfbT80JfURzKV/WkTgJTkRS4VsnN9DL9fBb1svA2y7PlwrKC1F6RKdlcM0mQutekDZ6yU7TAq5a3dqYyDv1lyv+wQRQSKHpOoNTlGd9RWF6n5fL0br3paT1lUpfopQmvUAXK08RQCWx+H+tTl4atMwWbny/OFv4MUL9JxX7d+SyUqP0e5GpW0aQnXcbBeJZVZLotoQkvnMOoO1dmYdsN/mBLdCE42jNQJDtACSQuwRS3OchxTmm24PKAoNdLmA2SxznUj8N88BcSuUxWQFkLPORh5zujp0VuxLjezq6yABTrPsbNLBeFutcDziIOfBczdAU7JVoLYangtyqWkcd8WOMmGRjm1gDR7KjhwwK75KrddPRHJh9w9pTesd/w8GNoC0VPOCif47XlweKlb/LfOkHgzzzswewxMEX6YTOmb2szpaiIVbTz111I7KZMQAyseMy9HsV+Yo4wvyPg/OSGSjFf66B1oF+gC X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This patchset makes it possible for MGLRU to consult secondary MMUs while doing aging, not just during eviction. This allows for more accurate reclaim decisions, which is especially important for proactive reclaim. This series includes: 1. Cleanup, add support for locklessly memslot walks in KVM (patches 1-2). 2. Support for lockless aging for x86 TDP MMU (patches 3-4). 3. Further small optimizations (patches 5-6). 4. Support for lockless harvesting of access information for the x86 shadow MMU (patches 7-10). 5. Some mm cleanup (patch 11). 6. Add fast-only aging MMU notifiers (patches 12-13). 7. Support fast-only aging in KVM/x86 (patches 14-16). 8. Have KVM participate in MGLRU aging (patch 17). 9. Updates to the access_tracking_perf_test to verify MGLRU functionality (patch 18). Patches 1-10 are pure optimizations and could be applied without the rest of the series, though the lockless shadow MMU lockless patches become more useful in the context of MGLRU aging. Please note that mmu_notifier_test_young_fast_only() is added but not used in this series. I am happy to remove it if that would be appropriate. The fast-only notifiers serve a particular purpose: for aging, we neither want to delay other operations (e.g. unmapping for eviction) nor do we want to be delayed by these other operations ourselves. By default, the implementations of test_young() and clear_young() are meant to be *accurate*, not fast. The fast-only notifiers will only give age information that can be gathered fast. The fast-only notifiers are non-trivially implemented for only x86. The TDP MMU and the shadow MMU are both supported, but the shadow MMU will not actually age sptes locklessly if A/D bits in the spte have been disabled (i.e., if L1 disables them). access_tracking_perf_test now has a mode (-p) to check performance of MGLRU aging while the VM is faulting memory in. This series has been tested with access_tracking_perf_test and Sean's mmu_stress_test[6], both with tdp_mmu=0 and tdp_mmu=1. === Previous Versions === Since v6[1]: - Rebased on top of kvm-x86/next and Sean's lockless rmap walking changes[6]. - Removed HAVE_KVM_MMU_NOTIFIER_YOUNG_FAST_ONLY (thanks DavidM). - Split up kvm_age_gfn() / kvm_test_age_gfn() optimizations (thanks DavidM and Sean). - Improved new MMU notifier documentation (thanks DavidH). - Dropped arm64 locking change. - No longer retry for CAS failure in TDP MMU non-A/D case (thanks Sean). - Added some R-bys and A-bys. Since v5[2]: - Reworked test_clear_young_fast_only() into a new parameter for the existing notifiers (thanks Sean). - Added mmu_notifier.has_fast_aging to tell mm if calling fast-only notifiers should be done. - Added mm_has_fast_young_notifiers() to inform users if calling fast-only notifier helpers is worthwhile (for look-around to use). - Changed MGLRU to invoke a single notifier instead of two when aging and doing look-around (thanks Yu). - For KVM/x86, check indirect_shadow_pages > 0 instead of kvm_memslots_have_rmaps() when collecting age information (thanks Sean). - For KVM/arm, some fixes from Oliver. - Small fixes to access_tracking_perf_test. - Added missing !MMU_NOTIFIER version of mmu_notifier_clear_young(). Since v4[3]: - Removed Kconfig that controlled when aging was enabled. Aging will be done whenever the architecture supports it (thanks Yu). - Added a new MMU notifier, test_clear_young_fast_only(), specifically for MGLRU to use. - Add kvm_fast_{test_,}age_gfn, implemented by x86. - Fix locking for clear_flush_young(). - Added KVM_MMU_NOTIFIER_YOUNG_LOCKLESS to clean up locking changes (thanks Sean). - Fix WARN_ON and other cleanup for the arm64 locking changes (thanks Oliver). Since v3[4]: - Vastly simplified the series (thanks David). Removed mmu notifier batching logic entirely. - Cleaned up how locking is done for mmu_notifier_test/clear_young (thanks David). - Look-around is now only done when there are no secondary MMUs subscribed to MMU notifiers. - CONFIG_LRU_GEN_WALKS_SECONDARY_MMU has been added. - Fixed the lockless implementation of kvm_{test,}age_gfn for x86 (thanks David). - Added MGLRU functional and performance tests to access_tracking_perf_test (thanks Axel). - In v3, an mm would be completely ignored (for aging) if there was a secondary MMU but support for secondary MMU walking was missing. Now, missing secondary MMU walking support simply skips the notifier calls (except for eviction). - Added a sanity check for that range->lockless and range->on_lock are never both provided for the memslot walk. For the changes since v2[5], see v3. Based on latest kvm-x86/next. [1]: https://lore.kernel.org/linux-mm/20240724011037.3671523-1-jthoughton@google.com/ [2]: https://lore.kernel.org/linux-mm/20240611002145.2078921-1-jthoughton@google.com/ [3]: https://lore.kernel.org/linux-mm/20240529180510.2295118-1-jthoughton@google.com/ [4]: https://lore.kernel.org/linux-mm/20240401232946.1837665-1-jthoughton@google.com/ [5]: https://lore.kernel.org/kvmarm/20230526234435.662652-1-yuzhao@google.com/ [6]: https://lore.kernel.org/kvm/20240809194335.1726916-1-seanjc@google.com/ James Houghton (14): KVM: Remove kvm_handle_hva_range helper functions KVM: Add lockless memslot walk to KVM KVM: x86/mmu: Factor out spte atomic bit clearing routine KVM: x86/mmu: Relax locking for kvm_test_age_gfn and kvm_age_gfn KVM: x86/mmu: Rearrange kvm_{test_,}age_gfn KVM: x86/mmu: Only check gfn age in shadow MMU if indirect_shadow_pages > 0 mm: Add missing mmu_notifier_clear_young for !MMU_NOTIFIER mm: Add has_fast_aging to struct mmu_notifier mm: Add fast_only bool to test_young and clear_young MMU notifiers KVM: Pass fast_only to kvm_{test_,}age_gfn KVM: x86/mmu: Locklessly harvest access information from shadow MMU KVM: x86/mmu: Enable has_fast_aging mm: multi-gen LRU: Have secondary MMUs participate in aging KVM: selftests: Add multi-gen LRU aging to access_tracking_perf_test Sean Christopherson (4): KVM: x86/mmu: Refactor low level rmap helpers to prep for walking w/o mmu_lock KVM: x86/mmu: Add infrastructure to allow walking rmaps outside of mmu_lock KVM: x86/mmu: Add support for lockless walks of rmap SPTEs KVM: x86/mmu: Support rmap walks without holding mmu_lock when aging gfns Documentation/admin-guide/mm/multigen_lru.rst | 6 +- arch/x86/include/asm/kvm_host.h | 4 +- arch/x86/kvm/Kconfig | 1 + arch/x86/kvm/mmu/mmu.c | 355 ++++++++++++---- arch/x86/kvm/mmu/tdp_iter.h | 27 +- arch/x86/kvm/mmu/tdp_mmu.c | 57 ++- include/linux/kvm_host.h | 2 + include/linux/mmu_notifier.h | 82 +++- include/linux/mmzone.h | 6 +- include/trace/events/kvm.h | 19 +- mm/damon/vaddr.c | 2 - mm/mmu_notifier.c | 38 +- mm/rmap.c | 9 +- mm/vmscan.c | 148 +++++-- tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/access_tracking_perf_test.c | 369 +++++++++++++++-- .../selftests/kvm/include/lru_gen_util.h | 55 +++ .../testing/selftests/kvm/lib/lru_gen_util.c | 391 ++++++++++++++++++ virt/kvm/Kconfig | 3 + virt/kvm/kvm_main.c | 124 +++--- 20 files changed, 1451 insertions(+), 248 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/lru_gen_util.h create mode 100644 tools/testing/selftests/kvm/lib/lru_gen_util.c base-commit: 3cc25d5adcfd2a2c33baa0b2a1979c2dbc9b990b