From patchwork Wed Jul 24 01:10:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13740496 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49582C3DA63 for ; Wed, 24 Jul 2024 01:11:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9C3926B007B; Tue, 23 Jul 2024 21:11:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 972856B0082; Tue, 23 Jul 2024 21:11:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 813C56B0083; Tue, 23 Jul 2024 21:11:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 6274C6B007B for ; Tue, 23 Jul 2024 21:11:16 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 11D0C16052C for ; Wed, 24 Jul 2024 01:11:16 +0000 (UTC) X-FDA: 82372867752.29.83AC33A Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf20.hostedemail.com (Postfix) with ESMTP id 4CF691C0010 for ; Wed, 24 Jul 2024 01:11:14 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=AJxsPNZI; spf=pass (imf20.hostedemail.com: domain of 3sVSgZgoKCBg7H5CI45HCB4CC492.0CA96BIL-AA8Jy08.CF4@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3sVSgZgoKCBg7H5CI45HCB4CC492.0CA96BIL-AA8Jy08.CF4@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721783428; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=4cSOuBWS+C1ycbtiGXOUNR2uU8fUZDxisEhVyZ4yNKI=; b=dPtBMuOFgQH8ProukiFncJalWPB+1E71JhsGB4BGihL4tdBzGZjuDLj/72ky8I3dmfQb54 EPK5YZmneNdpF73p7bpqGmvcNyHXe3kWOWObV7twDdYRoQnMO6ETaUx3zj3pnxGSHIHkU7 tG+i06We/q5H5f0UK/cS1FkP/qMulls= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721783428; a=rsa-sha256; cv=none; b=SbyoGkYqex2+4Jhf/mLX1RM20Ee/ykUcH+kUeQWK/RP1cXWHXqNgo8q5DwmQK5t7LwEhU0 HOVUWWy6A6es5T2CYISVnMbfZIbXpNjbcLWIEHR1l+mmmlbJxQrlNJ3oOxHBFhYq778Nwc c+caD8wWILOT7GXL+6ETmVocTCXxOzY= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=AJxsPNZI; spf=pass (imf20.hostedemail.com: domain of 3sVSgZgoKCBg7H5CI45HCB4CC492.0CA96BIL-AA8Jy08.CF4@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3sVSgZgoKCBg7H5CI45HCB4CC492.0CA96BIL-AA8Jy08.CF4@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6643016423fso192765807b3.3 for ; Tue, 23 Jul 2024 18:11:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1721783473; x=1722388273; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=4cSOuBWS+C1ycbtiGXOUNR2uU8fUZDxisEhVyZ4yNKI=; b=AJxsPNZIG82NlE1cY/2ZDiwC7VHJw16uXfCEqRZHqwoBXlk/GRSiPC3luLcbeu2ygA RyvceI80Y719zFin8UwRP0SC29y80AGInpPpeBKMeVZQPR63OckSogdtf0ZxIKQCtqUo E7Xpmr/cYKHSv8IolHoYrfXZN64+1ikrwzU4ZJ25aF+iDgwnGoIiaPBrMEYgWMwM2esl pCuRohLuVDBwa2SZAKobBU6TPDnmOnA/18fAkbCg01j3uNEznwvnolZ8ijuEdH+whOA/ 3rCYDn5TujN2yzVIsZTIb+PqLH5W5oANNNIp7y9aPMqVPdjNveqt6MCQXD/x7m542kGL GdqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721783473; x=1722388273; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=4cSOuBWS+C1ycbtiGXOUNR2uU8fUZDxisEhVyZ4yNKI=; b=cNylq1adEti2OxhSvD/TjcFpQOA5eKMueuJbLd89KsowMLpu7DASeQsh90m4jzqkBT oQkNRNd4p0yTf7HrpEKFKCYArGxBbWojA+n/J+H5ws7jOvQaAUvX6dsTIQgJc09BU3LO ZmR3DxjG7SLe8yCzvHlENUhPYUDGPtOGxZ9wjKlDezppHbm9Z6rs4lZF7BMk0Q9AJ+yQ nW0OU7kotR53ow5wR4ilg6NP2uITmGx16f2AMRSLwpOGYKzybNV6rmZQlYn7vE17FdjV Z/u0XHBJKsbI7BxHenBx9+X1C/+Cj9RtWMSEd0ktTOfO4VxQNGmBD95wRbLqzueiZ2Gz +Ixg== X-Forwarded-Encrypted: i=1; AJvYcCWM8IFRPtI2juvIqcVv162tVIaAnG84Lp5Xxng7RMigdUmYe/zjGIIRhXQRjj63lWUx6d4dppDujrQTmH8NOlsU4y0= X-Gm-Message-State: AOJu0YyhHGiEF62QVOGx2JgWvMk12IIWL7gdHmFCbdTx3VF7Dch7vQBW AkzloGzfi9tc6tx3a6q9S585n8aWGOzHL0L4KRPiXfAQPJUQihSWStZY/YklLwhQh0pDSrbemNW Q6JJGmBEttDvcQbR7Uw== X-Google-Smtp-Source: AGHT+IHTWsfjxkFhlZ/efjqoG2otoOld5wqzU0dS4uuWsUR/MGeJLDWGi8tLphMhuDEGJQs1xgmW73pGbIUMunRM X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:690c:289:b0:627:a787:abf4 with SMTP id 00721157ae682-671f0bcbd75mr236647b3.3.1721783473255; Tue, 23 Jul 2024 18:11:13 -0700 (PDT) Date: Wed, 24 Jul 2024 01:10:25 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.46.0.rc1.232.g9752f9e123-goog Message-ID: <20240724011037.3671523-1-jthoughton@google.com> Subject: [PATCH v6 00/11] mm: multi-gen LRU: Walk secondary MMU page tables while aging From: James Houghton To: Andrew Morton , Paolo Bonzini Cc: Ankit Agrawal , Axel Rasmussen , Catalin Marinas , David Matlack , David Rientjes , James Houghton , James Morse , Jason Gunthorpe , Jonathan Corbet , Marc Zyngier , Oliver Upton , Raghavendra Rao Ananta , Ryan Roberts , Sean Christopherson , Shaoqin Huang , Suzuki K Poulose , Wei Xu , Will Deacon , Yu Zhao , Zenghui Yu , kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org X-Rspamd-Queue-Id: 4CF691C0010 X-Stat-Signature: hk7o3wqnu5i7bngsq1iba67o8knr3yu4 X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1721783474-718226 X-HE-Meta: U2FsdGVkX1/crz8tMphDqLfdopZntAQdx3fw/PoDznPQyjmvTFtRM7Z7uyRHrJLCuytYUQTS8nkePEv9vESBR9ixoggDy9UnpanK9EyewarrsZwC2S4qWJ1oz4jI0bWGTBat/0b0e/NRmpZDHR547K/306AKDJI8sI8+Fyo+blMjEs3HR2c+7z4ZwWF6dnTZDzuPdgWXkR30e7rXbhNulUjErAS2OjYGbU1eKU+NHYjmwQ4itm3lELIdUYh4/GnR7GJ2sx3n75O4lPG4/VHPSu7ZzFC7zJbz1c9RArdkE7drLEDWRyzbXNYaeEBgWkF+fyukJlPDdNomdwYY1tT+Ah5mHglgsjGiltJvKBSVt5O97suO2OTw4B6ueTegqRRVQdFsYOe1JJaJlhDUB1tWDUuIVEtwYtpAcKUkysLrNXG9qOrAczXwKnuW3Xe5uwzcZzc4e2PxRttYLPfQ6wHn9ZPCNdGs1iWRckGd34vuW7R+50VO04MguLpAO6vjDKJ9rmKr2XyUjwumo59xwEnyjVohPIMRVSHx5xgHSW1ClMyOjTmLKO9eW+bnCB7oR2qH4egvbSL5078nNM3UGi+cl3W8/McRhA2PqP8PigLT2N3qJawKvCtbukC2nv85e/9dz4vl++xru7NYcIL05bsfbxeVpr7cGGAGL08MWRPr6V1eYMY0uyQy2kxMVXHOI7jA8aFj5HkxkNnupa1DSw87Uq2IBqBUB9Fty8eBXqfafp4kN+hl4BTetC6t0fta7OhI4L61fpj2tRR5meEvfIXH6mAKSkaS5Jr5e4XfRYKkIR2DfiwRUjmACUkYIVETdwOSPprdm2m5c7HRRIX5k5OQTqIXMKO9oU4dMSxYRUKjsI8JJYW8kkRAu1k5QIwOdWXqsF9Yz5C/h4UM23OzRg3WIxicHBmjPznKqEtbHAdEN4z2al435414KGH4FAIAOK04jF5Zl7F69ShnAwqPkuG 7J45fuSF nyGg24lw0rgXTFA/TsUmg2MXeUiY/+Eoy6BsGAUXBhw8X1xZFDCejegY2F9845xqAhmdgsfi72IWabSxSO+bAgHFvgBkGJJTcB7TAxym+x0dhwww3TWiafLIgMo0tS87hDk6wVyrjW7X+YrX4NqU4x88nNQlc7RGv+uUdJ3sJ4yuyQhJasFSZXw3Cv+9X+iBazUvXfpi8rsoI8+C8eKwhBPkKlYSnO8ZcE3x6RP3PePSAOzpWq9SzU0JWLTgXYjh0D05tMOdimm6YIeATF0GYdnRg98PHlXV7OGEQayhuPyjDfvR38cooVPwZT0YtSs6Xx7VMdBYFVsb7KUGObRzO4SQHZXP1S2H77ipj8ws/ABJvRGP6M7hCBvr+E2NBVkaR1fF2fhuQV8ywdMQ+6GL8L1j1XcbznSlti4yvc1Y6RJZTcHwhbyXVdOVh6XH8aSZEVg0t5zmqwwK7M6mYadmGZKh6LxKiPxTK2tJURH/2lsKzUCw2LyzkIpVyorX+/T0bA3NzYnDDuTkwCHZg4c9XfSwN+xZChlgbYJgRwE7Wdt2MXk+IJ9A4HOUsXQFE36mzan+FufcgJ5AD40rvTBzTleirKJP2PTKQsgRnC7/EiorZH0E= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This patchset makes it possible for MGLRU to consult secondary MMUs while doing aging, not just during eviction. This allows for more accurate reclaim decisions, which is especially important for proactive reclaim. This series does the following: 1. Improve locking for the existing test/clear_young notifiers for x86 and arm64. 2. Add a fast_only parameter into test_young() and clear_young(), and add helper functions for using the new parameter (e.g. mmu_notifier_clear_young_fast_only(). Non-trivially implement the fast-only test_young() and clear_young() for x86_64. 3. Incorporate mmu_notifier_clear_young_fast_only() into MGLRU aging. 4. Add an MGLRU mode (-l) to access_tracking_perf_test to show that aging is working properly. Please note that mmu_notifier_test_young_fast_only() is added but not used in this series. I am happy to remove it if that would be appropriate. The fast-only notifiers serve a particular purpose: for aging, we neither want to delay other operations (e.g. unmapping for eviction) nor do we want to be delayed by these other operations ourselves. By default, the implementations of test_young() and clear_young() are meant to be *accurate*, not fast. The fast-only notifiers will only give age information that can be gathered fast. The fast-only notifiers are non-trivially implemented for only x86_64 right now (as the KVM/x86 TDP MMU is the only secondary MMU that supports lockless Accessed bit harvesting). To make aging work for more than just x86, the fast-only clear_young() notifier must be non-trivially implemented by those other architectures and HAVE_KVM_MMU_NOTIFIER_YOUNG_FAST_ONLY needs to be set. access_tracking_perf_test now has a mode (-p) to check performance of MGLRU aging while the VM is faulting memory in. See the v4 cover letter[2] for performance data collected with this test. Previous versions of this series included logic in MGLRU and KVM to support batching the updates to secondary page tables. This version removes this logic, as it was complex and not necessary to enable proactive reclaim. This optimization, as well as enabling aging for arm64 and powerpc, can be done in a later series. === Previous Versions === Since v5[1]: - Reworked test_clear_young_fast_only() into a new parameter for the existing notifiers (thanks Sean). - Added mmu_notifier.has_fast_aging to tell mm if calling fast-only notifiers should be done. - Added mm_has_fast_young_notifiers() to inform users if calling fast-only notifier helpers is worthwhile (for look-around to use). - Changed MGLRU to invoke a single notifier instead of two when aging and doing look-around (thanks Yu). - For KVM/x86, check indirect_shadow_pages > 0 instead of kvm_memslots_have_rmaps() when collecting age information (thanks Sean). - For KVM/arm, some fixes from Oliver. - Small fixes to access_tracking_perf_test. - Added missing !MMU_NOTIFIER version of mmu_notifier_clear_young(). Since v4[2]: - Removed Kconfig that controlled when aging was enabled. Aging will be done whenever the architecture supports it (thanks Yu). - Added a new MMU notifier, test_clear_young_fast_only(), specifically for MGLRU to use. - Add kvm_fast_{test_,}age_gfn, implemented by x86. - Fix locking for clear_flush_young(). - Added KVM_MMU_NOTIFIER_YOUNG_LOCKLESS to clean up locking changes (thanks Sean). - Fix WARN_ON and other cleanup for the arm64 locking changes (thanks Oliver). Since v3[3]: - Vastly simplified the series (thanks David). Removed mmu notifier batching logic entirely. - Cleaned up how locking is done for mmu_notifier_test/clear_young (thanks David). - Look-around is now only done when there are no secondary MMUs subscribed to MMU notifiers. - CONFIG_LRU_GEN_WALKS_SECONDARY_MMU has been added. - Fixed the lockless implementation of kvm_{test,}age_gfn for x86 (thanks David). - Added MGLRU functional and performance tests to access_tracking_perf_test (thanks Axel). - In v3, an mm would be completely ignored (for aging) if there was a secondary MMU but support for secondary MMU walking was missing. Now, missing secondary MMU walking support simply skips the notifier calls (except for eviction). - Added a sanity check for that range->lockless and range->on_lock are never both provided for the memslot walk. For the changes since v2[4], see v3. Based on latest kvm/next. [1]: https://lore.kernel.org/linux-mm/20240611002145.2078921-1-jthoughton@google.com/ [2]: https://lore.kernel.org/linux-mm/20240529180510.2295118-1-jthoughton@google.com/ [3]: https://lore.kernel.org/linux-mm/20240401232946.1837665-1-jthoughton@google.com/ [4]: https://lore.kernel.org/kvmarm/20230526234435.662652-1-yuzhao@google.com/ James Houghton (11): KVM: Add lockless memslot walk to KVM KVM: x86: Relax locking for kvm_test_age_gfn and kvm_age_gfn KVM: arm64: Relax locking for kvm_test_age_gfn and kvm_age_gfn mm: Add missing mmu_notifier_clear_young for !MMU_NOTIFIER mm: Add fast_only bool to test_young and clear_young MMU notifiers mm: Add has_fast_aging to struct mmu_notifier KVM: Pass fast_only to kvm_{test_,}age_gfn KVM: x86: Optimize kvm_{test_,}age_gfn a little bit KVM: x86: Implement fast_only versions of kvm_{test_,}age_gfn mm: multi-gen LRU: Have secondary MMUs participate in aging KVM: selftests: Add multi-gen LRU aging to access_tracking_perf_test Documentation/admin-guide/mm/multigen_lru.rst | 6 +- arch/arm64/kvm/Kconfig | 1 + arch/arm64/kvm/hyp/pgtable.c | 15 +- arch/arm64/kvm/mmu.c | 30 +- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/Kconfig | 2 + arch/x86/kvm/mmu/mmu.c | 23 +- arch/x86/kvm/mmu/tdp_iter.h | 27 +- arch/x86/kvm/mmu/tdp_mmu.c | 67 ++- include/linux/kvm_host.h | 2 + include/linux/mmu_notifier.h | 67 ++- include/linux/mmzone.h | 6 +- include/trace/events/kvm.h | 19 +- mm/damon/vaddr.c | 2 - mm/mmu_notifier.c | 38 +- mm/rmap.c | 9 +- mm/vmscan.c | 148 +++++-- tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/access_tracking_perf_test.c | 369 +++++++++++++++-- .../selftests/kvm/include/lru_gen_util.h | 55 +++ .../testing/selftests/kvm/lib/lru_gen_util.c | 391 ++++++++++++++++++ virt/kvm/Kconfig | 7 + virt/kvm/kvm_main.c | 73 ++-- 23 files changed, 1194 insertions(+), 165 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/lru_gen_util.h create mode 100644 tools/testing/selftests/kvm/lib/lru_gen_util.c base-commit: 332d2c1d713e232e163386c35a3ba0c1b90df83f