From patchwork Wed Jul 24 01:10:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13740498 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE6B8C3DA64 for ; Wed, 24 Jul 2024 01:11:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9ED766B0083; Tue, 23 Jul 2024 21:11:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 99AF36B0085; Tue, 23 Jul 2024 21:11:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F3F56B0089; Tue, 23 Jul 2024 21:11:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 5759A6B0085 for ; Tue, 23 Jul 2024 21:11:18 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id F164F40538 for ; Wed, 24 Jul 2024 01:11:17 +0000 (UTC) X-FDA: 82372867794.13.863B110 Received: from mail-vs1-f74.google.com (mail-vs1-f74.google.com [209.85.217.74]) by imf02.hostedemail.com (Postfix) with ESMTP id 3A82A8000D for ; Wed, 24 Jul 2024 01:11:15 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=lirTCjLE; spf=pass (imf02.hostedemail.com: domain of 3slSgZgoKCBk8I6DJ56IDC5DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--jthoughton.bounces.google.com designates 209.85.217.74 as permitted sender) smtp.mailfrom=3slSgZgoKCBk8I6DJ56IDC5DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721783438; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cbTYml8WJPEGh8qSfIdK1VudaJalVARhAdPc06J+ALw=; b=64HXllbFZluLvUgi1PcPq+iQsbZfjZjoylM0EhVZOIw9UsTOA7GwVkTZ/du+YCd25+tKsQ Oxp8O2DmsH4+6oGF8yDTq2i6b29skL85+CIAcZQO43QqkDJNYOeegf23wL41Xui81KtxPi HWV4Mi/CuDHdSad0j3+JuQVaWfiRR4M= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=lirTCjLE; spf=pass (imf02.hostedemail.com: domain of 3slSgZgoKCBk8I6DJ56IDC5DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--jthoughton.bounces.google.com designates 209.85.217.74 as permitted sender) smtp.mailfrom=3slSgZgoKCBk8I6DJ56IDC5DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721783438; a=rsa-sha256; cv=none; b=PZ3lBgvwpjuxjBvwyOBmDM74Sto4gGbV9goRlLevlR3snI9W8sGXx8tLTN5Zaq1YbEzW2E tEKOpEHRPzLKkWrQR5H7NlnDuFwqS+zdJatuf2xpN5KD2lDd890KZSZlE3bHPKCOYKbFvO HZwpHlCCAQ6nbblLQuMsIHhGjpJuJTE= Received: by mail-vs1-f74.google.com with SMTP id ada2fe7eead31-4928b926d38so1619415137.0 for ; Tue, 23 Jul 2024 18:11:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1721783474; x=1722388274; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=cbTYml8WJPEGh8qSfIdK1VudaJalVARhAdPc06J+ALw=; b=lirTCjLEsRIFAq4BrOnmYSEqDi+XYT9em/qUvRlny8xamJHbR7Zottm185NSx7Jm3N +3zDbwl7D8fzI47XuD5i8ovNJJbHXAWz05VBloP9LQfIUAQIN0vaWLwNTTFyV6RmVvCF ClRA6WhAULXwPrhxjE3zjv2iK6kEnbSZWIO1We9dh7lfDirlWZTbLfHJ0CJ/khs2HP87 BtXTujs/FN71jY5DhGjd+T3hk8KD/lUnMrD2bPxVK3INcnJoZduGElp73OLWEDPa2cMS HvQ4DNr5OZHvd1Yo6sb5bJAnpTxiUGBgaey77868SJog8+DwhS8qz4yVq00fXP4+Aijx HuWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721783474; x=1722388274; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=cbTYml8WJPEGh8qSfIdK1VudaJalVARhAdPc06J+ALw=; b=ZiAvgbQ57R0xuAuUn16GkmfcC5EnKWcJwf8DQ9ZrwT4u9IlxpnIItwXLzXieztPDlh pS0f6qZzLun8OszQSyevi5hjUlzR4Sw7KfULF7FZgUfr5A9WQI3FAeGycSAIWQ2YisUJ B7bw6aptLzfw3Q2pnnQny5fnLHD1nGKC8RXQmclV2kIREiSrbIjTUCloDQz0e4bxsmbT hRCwQybiYx6mGvJj/lYu9LPg7nBQo2HJ78N3AP2M1ZdGvtFyo53PK1rosdjagE5cCrCq M5eInSwhMZ0i7XorD68m2DYOnXk4w7oofkPuCHFeEgL8QO0b+rMa8NwOXb4MmfA7LqYt vzHQ== X-Forwarded-Encrypted: i=1; AJvYcCWn1Q/kR6ZPauBuhbG+W4l9eXDy7bcWbmva5S9zbkDBwPh5Z8t9ThXLI1Fkjoa8L+CD1H2M7vmqXN7p3f80BBM1a+A= X-Gm-Message-State: AOJu0Yx25FPPM4nw914tBbXVqHY55qQqp5n86xiSI/T0SrZ+KgFF6WBO 3zpqGehutNADtpW2K8hYrMHtCU7IgDZ/kJjamac7sfbjWyyoJyjeVRHK4bek3elXUqLOi+Eruyz iDkSMoimZM+W2Nak6Vg== X-Google-Smtp-Source: AGHT+IFU0dcKvF1oJPyuhM+g7ff3d/ALiIFFaW2tweOW7lf82u7TeCdfrHDEq+idZRCqJG0SfkAoO+hAK8jfDIhe X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6102:54a5:b0:492:a760:c94c with SMTP id ada2fe7eead31-493c199efb7mr42818137.4.1721783474197; Tue, 23 Jul 2024 18:11:14 -0700 (PDT) Date: Wed, 24 Jul 2024 01:10:26 +0000 In-Reply-To: <20240724011037.3671523-1-jthoughton@google.com> Mime-Version: 1.0 References: <20240724011037.3671523-1-jthoughton@google.com> X-Mailer: git-send-email 2.46.0.rc1.232.g9752f9e123-goog Message-ID: <20240724011037.3671523-2-jthoughton@google.com> Subject: [PATCH v6 01/11] KVM: Add lockless memslot walk to KVM From: James Houghton To: Andrew Morton , Paolo Bonzini Cc: Ankit Agrawal , Axel Rasmussen , Catalin Marinas , David Matlack , David Rientjes , James Houghton , James Morse , Jason Gunthorpe , Jonathan Corbet , Marc Zyngier , Oliver Upton , Raghavendra Rao Ananta , Ryan Roberts , Sean Christopherson , Shaoqin Huang , Suzuki K Poulose , Wei Xu , Will Deacon , Yu Zhao , Zenghui Yu , kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org X-Stat-Signature: ok3ewf993k5rdnj7ipbw39iym5koc8g5 X-Rspam-User: X-Rspamd-Queue-Id: 3A82A8000D X-Rspamd-Server: rspam02 X-HE-Tag: 1721783475-728939 X-HE-Meta: U2FsdGVkX1964HRqckflKNVN+myHq5+wLyU1XLNbyU/CcMlhgNVUO4Mz6jKD4OoxL2Isfo8boPnNh6wYP1ix8Xbmlyyc6OP+Gyjqnf2WwMc8j8E2kgBc3+OhMVSKmubFoDPraJTgWiqGIth9Wkj8XxVxylDFKBpdzI0eHKJZwnsrlLQi+fA2zhSwOCu4g8csRUqKHtrrV6hXaN66itSZ6EqPUCoJroBC+lEQatypkz5CgOzk4KSw5Co0kloGG1JKNo2nK/PTVgt05+h3WPgM28aqy9hTNbLK4ga9RiBKiOriE3mHYq6Cf/8txzdpakXFM6oErK9zv2qKhnmwy5+27c+cTA9xUsTrXklkF8s7iVS5umdYz72rO5vYdrGXh9GokWoRdGBYGs9NC3mwhC4UaZp21ftaWjJmYJGarUQC4QbhCXpYgtzGu6x7upkuOG5WmfZatruuY03ZBTs1y5F4/CvvoOKOPWcEMOLGNmWYPMTNvIEX8tYINxWsR/OstKW9MVJKPX2AIC9AYt73uUPn/k/Z02hG5sQAt/NNWO6ET5UeGuu+Bxn8FP1uvhpd1BafRjZI4oAziHjqqln130QP8LgKY30NVItu2XqgZWWWgWqeuGPQB4qKo241au93ldw1PWT4K6lN+hNucdRJ3ZIF3zSxr7obnfV7ypYKIkCkqwe5U4t/D7MOAnSUpe+ajfhq5Toqb5f08Xgvi/KxrtWL17E+YOheE5frrgKMQbpUnWRG+LIzai362npqvfYZM0L8up4w2tHSJ3bnUAQlAPz57baPcWxtmDxPagA8zYH/LKOK4nCyaMWaK6rCBqqnWzMCDZCExN1bCfmUiOEYL2kq9k840VkoSZV3Wc1IdK220s0bW89gSX88/lpwTWGqmHO0JMWovHgyZfgG4GWBZDyKppQlXKHrfEcgWMqEuclQue7QlX4yubfJVVLZjeASRwASvItVMFuP1KBttc2kpZm oBZ2QhWA lP2bikcFshJ504CZhIYMdcX6ympA1LPLZwtfRzHHt6UR0ZsFIgjrQqC8vViRNxd6CEC1V4wAfnX0Tn5PcZ3k3vD0ywexXmnx7wOJSJx0JaoUgU8Cc2ngIoioRDPr5OF1ptEJX15LEzNrLHvLKvSrqMS4ukpMQeDDgIYaa6USZdRqbm/1XlNGir7KGKYPS4rK0a4Cit0YZnVHHDMouVTs67eaGXWiZmOY/2vXXZuDWiMKnkEwIuaMg0XvWLb72c7IUaEKVh0XaIBczSj1iC3lLZCeb1MiltJyYa+Lay5Wab8nF2pFsJGofTZrFZxc8Jc/w4tUV3liW/uu0uvTBUdBLUYCGr52VTdLLXrZa+WC0CSkz0UCwe+UTicK0Ulam1u3b1rjp4ur6+HFli68zpSb6ZBQH48EFTK7YH4AEVaQ/k+kZKEwj86VnpAuxKbrRrrmprHjOFFQrjgOxNHzr5VsoApvUwPnfsr4T+7Gqk0x+TjnSfI5qPfKPZ0T31ezNG6pEPG8Rnr26JWL+5ILLh1xf5HOnfT7FMIlsWExKQqGXb9DUslM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Provide flexibility to the architecture to synchronize as optimally as they can instead of always taking the MMU lock for writing. Architectures that do their own locking must select CONFIG_KVM_MMU_NOTIFIER_YOUNG_LOCKLESS. The immediate application is to allow architectures to implement the test/clear_young MMU notifiers more cheaply. Suggested-by: Yu Zhao Signed-off-by: James Houghton Reviewed-by: David Matlack --- include/linux/kvm_host.h | 1 + virt/kvm/Kconfig | 3 +++ virt/kvm/kvm_main.c | 26 +++++++++++++++++++------- 3 files changed, 23 insertions(+), 7 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 689e8be873a7..8cd80f969cff 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -266,6 +266,7 @@ struct kvm_gfn_range { gfn_t end; union kvm_mmu_notifier_arg arg; bool may_block; + bool lockless; }; bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range); bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range); diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index b14e14cdbfb9..632334861001 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -100,6 +100,9 @@ config KVM_GENERIC_MMU_NOTIFIER select MMU_NOTIFIER bool +config KVM_MMU_NOTIFIER_YOUNG_LOCKLESS + bool + config KVM_GENERIC_MEMORY_ATTRIBUTES depends on KVM_GENERIC_MMU_NOTIFIER bool diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index d0788d0a72cc..33f8997a5c29 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -555,6 +555,7 @@ struct kvm_mmu_notifier_range { on_lock_fn_t on_lock; bool flush_on_ret; bool may_block; + bool lockless; }; /* @@ -609,6 +610,10 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm, IS_KVM_NULL_FN(range->handler))) return r; + /* on_lock will never be called for lockless walks */ + if (WARN_ON_ONCE(range->lockless && !IS_KVM_NULL_FN(range->on_lock))) + return r; + idx = srcu_read_lock(&kvm->srcu); for (i = 0; i < kvm_arch_nr_memslot_as_ids(kvm); i++) { @@ -640,15 +645,18 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm, gfn_range.start = hva_to_gfn_memslot(hva_start, slot); gfn_range.end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, slot); gfn_range.slot = slot; + gfn_range.lockless = range->lockless; if (!r.found_memslot) { r.found_memslot = true; - KVM_MMU_LOCK(kvm); - if (!IS_KVM_NULL_FN(range->on_lock)) - range->on_lock(kvm); - - if (IS_KVM_NULL_FN(range->handler)) - goto mmu_unlock; + if (!range->lockless) { + KVM_MMU_LOCK(kvm); + if (!IS_KVM_NULL_FN(range->on_lock)) + range->on_lock(kvm); + + if (IS_KVM_NULL_FN(range->handler)) + goto mmu_unlock; + } } r.ret |= range->handler(kvm, &gfn_range); } @@ -658,7 +666,7 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm, kvm_flush_remote_tlbs(kvm); mmu_unlock: - if (r.found_memslot) + if (r.found_memslot && !range->lockless) KVM_MMU_UNLOCK(kvm); srcu_read_unlock(&kvm->srcu, idx); @@ -679,6 +687,8 @@ static __always_inline int kvm_handle_hva_range(struct mmu_notifier *mn, .on_lock = (void *)kvm_null_fn, .flush_on_ret = true, .may_block = false, + .lockless = + IS_ENABLED(CONFIG_KVM_MMU_NOTIFIER_YOUNG_LOCKLESS), }; return __kvm_handle_hva_range(kvm, &range).ret; @@ -697,6 +707,8 @@ static __always_inline int kvm_handle_hva_range_no_flush(struct mmu_notifier *mn .on_lock = (void *)kvm_null_fn, .flush_on_ret = false, .may_block = false, + .lockless = + IS_ENABLED(CONFIG_KVM_MMU_NOTIFIER_YOUNG_LOCKLESS), }; return __kvm_handle_hva_range(kvm, &range).ret; From patchwork Wed Jul 24 01:10:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13740497 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3F99C3DA49 for ; Wed, 24 Jul 2024 01:11:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5CBAC6B0082; Tue, 23 Jul 2024 21:11:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5076D6B0083; Tue, 23 Jul 2024 21:11:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 35A256B0085; Tue, 23 Jul 2024 21:11:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 116C96B0082 for ; Tue, 23 Jul 2024 21:11:18 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 8B825A2836 for ; Wed, 24 Jul 2024 01:11:17 +0000 (UTC) X-FDA: 82372867794.17.8F15A0F Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf15.hostedemail.com (Postfix) with ESMTP id CEF75A0007 for ; Wed, 24 Jul 2024 01:11:15 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=FGO7PeGN; spf=pass (imf15.hostedemail.com: domain of 3slSgZgoKCBk8I6DJ56IDC5DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3slSgZgoKCBk8I6DJ56IDC5DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721783428; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wYE88dZwxum6KV/GhN2JgnjTuhwRm296Ss1xDF2ZSzg=; b=f7ai0FJwx00PW1zWMK/N7YAEoyYNhvi7LIm9l/Gg+UPHDd66Ro/wCdt9lXnJm5S0GdQMuL 4DupWXJl7SxA/E5TxEuZfdqZZncJxaugzZI1Ht3kYLh9Mw174WMmExU63/WDrt0EvwMUZP 4Sf+hwk9A4e8jH2LgqjghtFeiUvnoXc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721783428; a=rsa-sha256; cv=none; b=LYyJr/fFzYGOr7oXG1IKY32BHoSw6g92DwFF2F1WcthDnorXmtJnptZUE2iw0WGg/wkmlW DDplteCnPbrs4go1HnzsSX+uUKSzdEoum30W71lNEgiZhmbLgp3R5wAVBEOp5F5W/tAPPS +8YOXLjcmQejn5YofIAFeikb9fLsfAA= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=FGO7PeGN; spf=pass (imf15.hostedemail.com: domain of 3slSgZgoKCBk8I6DJ56IDC5DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3slSgZgoKCBk8I6DJ56IDC5DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-666010fb35cso8220957b3.0 for ; Tue, 23 Jul 2024 18:11:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1721783475; x=1722388275; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=wYE88dZwxum6KV/GhN2JgnjTuhwRm296Ss1xDF2ZSzg=; b=FGO7PeGNcRyiTLA8bFDdj706rIHGnTBsEtE/13J53aMfziptCNy+BUAYUhAryxpZ68 mGjQtAPws+8FCXnmBoT7JkM7vtskiHQ4r3LKq2a3p1cnFUzmt1Zv1VGgbjTXGac1OO6S x6KmFLD7xNJRY/eWZZLZ5NZAnnrDQJrOC1Ct79rb1zdLHY/nzLqB7/PZ9OJSHU5vNZ/H dHz//rd0mTum7BrmXQ4aU8HvjGZtuUXq21v44GEwIDpXG1Ru3leDWDzxChZ7FBX1xDlY gQBEg7IeBGB6Frr7g7OJjYzFh+sSwpst4JEuDE8HFgARFn1WW5aAsQqfJGdBBHE0CtuA QJJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721783475; x=1722388275; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=wYE88dZwxum6KV/GhN2JgnjTuhwRm296Ss1xDF2ZSzg=; b=JC/lVlUzbatsS9cM7IGu58u9XSHNYrTvxPZwu7MEunucXyMQ066UktOTdCyOXc8cns A2GCMUw3gqepYsavUcsh6yiIRGqtfNSQAWHHIZpcDOI8tPOTxCx6Mxrbp6GYrSFZ9EZd Huv4okW3eIDARD19BjdjVtiH3moGXsd7bC8uh07U9FmnmPa4+hURe4BG/BK2+oVNPnsp h6xMoLm0JWrd41pjqs/WNOQUd+/o6CMeEOqSOufzez8NciScyiS0MJFGdzrqXR7Lhy70 JU499rKmimTpkkRmYTapiTwvp1+xCbH8b1fEC/YpRbSKDYe/L3QxwLmU5UJr2FcW4jaE Jw/A== X-Forwarded-Encrypted: i=1; AJvYcCVJjlPWmYATLX12OOCzZ+NF8mYngwLn05a1/cMikrz+mH16DtbTF72UdmvxuK+zPiM9+RJXJN8G4X0HacAeM9wBAhA= X-Gm-Message-State: AOJu0Yy+KKQGR0BZrFoOMLj5PxxtwWAYbkHT1iKgRAPKnzIRQW2bSf0Z EwgwSMnXvLO+05uXjuD+odpOBAXVRjduvOJ/bZfPPsWvLHK0riJEgy9mJGoU+B0hvIg2t7MrDMI ZXOaBPmTbSgOsK4lygg== X-Google-Smtp-Source: AGHT+IFlbCwOv7FcHHPSyN/4LSWSns0pNrKnm/Gh8fQ5E5vmLdKRE2NJsut+pxEH6Bu3xVJUKO8eEZSj9XJO7SkF X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a0d:ff85:0:b0:665:7b0d:ed27 with SMTP id 00721157ae682-672bcb1f07cmr141337b3.2.1721783474906; Tue, 23 Jul 2024 18:11:14 -0700 (PDT) Date: Wed, 24 Jul 2024 01:10:27 +0000 In-Reply-To: <20240724011037.3671523-1-jthoughton@google.com> Mime-Version: 1.0 References: <20240724011037.3671523-1-jthoughton@google.com> X-Mailer: git-send-email 2.46.0.rc1.232.g9752f9e123-goog Message-ID: <20240724011037.3671523-3-jthoughton@google.com> Subject: [PATCH v6 02/11] KVM: x86: Relax locking for kvm_test_age_gfn and kvm_age_gfn From: James Houghton To: Andrew Morton , Paolo Bonzini Cc: Ankit Agrawal , Axel Rasmussen , Catalin Marinas , David Matlack , David Rientjes , James Houghton , James Morse , Jason Gunthorpe , Jonathan Corbet , Marc Zyngier , Oliver Upton , Raghavendra Rao Ananta , Ryan Roberts , Sean Christopherson , Shaoqin Huang , Suzuki K Poulose , Wei Xu , Will Deacon , Yu Zhao , Zenghui Yu , kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: CEF75A0007 X-Stat-Signature: hkc6dhhm9xnpmek9zfk89wsbj8ppxswp X-HE-Tag: 1721783475-685656 X-HE-Meta: U2FsdGVkX18TfeC5HqDHG4SPRf9rZx06MhAXQNtXZmh9Q1lLDOtZn0no+gl1xrA0AmuPBWfTmCn+qCROfpKFnHDHp01OZ0e+ZrSzCR+7jf4hMkbixnGaSsYnlsvR5rBI9z7EJ+cNB84Ar3OqHG3uyj3Q8Uq7u/kf2O3kQFA9WDY2YTaLeasclTFY1fkDPOhvc2D+MD7pe7make7trxw2nDd2VEDrJbd+ToefxvzpbQQy0MHavMje4FxZsQJC+iuPTbplRIDB38Nm7b3i7K9HsofKNV5pyR7oTfuPIJOi0W7D9iwbeZyZWMlXB+tNejDb70TYov2sYHP+ejEpFaCGFMBo//PTASvajd4X1aEx2aXBzblpPTtCMidy4k5hUf/DdA0zWBZTYVW+8j4JDzrxylA3b2/0UIOJiN3NfxTccFkAxNeEo3STEwEtDNvwcuQU27Nw17+gYrJoP41MjOpZlqXpf4q/DidlV2DgovjweNno7Xj2oi29t9q9Kc1BufzbcU3ZVhy1VaoNmNsS2N0UxwJonDreL+Ml6QJC42dIsN1p3L4DII3u7MCGK9FkfEq6Yjl2hwgn7JLuQ7ZHzDxpKvP1XAgFPSROS+UZBCm9i0nRJicxZtR+KYq69VRTDAs86uTJUd59apXKyqLmxQSzy7PpcA4YqgBINwg2VnPn1pCfI9DlMR2jRXJEpwcj34FBZ3m7RmYs8JFBjoxTM1vwIio9brIkZrl2I9RhD9spNjaKNw5O3vQRQtYZn+XLjSXe56b2uvLMFjPJNAHFr+RL1DPRbawjWC2+EaLPzOklxb6ez8LjhdGQOoIudZkBNjT10GaEXGhaq2TYeTcwGPYpQqiW1PUdPwzwRz1KXzVhFP6hvzIW9z7AoNJCk6F5CrPmOH1FhgqWaLUEhOpQDxrI/KXgK7Eta+zxOqS7vF36Nlh5Z6hxc/RB9cQWv1971bbkbp14PazlOvfVsj6di7b ok+KyD0L Mf8p7Rip6iqb2p0zr13RUAaWGhGBQXxqGvXbFiEri1UtMwobBfZ03ZeeHtW4uZbMlfPp3yGAHqF1kJBEQXaZ/s3d8zsBdXJiSZrESdIWfhWWG9mFKKx1p6Hr15vYmHRk9/FHKQ8ydUCGkry7Bkm4wNOe6ZvVYpJS+IDQQOQaadY712pTHUkGCak1Axz5GJt25AFbZ0TJs4wIBMx0JlylevQVaR0zYs/FRylqCs/SxrzjBspAU1wfWZkKjjroBrSJvDWRwei8QIe2bkOsqGmjXQsGK7/Uu6P5LtXWRT+TBX13DBweEjvendxaJD4jj2iH+PUGJH+EUvNQt2Ob+kmEydpsLoz4ylpdi9N2UJhdIUwboMWYxNGNWSKel+9lWL2XwpdmeeHvfKdJ8oyGeU1KBqe9S2qDaF/fQFRVB1lr/AGK80zo4tmj4305sqZitcBFwfnkp/2VqPTQARe54BM7b3247NDF6vSe6m/VoStEX7wRPq1G8zYpwc2HlIzVYQrt2obD+s4/TRgovU2djsAAfP0323ygixUAOeuQ6utHFKmDYiNICJCRAIt5jRyUc5x3wGuX3HULQAG0wohLZNfdP6SmWEUGQDfoUJMPWU9wWsFynKkjPqqtoxUYGtA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Walk the TDP MMU in an RCU read-side critical section. This requires a way to do RCU-safe walking of the tdp_mmu_roots; do this with a new macro. The PTE modifications are now done atomically, and kvm_tdp_mmu_spte_need_atomic_write() has been updated to account for the fact that kvm_age_gfn can now lockless update the accessed bit and the R/X bits). If the cmpxchg for marking the spte for access tracking fails, we simply retry if the spte is still a leaf PTE. If it isn't, we return false to continue the walk. Harvesting age information from the shadow MMU is still done while holding the MMU write lock. Suggested-by: Yu Zhao Signed-off-by: James Houghton Reviewed-by: David Matlack --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/Kconfig | 1 + arch/x86/kvm/mmu/mmu.c | 10 ++++- arch/x86/kvm/mmu/tdp_iter.h | 27 +++++++------ arch/x86/kvm/mmu/tdp_mmu.c | 67 +++++++++++++++++++++++++-------- 5 files changed, 77 insertions(+), 29 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 950a03e0181e..096988262005 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1456,6 +1456,7 @@ struct kvm_arch { * tdp_mmu_page set. * * For reads, this list is protected by: + * RCU alone or * the MMU lock in read mode + RCU or * the MMU lock in write mode * diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 4287a8071a3a..6ac43074c5e9 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -23,6 +23,7 @@ config KVM depends on X86_LOCAL_APIC select KVM_COMMON select KVM_GENERIC_MMU_NOTIFIER + select KVM_MMU_NOTIFIER_YOUNG_LOCKLESS select HAVE_KVM_IRQCHIP select HAVE_KVM_PFNCACHE select HAVE_KVM_DIRTY_RING_TSO diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 901be9e420a4..7b93ce8f0680 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1633,8 +1633,11 @@ bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { bool young = false; - if (kvm_memslots_have_rmaps(kvm)) + if (kvm_memslots_have_rmaps(kvm)) { + write_lock(&kvm->mmu_lock); young = kvm_handle_gfn_range(kvm, range, kvm_age_rmap); + write_unlock(&kvm->mmu_lock); + } if (tdp_mmu_enabled) young |= kvm_tdp_mmu_age_gfn_range(kvm, range); @@ -1646,8 +1649,11 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { bool young = false; - if (kvm_memslots_have_rmaps(kvm)) + if (kvm_memslots_have_rmaps(kvm)) { + write_lock(&kvm->mmu_lock); young = kvm_handle_gfn_range(kvm, range, kvm_test_age_rmap); + write_unlock(&kvm->mmu_lock); + } if (tdp_mmu_enabled) young |= kvm_tdp_mmu_test_age_gfn(kvm, range); diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h index 2880fd392e0c..510936a8455a 100644 --- a/arch/x86/kvm/mmu/tdp_iter.h +++ b/arch/x86/kvm/mmu/tdp_iter.h @@ -25,6 +25,13 @@ static inline u64 kvm_tdp_mmu_write_spte_atomic(tdp_ptep_t sptep, u64 new_spte) return xchg(rcu_dereference(sptep), new_spte); } +static inline u64 tdp_mmu_clear_spte_bits_atomic(tdp_ptep_t sptep, u64 mask) +{ + atomic64_t *sptep_atomic = (atomic64_t *)rcu_dereference(sptep); + + return (u64)atomic64_fetch_and(~mask, sptep_atomic); +} + static inline void __kvm_tdp_mmu_write_spte(tdp_ptep_t sptep, u64 new_spte) { KVM_MMU_WARN_ON(is_ept_ve_possible(new_spte)); @@ -32,10 +39,11 @@ static inline void __kvm_tdp_mmu_write_spte(tdp_ptep_t sptep, u64 new_spte) } /* - * SPTEs must be modified atomically if they are shadow-present, leaf - * SPTEs, and have volatile bits, i.e. has bits that can be set outside - * of mmu_lock. The Writable bit can be set by KVM's fast page fault - * handler, and Accessed and Dirty bits can be set by the CPU. + * SPTEs must be modified atomically if they have bits that can be set outside + * of the mmu_lock. This can happen for any shadow-present leaf SPTEs, as the + * Writable bit can be set by KVM's fast page fault handler, the Accessed and + * Dirty bits can be set by the CPU, and the Accessed and R/X bits can be + * cleared by age_gfn_range. * * Note, non-leaf SPTEs do have Accessed bits and those bits are * technically volatile, but KVM doesn't consume the Accessed bit of @@ -46,8 +54,7 @@ static inline void __kvm_tdp_mmu_write_spte(tdp_ptep_t sptep, u64 new_spte) static inline bool kvm_tdp_mmu_spte_need_atomic_write(u64 old_spte, int level) { return is_shadow_present_pte(old_spte) && - is_last_spte(old_spte, level) && - spte_has_volatile_bits(old_spte); + is_last_spte(old_spte, level); } static inline u64 kvm_tdp_mmu_write_spte(tdp_ptep_t sptep, u64 old_spte, @@ -63,12 +70,8 @@ static inline u64 kvm_tdp_mmu_write_spte(tdp_ptep_t sptep, u64 old_spte, static inline u64 tdp_mmu_clear_spte_bits(tdp_ptep_t sptep, u64 old_spte, u64 mask, int level) { - atomic64_t *sptep_atomic; - - if (kvm_tdp_mmu_spte_need_atomic_write(old_spte, level)) { - sptep_atomic = (atomic64_t *)rcu_dereference(sptep); - return (u64)atomic64_fetch_and(~mask, sptep_atomic); - } + if (kvm_tdp_mmu_spte_need_atomic_write(old_spte, level)) + return tdp_mmu_clear_spte_bits_atomic(sptep, mask); __kvm_tdp_mmu_write_spte(sptep, old_spte & ~mask); return old_spte; diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index c7dc49ee7388..3f13b2db53de 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -29,6 +29,11 @@ static __always_inline bool kvm_lockdep_assert_mmu_lock_held(struct kvm *kvm, return true; } +static __always_inline bool kvm_lockdep_assert_rcu_read_lock_held(void) +{ + WARN_ON_ONCE(!rcu_read_lock_held()); + return true; +} void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm) { @@ -178,6 +183,15 @@ static struct kvm_mmu_page *tdp_mmu_next_root(struct kvm *kvm, ((_only_valid) && (_root)->role.invalid))) { \ } else +/* + * Iterate over all TDP MMU roots in an RCU read-side critical section. + */ +#define for_each_tdp_mmu_root_rcu(_kvm, _root, _as_id) \ + list_for_each_entry_rcu(_root, &_kvm->arch.tdp_mmu_roots, link) \ + if (kvm_lockdep_assert_rcu_read_lock_held() && \ + (_as_id >= 0 && kvm_mmu_page_as_id(_root) != _as_id)) { \ + } else + #define for_each_tdp_mmu_root(_kvm, _root, _as_id) \ __for_each_tdp_mmu_root(_kvm, _root, _as_id, false) @@ -1224,6 +1238,27 @@ static __always_inline bool kvm_tdp_mmu_handle_gfn(struct kvm *kvm, return ret; } +static __always_inline bool kvm_tdp_mmu_handle_gfn_lockless( + struct kvm *kvm, + struct kvm_gfn_range *range, + tdp_handler_t handler) +{ + struct kvm_mmu_page *root; + struct tdp_iter iter; + bool ret = false; + + rcu_read_lock(); + + for_each_tdp_mmu_root_rcu(kvm, root, range->slot->as_id) { + tdp_root_for_each_leaf_pte(iter, root, range->start, range->end) + ret |= handler(kvm, &iter, range); + } + + rcu_read_unlock(); + + return ret; +} + /* * Mark the SPTEs range of GFNs [start, end) unaccessed and return non-zero * if any of the GFNs in the range have been accessed. @@ -1237,28 +1272,30 @@ static bool age_gfn_range(struct kvm *kvm, struct tdp_iter *iter, { u64 new_spte; +retry: /* If we have a non-accessed entry we don't need to change the pte. */ if (!is_accessed_spte(iter->old_spte)) return false; if (spte_ad_enabled(iter->old_spte)) { - iter->old_spte = tdp_mmu_clear_spte_bits(iter->sptep, - iter->old_spte, - shadow_accessed_mask, - iter->level); + iter->old_spte = tdp_mmu_clear_spte_bits_atomic(iter->sptep, + shadow_accessed_mask); new_spte = iter->old_spte & ~shadow_accessed_mask; } else { - /* - * Capture the dirty status of the page, so that it doesn't get - * lost when the SPTE is marked for access tracking. - */ + new_spte = mark_spte_for_access_track(iter->old_spte); + if (__tdp_mmu_set_spte_atomic(iter, new_spte)) { + /* + * The cmpxchg failed. If the spte is still a + * last-level spte, we can safely retry. + */ + if (is_shadow_present_pte(iter->old_spte) && + is_last_spte(iter->old_spte, iter->level)) + goto retry; + /* Otherwise, continue walking. */ + return false; + } if (is_writable_pte(iter->old_spte)) kvm_set_pfn_dirty(spte_to_pfn(iter->old_spte)); - - new_spte = mark_spte_for_access_track(iter->old_spte); - iter->old_spte = kvm_tdp_mmu_write_spte(iter->sptep, - iter->old_spte, new_spte, - iter->level); } trace_kvm_tdp_mmu_spte_changed(iter->as_id, iter->gfn, iter->level, @@ -1268,7 +1305,7 @@ static bool age_gfn_range(struct kvm *kvm, struct tdp_iter *iter, bool kvm_tdp_mmu_age_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) { - return kvm_tdp_mmu_handle_gfn(kvm, range, age_gfn_range); + return kvm_tdp_mmu_handle_gfn_lockless(kvm, range, age_gfn_range); } static bool test_age_gfn(struct kvm *kvm, struct tdp_iter *iter, @@ -1279,7 +1316,7 @@ static bool test_age_gfn(struct kvm *kvm, struct tdp_iter *iter, bool kvm_tdp_mmu_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { - return kvm_tdp_mmu_handle_gfn(kvm, range, test_age_gfn); + return kvm_tdp_mmu_handle_gfn_lockless(kvm, range, test_age_gfn); } /* From patchwork Wed Jul 24 01:10:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13740499 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01340C3DA49 for ; Wed, 24 Jul 2024 01:11:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 495A26B0085; Tue, 23 Jul 2024 21:11:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3AB5E6B0088; Tue, 23 Jul 2024 21:11:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 185E26B0089; Tue, 23 Jul 2024 21:11:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E62786B0085 for ; Tue, 23 Jul 2024 21:11:18 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 99BB9A2A5F for ; Wed, 24 Jul 2024 01:11:18 +0000 (UTC) X-FDA: 82372867836.21.40900EB Received: from mail-vs1-f74.google.com (mail-vs1-f74.google.com [209.85.217.74]) by imf26.hostedemail.com (Postfix) with ESMTP id D34F614000E for ; Wed, 24 Jul 2024 01:11:16 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=qQpgxK+g; spf=pass (imf26.hostedemail.com: domain of 3s1SgZgoKCBo9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com designates 209.85.217.74 as permitted sender) smtp.mailfrom=3s1SgZgoKCBo9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721783414; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lX17S5nNca5CiBEi8R7tHh84KKPLmRWb0mcnxRYOK88=; b=pNSGr2A/Js9xwVLi0tplvpKYSWf+YclfsEpTjHQHsAXVSs5OzEXKiEyZIpQ/3wpjCNXFQt CzUhsDnAQC/vRhTIL9AtL4l7/T31T7LenJIzra0V/FBuWpnCHZ37BZ3hXny96ez3on7ibx TWdI8NPv3dqOQF0pNtVvCncAXFSuEtM= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=qQpgxK+g; spf=pass (imf26.hostedemail.com: domain of 3s1SgZgoKCBo9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com designates 209.85.217.74 as permitted sender) smtp.mailfrom=3s1SgZgoKCBo9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721783414; a=rsa-sha256; cv=none; b=fjUT5/CP17fAucouHfpo6gox36hONN5VBVDCMqpZUBbhZPSrAv/iFUQOcUZgkOeohDB5nX iZOXBN2zHSVBq6Os5I0GRvE2NJMf9KPu1XuzHiDxa70aR5LOrsPUYSiIgK+bUhysHt1eED qCMfCl+09AdPoQTPSXolDsBS0jerFos= Received: by mail-vs1-f74.google.com with SMTP id ada2fe7eead31-4928cea3c69so1738206137.3 for ; Tue, 23 Jul 2024 18:11:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1721783476; x=1722388276; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=lX17S5nNca5CiBEi8R7tHh84KKPLmRWb0mcnxRYOK88=; b=qQpgxK+gMSwjk0x/ImDPs+F0Jl5qXFOTBvTyDSjmikQ9B6HYbRgJbwz2uCsizbqHHZ yPsB1+hpEgwO5YSSEBPEUemQqef555mx0ibVU4VFHQp3qLaCvlwyIkruwh1xFMHVTOQH 0ya84Jip0XAi+OwZTsZ5IFaI0cjXB/Re+472V4qJ06ZSBPjd5Z1IMoU78VzqgR2EHV/0 RCSSuysRyn+kJey8r0681CO8aSQGdudDR/us/RnXiLVTHOAkJcRYoLnD3fwawt4T25ik wSByAPxA/VDD/EFQ/ER+dPIKSB/QvEtuyhOICRgGSVws5m9XfIFvpEQdjHsjBKGgJLIt 2+LQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721783476; x=1722388276; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lX17S5nNca5CiBEi8R7tHh84KKPLmRWb0mcnxRYOK88=; b=RKwo0BIEuqzr8AwwSEUhA9Jta90nWsizDdQWKOzzLi+Ie0QgLKMg7IthBaf4w7QQ+D UnLjYNWrWQjXK32UlTKYiuWXINChrh7I4vfxLilIUWpNciIozJEgP2qMcKdiHMXttRma Ye2es9Fw1M8QO/MnWVInGqAEwYgq50IlGBq5FukZZAgP/V9psOSXCA8LVLVHbU9EZliO fM4xWfJZ26SXES2k5E0hJVn2kR6uS5X5DoazLyfroW8xOVxUccYKbjRfodxpsXYkDwty hEPJzEXdAzvZBH/ScVjLt4MiYgeq303+OxMquMA+NH4SQVMn2RWGFipBGNoU2eNKTlXx 4PuQ== X-Forwarded-Encrypted: i=1; AJvYcCXA+5E3ym+MD7ehtqMCJ5QeEUAkV46rZ1VyqteL8JDu9Ke+bXv9RGfH4chh7TKGcDdg21azmIDNvyElltd6Whv3mXw= X-Gm-Message-State: AOJu0YxEtlEDss0Ao8hcTPzgqvyCmyQmd0dB9QIUwC6tZiVd78ZecoTl IANj5P8ZlXfF26HE0iOgq47YankqqCXG2qoNDQtWUkkw68MQvqtrwwG8aZ6PLSXa7c+Yx178sen K2o4ORABBmMcIyD8HWQ== X-Google-Smtp-Source: AGHT+IGlIuIrgkANC0pZBbk8z8He5iptT26RG1V2ZrvO9VkU9yoaN+aGYmYAd60aeXGiO8520IPOKzxK6pnBSRdv X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6102:2ac6:b0:492:9449:c33e with SMTP id ada2fe7eead31-493c19d19c5mr46187137.5.1721783475890; Tue, 23 Jul 2024 18:11:15 -0700 (PDT) Date: Wed, 24 Jul 2024 01:10:28 +0000 In-Reply-To: <20240724011037.3671523-1-jthoughton@google.com> Mime-Version: 1.0 References: <20240724011037.3671523-1-jthoughton@google.com> X-Mailer: git-send-email 2.46.0.rc1.232.g9752f9e123-goog Message-ID: <20240724011037.3671523-4-jthoughton@google.com> Subject: [PATCH v6 03/11] KVM: arm64: Relax locking for kvm_test_age_gfn and kvm_age_gfn From: James Houghton To: Andrew Morton , Paolo Bonzini Cc: Ankit Agrawal , Axel Rasmussen , Catalin Marinas , David Matlack , David Rientjes , James Houghton , James Morse , Jason Gunthorpe , Jonathan Corbet , Marc Zyngier , Oliver Upton , Raghavendra Rao Ananta , Ryan Roberts , Sean Christopherson , Shaoqin Huang , Suzuki K Poulose , Wei Xu , Will Deacon , Yu Zhao , Zenghui Yu , kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: D34F614000E X-Stat-Signature: 4knu43tkydo3qut173cu4k9r64wec1op X-Rspam-User: X-HE-Tag: 1721783476-902587 X-HE-Meta: U2FsdGVkX19uof8qD5E4VH+YWkGtiLNdpihg1DhAc1Ug2Z0CJQPQeNtxxp4dbHKPMIhE+blxdsVDxomtWNH/SHbg2sDXldhbpG2YASF9XO/SYzTiJnxNM0hgYnZiZ80Q4A9wrYpcrGHRIDXrZTk5xFWT2YkBMdYYmlZhp69U6jEP6/KN0gfjzEn5PJuE9nUn6eb2/g6C7YGZXKQS7V8sPIAkoJ5ZPq7TA7NPi4J+kwt33pGtqHWaTPAEdt7VSg1bU2IKKteV0GqTX5H1/QKkU397vgI4uXIS8V0AWAu4rrpz8zCh9zwkq3mGrms7ErUIaMJbh8icB3SBe43EP/gkGDylp65+fk5CZCZGAZEDcBRwa/O0ljAUD39NryXVoyQU7oY7zaq2WX5efO4gazt05W4TpSh1cikJmu6V3WxmXR924SXMwpqr06cY1rfVa7Co5RPNwJDzy6x5Jg839Wn76yJAmW4tDKaUyjjHoSJLxYymrBDWbGhYPV8mnzGCUi7k9a5yEKgEdNfngS6Ddmi50/by8jCsojkrGHLO93SUhcD8agLkeC2vuHlBvpUGVPSDscG+ztsvjc1ajVawIHLB0aLQOQSjDyb5NYBT2ImY+ano3YJdM412rX8ciSXptTyyyGgEHouACZQnTBgG39acmXZK6ZlkvGwkcDeowpsBujJiZchSX5qJFaQwd1q2ndyxhs/ezSMpzaJDpM1ouWKMklbCOf6ZvnEgLB7kuoMcNpogsodJ0RIUV6xDYVLi460cjii7uAUxtGsVPx8DzQDKLEjHEdTZKJV9lseSPM9vymHa350ATD/A1vg2pb8WMuykh3il4LASDuMRQgBJ3GE9pTJ7tVP08ttF2tbYblTul6IY3vaGYsR1sqlNyjTvi0s6bvyk2XZqtIgSLYWkJqcD7tTpBPYfEUfz9F2H5nXOlH+fcAi2vZVYFrnHSk2jz0CjMu0i51F3SD/EhC6lH9H fW7x1zM1 lErPLBv2U+aNT8q95xh7a7Tig7quDFd+pQOCBmmznJv+tZA8mdtsUpqHV9Fi84u1tgzOsfajfg3LIaMJcTOuAz3csn3lLFccUVOL8qYHAW0+DfDmjHFjUWIeeR6k4c/28T47WJram/ygREhqNjvlTfc/f9nZ/YZpV3iQ1DMOJ03Q1fs52jYlO0P8Gp9gAVDHS1k5wq2u8k7k+69cbHKUWD5Feyn3iVol4I1zhwwv9xCg7oKWzwGIuj/B3xrypJ52xIh0hIY+8RE90hqhPwp0EyI9exyWoKsph8a26AYrFZC1Ohm5jXMZqwedtzrdXgeitKfvlf6rx8ahIMbkJYxqoqwaoyxRD6JHQeKjUpItA3E/GA4tUOXB18BmJu+aMTx77q8Y7/A9xnsTenUt7L97375vqnH2S8qB6HwYMs9Tgu9jeRlqLdst2mHYLQmM8M1U2neYQ1vdHiFQX1UAvTuiNsgamhron+pNC16mADFT8DME96tbT27zB84Y0uijdHmxy21Uq6RU0EX2LueASiuoFeL73mzRPY4jaVbg6uQ/ROfuGyEDKmJBy1MhuKleJ++wum1tK0FHUs6If0+J+hwgUWXWpT8vSQ24ceORm X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Replace the MMU write locks (taken in the memslot iteration loop) for read locks. Grabbing the read lock instead of the write lock is safe because the only requirement we have is that the stage-2 page tables do not get deallocated while we are walking them. The stage2_age_walker() callback is safe to race with itself; update the comment to reflect the synchronization change. Signed-off-by: James Houghton --- arch/arm64/kvm/Kconfig | 1 + arch/arm64/kvm/hyp/pgtable.c | 15 +++++++++------ arch/arm64/kvm/mmu.c | 30 ++++++++++++++++++++++-------- 3 files changed, 32 insertions(+), 14 deletions(-) diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig index 58f09370d17e..7a1af8141c0e 100644 --- a/arch/arm64/kvm/Kconfig +++ b/arch/arm64/kvm/Kconfig @@ -22,6 +22,7 @@ menuconfig KVM select KVM_COMMON select KVM_GENERIC_HARDWARE_ENABLING select KVM_GENERIC_MMU_NOTIFIER + select KVM_MMU_NOTIFIER_YOUNG_LOCKLESS select HAVE_KVM_CPU_RELAX_INTERCEPT select KVM_MMIO select KVM_GENERIC_DIRTYLOG_READ_PROTECT diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 9e2bbee77491..a24a2a857456 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -1319,10 +1319,10 @@ static int stage2_age_walker(const struct kvm_pgtable_visit_ctx *ctx, data->young = true; /* - * stage2_age_walker() is always called while holding the MMU lock for - * write, so this will always succeed. Nonetheless, this deliberately - * follows the race detection pattern of the other stage-2 walkers in - * case the locking mechanics of the MMU notifiers is ever changed. + * This walk is not exclusive; the PTE is permitted to change from + * under us. If there is a race to update this PTE, then the GFN is + * most likely young, so failing to clear the AF is likely to be + * inconsequential. */ if (data->mkold && !stage2_try_set_pte(ctx, new)) return -EAGAIN; @@ -1345,10 +1345,13 @@ bool kvm_pgtable_stage2_test_clear_young(struct kvm_pgtable *pgt, u64 addr, struct kvm_pgtable_walker walker = { .cb = stage2_age_walker, .arg = &data, - .flags = KVM_PGTABLE_WALK_LEAF, + .flags = KVM_PGTABLE_WALK_LEAF | + KVM_PGTABLE_WALK_SHARED, }; + int r; - WARN_ON(kvm_pgtable_walk(pgt, addr, size, &walker)); + r = kvm_pgtable_walk(pgt, addr, size, &walker); + WARN_ON_ONCE(r && r != -EAGAIN); return data.young; } diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 6981b1bc0946..e37765f6f2a1 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1912,29 +1912,43 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { u64 size = (range->end - range->start) << PAGE_SHIFT; + bool young = false; + + read_lock(&kvm->mmu_lock); if (!kvm->arch.mmu.pgt) - return false; + goto out; - return kvm_pgtable_stage2_test_clear_young(kvm->arch.mmu.pgt, - range->start << PAGE_SHIFT, - size, true); + young = kvm_pgtable_stage2_test_clear_young(kvm->arch.mmu.pgt, + range->start << PAGE_SHIFT, + size, true); /* * TODO: Handle nested_mmu structures here using the reverse mapping in * a later version of patch series. */ + +out: + read_unlock(&kvm->mmu_lock); + return young; } bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { u64 size = (range->end - range->start) << PAGE_SHIFT; + bool young = false; + + read_lock(&kvm->mmu_lock); if (!kvm->arch.mmu.pgt) - return false; + goto out; - return kvm_pgtable_stage2_test_clear_young(kvm->arch.mmu.pgt, - range->start << PAGE_SHIFT, - size, false); + young = kvm_pgtable_stage2_test_clear_young(kvm->arch.mmu.pgt, + range->start << PAGE_SHIFT, + size, false); + +out: + read_unlock(&kvm->mmu_lock); + return young; } phys_addr_t kvm_mmu_get_httbr(void) From patchwork Wed Jul 24 01:10:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13740500 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00C03C3DA49 for ; Wed, 24 Jul 2024 01:11:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8DCFF6B0089; Tue, 23 Jul 2024 21:11:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 863856B008A; Tue, 23 Jul 2024 21:11:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6423A6B008C; Tue, 23 Jul 2024 21:11:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 315806B0089 for ; Tue, 23 Jul 2024 21:11:20 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id D5A7916053B for ; Wed, 24 Jul 2024 01:11:19 +0000 (UTC) X-FDA: 82372867878.11.87B13BF Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf29.hostedemail.com (Postfix) with ESMTP id D528312001C for ; Wed, 24 Jul 2024 01:11:17 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=CXKDgTUW; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 3tFSgZgoKCBsAK8FL78KFE7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3tFSgZgoKCBsAK8FL78KFE7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721783425; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yD4XBA3wulmfL9cZplrhNjvZpnpcy308OK8qykR4vJ4=; b=uCvcxMl9yPW8jni/I3Jik+Kq4bh6aGp39V331tPJpi0mL8PPUxxDcZ79oYg8HG6sTxu0MF TfsQ/vseP6aAmrzOmXSAMhdDSzlSlKXOaeJcwUsvMX9jja9fScVvxgbw1fcy3XV6Msq9LY qaFsMYfrfD08IiSGLlcC1w1tv2WOlk4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721783425; a=rsa-sha256; cv=none; b=H1G3+K3J5QsY16nhH6eYajIu33zroLF8cbqzFz5v0825ucMtapwATiBdh8uW6uf8W+6fkz GPBOFBMTqcA0Opk1u/JnxHZuL8lMDDh4S+RMe6T9dyGATfMIAXK8Q7Lg+NJTRsm9L8isW9 B15b0ZEt7ZAwAXHsjeUj7unFLfiA1+g= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=CXKDgTUW; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 3tFSgZgoKCBsAK8FL78KFE7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3tFSgZgoKCBsAK8FL78KFE7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--jthoughton.bounces.google.com Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e0875f9f7ceso801581276.1 for ; Tue, 23 Jul 2024 18:11:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1721783477; x=1722388277; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=yD4XBA3wulmfL9cZplrhNjvZpnpcy308OK8qykR4vJ4=; b=CXKDgTUW27Z5zrbIeeYEdA7LLPJw8ggOAZKTxhVB1iS5yy9wDsMVq/+6+IFDo/VRuY wkbO+b0L0/ilryhSqQ0lLxzG9bxRJDJgNekZTIap4Gu0qSZG9ssUiqZQqG8jwWmuEMvY QXCdE2Z9rCMuS8tH5KEqgYpHpQ+Jex1yAWJ9r86etr+DvD7tv8rgOVJcaIbMaAtQU1iz PT5+w0PG94NYSKAoMSIQUtP3inQyYAt1dP+YDc+OhStqu4t7iAxeSmzYc4Y5Un9X3Qhq BXY/Nev1nHnRu6Quqs7xKrIMJ+qoaiYxpJCdhGtHUChbSK6J9K/ys/7ncGtK4zhrRTlF N0Cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721783477; x=1722388277; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=yD4XBA3wulmfL9cZplrhNjvZpnpcy308OK8qykR4vJ4=; b=wnVjyaaT18Uc75jzUinnfCS682GC+pLRLNl42LBRUD2lpxMQGeq0sKZo9gDIJZfab3 jgJwFIi+7ux+AnM6xXyk7zCsn6xCmIySXBQ4qrJiykpzIDcjyC47ksR7xtNOBbKwDapd KoaetuQAOYDhzEwauypLceWegiUGZ+bL0X73FZssWhaQL5jJOdS3yRew1bDpSh7dWcyI fEeeQkewg3CSnyTIni3MbD+eVipVR3c8/h2594kYpaWX7ppazXoM85KC2QqUGgXr+qjj 3IeBMLl4h0XJy/R5j0rTQmtCTHRcI3m3qjlY7/3I3hFCYWIq3Emjk6yqtRnWn7VF4vNT J7Pw== X-Forwarded-Encrypted: i=1; AJvYcCXTwPOiqVVSPHrEdjsAQClfbk3DTnTStuOQCYVmi7lW7MqXPHwW7ANajpql1d/jwz59C5/wC1JilpLr+roiKkMg+y0= X-Gm-Message-State: AOJu0Yy9yRWj2XZm/JL4cgalLMAe76YCtMCQE9MK9bZCgRDQ77sdrPQ1 sDVEv5ZqIyxvqjNLF03CKvG1DSvNLKjlSCkZq8e5BIhHXq8ai354VpcWaPZZtQ+4BByrSYM+r+Y MHzLhCp4dcN9XvQt33g== X-Google-Smtp-Source: AGHT+IGa8qCTH1LNH/hmEAmVML0lM3qLC3vgS4lo0iysyChD5iecelBKp5Dde0rCv2FIne41yj4Q4w4VDl3wLvaY X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6902:72b:b0:e05:a1df:5644 with SMTP id 3f1490d57ef6-e0b115446e0mr18142276.2.1721783476871; Tue, 23 Jul 2024 18:11:16 -0700 (PDT) Date: Wed, 24 Jul 2024 01:10:29 +0000 In-Reply-To: <20240724011037.3671523-1-jthoughton@google.com> Mime-Version: 1.0 References: <20240724011037.3671523-1-jthoughton@google.com> X-Mailer: git-send-email 2.46.0.rc1.232.g9752f9e123-goog Message-ID: <20240724011037.3671523-5-jthoughton@google.com> Subject: [PATCH v6 04/11] mm: Add missing mmu_notifier_clear_young for !MMU_NOTIFIER From: James Houghton To: Andrew Morton , Paolo Bonzini Cc: Ankit Agrawal , Axel Rasmussen , Catalin Marinas , David Matlack , David Rientjes , James Houghton , James Morse , Jason Gunthorpe , Jonathan Corbet , Marc Zyngier , Oliver Upton , Raghavendra Rao Ananta , Ryan Roberts , Sean Christopherson , Shaoqin Huang , Suzuki K Poulose , Wei Xu , Will Deacon , Yu Zhao , Zenghui Yu , kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: D528312001C X-Stat-Signature: 499t9xoywo3q7by6ytozymiut1rang79 X-Rspam-User: X-HE-Tag: 1721783477-940425 X-HE-Meta: U2FsdGVkX1/4OO4+2gIntfvmn3IYrQxWx+0EF5qRx6+Sg7Xp8/a/fbhrWpzp3WFIwUibiYRtzAY23JRMClgV0QHgFNF5wxAipkXy8waRJTNSO1+MzHSwMkgDsxPUj9qprEcl86kQzL3Oz5c9NsDPnZCKIYX4OP7ZSPAh5GI2KxLLmcz2Yxcw8+Q+jMCT7Svlc0Y5uVgO2nDu1OYz+nkymDrvSVBu94AkCg+HkKqzDkvmxdvhTUXLatyK/DKqWXT5Sbnb5vn5dyIrGzGO86CyvN5yK+isIzM1rDA2KCIquyWbIbhQukwcYwFgf5fIk+NQKaeNTywEUdSNj5QEyDqi5vKgYmIgTptRPRnUrYqY+x++2WMM5z/lcR39igztAz8T1OrkDZ/BFogLKAka0fBKc6HySYpnEySsU+eGuPPst+iEw/MB/SBG0vTt5HkeA7vOTfJthxkH1MMdWY1YhFFSmLqJOsxgo6wESpQGE80SzMVob3J5di/IOjTl/a0PlP1ZoDUKkShMPKJvOkJ2bwpy/FVfw/QCe2S3RFrJf1H8Etv0e2R+hbaOmtiJVfPUmMYL4fljOkaYx1iYOAx0xq/LQlj+t+rBNTfd6spQ/K0u+UGYy8zR/mhUaZ4UEP1q0j6uKYZwkivJRGa0jV+TS7w1SpOYGMjauqUKYPpQWOcdaBCT6oghR7ZtfjcN8iGgyd9yNJB62EMzAKSaxLDr8oHF70ek/0hoRTaSOz5ujQOvPYWPDp0BJYZRb2CzN/Ip3KALOAK+3dbIwlqs9C3sYUKV5DY9JyOZOuZeCG07MFFXCC8S2Aro238SfFTF95vroWg8Uwhttawg1yzT+FmfdyrrzhQs5Y2qttsuu79SF1kIPbsITsx+y0pw6y4n47g/U3gLRo0KpsV8gJBHhzpuJLTntUR53dqh1eW97R+1NqPUxTXRwGmB7xN1n7tqdAoz2wB0q0ptf/k/AuMk7/8ctGo +D24lAvk 5drjdmOGEltolcSb4SUTqoh1Lc2agDgB3huJnpMnwekSyst9V77eUhj2w6Cl6ie8+8SZyZtyG5tyAt1vLKDDA1pUlQEWuDD+/CTx1NU5ntsE++H4xlPW0MCA6YeEUFmTWnYOtTr2uQb+qtVRnEk5JlxOFG06yFeRKSDfr1hH+geUziS6zUggOavGW3gmvus1XA3OVblVi5DGvow+jNV8X/syZJO6Tx0eu16XaOu8Lj4Tgd/ZGQhYwHwu9nzN68/jkQUiowkMAz2gIp/GMXyUpVX0RZaSrWZ/+egIOTXJQVk7jrQgdXOwgP0DntXjEicftSjaVOT7oe6BtMdg0lfvg4ZO8GMqSoxeOn0p/XEY89TObmzC94QMw40yoBYXH9I+giaZxco2gDSmNNpRkP3W0ePGZufY34vRxEmokekmW64eevQULjlgquslbDPoHTIYt6QNL9A5+O6d0jq1D0elCTi0aG214V2U5S/EceCSLYiNsqnT59C6+F/kUY2AlIwHvMUdKGT5ZlghIb06KHma49U81mVFduLGjWcZ0+RvyYCFXfke07Eqx3b/IgpYSZ77xLJtoFbf6bfcKiWutKijiLGtivqTyjhjTbLBy6JSvUjnyQEiLDkSYDUbD1tPAEV0eS20v X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Remove the now unnecessary ifdef in mm/damon/vaddr.c as well. Signed-off-by: James Houghton Acked-by: David Hildenbrand --- include/linux/mmu_notifier.h | 7 +++++++ mm/damon/vaddr.c | 2 -- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index d39ebb10caeb..e2dd57ca368b 100644 --- a/include/linux/mmu_notifier.h +++ b/include/linux/mmu_notifier.h @@ -606,6 +606,13 @@ static inline int mmu_notifier_clear_flush_young(struct mm_struct *mm, return 0; } +static inline int mmu_notifier_clear_young(struct mm_struct *mm, + unsigned long start, + unsigned long end) +{ + return 0; +} + static inline int mmu_notifier_test_young(struct mm_struct *mm, unsigned long address) { diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c index 381559e4a1fa..a453d77565e6 100644 --- a/mm/damon/vaddr.c +++ b/mm/damon/vaddr.c @@ -351,11 +351,9 @@ static void damon_hugetlb_mkold(pte_t *pte, struct mm_struct *mm, set_huge_pte_at(mm, addr, pte, entry, psize); } -#ifdef CONFIG_MMU_NOTIFIER if (mmu_notifier_clear_young(mm, addr, addr + huge_page_size(hstate_vma(vma)))) referenced = true; -#endif /* CONFIG_MMU_NOTIFIER */ if (referenced) folio_set_young(folio); From patchwork Wed Jul 24 01:10:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13740501 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5D69C3DA64 for ; Wed, 24 Jul 2024 01:11:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 46EE76B008A; Tue, 23 Jul 2024 21:11:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 41B7E6B008C; Tue, 23 Jul 2024 21:11:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 110F56B0092; Tue, 23 Jul 2024 21:11:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id DAC356B008A for ; Tue, 23 Jul 2024 21:11:20 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 82C5EC057C for ; Wed, 24 Jul 2024 01:11:20 +0000 (UTC) X-FDA: 82372867920.06.360CD04 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf22.hostedemail.com (Postfix) with ESMTP id C08F4C0007 for ; Wed, 24 Jul 2024 01:11:18 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=QYzRCHBv; spf=pass (imf22.hostedemail.com: domain of 3tVSgZgoKCBwBL9GM89LGF8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3tVSgZgoKCBwBL9GM89LGF8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721783441; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MLmhrACopMN8CDbTxEqiwIPa4YrGM6ZGpV4/JEvInYA=; b=LrZ2Ea5LngaZQ6+lIyd25mZIDfw32PXDX4pUyKcPSMR1V8WNmxNUesRpSuCq5e8GRB3eXD DUbvyu+UCzPjhSATeIKKYR3NBvSr4ZxGiVpJFK4Kp7Ep4bc3aO8R0rXeTrynYZaZeG0Tvn TjDExfsGLsrreWKicZnvpjotvKL2JoM= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=QYzRCHBv; spf=pass (imf22.hostedemail.com: domain of 3tVSgZgoKCBwBL9GM89LGF8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3tVSgZgoKCBwBL9GM89LGF8GG8D6.4GEDAFMP-EECN24C.GJ8@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721783441; a=rsa-sha256; cv=none; b=hh3BEw6uaLBtfalB283ZBAPs4XulcCm6RRkQ14yTBfxC/gdZspJM4HWUGXH8YSauBe805Y ctdi3N/DhmqNLyFTX1dkQMnB+vRpmFCFQwMSSYJRJ8HRNwb1GyofUR0FZdifRSFEgUTF2A BJ/ZPiJ9x8bkJkHZjhmpYLHbtj4XAzU= Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-e08723e9d7cso7458371276.0 for ; Tue, 23 Jul 2024 18:11:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1721783478; x=1722388278; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=MLmhrACopMN8CDbTxEqiwIPa4YrGM6ZGpV4/JEvInYA=; b=QYzRCHBvrkCiCYa8dt3hBzRvo+OJkWG/JFj4RLOMPAXwg5E4VI2Di7mzw9ow6inx37 t3mnq6afbV1tjKJIZxG/tc8UVL7NR1b0WKixVWrSpGfScZNgZ6U8AWqXVMwJyLVnKHr2 bQX+AyxLB8udMiUcQ3h2gX2q25H86DyE4qJnIGKButvkYTcELATrTDw3zZMcz3Kse4Lx d81n3FdtC8PVnsNdgPENbAtDcNeOx1cv4Ag9hQSGOBq+eBtx5nUlnBE9hLhVKk5aJ+99 UNYk9mP0Tn1jG0SmMgMghqlZMvla7jQRsvClmqPOapfb6Ji7JS09lbJWIpft/EQ27uXk zPVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721783478; x=1722388278; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=MLmhrACopMN8CDbTxEqiwIPa4YrGM6ZGpV4/JEvInYA=; b=qYk6+70X4c72CBCmIG9vnU85VZN1bvJYeUDgAfsMfp+IXF8ByDxn78w+ZcRpXnmcDA GLmdvNbTWYA91JBttYnm3AEoWd/xwv7xnO68+QjZuOlSWrGB3yyPsj6I6khVs+j9wYdw lizo8l8nw93teGKjV0pxeKobr/OOU785nTS4htg2W1bII6G7HlMuRvyhc4/um6IyNur5 JEaLS9HxNvJZuT2HsNRUApkBwEApxnOziDpq0erEvN55Szn6XzMYaZ4k2PpEFvsmDtV1 98q3GbwwMp3U9KL5GsdyviS2Rxf6Dcq1rvKrO5AS+82MV1ol1mO9P7NgG2WhuTlCBOrQ vCoA== X-Forwarded-Encrypted: i=1; AJvYcCWq4tJLqvSb64H7f+0kUAd0DTBY2956e4FPrgbXqPAzFBDN3ss0j5gYLYiPHmKlzOd9MZ7QTtqPbTtM6vKA1t+pBOI= X-Gm-Message-State: AOJu0YybSG99geoRXxale6bvMijh1YzxFwP9cmbbMGNsW7J4NNbVTP4C GtQxLrjvOwtBvB/YZT8uZFX+IyB3EHLIHEMtsD+vaHVAe7Nqh0KN0GEkaKnSa9u30rlKsGjhuJc Hsfo5wiVyVCWmBdRWiQ== X-Google-Smtp-Source: AGHT+IGmLYw0C2YUyk3sIv0ePZsCocY4rFKls0LnHS1LhedGuIns0QZPtcJEKudMyoCE03fu3hfho3QJFpB7QIkQ X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6902:100f:b0:e05:fb86:1909 with SMTP id 3f1490d57ef6-e0b0982aef8mr23085276.6.1721783477681; Tue, 23 Jul 2024 18:11:17 -0700 (PDT) Date: Wed, 24 Jul 2024 01:10:30 +0000 In-Reply-To: <20240724011037.3671523-1-jthoughton@google.com> Mime-Version: 1.0 References: <20240724011037.3671523-1-jthoughton@google.com> X-Mailer: git-send-email 2.46.0.rc1.232.g9752f9e123-goog Message-ID: <20240724011037.3671523-6-jthoughton@google.com> Subject: [PATCH v6 05/11] mm: Add fast_only bool to test_young and clear_young MMU notifiers From: James Houghton To: Andrew Morton , Paolo Bonzini Cc: Ankit Agrawal , Axel Rasmussen , Catalin Marinas , David Matlack , David Rientjes , James Houghton , James Morse , Jason Gunthorpe , Jonathan Corbet , Marc Zyngier , Oliver Upton , Raghavendra Rao Ananta , Ryan Roberts , Sean Christopherson , Shaoqin Huang , Suzuki K Poulose , Wei Xu , Will Deacon , Yu Zhao , Zenghui Yu , kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org X-Stat-Signature: rnts9hd6bo4wqj6zmdrpi1piwy8dzonb X-Rspam-User: X-Rspamd-Queue-Id: C08F4C0007 X-Rspamd-Server: rspam02 X-HE-Tag: 1721783478-344211 X-HE-Meta: U2FsdGVkX1+j+Z396AiEP8rvWfYKPYseITtZmJjIMe8I4TJu7WYHwRESH1qI/OHAwBqfkW6uhkBQWS39+VpOlFamtzmncappbgK4AUQQVJ7v1NriHVJ/8fdFygxBk9tz1Rm4YyKh5RkOyeCugldIp6LF5u7c0uDmw+sRFgnuUrdclgOn3UDNOJu1AxmxoOSDZmbpS4SLrTgxe6v+kPCIgNtUDi/vtJS+gFFE0Y+kpGLxMYvzCiI+82jmnwyZrnMEkQvjjMCH6bicMl/UveeW/pHtda6uzByVwxdcz9nqEBA1wIqQKt0ws7XyM2GiIgr6wOaua/9RTs9RfUvtON49gdLGwIdmPxBqzqii4XzwbUFZQQQOvSKfvTs0ra5tRYtSv/jc9d+SHHU1rzwjcqGBlfqEVPsqSJoOWZbLDNM60rnQwricz9D3pTHPSP66Wa7PCYr+J9aEuO7zMbsR5xf94zEeQ1X2kzBYxLbi8gebPA8BMPBt/+qgDuKOUUAN7e5e7ExPuMWZW9B48A8RKHLarXJ4ND7emIYjLICGRk/jr+ewxbbGag25v+V1Epbx7+sbCY/EFrHqi4eGyDsLWVRpxMLz00oVi4f93Jv4RjQDKVdG64IAER6QY/rtCwux1dUhw4f23VaR0G6m6WWINfMevq58IAnj1Xu0gfjyXSADcX68+jp0OGf56q5l+Pobg/iDdTehWwWGk308ub4CfmsIVtv9ueR1fAf0bo0nqh0KcUw62OFcm1urUNgbHwc9Xu2bPLJ0FJYMOJCn4WsRb8p0ZPwxU1hsOL3NuMDmgKT37NS4ajeaiR6u77ZJlyhwCCYEOTEYK3JMYimQSt0OKY6ornMzspkc3YAQjR8YoUbjyIa6rSHVaKmupszFc/gPCwYMIUI3xSUAUOaxr9Avq2T5S/3vsusmiXcLvnB42ar+exN0DVof9dLbLsi4DXwxm0lx89Or29z3FaFvFXYs0h6 +TPLoT47 6xLQWbohVgvJB0G8/6aJ6CP5PYeoQefh2h+WZtF6gpkllARetz2YSHvipuOVQZar66Q3cvM6CZjzDzSS1laV4uaD8NUrMCnNoyLpkE3RGbE4j4++pLqH+KktKElX9HE/eOHTQRBZDxvN1pemFgUkQU/aa8y9Y++eWSNKis3rMtyvmSCtciES+8Yq8sBv8kNM6QIUCoSCkUnk1TUw1G0gjPZBjBWtL2P6SgyrWyd+qtQobWL+0e3zJW2ZHrshBSdEQY0+XNpCbvlSupntHSmyRHb5K/EuJ/mZm7YDqZ+x3esWuJxhVYFo3x4thnrd1LSLlwiHzgeQ3EChk6m5syOuIXQL6MP+jQzwiF68LRqxhtlnuzcKo5VB1RATgAjyRwOmRdJnpAVYeE5OZMdgVzx/OaMrPK81MgDlZHYpNbCPBSJVa4mxgtJu5M+DBqOTT310uog7Vxx5+WMWXIuBHMOqDVoD0Rqj+VMW/8vTgg9/sVVIIRyHfFfTWczOrfxVe3PS8C2N7SjOYsBw7D109Kq9rOVIPbRkYcMnli5FszTURA99PyBJ5547t3/2yvqeba4NU6YlU018idDl/IibFY7Gf7ucglxrDFZTfqUi5 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: For implementers, the fast_only bool indicates that the age information needs to be harvested such that we do not slow down other MMU operations, and ideally that we are not ourselves slowed down by other MMU operations. Usually this means that the implementation should be lockless. Also add mmu_notifier_test_young_fast_only() and mmu_notifier_clear_young_fast_only() helpers to set fast_only for these notifiers. Signed-off-by: James Houghton --- include/linux/mmu_notifier.h | 46 +++++++++++++++++++++++++++++++----- include/trace/events/kvm.h | 19 +++++++++------ mm/mmu_notifier.c | 12 ++++++---- virt/kvm/kvm_main.c | 12 ++++++---- 4 files changed, 67 insertions(+), 22 deletions(-) diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index e2dd57ca368b..45c5995ebd84 100644 --- a/include/linux/mmu_notifier.h +++ b/include/linux/mmu_notifier.h @@ -110,7 +110,8 @@ struct mmu_notifier_ops { int (*clear_young)(struct mmu_notifier *subscription, struct mm_struct *mm, unsigned long start, - unsigned long end); + unsigned long end, + bool fast_only); /* * test_young is called to check the young/accessed bitflag in @@ -120,7 +121,8 @@ struct mmu_notifier_ops { */ int (*test_young)(struct mmu_notifier *subscription, struct mm_struct *mm, - unsigned long address); + unsigned long address, + bool fast_only); /* * invalidate_range_start() and invalidate_range_end() must be @@ -380,9 +382,11 @@ extern int __mmu_notifier_clear_flush_young(struct mm_struct *mm, unsigned long end); extern int __mmu_notifier_clear_young(struct mm_struct *mm, unsigned long start, - unsigned long end); + unsigned long end, + bool fast_only); extern int __mmu_notifier_test_young(struct mm_struct *mm, - unsigned long address); + unsigned long address, + bool fast_only); extern int __mmu_notifier_invalidate_range_start(struct mmu_notifier_range *r); extern void __mmu_notifier_invalidate_range_end(struct mmu_notifier_range *r); extern void __mmu_notifier_arch_invalidate_secondary_tlbs(struct mm_struct *mm, @@ -416,7 +420,16 @@ static inline int mmu_notifier_clear_young(struct mm_struct *mm, unsigned long end) { if (mm_has_notifiers(mm)) - return __mmu_notifier_clear_young(mm, start, end); + return __mmu_notifier_clear_young(mm, start, end, false); + return 0; +} + +static inline int mmu_notifier_clear_young_fast_only(struct mm_struct *mm, + unsigned long start, + unsigned long end) +{ + if (mm_has_notifiers(mm)) + return __mmu_notifier_clear_young(mm, start, end, true); return 0; } @@ -424,7 +437,15 @@ static inline int mmu_notifier_test_young(struct mm_struct *mm, unsigned long address) { if (mm_has_notifiers(mm)) - return __mmu_notifier_test_young(mm, address); + return __mmu_notifier_test_young(mm, address, false); + return 0; +} + +static inline int mmu_notifier_test_young_fast_only(struct mm_struct *mm, + unsigned long address) +{ + if (mm_has_notifiers(mm)) + return __mmu_notifier_test_young(mm, address, true); return 0; } @@ -613,12 +634,25 @@ static inline int mmu_notifier_clear_young(struct mm_struct *mm, return 0; } +static inline int mmu_notifier_clear_young_fast_only(struct mm_struct *mm, + unsigned long start, + unsigned long end) +{ + return 0; +} + static inline int mmu_notifier_test_young(struct mm_struct *mm, unsigned long address) { return 0; } +static inline int mmu_notifier_test_young_fast_only(struct mm_struct *mm, + unsigned long address) +{ + return 0; +} + static inline void mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range) { diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h index 74e40d5d4af4..6d9485cf3e51 100644 --- a/include/trace/events/kvm.h +++ b/include/trace/events/kvm.h @@ -457,36 +457,41 @@ TRACE_EVENT(kvm_unmap_hva_range, ); TRACE_EVENT(kvm_age_hva, - TP_PROTO(unsigned long start, unsigned long end), - TP_ARGS(start, end), + TP_PROTO(unsigned long start, unsigned long end, bool fast_only), + TP_ARGS(start, end, fast_only), TP_STRUCT__entry( __field( unsigned long, start ) __field( unsigned long, end ) + __field( bool, fast_only ) ), TP_fast_assign( __entry->start = start; __entry->end = end; + __entry->fast_only = fast_only; ), - TP_printk("mmu notifier age hva: %#016lx -- %#016lx", - __entry->start, __entry->end) + TP_printk("mmu notifier age hva: %#016lx -- %#016lx fast_only: %d", + __entry->start, __entry->end, __entry->fast_only) ); TRACE_EVENT(kvm_test_age_hva, - TP_PROTO(unsigned long hva), - TP_ARGS(hva), + TP_PROTO(unsigned long hva, bool fast_only), + TP_ARGS(hva, fast_only), TP_STRUCT__entry( __field( unsigned long, hva ) + __field( bool, fast_only ) ), TP_fast_assign( __entry->hva = hva; + __entry->fast_only = fast_only; ), - TP_printk("mmu notifier test age hva: %#016lx", __entry->hva) + TP_printk("mmu notifier test age hva: %#016lx fast_only: %d", + __entry->hva, __entry->fast_only) ); #endif /* _TRACE_KVM_MAIN_H */ diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c index 8982e6139d07..f9a0ca6ffe65 100644 --- a/mm/mmu_notifier.c +++ b/mm/mmu_notifier.c @@ -384,7 +384,8 @@ int __mmu_notifier_clear_flush_young(struct mm_struct *mm, int __mmu_notifier_clear_young(struct mm_struct *mm, unsigned long start, - unsigned long end) + unsigned long end, + bool fast_only) { struct mmu_notifier *subscription; int young = 0, id; @@ -395,7 +396,8 @@ int __mmu_notifier_clear_young(struct mm_struct *mm, srcu_read_lock_held(&srcu)) { if (subscription->ops->clear_young) young |= subscription->ops->clear_young(subscription, - mm, start, end); + mm, start, end, + fast_only); } srcu_read_unlock(&srcu, id); @@ -403,7 +405,8 @@ int __mmu_notifier_clear_young(struct mm_struct *mm, } int __mmu_notifier_test_young(struct mm_struct *mm, - unsigned long address) + unsigned long address, + bool fast_only) { struct mmu_notifier *subscription; int young = 0, id; @@ -414,7 +417,8 @@ int __mmu_notifier_test_young(struct mm_struct *mm, srcu_read_lock_held(&srcu)) { if (subscription->ops->test_young) { young = subscription->ops->test_young(subscription, mm, - address); + address, + fast_only); if (young) break; } diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 33f8997a5c29..959b6d5d8ce4 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -874,7 +874,7 @@ static int kvm_mmu_notifier_clear_flush_young(struct mmu_notifier *mn, unsigned long start, unsigned long end) { - trace_kvm_age_hva(start, end); + trace_kvm_age_hva(start, end, false); return kvm_handle_hva_range(mn, start, end, kvm_age_gfn); } @@ -882,9 +882,10 @@ static int kvm_mmu_notifier_clear_flush_young(struct mmu_notifier *mn, static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn, struct mm_struct *mm, unsigned long start, - unsigned long end) + unsigned long end, + bool fast_only) { - trace_kvm_age_hva(start, end); + trace_kvm_age_hva(start, end, fast_only); /* * Even though we do not flush TLB, this will still adversely @@ -904,9 +905,10 @@ static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn, static int kvm_mmu_notifier_test_young(struct mmu_notifier *mn, struct mm_struct *mm, - unsigned long address) + unsigned long address, + bool fast_only) { - trace_kvm_test_age_hva(address); + trace_kvm_test_age_hva(address, fast_only); return kvm_handle_hva_range_no_flush(mn, address, address + 1, kvm_test_age_gfn); From patchwork Wed Jul 24 01:10:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13740502 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4EA6CC3DA63 for ; Wed, 24 Jul 2024 01:11:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E344A6B0093; Tue, 23 Jul 2024 21:11:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D1F8D6B0092; Tue, 23 Jul 2024 21:11:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B95F66B0093; Tue, 23 Jul 2024 21:11:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 9D2BB6B008C for ; Tue, 23 Jul 2024 21:11:21 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 5B5C3140520 for ; Wed, 24 Jul 2024 01:11:21 +0000 (UTC) X-FDA: 82372867962.22.BAEF2A0 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf01.hostedemail.com (Postfix) with ESMTP id 9335A40009 for ; Wed, 24 Jul 2024 01:11:19 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ngabXLYq; spf=pass (imf01.hostedemail.com: domain of 3tlSgZgoKCB0CMAHN9AMHG9HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3tlSgZgoKCB0CMAHN9AMHG9HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721783433; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RjwIFwaGyFm0pqYoSJRPlexY59ZZdKZ6ixfPL8P7AvM=; b=e+snCwYiBFRDC905rWcc69C4sotVa1/ZpPzNcLtG/hYgFrwz4LIdhAJUB0Dp9aq34V8ibG LHc5n89lXhuRhYbfGlE1/OhbJHODjjZikReIco4vuSJ9Q9GeMWAmb/yJDgRolqwLY5syu9 XR39NAKuvPHQOzoAJVpaMy4hGjcsfdA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721783433; a=rsa-sha256; cv=none; b=zA2kIQE+c4Fz+zkScYI5EinbVlJXzUcebdjbbHSxtXa4hgzud5sTPPrUYxAZgUarb88YTt 9poIJYgNAnZPO7iHj5GPBBhBoq6lgbcJ3kVNb9U2OCTk0xBjYihC4Qo42fqGfPCYdxnOJH Y7OdIhdeqoJRG4qQ++ayNX/hFF9ByFw= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ngabXLYq; spf=pass (imf01.hostedemail.com: domain of 3tlSgZgoKCB0CMAHN9AMHG9HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--jthoughton.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3tlSgZgoKCB0CMAHN9AMHG9HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-e02a4de4f4eso13598224276.1 for ; Tue, 23 Jul 2024 18:11:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1721783479; x=1722388279; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=RjwIFwaGyFm0pqYoSJRPlexY59ZZdKZ6ixfPL8P7AvM=; b=ngabXLYqzU/aj15BnXxwwJtc1sdobeYhM8VftC6B27l45IjKBFpy8rInVeBztT/Ecx Yfiorz+1kdA58oFaGyOKpW3wbd3XeJ9w8cy26ttr4WxEtPkHmfbl/T0n6tmSckVRq6gi zf+gcn+ARhc0oa4oUsbf6juOzbwDgW1rm/YoNplsO09IKxgOdBpr6sy0YnOLTfPxuAyx z7q5ktnsbk3FyRoNlZnFK/lfdd78P4yy4Ur5Ru3PJ7aKi6KlSxtd6yImZYsnBw4fDTPu iSQ2g9GXPug2tFJzYKRgKzR2AMz0LGiwAA9jqd+Qoo8s0IfgdnGPygghAmsE5FC+3B59 NOUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721783479; x=1722388279; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=RjwIFwaGyFm0pqYoSJRPlexY59ZZdKZ6ixfPL8P7AvM=; b=gjCnphIYkD9EVxMlyV3OMNWwq6SY8mELdJgxwBP5wXtpEPdnr3hqgaot9b1azkBirj K5nSaFuHLhkvp7PzSYTHxm88a7FmubzZ+eVcEpVnl1odlk62c79J2VgxCBDOPMwMJyZw KtCM/r8qEyYSmUqqN2N1unut98GCCBmZggG+WMUc9CLsiOG7HtjvFOlPnTrf8onoJO75 8BeXCAonQt7/Yn6mk7tm/BdCCAMCuajvaaeE8GK8iAOVRf8j6R4vXdpGff1uTuATJU3M H/Ws/KpUrlvrl2NeuEYzubU9SM/37DCLAWVNVgp3VKtQSUzlHW0ISw/Z/qbXdpYjLykM MahA== X-Forwarded-Encrypted: i=1; AJvYcCWaRsEHczeMsJFGXdRf7tJxM/k0InhcQWB48k23HKz/chF8sTUK7ywAUAksCT7s3iBU4ZC97L2C4x+AcRLloL/bNfY= X-Gm-Message-State: AOJu0Yy+9ThqLg2Rst4258uKvXSzm5ATlo3SWJfSwFzELwAhyOuoKekY 9e7adMEjCbHu5CW78IPKfQeh54cpPUjhdjPAwaVpP2nxJRnBhetTmMZafe3hAXhNrwrEVCsJDyT 8TstwrAOIUNjTqHRE2w== X-Google-Smtp-Source: AGHT+IEf8LhPu+J2AQj8JKbpYxC+Q+/7UmZG6e+vLGbTXrpNe4bcC8God4ZSRUYFG8d6cpp91FbqTUc9FhMyYUjh X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6902:20ca:b0:e08:5554:b2be with SMTP id 3f1490d57ef6-e0b0e60b370mr1189276.9.1721783478588; Tue, 23 Jul 2024 18:11:18 -0700 (PDT) Date: Wed, 24 Jul 2024 01:10:31 +0000 In-Reply-To: <20240724011037.3671523-1-jthoughton@google.com> Mime-Version: 1.0 References: <20240724011037.3671523-1-jthoughton@google.com> X-Mailer: git-send-email 2.46.0.rc1.232.g9752f9e123-goog Message-ID: <20240724011037.3671523-7-jthoughton@google.com> Subject: [PATCH v6 06/11] mm: Add has_fast_aging to struct mmu_notifier From: James Houghton To: Andrew Morton , Paolo Bonzini Cc: Ankit Agrawal , Axel Rasmussen , Catalin Marinas , David Matlack , David Rientjes , James Houghton , James Morse , Jason Gunthorpe , Jonathan Corbet , Marc Zyngier , Oliver Upton , Raghavendra Rao Ananta , Ryan Roberts , Sean Christopherson , Shaoqin Huang , Suzuki K Poulose , Wei Xu , Will Deacon , Yu Zhao , Zenghui Yu , kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org X-Rspamd-Queue-Id: 9335A40009 X-Stat-Signature: s88kjjzqi6k5pji9uj1npunt9utkck6i X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1721783479-424134 X-HE-Meta: U2FsdGVkX19j+xNg0EbbCnSYjZJwT10oEe2dju/tcFBxUEfu60fg4jISPd5+fD9jla5Or0LWStmXsxHqiN3UOURKQmZN12xqHYIyN2h8rYWoXlYkYJvVw/W+ujcooduv+AUn6hFUx3yApoNCejfzlap22HNs+1nCkJYdkaE8odmxgwryjukl5d8kQno2H+caP3Gi6ShKTznyQD3fUOP9dWzojI1KRgQY6+SptHMogQoUUrFubaFd0sCc+AFasKU7gfa6Uxm+6toBO/wwDSRw8DtffGaByAj51QO57xUNxYa0TeFKsOmPbuFMi2wP8oU6+4DEEFCLWCctE9n5Jqrv5cMMPtzDtrz8sQkOjHgbj0KbpglvjEi/rdwDKbU2BhHE8A9MCuGYheNEb71XAv/tvsT9z4glCjWgLvFZTTpMcBdeDmYORG/BDd2YtBnOH4EGZEQVB4ygKaDPCi48s8JtZ8qKpgzaysFFAuxWSOX+ArStjIuYEGaYCn/0T9lG+TnDBQZ7dMrrhqio3iEY98rAmHWGHjU2E39IWlKN3g/zHi+MK2+FLlmzgwotYoYy8hbvclEf6ne797YrXzoRmlF0q1nVVEI7pR7jCHMmHXTwsVUiHkmswaSYGm5cQp9A+98Ojj1gCgr8TLXHYZCLWDJTll92tJDxswx0QnooNYSALzzQKcGCcfTZN5zVjc5tBakQSIiaQYEA40Ipo1BDZFiGIm+/N6hoYOP/rWouW+8iIvKpPzLvpJuM9VAknVxCb3jyjcVDJLke76RZh2l1MmW7JWbRCEPbmQ9LdFTyNtANlC9B46iHoDU42q2bJj0xL/Vmg/dDQ6r7QJYB1bXwn1Pl8pvxOtZxqUT30J+P+e9CkjoORpzcEIwJ+FqdUAV6JbGsdP1wUKbf+dHIiWyB9DqOUrvZ+AIJ8NOdbYoQknGcPCzzrMkLLxprYPj+/hmEs/lfufR6FlvrMtfLjvfKoLe MBQBQ6oi vso9yqsjoyIVhXDtTNWL09kBP3bXWh1+QQB2PfpQj8wvokO66FhklaZZnsYgJ1AGnvJvJVVtBaal11F8RKVmBwTBw9F6lifhicSo7kW9UBo+dSDStD1PnxhmlxmoyXfnOy5SFPeJQ4fUAGtWCt5Lt7f+wvis0/MO9mal60W4SY+SMWm8Gg/YoidxZ+BPWHv2gJ+wuZLv9NjpRKMOM0a0wrgmwkol5wAYtg+BnSRi6rxlo0bu1hUqZLq4VqF93e92Zxf+6xVBPlSetynoUfn94vsIm7k1r1eneSrrPVxJC3MZYBHOjDqnhccwgWlL3yC3iRhxtC13iA9jMywApiqr8si/6wmzq6cHjHduFbgHLJO7qHS1ealEXBGtcq0BVRX7aWJmP5YU2lUEM0Cl3015yleiop0X2Yd9tGfEjqDQMwixt87FFGOdEyAYsNT1InPHLQESpNoWnEJNX5IXJlwo34lR91lIQdDTNGgCvjKInQ2RPNWfM//mNn+gMu+jLr1JkdfsB0Ftg3Loob64wZEDoYp+drzyT3RbKzD5PK8RdH/o8Zyuv8WQLx1JZeWpdePCaxtZeavofGKf5W4cZGGw7hRPlIkpZoi9cNGu+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: has_fast_aging should be set by subscribers that non-trivially implement fast_only versions of both test_young() and clear_young(). Fast aging must be opt-in. For a subscriber that has not been enlightened with "fast aging", the test/clear_young() will behave identically whether or not fast_only is given. Given that KVM is the only test/clear_young() implementer, we could instead add an equivalent check in KVM, but doing so would incur an indirect function call every time, even if the notifier ends up being a no-op. Add mm_has_fast_young_notifiers() in case a caller wants to know if it should skip many calls to the mmu notifiers that may not be necessary (like MGLRU look-around). Signed-off-by: James Houghton --- include/linux/mmu_notifier.h | 14 ++++++++++++++ mm/mmu_notifier.c | 26 ++++++++++++++++++++++++++ 2 files changed, 40 insertions(+) diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index 45c5995ebd84..e23fc10f864b 100644 --- a/include/linux/mmu_notifier.h +++ b/include/linux/mmu_notifier.h @@ -233,6 +233,7 @@ struct mmu_notifier { struct mm_struct *mm; struct rcu_head rcu; unsigned int users; + bool has_fast_aging; }; /** @@ -387,6 +388,7 @@ extern int __mmu_notifier_clear_young(struct mm_struct *mm, extern int __mmu_notifier_test_young(struct mm_struct *mm, unsigned long address, bool fast_only); +extern bool __mm_has_fast_young_notifiers(struct mm_struct *mm); extern int __mmu_notifier_invalidate_range_start(struct mmu_notifier_range *r); extern void __mmu_notifier_invalidate_range_end(struct mmu_notifier_range *r); extern void __mmu_notifier_arch_invalidate_secondary_tlbs(struct mm_struct *mm, @@ -449,6 +451,13 @@ static inline int mmu_notifier_test_young_fast_only(struct mm_struct *mm, return 0; } +static inline bool mm_has_fast_young_notifiers(struct mm_struct *mm) +{ + if (mm_has_notifiers(mm)) + return __mm_has_fast_young_notifiers(mm); + return 0; +} + static inline void mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range) { @@ -653,6 +662,11 @@ static inline int mmu_notifier_test_young_fast_only(struct mm_struct *mm, return 0; } +static inline bool mm_has_fast_young_notifiers(struct mm_struct *mm) +{ + return 0; +} + static inline void mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range) { diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c index f9a0ca6ffe65..f9ec810c8a1b 100644 --- a/mm/mmu_notifier.c +++ b/mm/mmu_notifier.c @@ -382,6 +382,26 @@ int __mmu_notifier_clear_flush_young(struct mm_struct *mm, return young; } +bool __mm_has_fast_young_notifiers(struct mm_struct *mm) +{ + struct mmu_notifier *subscription; + bool has_fast_aging = false; + int id; + + id = srcu_read_lock(&srcu); + hlist_for_each_entry_rcu(subscription, + &mm->notifier_subscriptions->list, hlist, + srcu_read_lock_held(&srcu)) { + if (subscription->has_fast_aging) { + has_fast_aging = true; + break; + } + } + srcu_read_unlock(&srcu, id); + + return has_fast_aging; +} + int __mmu_notifier_clear_young(struct mm_struct *mm, unsigned long start, unsigned long end, @@ -394,6 +414,9 @@ int __mmu_notifier_clear_young(struct mm_struct *mm, hlist_for_each_entry_rcu(subscription, &mm->notifier_subscriptions->list, hlist, srcu_read_lock_held(&srcu)) { + if (fast_only && !subscription->has_fast_aging) + continue; + if (subscription->ops->clear_young) young |= subscription->ops->clear_young(subscription, mm, start, end, @@ -415,6 +438,9 @@ int __mmu_notifier_test_young(struct mm_struct *mm, hlist_for_each_entry_rcu(subscription, &mm->notifier_subscriptions->list, hlist, srcu_read_lock_held(&srcu)) { + if (fast_only && !subscription->has_fast_aging) + continue; + if (subscription->ops->test_young) { young = subscription->ops->test_young(subscription, mm, address, From patchwork Wed Jul 24 01:10:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13740503 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F11AC3DA49 for ; Wed, 24 Jul 2024 01:11:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D21EA6B0092; Tue, 23 Jul 2024 21:11:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CAAE26B0095; Tue, 23 Jul 2024 21:11:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AFB7A6B0096; Tue, 23 Jul 2024 21:11:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 8B1E56B0092 for ; Tue, 23 Jul 2024 21:11:22 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 50A7F1A057E for ; Wed, 24 Jul 2024 01:11:22 +0000 (UTC) X-FDA: 82372868004.30.831FDFA Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf24.hostedemail.com (Postfix) with ESMTP id 849CF180003 for ; Wed, 24 Jul 2024 01:11:20 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=jCGW+We7; spf=pass (imf24.hostedemail.com: domain of 3t1SgZgoKCB4DNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3t1SgZgoKCB4DNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721783456; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xYFUlEH9+669o/RlL44au3jQWbYqLeFR1nQhpVmQt2k=; b=gRllc61YO/wA1/91PWYRUfuhW2QFxGyKtE37cqOvhmOq6tIy9eggxe3Huue+RUNEETuAL1 TO66Ro0p7lXsaIY09+fqnVRa7hjo5hI6XMGlqfkM3IjsJjPGIkbZtwJjbx+b84d7IGCY5I dBKt1+ylK2+CXvCYznRctbUdm3ow8bg= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=jCGW+We7; spf=pass (imf24.hostedemail.com: domain of 3t1SgZgoKCB4DNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3t1SgZgoKCB4DNBIOABNIHAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721783456; a=rsa-sha256; cv=none; b=6sLsbrC9exksTvRe+/h/NKNLG6yKnd7CZ7ye+NHQhOyKA6JKe6vYx/fXBtEbpatmxUcWhN rtlL1Xyq5oJmI9pAD8XJx0mFxLD2p0mOKR1q3E7e34kIJzSnC5Eq+EAFgDlxAQkxmlgd6E NOBVNArSAJuGWzSHD4/LtUZjW6pJueY= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-664b7a67ad4so176397207b3.2 for ; Tue, 23 Jul 2024 18:11:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1721783479; x=1722388279; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=xYFUlEH9+669o/RlL44au3jQWbYqLeFR1nQhpVmQt2k=; b=jCGW+We71dbOf0PyjAdvG6vdyiJfT444ffBuGsZcduPKCA1+yokaXitVuGiKrWRcfX 8g9rm0A4RXtk9iX3fVYeDxtstCHt4/bWBYQBpQE1l9vCDtPHMu5ENHR6fxXEdVBvsFb5 SiWTxCc9lCw6K35wFkykA7M5FzRYA2/IUXpJlUnMlTJPSIBScXMnvrvX+gx658BNqwOl 4aXIilCkdN3CADkPqH8TyWHNwZuGGDnwdTjGsqT2FQnupNL6y5lx4qExbtBG2TsaeGWc DWE0LEwHT4+oNDqDVrkrz9sloo8BeRNRi1r/h/Sg4i6MZPyR8dj9OM4Iwgr847UeHrJB SIvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721783479; x=1722388279; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=xYFUlEH9+669o/RlL44au3jQWbYqLeFR1nQhpVmQt2k=; b=sGgYzVUH93OpOJSr9IkDyXTxR0mZ8rKC6hR+tXpuZWRq25SE7b34p83pTS/m8b2YOR UC1rZ/fY2p0QYQjpCFW/S0VMAGX82YRyrUzaLaU44qVX037yKwQKYckbOI8yOKiP3nRd wm80Q9PnbXlyQ5LiGGIVWEKZFupzn8PvnQSpe2u/9iVPW7ZpPK08imWU8KTqWnWphsNf NisS1xQn5ELxAEmKcibEndjgeYw3aY1QQqMYaheUd4PY8L/rJ06lIRvD8YaqfER2l+p3 F8lZpwiXTwmChw2uLzvSjo4WBoTlSmneMMan/aOZpdjG8+tj2iA6V1Vt8vPPiI80vHzb bd9w== X-Forwarded-Encrypted: i=1; AJvYcCVR9wtJM2F3unI2S1cXDgsWjBtnoxqHg420zBfkYbV9lVrDN9n7u8b/tnu7meIUYLBmeOinARC7JHTKXrP6lN9NUik= X-Gm-Message-State: AOJu0YxWr8KWPHHrwLNQmSRkrE5ecFaWvC7MyApsgo6RsC8a+erwo+K6 ZQN0M1dWxIi60w06yH7A02tDEbcjZPTK9Ce18yhWONC7iuSdW5qCizPVpIRSdNIL/Wr+dT3JkUH c/N7IdhP4Mtk8AFv8Bw== X-Google-Smtp-Source: AGHT+IEN/Y2KcEHN/92i673drFtDbgqlQb3JUSQlrIWx5MUp4ovVZHssNawTpvQt6yIfvMN/stHEpv668SrbKdDV X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:690c:289:b0:630:28e3:2568 with SMTP id 00721157ae682-671f1095e3fmr479257b3.3.1721783479490; Tue, 23 Jul 2024 18:11:19 -0700 (PDT) Date: Wed, 24 Jul 2024 01:10:32 +0000 In-Reply-To: <20240724011037.3671523-1-jthoughton@google.com> Mime-Version: 1.0 References: <20240724011037.3671523-1-jthoughton@google.com> X-Mailer: git-send-email 2.46.0.rc1.232.g9752f9e123-goog Message-ID: <20240724011037.3671523-8-jthoughton@google.com> Subject: [PATCH v6 07/11] KVM: Pass fast_only to kvm_{test_,}age_gfn From: James Houghton To: Andrew Morton , Paolo Bonzini Cc: Ankit Agrawal , Axel Rasmussen , Catalin Marinas , David Matlack , David Rientjes , James Houghton , James Morse , Jason Gunthorpe , Jonathan Corbet , Marc Zyngier , Oliver Upton , Raghavendra Rao Ananta , Ryan Roberts , Sean Christopherson , Shaoqin Huang , Suzuki K Poulose , Wei Xu , Will Deacon , Yu Zhao , Zenghui Yu , kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 849CF180003 X-Stat-Signature: 186ii6ab8yreiecf6yddowroo4bz8w4x X-Rspam-User: X-HE-Tag: 1721783480-582757 X-HE-Meta: U2FsdGVkX199KAku0TdirhEp3BhKJp9N2Vgx5YyBN9JwS9fuetuIXY9/zmRcyYU6Knbdro8cfr2yTNAt7KevQs0TJPHGHyXvD4bhIOzZ/NDFUcvv5Lkzr3SX9OXosmUTE6l7ZxOKHFxAcX079+eq0eyfakW/CudTWgBz2re0LU5ktaftdkqYv+C15gYnqcJ9np3F+NGFdOqRbHNLAZAxzHzKPq4s3uiLPpicChXLqp3w9eX0fwvHt2c3HEsQKUhPi7KEz+SKQM0+JZtWa6ms/B9+/EI5IW12L5NkxRjot3Ca+iEUcJPv0FdvdHI5lSTsoh3L1hDXGO9lsXfxe6InJ0piTg/LkZqOIpdQOOUsho710OnOMgOrxvwqWQUl5j2WgN0PtJZ6O1fOnX8a0Iv17Z8qMC8pbzpecaDYz5htsVMwKqHVE97z2MS0C7OlWq5471omelixUoEZH0YW1VesVz1q21ymJjW9JQFINvGWdvef/fVWWOntYKEa0pGs3P8xa5CPvpwIZe1pZynnnnJDJLcMRlZZWChxDR51DBWKLXxbGVTxDQK2+vug3elCPhACg41DqpGXM+3yl9SGvFjTwHClWWSgMoO0QJgAtuA3/mf5Laus4lbZ5IQY9HxSI8NQs2PsjILyJMTH1PzlvE7twjH+FfYaiqDnmQFEjhZg6Vle5+SVC5NSE8zC3STgj3ar72WY7CtEbOinL7MUOTd0E41hE3BLXPQJZFoUxiDdR+ri5zcnEP7mYqNtbT9xz6ywtUO9f1VHniIDMdL3befvDoeo9zInGpE1THXn+kfoWV0elJeVi0j9OUQcLraFXlMvUx/uAwIJ1taqcqnADMxiUKMUZU9+Bbi0asdBxd4MINL1dB6Amrfdhy9nneSeXvSliAI9Ul7X5nG+TrDmhQgwGMoBCViGV77U4yqAWjMh1gQhZTahpzXGf1/NW7dGFT+KpCO7NGPonsbYDZNchjk Z/WWbFVt mtHcGnNP5Q8oc2u6ZgmbytaPt/Vi4N5UWW5Ac8B3/y+Yu1b9mXhqge5xOPK0GYE/hDd+uxwFKKpSRqCgq6vv6nPawJKt7b3sjHjuO0gqc6X3O2c/xAlcG13rKWdJ8Bno5wpYkxGXcLH5K8xA7cF/T43EYiqL+dXBK/Q2cJnGvmB7fwbJFOXF8XMO4YhTR5pE9VLDhL5Jom/ZDxgAPNyygBLIR47Fn2XrI108pSs4L9wjidxE9VVZ3B0xdzZJVYdCAYy7reS1hFkenki+/91Uo7LQeQQkfGUaA14b1rKZ/GBNrAwCGapZgnxszgkSPSanRIu0nAwPUmHrIxRKIVpPJ1G5ejITJI9UXeTy4GKdKkZJrA7HpcN9y2+UtMVVtfN7/UWiuso8k73hMB0VH4slbjcOmfONfrMsnS+kE3knQTH/Z+afL1jUKSo7tXarSieRmz0aKYChNojsU4KkTuDdSwROiWwRoQy+0ezLCWjhAYKC8ZSYYkQYmsvRhTXz7SbydBchPKlwUDZqa3y3ggoWNRFokXY6//LbhFUPOo0QstOrGAsYHxrvENJq4ggoA27Bg4ZVhQU3NY7REG7D3jXP9aj3y+w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Provide the basics for architectures to implement a fast-only version of kvm_{test_,}age_gfn. Add CONFIG_HAVE_KVM_MMU_NOTIFIER_YOUNG_FAST_ONLY that architectures will set if they non-trivially implement test_young() and clear_young() when called with fast_only. Signed-off-by: James Houghton --- include/linux/kvm_host.h | 1 + virt/kvm/Kconfig | 4 ++++ virt/kvm/kvm_main.c | 37 +++++++++++++++++++++---------------- 3 files changed, 26 insertions(+), 16 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 8cd80f969cff..944c5fba2344 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -258,6 +258,7 @@ int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu); #ifdef CONFIG_KVM_GENERIC_MMU_NOTIFIER union kvm_mmu_notifier_arg { unsigned long attributes; + bool fast_only; }; struct kvm_gfn_range { diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index 632334861001..cb4d5384c2f2 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -103,6 +103,10 @@ config KVM_GENERIC_MMU_NOTIFIER config KVM_MMU_NOTIFIER_YOUNG_LOCKLESS bool +config HAVE_KVM_MMU_NOTIFIER_YOUNG_FAST_ONLY + select KVM_GENERIC_MMU_NOTIFIER + bool + config KVM_GENERIC_MEMORY_ATTRIBUTES depends on KVM_GENERIC_MMU_NOTIFIER bool diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 959b6d5d8ce4..86fb2b560d98 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -697,18 +697,20 @@ static __always_inline int kvm_handle_hva_range(struct mmu_notifier *mn, static __always_inline int kvm_handle_hva_range_no_flush(struct mmu_notifier *mn, unsigned long start, unsigned long end, - gfn_handler_t handler) + gfn_handler_t handler, + bool fast_only) { struct kvm *kvm = mmu_notifier_to_kvm(mn); const struct kvm_mmu_notifier_range range = { - .start = start, - .end = end, - .handler = handler, - .on_lock = (void *)kvm_null_fn, - .flush_on_ret = false, - .may_block = false, - .lockless = + .start = start, + .end = end, + .handler = handler, + .on_lock = (void *)kvm_null_fn, + .flush_on_ret = false, + .may_block = false, + .lockless = IS_ENABLED(CONFIG_KVM_MMU_NOTIFIER_YOUNG_LOCKLESS), + .arg.fast_only = fast_only, }; return __kvm_handle_hva_range(kvm, &range).ret; @@ -900,7 +902,8 @@ static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn, * cadence. If we find this inaccurate, we might come up with a * more sophisticated heuristic later. */ - return kvm_handle_hva_range_no_flush(mn, start, end, kvm_age_gfn); + return kvm_handle_hva_range_no_flush(mn, start, end, kvm_age_gfn, + fast_only); } static int kvm_mmu_notifier_test_young(struct mmu_notifier *mn, @@ -911,7 +914,7 @@ static int kvm_mmu_notifier_test_young(struct mmu_notifier *mn, trace_kvm_test_age_hva(address, fast_only); return kvm_handle_hva_range_no_flush(mn, address, address + 1, - kvm_test_age_gfn); + kvm_test_age_gfn, fast_only); } static void kvm_mmu_notifier_release(struct mmu_notifier *mn, @@ -926,17 +929,19 @@ static void kvm_mmu_notifier_release(struct mmu_notifier *mn, } static const struct mmu_notifier_ops kvm_mmu_notifier_ops = { - .invalidate_range_start = kvm_mmu_notifier_invalidate_range_start, - .invalidate_range_end = kvm_mmu_notifier_invalidate_range_end, - .clear_flush_young = kvm_mmu_notifier_clear_flush_young, - .clear_young = kvm_mmu_notifier_clear_young, - .test_young = kvm_mmu_notifier_test_young, - .release = kvm_mmu_notifier_release, + .invalidate_range_start = kvm_mmu_notifier_invalidate_range_start, + .invalidate_range_end = kvm_mmu_notifier_invalidate_range_end, + .clear_flush_young = kvm_mmu_notifier_clear_flush_young, + .clear_young = kvm_mmu_notifier_clear_young, + .test_young = kvm_mmu_notifier_test_young, + .release = kvm_mmu_notifier_release, }; static int kvm_init_mmu_notifier(struct kvm *kvm) { kvm->mmu_notifier.ops = &kvm_mmu_notifier_ops; + kvm->mmu_notifier.has_fast_aging = + IS_ENABLED(CONFIG_HAVE_KVM_MMU_NOTIFIER_YOUNG_FAST_ONLY); return mmu_notifier_register(&kvm->mmu_notifier, current->mm); } From patchwork Wed Jul 24 01:10:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13740504 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D6B1C3DA49 for ; Wed, 24 Jul 2024 01:11:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9C3FE6B0095; Tue, 23 Jul 2024 21:11:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 925F86B0096; Tue, 23 Jul 2024 21:11:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 79F116B0098; Tue, 23 Jul 2024 21:11:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 57D046B0095 for ; Tue, 23 Jul 2024 21:11:23 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 0D870160520 for ; Wed, 24 Jul 2024 01:11:23 +0000 (UTC) X-FDA: 82372868046.12.882D300 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf01.hostedemail.com (Postfix) with ESMTP id 48BE540004 for ; Wed, 24 Jul 2024 01:11:21 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="dtEZ+oq/"; spf=pass (imf01.hostedemail.com: domain of 3uFSgZgoKCB8EOCJPBCOJIBJJBG9.7JHGDIPS-HHFQ57F.JMB@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3uFSgZgoKCB8EOCJPBCOJIBJJBG9.7JHGDIPS-HHFQ57F.JMB@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721783458; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JW+jUUoqcXqadXhonZGbNsXZ7+Zma3rlwNzyePdiaMw=; b=oS7lLSfEIp7thwBSAVW4R2spAxh/O/22o4kGjZsehS2CV3Zpwg6JbxUveD9Rk0O97fODiI nyN7lW2HdimEFvnRXfawMIiVHyvxGOaklNb5N6xXACtUZ+dANnO5ql5qRzMN0zdJ2MvO2v t+upalSk4kyKV9jOWg8EEjR4wUY6R7A= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="dtEZ+oq/"; spf=pass (imf01.hostedemail.com: domain of 3uFSgZgoKCB8EOCJPBCOJIBJJBG9.7JHGDIPS-HHFQ57F.JMB@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3uFSgZgoKCB8EOCJPBCOJIBJJBG9.7JHGDIPS-HHFQ57F.JMB@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721783458; a=rsa-sha256; cv=none; b=tEtFLft8ZtdMX23QZa7KR+sjYoU2OZLZH33yyV5BdDHqXWJ/T7zGWvjXdHiIMwzCvO4T/V 8qtnCS1GyApXmZbjv3GNNRKafkNcd1TthcG7lxkeNtwBPkhwOvV/4IghL8xWfmlN8dbmLl TZFeVXs2vBLSBfKpSPtWsukcXN1x72k= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-672bea19dd3so3556577b3.1 for ; Tue, 23 Jul 2024 18:11:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1721783480; x=1722388280; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=JW+jUUoqcXqadXhonZGbNsXZ7+Zma3rlwNzyePdiaMw=; b=dtEZ+oq/BJWU7h68CoxJ36a3ilCU2JNr6si+aIcVhHSx14AIeCQ8vIB0YQYilv8bFN UdF5C8oIpkUaUBSS6KPYA5IiDMYkKzuqei9yMftbkEQLje3un7EuCIJK/6SY7UrA3fwX Ucqnv5E2pX8uHKpmUD1rEni9iEIW0L19KsYuurWsJaCmc7GgGm8ALzIfCope1MGHZkFq lCzpWVF4qE7f9nhvTcdZs753qLBLNpLpLWO+gEqYGYJ7UipB8ESbwqF8sB7v+jqCnXL+ M8FWp63mYtETXVCKfHv0kZDqBl21/uIBfVbWU9J/cOb9mDd/f32LjbMopvVyIzbAERGl fZ4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721783480; x=1722388280; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=JW+jUUoqcXqadXhonZGbNsXZ7+Zma3rlwNzyePdiaMw=; b=KQRVj+SdY8odMpkjc+gsJEQ9kfFUqeuRf4bxA5OlVVQa5FRRHoxTQWAllEyNrmtYsZ SZ+cPZeho1f8gkUNmNoYOztLADQpdQYd7XYgw3Hu0KJefapJp6Gv/MoctkYDcrhXLvyf B5wpNFUgib1Nlcqxw9oEnBhJ0EnQJVKRancwGE5tUk8Y5v33pMOpndhTi8uzi1IEKZYb tQSzSGXifKwJLbvhl9ZeYS4cG42LgMAXMbIjhXrnkPyvi4gw1NrLbuRC8XixvNCktCqo iYkBeJXF1BNwTfDGtC0gz7nFuQPCRvvSlcduIXD2gsO/Rr/JGnE8EOQIeZdVGTLVHams ugyg== X-Forwarded-Encrypted: i=1; AJvYcCXiUjLaYK7fidIz3d54fb0m2t6KWeNRN1vPMFHcOqID0ZC0s9JUsXmqhoER98mobqX0Q5+Q68HyKVSsmwjMSLue+r0= X-Gm-Message-State: AOJu0YxpPzbI25hfS5wnNVmBCxjaJEi0KaCkXkPAZ081DqlO6MEFkJMO GHhaz96vOivaY6Ujtk5Ge9awO2PlDBpGSRDZf1bClMQfV4wXmcJiYvUeJsp3PNRF5nqm1ZSCch7 nGKZZwH99JXhULUd7oQ== X-Google-Smtp-Source: AGHT+IHToUjK35hNR0etsZGQeAjUCPxr50Jqta5Si1cnS8NseXl88g/0Ew0GUmmAUp0HML6hFdfWagbQP5rsIFm/ X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:690c:102:b0:66b:fb2f:1c2 with SMTP id 00721157ae682-671f4e0a6damr821967b3.7.1721783480393; Tue, 23 Jul 2024 18:11:20 -0700 (PDT) Date: Wed, 24 Jul 2024 01:10:33 +0000 In-Reply-To: <20240724011037.3671523-1-jthoughton@google.com> Mime-Version: 1.0 References: <20240724011037.3671523-1-jthoughton@google.com> X-Mailer: git-send-email 2.46.0.rc1.232.g9752f9e123-goog Message-ID: <20240724011037.3671523-9-jthoughton@google.com> Subject: [PATCH v6 08/11] KVM: x86: Optimize kvm_{test_,}age_gfn a little bit From: James Houghton To: Andrew Morton , Paolo Bonzini Cc: Ankit Agrawal , Axel Rasmussen , Catalin Marinas , David Matlack , David Rientjes , James Houghton , James Morse , Jason Gunthorpe , Jonathan Corbet , Marc Zyngier , Oliver Upton , Raghavendra Rao Ananta , Ryan Roberts , Sean Christopherson , Shaoqin Huang , Suzuki K Poulose , Wei Xu , Will Deacon , Yu Zhao , Zenghui Yu , kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: 48BE540004 X-Stat-Signature: xbwn6m8yfugcz5qzuysoa5cos5iwn714 X-HE-Tag: 1721783481-705413 X-HE-Meta: U2FsdGVkX19oAortkafQmY8irdU79EgVf2Mx0mm1QTFEFXMunRobkogekwk3QMN6c1nS8qNu71stJKfPzKqeYjtiryZxlX1U3DZuDGY76k/Dtugum2WKGN0knCbwft8Ys0NlDQA969dj5wUEYDtVvsq7y9H5jn2XquRC1saS01m8ZVnXFRbuu0ijVXZ9DrVwpfvUepSYPRlJvEpKItECEbGupWwkrtV8kqmJFvGj+xpi+wwOVvZx2abLW+/bVtD3DoLiWt8qpfq6tQctN9YchWzhTwZ0GESYLKc8DtnLdHaA7MxvQRyRWFrm4gDFWxEGQpOdtQ86yyYZSlIcDxS90sf+R/nF1CC8PSFG4l3jZR23uwV/CZlv7uhaM/fwU1aDPY3gLI22Fwk3UHAO2B6Dr/A6jBvWx8msKUemOOiKkC2KUP3aPCAn7+e4rBMrAbDsvtsmSigVHV0pYKBsqrpf3F7pwKKX4341eplPT1faICtFV3DTmdE0a6+p4oDR3KYgFvSRwefnWFd53B/HYuDMPCOzxL6cybth32/xr9CS6w1gFdfwem2vSHtAsoUzt8pxSPJ3oB8jojdc+TmvmU00ZtzbRXSMiQmyCpIvAIPLxvhY/JPY1HBN6g6AqCh81u7lCu6gGBdFWdGlpDrV9Wn+N6gO+NG2X8aDzOXuHI+lyJ8n9thv7gnEUv3efKDnnZqq/SYoz2Ua+Bfw+JHKJIGPed6Au63hfpMeYCX50V0AceAuvooEQ6z1i/tcMauQ+peVA8Om3YRnJxJOMSNF/uAtVf5cAv0IterfdBOcRd5oPbrdIiKXQTxfeuNbrjXIM6sY9nIQjAnDef9Iw9shysuoRAk48bchovrCRPlsNhDLsdmtVGkux+KnpBqFpx7C3Zuwl+n8AB6PZrxLkd5qKJgKqkuXpz6PoB8+k20vgUfODph/07sncCOEzxhn/g5JwXPhzz6sZkO0K7AfXpjj3kN tEpxz/rt LRaxqvbpaSAIsYxgkb136nIf/lwTBI7jUStK7HSk4kQkNsb5m4tQuGekzzQkDieAZ8Vz3H69U+pABpO+hxBZHLcxl+057lkYNy/PCmwIO0GZs/NE5VSmUFzVbcZzhcO4eq84ntTuF1syvakZjRND3Lp2cvEAhj8eZBMFWxnhfkiToh5uAWpnG/86KB17KGHS9j+pmcFRznfUQvj/eRPXzEfyDWe44KbMbRIEccPir+cyn660lyTcYJKcHMen4MSuLJZ3Egjda2WvLIdR4dQQ+qK4GwNsJm8fCQGGumby2uUicmWxhVDfyeru0uQ9RA8w7l42XVShR08dTT7QxPanUAQnC5suYs+2jez3QAfmOkRRmAmXb+QaU/rExdZ1luED3PkN0D+sG+Ps1zm5rWV1VY7UpB/N6U59JSNYmEB22iRZBvAOjwYUiw5lNNj8uU802uIqm2kdQxFy1pnGCrABzYxhyHJJTgHh7mx0JGTRMCvSckNMb4o+C5blr3N+xxL7Aru9UY3nTmu8PVuXHqx0RfL2F8AP47Ce2J/A6SFhtID7jEavRffpy/L9kSrEZK9IrpTowkkcam454pgoVavbsiaI/7A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Optimize both kvm_age_gfn and kvm_test_age_gfn's interaction with the shadow MMU by, rather than checking if our memslot has rmaps, check if there are any indirect_shadow_pages at all. Also, for kvm_test_age_gfn, reorder the TDP MMU check to be first. If we find that the range is young, we do not need to check the shadow MMU. Signed-off-by: James Houghton --- arch/x86/kvm/mmu/mmu.c | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 7b93ce8f0680..919d59385f89 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1629,19 +1629,24 @@ static void rmap_add(struct kvm_vcpu *vcpu, const struct kvm_memory_slot *slot, __rmap_add(vcpu->kvm, cache, slot, spte, gfn, access); } +static bool kvm_has_shadow_mmu_sptes(struct kvm *kvm) +{ + return !tdp_mmu_enabled || READ_ONCE(kvm->arch.indirect_shadow_pages); +} + bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { bool young = false; - if (kvm_memslots_have_rmaps(kvm)) { + if (tdp_mmu_enabled) + young |= kvm_tdp_mmu_age_gfn_range(kvm, range); + + if (kvm_has_shadow_mmu_sptes(kvm)) { write_lock(&kvm->mmu_lock); young = kvm_handle_gfn_range(kvm, range, kvm_age_rmap); write_unlock(&kvm->mmu_lock); } - if (tdp_mmu_enabled) - young |= kvm_tdp_mmu_age_gfn_range(kvm, range); - return young; } @@ -1649,15 +1654,15 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { bool young = false; - if (kvm_memslots_have_rmaps(kvm)) { + if (tdp_mmu_enabled) + young |= kvm_tdp_mmu_test_age_gfn(kvm, range); + + if (!young && kvm_has_shadow_mmu_sptes(kvm)) { write_lock(&kvm->mmu_lock); young = kvm_handle_gfn_range(kvm, range, kvm_test_age_rmap); write_unlock(&kvm->mmu_lock); } - if (tdp_mmu_enabled) - young |= kvm_tdp_mmu_test_age_gfn(kvm, range); - return young; } From patchwork Wed Jul 24 01:10:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13740505 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8CCEC3DA63 for ; Wed, 24 Jul 2024 01:11:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A38DA6B0099; Tue, 23 Jul 2024 21:11:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9E9BE6B0098; Tue, 23 Jul 2024 21:11:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C6EE6B0099; Tue, 23 Jul 2024 21:11:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 531EC6B0096 for ; Tue, 23 Jul 2024 21:11:24 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 14C7040546 for ; Wed, 24 Jul 2024 01:11:24 +0000 (UTC) X-FDA: 82372868088.08.9E77D48 Received: from mail-ua1-f74.google.com (mail-ua1-f74.google.com [209.85.222.74]) by imf07.hostedemail.com (Postfix) with ESMTP id 5A8E540008 for ; Wed, 24 Jul 2024 01:11:22 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=4DFSQj5R; spf=pass (imf07.hostedemail.com: domain of 3uVSgZgoKCCAFPDKQCDPKJCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--jthoughton.bounces.google.com designates 209.85.222.74 as permitted sender) smtp.mailfrom=3uVSgZgoKCCAFPDKQCDPKJCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721783419; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=P6265Vr7ZBVKTm4Fxp+PtFVTNDBMV13LE1gmbTEV1Oo=; b=8QcE/Ya8JIoDVwY+l6iXL6O3XgeLce/vo/MbftAW/a5ssaarc0rzId9bmSabrlgys0b+vD AJU0hsdWq2rFmus7CbkA3y3ClEB8yN+1RckeWcqnRyYeOC1yWeuKtya7cc3pCtCMcKmWVl pCB0MnAucsE3EtVsE+JMrjp53lMqfIQ= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=4DFSQj5R; spf=pass (imf07.hostedemail.com: domain of 3uVSgZgoKCCAFPDKQCDPKJCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--jthoughton.bounces.google.com designates 209.85.222.74 as permitted sender) smtp.mailfrom=3uVSgZgoKCCAFPDKQCDPKJCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721783419; a=rsa-sha256; cv=none; b=S7sCkkEXOHTtcE/b3rtDBJnRM6Ts0tCTTvpZboUeSxKY7N0bHS1UMvvk584ldcY8AKyoL0 Rm4DscjxMXdlihOPhO93byhjpk5TbgGJmbXprlqmP+u6aOuMWtRGZ82WH6LaPmmy50cvt5 Ds7OPz4t2iGE2ioS9GroWrICKWfJmS8= Received: by mail-ua1-f74.google.com with SMTP id a1e0cc1a2514c-821ba96962bso1585438241.1 for ; Tue, 23 Jul 2024 18:11:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1721783481; x=1722388281; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=P6265Vr7ZBVKTm4Fxp+PtFVTNDBMV13LE1gmbTEV1Oo=; b=4DFSQj5R/jRcSeHwzG03ImFWqyqdOLPDw1Orn7gQOYnA8V7Te3Tcdd2lNjlt7l5I4p DLvxCqn5YC0ivO6/qHFsgDHh2N4GOoDIuk8ztXCNoNmQADuyHmdYhDrz+6OwfGddmZtJ B+kI9PeUWzAdW72UfgaDy4r5RV73ekLK/+R876wsOVXul83rUrx4F6B3fjArl1/MrDI3 9wH/+UYZ/DVdPCSxqMPWcj8g0RM1oz9ZXHa3VC0BHaeYB7dM0QjYvIw7Pm/XXJ6njAfO Y+HetsSeM8dIKKzYRuwJq7ZQJsoQuBN3mYdrNGXU4iy26P9Du5hbc+fH4H7XBvfVfcW/ lL2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721783481; x=1722388281; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=P6265Vr7ZBVKTm4Fxp+PtFVTNDBMV13LE1gmbTEV1Oo=; b=qiPF7PYT4tpGKwtxKsKLWyyeGGRHkyANh4izn9m8pnkHl7NlUkrHrV1FZvB5ALoTUc WU5u7HGEcdKrh1VrnNcq9NwPNlD/us23eCDfsq4pEVXUnIEQrSrU2xwaeYPYD71MKBge XvbL1VHkNN+5U367Z+QdhK1DPCq5lpPGNpn17U50sQ91gj0WpbLSwaxyT2bm6SPtnd/1 1neH9lGtNrF+kf1x9q1DO/D7o5QaPpxgSdPBSsxfYkAMSiIAS/loLb7TbrUGG8m6PCDX wdTISg4THbpj5m2Q2ydUk5kO+jb/Yh9nQatYxh8SD9r6dKuQrsUuoMU4qlf3LM5cu5Th 70Kw== X-Forwarded-Encrypted: i=1; AJvYcCVXdEVvBOFVcOLfWV4e0iSa7P0SGm37VIWC6ZCkhdBUSCtFxsJKxnddJvBb8mmfvnT1ScsapE39zgT5iwzejqhw0iY= X-Gm-Message-State: AOJu0YxJLokvQ1qvtDXBPN8b+CjngCnFaOgtLSPN5iykYMAT+vz16b2g I5bSxXi399mZsQmmzcNOG3wHabPjcofCbooWZ4ZMyfuEKgeW4yelSjsAB7t7duRcLi+88Z7EX6+ 9QD/dyXxuipioPfFuHQ== X-Google-Smtp-Source: AGHT+IGFz1kYjtJ2CB63i3kpIK8cbKQBNz86a9KHHAO5xzKcY+Rn5KgcFneR9rdE9iF/27GJgdzSYep0JZHDL1qb X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6102:358f:b0:48c:403d:4428 with SMTP id ada2fe7eead31-493c4b17e64mr40150137.4.1721783481412; Tue, 23 Jul 2024 18:11:21 -0700 (PDT) Date: Wed, 24 Jul 2024 01:10:34 +0000 In-Reply-To: <20240724011037.3671523-1-jthoughton@google.com> Mime-Version: 1.0 References: <20240724011037.3671523-1-jthoughton@google.com> X-Mailer: git-send-email 2.46.0.rc1.232.g9752f9e123-goog Message-ID: <20240724011037.3671523-10-jthoughton@google.com> Subject: [PATCH v6 09/11] KVM: x86: Implement fast_only versions of kvm_{test_,}age_gfn From: James Houghton To: Andrew Morton , Paolo Bonzini Cc: Ankit Agrawal , Axel Rasmussen , Catalin Marinas , David Matlack , David Rientjes , James Houghton , James Morse , Jason Gunthorpe , Jonathan Corbet , Marc Zyngier , Oliver Upton , Raghavendra Rao Ananta , Ryan Roberts , Sean Christopherson , Shaoqin Huang , Suzuki K Poulose , Wei Xu , Will Deacon , Yu Zhao , Zenghui Yu , kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 5A8E540008 X-Stat-Signature: g1oqp3we3bje5t5eygom8pm1mwf74mcp X-Rspam-User: X-HE-Tag: 1721783482-824348 X-HE-Meta: U2FsdGVkX19Jm887IRUSXv/S5Z/tSpffqEe94uJXo9hlK9mY7rm9BeLjQGuYbIFp2RQQVzjfAbpCLL/7g9FNCEaoOCRFbDSITgplwknOagQIZGk8ObVqOimLq02xgccQ0wH7mkxUYOpuMwd0aUDj6rmJwKrqXDcYLoI6PpTc1rJKeMnXx/brWt5gsnpcnmvZyLyoWLcRgmM/5Mye9c028nSFuKnz27aWKhFWwFIIUYlhqjEj9VEi/J+V1V8ADovqJOf1vNWvniaItZ9NhRhe5PjAll4Ysq4ZCC3JwOVY4aBaALg4edJEqX/X3JEx3rlq1Kh1BGyyjaqIm6w6wqVa8lJSVNvQ9N3npP5VCdlWLdzcrU4Zq1+Pygwr8M1Hcic0qA6XQ9YnU+wxWg3txVzkVL99Dd6nUv6LNDWyk4k6tVLGHkOE/Rs4vnVvnx8S1Ztv0DluvV+7ip/Zkibv6gILvX4mAtiUHLh9YyCCULSjQdE2NF2ryq2WJz8K3fdfiK/ztlFk4w8bYC/5JqbsiHMpJtjEAg+mbzRLZ/xGtWToMu+EyvYOkldAaFrRyjm9UpL7mpwRDVtZ6eTmrOJc+WwURBhTVqKKej82jN/MS/Ru+s7TtuISF3GI3oVDYi9VslJg5PbUmB5gIzyaatTBrGeA8VexEwgP/oNkjFf60kkToBOHp89yFGAMy/X0zjVphqH6q2is2pj+gvn8kBcABhItLKyAVVUF79SOeXXPtM0fhaZdvxi7Ma8rW6n6I1PSeB4rA1iqsxfK9ZJccm4RdBLq0jMkjtRksMziajC/YKr8Mn97wU75JPwA6ok32mkJG8Yj9P8V61ZZwpDrvV8oPOe7ww4xfDTLzj/8LMEjL6FjUOjgTBBsf9B4J7r/bqUdym6wg6S792D8uz8hF01Xf/i3PvOQHbmpIEYqFBYAg4YTTyhsFIYSoiCVk1DNQVRIBI+N8kYzJoKjYhVoOBE4PSd v7Ju5NOU Xk6DX8EVzzYfWcaHZdHtaG6Nk5n/ukO/Jb+eQ0X/v1e0e1+tNWeFeed2+mC1AUpxlHCp7f/TYuf2VmsZnhVZLwwCmBO6E9wRdYCCS/tlfLh18flFB4sbiQTgLaVQwgqBXGbVxzBoAqWfpNC7p4/WF+vyUaQJDL4ZMH9LQ74L6C0mexzwZDg7xEBPqN3vS7UwDZmOI8heuKoDx+wiL2yIapajhY64L4aoeTZjx8xRY2mmH4IRGNi2jY16d6Qp+NQR7mS3ZDsDDh4yHFqvqep8Y1LEsOETX0vrzqt5av6b5Mfj0oEeTLOzb9J7JigmzQr8aCf0RJxH/rcOR2CRwGgP3q454OJpa4ExeW7vbG1gn3XH2s/QGSE/9zBrpWNGQU502GSqyLQOQox6mbIkUS6xOOIGlNmR+wqPtTcVqNpleYa/UVp/cuHP+BbgMH3e6wuBMW+UPX2jJ3O4+b58sL/yMNxRBk+jn6hu4ujfGHKkBr48YtNTiMCdAjfMyKCUw1HFozffvFHyy6ZKuag18CiXiibsl+QZeJzPkqtPt7kQ0L8AEDEjGau47MzNaIaFyt1BVq8ygjzsbYCZIwpKnBirPO7xwjQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: These fast-only versions simply ignore the shadow MMU. We can locklessly handle the shadow MMU later. Set HAVE_KVM_MMU_NOTIFIER_YOUNG_FAST_ONLY for X86_64 only, as that is the only case where the TDP MMU might be used. Without the TDP MMU, the fast-only notifiers will always be no-ops. It would be ideal not to report has_fast_only if !tdp_mmu_enabled, but tdp_mmu_enabled can be changed at any time. Signed-off-by: James Houghton --- arch/x86/kvm/Kconfig | 1 + arch/x86/kvm/mmu/mmu.c | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 6ac43074c5e9..ed9049cf1255 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -24,6 +24,7 @@ config KVM select KVM_COMMON select KVM_GENERIC_MMU_NOTIFIER select KVM_MMU_NOTIFIER_YOUNG_LOCKLESS + select HAVE_KVM_MMU_NOTIFIER_YOUNG_FAST_ONLY if X86_64 select HAVE_KVM_IRQCHIP select HAVE_KVM_PFNCACHE select HAVE_KVM_DIRTY_RING_TSO diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 919d59385f89..3c6c9442434a 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1641,7 +1641,7 @@ bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) if (tdp_mmu_enabled) young |= kvm_tdp_mmu_age_gfn_range(kvm, range); - if (kvm_has_shadow_mmu_sptes(kvm)) { + if (!range->arg.fast_only && kvm_has_shadow_mmu_sptes(kvm)) { write_lock(&kvm->mmu_lock); young = kvm_handle_gfn_range(kvm, range, kvm_age_rmap); write_unlock(&kvm->mmu_lock); @@ -1657,7 +1657,7 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) if (tdp_mmu_enabled) young |= kvm_tdp_mmu_test_age_gfn(kvm, range); - if (!young && kvm_has_shadow_mmu_sptes(kvm)) { + if (!young && !range->arg.fast_only && kvm_has_shadow_mmu_sptes(kvm)) { write_lock(&kvm->mmu_lock); young = kvm_handle_gfn_range(kvm, range, kvm_test_age_rmap); write_unlock(&kvm->mmu_lock); From patchwork Wed Jul 24 01:10:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13740506 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0B76C3DA49 for ; Wed, 24 Jul 2024 01:11:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BC06B6B0098; Tue, 23 Jul 2024 21:11:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B71346B009A; Tue, 23 Jul 2024 21:11:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 975E86B009B; Tue, 23 Jul 2024 21:11:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 644886B0098 for ; Tue, 23 Jul 2024 21:11:25 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 29448160527 for ; Wed, 24 Jul 2024 01:11:25 +0000 (UTC) X-FDA: 82372868130.08.E1683AA Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf21.hostedemail.com (Postfix) with ESMTP id 65C2A1C000E for ; Wed, 24 Jul 2024 01:11:23 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="MToV/bOd"; spf=pass (imf21.hostedemail.com: domain of 3ulSgZgoKCCEGQELRDEQLKDLLDIB.9LJIFKRU-JJHS79H.LOD@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3ulSgZgoKCCEGQELRDEQLKDLLDIB.9LJIFKRU-JJHS79H.LOD@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721783459; a=rsa-sha256; cv=none; b=2WeNG/IzoqNHDThGoHi3dPH3aHfP44h1dG29UtaoTNSAFBZAB/e7KqAVtzqeiUbAqCt6mI hQIQjr9l+x1TlRHzQxjEqq2rCHg2vyEtHXqGdLwiOF1MSYV98Zwf/3Gely0LgCY3LKJplv 6Pxbd/EyHAe+TjE23yLZ2YMigrqPDIY= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="MToV/bOd"; spf=pass (imf21.hostedemail.com: domain of 3ulSgZgoKCCEGQELRDEQLKDLLDIB.9LJIFKRU-JJHS79H.LOD@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3ulSgZgoKCCEGQELRDEQLKDLLDIB.9LJIFKRU-JJHS79H.LOD@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721783459; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=l61walSFLT3wujjZG3IB9eT72Sms+glBrguecMkZhLU=; b=xPHPdz243xXBAWSGpeawVY5FmjtWXrpBlJxAPbkCAiMVS5XtXft3QA/wI4RexO+Uw+AYyp HdghybSBMffDMJY5jENi/41LjrTyyG68iP16VBHFbGCpmzShQtRqrE4lMFLmoXBgVb2b47 LNT8Loyxd6rViQP6ej8OxcYYfcKkxlw= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-65b985bb059so186115467b3.2 for ; Tue, 23 Jul 2024 18:11:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1721783482; x=1722388282; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=l61walSFLT3wujjZG3IB9eT72Sms+glBrguecMkZhLU=; b=MToV/bOdwOBDQ3ymRl1MkOcUW7sMdOjp8eaXdo9Dq6vVw6qG/eSMoOzWLpQ6jZSrLT gjyOwVqvbA3FtjpqTcUtfbJ/x3IkCqmHGZfLvCkfpogFiYDwmKWJYna9Q/tj0bxxxgMW Vg+lAIWic2PJrZqDQzrEUnL9FNYYCb9oS9iSO2fGD5Kz0n/hfoA+2fjOeFI56RmFmIOh aGDLaXPKSTyggaNrtTcBy3EdVRGvT175orSnewotcGpQGmg213mLxhbPCrru0jdqj5UE NFK710XZrzTGxj1J22RgBPl+6fBYlI2H2sIyNHHHJuH+PAaTzJPRCmPyUCg7asI5Rp76 rdsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721783482; x=1722388282; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=l61walSFLT3wujjZG3IB9eT72Sms+glBrguecMkZhLU=; b=MIy06yfQWJFUmUetLNDx7kgjFF7EP0claduR4c9ujH5f0m+xOodN7S9xfp5WaIkMja 0WaR8jb0vTZbIWxnxB7r0Rf5R5hNTClHqJVbd//UM8W+8051YEFeHxos8Xcf/82jWquF hwHeJF9Fu2HsPCZMsmQHypGstzPKtbf8sayoVCHSFxfjOaklycAxiBM+bAC4obW9chY0 jd0U4rDT/vjc6kbU3ZPgwM1LKh/dB/r9A4hS+Pr7R+K9asJTixsEdmUIWaFGuFyp0Cp1 RxF/EdEloIK/6DHeUiiX5BZ1sYnQ2zLpifouSI/Bxa+r1w9RpmCpvkofzN5sIwAr/oBD e0Pg== X-Forwarded-Encrypted: i=1; AJvYcCW50tjTFqon8SXZURIIiJYNhEyhBtOfihrJQfD71FE9tabm+cnbc4j9odxodlNdJ3pCv0EOiyeU4OysaByhHxJMbYQ= X-Gm-Message-State: AOJu0Ywmy0Opvg5iynm1W4SZdTgGf1gv7q3hOVDfVa1gaeIfa+HE4B5k Yrll6dYbXTdPoqodWuoIUnK2DBuH01c4B6YbHWYRuE1pmr+/pdK9V/EmXQ+n4MRAoKASgK0YDil 40LM1vOsfDBX4Z6/+XA== X-Google-Smtp-Source: AGHT+IEg/H2lbIriLQJuQZOeuvv/uRDEcOC7gwIqFMsyzw2ax8lq+eZHzig2mgaym4s4zLkdQm+/Llq7W+RvG6Hz X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:690c:4286:b0:664:c5e0:6574 with SMTP id 00721157ae682-66a68f79a06mr2287047b3.9.1721783482414; Tue, 23 Jul 2024 18:11:22 -0700 (PDT) Date: Wed, 24 Jul 2024 01:10:35 +0000 In-Reply-To: <20240724011037.3671523-1-jthoughton@google.com> Mime-Version: 1.0 References: <20240724011037.3671523-1-jthoughton@google.com> X-Mailer: git-send-email 2.46.0.rc1.232.g9752f9e123-goog Message-ID: <20240724011037.3671523-11-jthoughton@google.com> Subject: [PATCH v6 10/11] mm: multi-gen LRU: Have secondary MMUs participate in aging From: James Houghton To: Andrew Morton , Paolo Bonzini Cc: Ankit Agrawal , Axel Rasmussen , Catalin Marinas , David Matlack , David Rientjes , James Houghton , James Morse , Jason Gunthorpe , Jonathan Corbet , Marc Zyngier , Oliver Upton , Raghavendra Rao Ananta , Ryan Roberts , Sean Christopherson , Shaoqin Huang , Suzuki K Poulose , Wei Xu , Will Deacon , Yu Zhao , Zenghui Yu , kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org X-Stat-Signature: 8hxdfibf99srg41t7fyuwkubig391ttk X-Rspamd-Queue-Id: 65C2A1C000E X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1721783483-145116 X-HE-Meta: U2FsdGVkX1/LqJwgLDiV398GAXVakgTXaYjfsfjHayGFoXBTh4jk7j7BOsa8hI+8WRSVkLDa/ikJhC+Zr7AsSi9KXQaoD62dij6VpQXXMcy5VY8zbbXjQWO0rBltYeTtZcwJy7y+v70yAnrpx+L7Gp5olf2uof9kU00d6/M1E+9shuJ99xVsSEVDr98FsruiCMfbQ069Y0PSRZbKg3udxaUKxx0Q1BPy1a52J6dxodfYuVMr8Wdk8uhzOMZHYY8TKCrkdiqcFEfYWCR9LsGBfWBqAbolqPytlFarnDAqT2CjdjBmpBH2imdf3Hmjg9En9eOPR5o6xf2UlPupgAJ2AI9pZFWhT68DUEBmkkgwlN7b+VB4d7k2aZC0EMurM4Oi5XUnthqHODKQWvGmptxJEZlijfJsLI0zRB74pK6WmxPLRquZpB5rr791Oma3QnCtxWAviC03ZrmibO6XtHOZQA52B+YMzEFpzpy5nIXbI9xeo+XMD6vI91uwVKmIMeL7dZtihKOSdtdcYSTD4iEoM/yhBDk2fYkkuQOB1rWrSPYZMRAfUg8ueJqiBvyBIdBx0S49BVwpcskEHx4AsTDsHZAMzVuexvUvb+Edq/E2l0ITfAmn798LyxgRtL0EvjPGZTvpmordelLs12ABrCPA1TG1MSFx3OPLFI5F4UKaeLS+4eRIg9OW/7Am0HgDYy+GOE4XGMM1cGidbcIjEjAjCiXs8C2sKhAMbvDAIXPLOchEK2RHamUW/zIrAW7LXWfy0hMHMlbAvJ6Ccl3+VJYdNWE9y4gUi1n83cjeU6UWh8vzgBuxBBxtPQ/Z9QAzw4CspcffKvBeRNsbRnS/6dduRl+Nlpd2ZPxMJF4mYiGRc2viVDK/qyn+B8udoioGyui8bmLEz+WYHDdNdsKld6ftazdZqzhYJeUa50I7kcsyT6Vj8Y+4KIEncBDBrvU7c3XOd+Ixxsg+dJnucDsOdZb /o97Qkzm uUP4aJEgiEsjmWMJxHsjrBob46S5GZmEsyWxbEBtRJcGEM9j2RyRXAxzWZD7mKEx1I5GzS2SRrOd+wgcsVpvwiU9RZkhLWntUcG/4mE8wCyy7tG6ZS6IlISJWLdNkZvlNfX+zqwWK3pdl/ZmCo8zjs+Dt7Nx28tU6rJuSPw6CyIbzXKLMVKIUa82q7hlzgUV3j39vhDtJw4XLsm7Fc0CbrXPYXkfLqtka+YbM7yHsimnF0/ArVGryjHmcVZjnbkVrJSasN7TMMfj9oVpQ6koczIpDE1ZoIr0QLwDl/NSXTEFgaepMg0+A+CWaW9qrYfy223ocVqRMGaTNc3UpCVaQwyi1BLUbPbbdLq1mfkii9PGpsmgMYp2/eVTPttP2t4w6mpXDukgaq1bciTKtvUZSNp3NcByPst1tAVQO0nM0CXMk7FdCExhDg8b4og/gOMicg6bLrvHAmuyLeMwBdjcQVpiF4NGWnlWO7ys25EljmmwO22eJ7tqBx+FAXowcJSAcAHUHqWPwoRKRplL7XGO77I3ISw5esBLI7Qmb X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Secondary MMUs are currently consulted for access/age information at eviction time, but before then, we don't get accurate age information. That is, pages that are mostly accessed through a secondary MMU (like guest memory, used by KVM) will always just proceed down to the oldest generation, and then at eviction time, if KVM reports the page to be young, the page will be activated/promoted back to the youngest generation. The added feature bit (0x8), if disabled, will make MGLRU behave as if there are no secondary MMUs subscribed to MMU notifiers except at eviction time. Implement aging with the new mmu_notifier_clear_young_fast_only() notifier. For architectures that do not support this notifier, this becomes a no-op. For architectures that do implement it, it should be fast enough to make aging worth it (usually the case if the notifier is implemented locklessly). Suggested-by: Yu Zhao Signed-off-by: James Houghton --- Documentation/admin-guide/mm/multigen_lru.rst | 6 +- include/linux/mmzone.h | 6 +- mm/rmap.c | 9 +- mm/vmscan.c | 148 ++++++++++++++---- 4 files changed, 127 insertions(+), 42 deletions(-) diff --git a/Documentation/admin-guide/mm/multigen_lru.rst b/Documentation/admin-guide/mm/multigen_lru.rst index 33e068830497..e1862407652c 100644 --- a/Documentation/admin-guide/mm/multigen_lru.rst +++ b/Documentation/admin-guide/mm/multigen_lru.rst @@ -48,6 +48,10 @@ Values Components verified on x86 varieties other than Intel and AMD. If it is disabled, the multi-gen LRU will suffer a negligible performance degradation. +0x0008 Clear the accessed bit in secondary MMU page tables when aging + instead of waiting until eviction time. This results in accurate + page age information for pages that are mainly used by a + secondary MMU. [yYnN] Apply to all the components above. ====== =============================================================== @@ -56,7 +60,7 @@ E.g., echo y >/sys/kernel/mm/lru_gen/enabled cat /sys/kernel/mm/lru_gen/enabled - 0x0007 + 0x000f echo 5 >/sys/kernel/mm/lru_gen/enabled cat /sys/kernel/mm/lru_gen/enabled 0x0005 diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 586a8f0104d7..ee82e635e75b 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -400,6 +400,7 @@ enum { LRU_GEN_CORE, LRU_GEN_MM_WALK, LRU_GEN_NONLEAF_YOUNG, + LRU_GEN_SECONDARY_MMU_WALK, NR_LRU_GEN_CAPS }; @@ -557,7 +558,7 @@ struct lru_gen_memcg { void lru_gen_init_pgdat(struct pglist_data *pgdat); void lru_gen_init_lruvec(struct lruvec *lruvec); -void lru_gen_look_around(struct page_vma_mapped_walk *pvmw); +bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw); void lru_gen_init_memcg(struct mem_cgroup *memcg); void lru_gen_exit_memcg(struct mem_cgroup *memcg); @@ -576,8 +577,9 @@ static inline void lru_gen_init_lruvec(struct lruvec *lruvec) { } -static inline void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) +static inline bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw) { + return false; } static inline void lru_gen_init_memcg(struct mem_cgroup *memcg) diff --git a/mm/rmap.c b/mm/rmap.c index e8fc5ecb59b2..24a3ff639919 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -870,13 +870,10 @@ static bool folio_referenced_one(struct folio *folio, continue; } - if (pvmw.pte) { - if (lru_gen_enabled() && - pte_young(ptep_get(pvmw.pte))) { - lru_gen_look_around(&pvmw); + if (lru_gen_enabled() && pvmw.pte) { + if (lru_gen_look_around(&pvmw)) referenced++; - } - + } else if (pvmw.pte) { if (ptep_clear_flush_young_notify(vma, address, pvmw.pte)) referenced++; diff --git a/mm/vmscan.c b/mm/vmscan.c index 2e34de9cd0d4..e4fa52c8f714 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -56,6 +56,7 @@ #include #include #include +#include #include #include @@ -2579,6 +2580,11 @@ static bool should_clear_pmd_young(void) return arch_has_hw_nonleaf_pmd_young() && get_cap(LRU_GEN_NONLEAF_YOUNG); } +static bool should_walk_secondary_mmu(void) +{ + return get_cap(LRU_GEN_SECONDARY_MMU_WALK); +} + /****************************************************************************** * shorthand helpers ******************************************************************************/ @@ -3276,7 +3282,8 @@ static bool get_next_vma(unsigned long mask, unsigned long size, struct mm_walk return false; } -static unsigned long get_pte_pfn(pte_t pte, struct vm_area_struct *vma, unsigned long addr) +static unsigned long get_pte_pfn(pte_t pte, struct vm_area_struct *vma, unsigned long addr, + struct pglist_data *pgdat) { unsigned long pfn = pte_pfn(pte); @@ -3291,10 +3298,15 @@ static unsigned long get_pte_pfn(pte_t pte, struct vm_area_struct *vma, unsigned if (WARN_ON_ONCE(!pfn_valid(pfn))) return -1; + /* try to avoid unnecessary memory loads */ + if (pfn < pgdat->node_start_pfn || pfn >= pgdat_end_pfn(pgdat)) + return -1; + return pfn; } -static unsigned long get_pmd_pfn(pmd_t pmd, struct vm_area_struct *vma, unsigned long addr) +static unsigned long get_pmd_pfn(pmd_t pmd, struct vm_area_struct *vma, unsigned long addr, + struct pglist_data *pgdat) { unsigned long pfn = pmd_pfn(pmd); @@ -3309,6 +3321,10 @@ static unsigned long get_pmd_pfn(pmd_t pmd, struct vm_area_struct *vma, unsigned if (WARN_ON_ONCE(!pfn_valid(pfn))) return -1; + /* try to avoid unnecessary memory loads */ + if (pfn < pgdat->node_start_pfn || pfn >= pgdat_end_pfn(pgdat)) + return -1; + return pfn; } @@ -3317,10 +3333,6 @@ static struct folio *get_pfn_folio(unsigned long pfn, struct mem_cgroup *memcg, { struct folio *folio; - /* try to avoid unnecessary memory loads */ - if (pfn < pgdat->node_start_pfn || pfn >= pgdat_end_pfn(pgdat)) - return NULL; - folio = pfn_folio(pfn); if (folio_nid(folio) != pgdat->node_id) return NULL; @@ -3343,6 +3355,26 @@ static bool suitable_to_scan(int total, int young) return young * n >= total; } +static bool lru_gen_notifier_clear_young(struct mm_struct *mm, + unsigned long start, + unsigned long end) +{ + return should_walk_secondary_mmu() && + mmu_notifier_clear_young_fast_only(mm, start, end); +} + +static bool lru_gen_pmdp_test_and_clear_young(struct vm_area_struct *vma, + unsigned long addr, + pmd_t *pmd) +{ + bool young = pmdp_test_and_clear_young(vma, addr, pmd); + + if (lru_gen_notifier_clear_young(vma->vm_mm, addr, addr + PMD_SIZE)) + young = true; + + return young; +} + static bool walk_pte_range(pmd_t *pmd, unsigned long start, unsigned long end, struct mm_walk *args) { @@ -3357,8 +3389,9 @@ static bool walk_pte_range(pmd_t *pmd, unsigned long start, unsigned long end, struct pglist_data *pgdat = lruvec_pgdat(walk->lruvec); DEFINE_MAX_SEQ(walk->lruvec); int old_gen, new_gen = lru_gen_from_seq(max_seq); + struct mm_struct *mm = args->mm; - pte = pte_offset_map_nolock(args->mm, pmd, start & PMD_MASK, &ptl); + pte = pte_offset_map_nolock(mm, pmd, start & PMD_MASK, &ptl); if (!pte) return false; if (!spin_trylock(ptl)) { @@ -3376,11 +3409,11 @@ static bool walk_pte_range(pmd_t *pmd, unsigned long start, unsigned long end, total++; walk->mm_stats[MM_LEAF_TOTAL]++; - pfn = get_pte_pfn(ptent, args->vma, addr); + pfn = get_pte_pfn(ptent, args->vma, addr, pgdat); if (pfn == -1) continue; - if (!pte_young(ptent)) { + if (!pte_young(ptent) && !mm_has_notifiers(mm)) { walk->mm_stats[MM_LEAF_OLD]++; continue; } @@ -3389,8 +3422,14 @@ static bool walk_pte_range(pmd_t *pmd, unsigned long start, unsigned long end, if (!folio) continue; - if (!ptep_test_and_clear_young(args->vma, addr, pte + i)) - VM_WARN_ON_ONCE(true); + if (!lru_gen_notifier_clear_young(mm, addr, addr + PAGE_SIZE) && + !pte_young(ptent)) { + walk->mm_stats[MM_LEAF_OLD]++; + continue; + } + + if (pte_young(ptent)) + ptep_test_and_clear_young(args->vma, addr, pte + i); young++; walk->mm_stats[MM_LEAF_YOUNG]++; @@ -3456,22 +3495,25 @@ static void walk_pmd_range_locked(pud_t *pud, unsigned long addr, struct vm_area /* don't round down the first address */ addr = i ? (*first & PMD_MASK) + i * PMD_SIZE : *first; - pfn = get_pmd_pfn(pmd[i], vma, addr); - if (pfn == -1) - goto next; - - if (!pmd_trans_huge(pmd[i])) { - if (should_clear_pmd_young()) + if (pmd_present(pmd[i]) && !pmd_trans_huge(pmd[i])) { + if (should_clear_pmd_young() && + !should_walk_secondary_mmu()) pmdp_test_and_clear_young(vma, addr, pmd + i); goto next; } + pfn = get_pmd_pfn(pmd[i], vma, addr, pgdat); + if (pfn == -1) + goto next; + folio = get_pfn_folio(pfn, memcg, pgdat, walk->can_swap); if (!folio) goto next; - if (!pmdp_test_and_clear_young(vma, addr, pmd + i)) + if (!lru_gen_pmdp_test_and_clear_young(vma, addr, pmd + i)) { + walk->mm_stats[MM_LEAF_OLD]++; goto next; + } walk->mm_stats[MM_LEAF_YOUNG]++; @@ -3528,19 +3570,18 @@ static void walk_pmd_range(pud_t *pud, unsigned long start, unsigned long end, } if (pmd_trans_huge(val)) { - unsigned long pfn = pmd_pfn(val); struct pglist_data *pgdat = lruvec_pgdat(walk->lruvec); + unsigned long pfn = get_pmd_pfn(val, vma, addr, pgdat); walk->mm_stats[MM_LEAF_TOTAL]++; - if (!pmd_young(val)) { - walk->mm_stats[MM_LEAF_OLD]++; + if (pfn == -1) continue; - } - /* try to avoid unnecessary memory loads */ - if (pfn < pgdat->node_start_pfn || pfn >= pgdat_end_pfn(pgdat)) + if (!pmd_young(val) && !mm_has_notifiers(args->mm)) { + walk->mm_stats[MM_LEAF_OLD]++; continue; + } walk_pmd_range_locked(pud, addr, vma, args, bitmap, &first); continue; @@ -3548,7 +3589,7 @@ static void walk_pmd_range(pud_t *pud, unsigned long start, unsigned long end, walk->mm_stats[MM_NONLEAF_TOTAL]++; - if (should_clear_pmd_young()) { + if (should_clear_pmd_young() && !should_walk_secondary_mmu()) { if (!pmd_young(val)) continue; @@ -3994,6 +4035,31 @@ static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc) * rmap/PT walk feedback ******************************************************************************/ +static bool should_look_around(struct vm_area_struct *vma, unsigned long addr, + pte_t *pte, int *young) +{ + int secondary_young = mmu_notifier_clear_young( + vma->vm_mm, addr, addr + PAGE_SIZE); + + /* + * Look around if (1) the PTE is young or (2) the secondary PTE was + * young and one of the "fast" MMUs of one of the secondary MMUs + * reported that the page was young. + */ + if (pte_young(ptep_get(pte))) { + ptep_test_and_clear_young(vma, addr, pte); + *young = true; + return true; + } + + if (secondary_young) { + *young = true; + return mm_has_fast_young_notifiers(vma->vm_mm); + } + + return false; +} + /* * This function exploits spatial locality when shrink_folio_list() walks the * rmap. It scans the adjacent PTEs of a young PTE and promotes hot pages. If @@ -4001,7 +4067,7 @@ static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc) * the PTE table to the Bloom filter. This forms a feedback loop between the * eviction and the aging. */ -void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) +bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw) { int i; unsigned long start; @@ -4019,16 +4085,20 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) struct lru_gen_mm_state *mm_state = get_mm_state(lruvec); DEFINE_MAX_SEQ(lruvec); int old_gen, new_gen = lru_gen_from_seq(max_seq); + struct mm_struct *mm = pvmw->vma->vm_mm; lockdep_assert_held(pvmw->ptl); VM_WARN_ON_ONCE_FOLIO(folio_test_lru(folio), folio); + if (!should_look_around(vma, addr, pte, &young)) + return young; + if (spin_is_contended(pvmw->ptl)) - return; + return young; /* exclude special VMAs containing anon pages from COW */ if (vma->vm_flags & VM_SPECIAL) - return; + return young; /* avoid taking the LRU lock under the PTL when possible */ walk = current->reclaim_state ? current->reclaim_state->mm_walk : NULL; @@ -4036,6 +4106,9 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) start = max(addr & PMD_MASK, vma->vm_start); end = min(addr | ~PMD_MASK, vma->vm_end - 1) + 1; + if (end - start == PAGE_SIZE) + return young; + if (end - start > MIN_LRU_BATCH * PAGE_SIZE) { if (addr - start < MIN_LRU_BATCH * PAGE_SIZE / 2) end = start + MIN_LRU_BATCH * PAGE_SIZE; @@ -4049,7 +4122,7 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) /* folio_update_gen() requires stable folio_memcg() */ if (!mem_cgroup_trylock_pages(memcg)) - return; + return young; arch_enter_lazy_mmu_mode(); @@ -4059,19 +4132,23 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) unsigned long pfn; pte_t ptent = ptep_get(pte + i); - pfn = get_pte_pfn(ptent, vma, addr); + pfn = get_pte_pfn(ptent, vma, addr, pgdat); if (pfn == -1) continue; - if (!pte_young(ptent)) + if (!pte_young(ptent) && !mm_has_notifiers(mm)) continue; folio = get_pfn_folio(pfn, memcg, pgdat, can_swap); if (!folio) continue; - if (!ptep_test_and_clear_young(vma, addr, pte + i)) - VM_WARN_ON_ONCE(true); + if (!lru_gen_notifier_clear_young(mm, addr, addr + PAGE_SIZE) && + !pte_young(ptent)) + continue; + + if (pte_young(ptent)) + ptep_test_and_clear_young(vma, addr, pte + i); young++; @@ -4101,6 +4178,8 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) /* feedback from rmap walkers to page table walkers */ if (mm_state && suitable_to_scan(i, young)) update_bloom_filter(mm_state, max_seq, pvmw->pmd); + + return young; } /****************************************************************************** @@ -5137,6 +5216,9 @@ static ssize_t enabled_show(struct kobject *kobj, struct kobj_attribute *attr, c if (should_clear_pmd_young()) caps |= BIT(LRU_GEN_NONLEAF_YOUNG); + if (should_walk_secondary_mmu()) + caps |= BIT(LRU_GEN_SECONDARY_MMU_WALK); + return sysfs_emit(buf, "0x%04x\n", caps); } From patchwork Wed Jul 24 01:10:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13740507 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1924CC3DA64 for ; Wed, 24 Jul 2024 01:11:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A16CC6B009A; Tue, 23 Jul 2024 21:11:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9255D6B009B; Tue, 23 Jul 2024 21:11:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 752F86B009C; Tue, 23 Jul 2024 21:11:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 5048F6B009A for ; Tue, 23 Jul 2024 21:11:26 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 10AF640518 for ; Wed, 24 Jul 2024 01:11:26 +0000 (UTC) X-FDA: 82372868172.12.E5D7E6A Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf12.hostedemail.com (Postfix) with ESMTP id 2BAB840016 for ; Wed, 24 Jul 2024 01:11:24 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=fLsdaYj7; spf=pass (imf12.hostedemail.com: domain of 3u1SgZgoKCCIHRFMSEFRMLEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3u1SgZgoKCCIHRFMSEFRMLEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721783421; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lFFc/b0SZHtH5PvRwJV5gBsUXTybIUywMYm4prtgkD8=; b=oeg87yfIoxfZKlBUUd86pH+nJSDRz45wKogAyNc+qUskn2DSARg0h5NSIvKzq8/Szk7nfX T2/yLTv4iLCM5stUNTqdKlnWux/2Y+I/oj+ugDK1VShn9XVEiwdl/EoAYorju+26QUZ8Re ZTPXJPoNmUc/c8wbcPBH++A4jrURe5Q= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=fLsdaYj7; spf=pass (imf12.hostedemail.com: domain of 3u1SgZgoKCCIHRFMSEFRMLEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3u1SgZgoKCCIHRFMSEFRMLEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721783421; a=rsa-sha256; cv=none; b=fVSvOoOAlX1elajRpiTnx2XOybSyGCaQJK47Z+4MPoTp+9EYOnHHE4xKrB6vRQLeJvX0n+ KHjkae3IZm5vN5YiYStFcbPECpkR9dgqeVsHIP1LMnLgMcmxkbUsf/W/WLX50XqFjDQYwi ZiIBs+0+5hVBTbAue8KHCh5Qp8JyHD0= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-650b621f4cdso171663157b3.1 for ; Tue, 23 Jul 2024 18:11:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1721783483; x=1722388283; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=lFFc/b0SZHtH5PvRwJV5gBsUXTybIUywMYm4prtgkD8=; b=fLsdaYj7mET01N4rGvSugRAhwK8t7mgzYd8aMWCUvdjNK7pKsAkmCGvmD0xQOhUO4D U0IZ5k0ZJpm1m9XeLLcsz02XuKetbkJGITByO3shp7ZjAkHutcWLp38A4DlCYa1i+03Y gAkbLvfbGm6zgEWG1dg4J6DyJAdvpK+U1eFi7RCutho3rsw7Q32WR01+QWH8Mgo1BV/j xwJVnMrzCTAYtYqCmcNgfosz7yFBF2ZGkLIegIzb4Bw01AlNiQxEbQBHZiZnnXpwCN9v HHFF05rsgTe1CCKynnQTlBB9QEd3z0SjO0P5vVGzuG+oCJUtIJkD5i0pF9xjsbgfGIrT g0nA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721783483; x=1722388283; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lFFc/b0SZHtH5PvRwJV5gBsUXTybIUywMYm4prtgkD8=; b=qAxfu5b9sWgtuvQQ5lsgDd9/LN9zw1Qai9EBjvCoI8VBB4iTJNwpWF+M/MDt0ar8kX 4NA2qv3U9whPx8bMu9LYs2ovJSloltjbZDUQMwJ3V0wGHEe0b3n7inLouSzPjhx5fP6G KswRA8ZMXHuRkoiQxiKjimjr41WOeG4kYJL65x/GRLXXaUdD3rc2D2zL63/f2o0AD2Oz FoqpnvhOYuwD6vd4aD5Ielnh69ccekSL6IlSxBUY6vL/mrtoXqow7tQyEdc2avUmaYmi NXxprf5MMAqyw0hDgSMyVZNbqjg1GA0vTWd3jNHXkHCdvQyMksHMkXyPTjCj7dxWIOiT PZKA== X-Forwarded-Encrypted: i=1; AJvYcCWPs2RgaejontwAzXdO9HgqbVhykc+XRzYmgmCscCOj6GYrK2KmBAczecYO0yyl0oGuHTJp0/oisfZtufQsG7FZlao= X-Gm-Message-State: AOJu0Yz0u1AAm225lz4Ofi8RlJHUP75Oih5TTc0aEC7p8Q4kkmfq7NAl E799+BHsxn7hogW4qebJJ/8QvXq92tNhUOm3VhFDiXtzx4gKO27bIkR0LrtLPxTDV1BjqcbLNvY l4LDXZ3XeC/F+GLzjBA== X-Google-Smtp-Source: AGHT+IFEIajFgnErDuOT6uI6bOlya+mYXH6dqjGmzwZ7kR+UC01pVNZSxx8yQNnshgmFjkn15/YQFjmWZXqs5tUQ X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:690c:660d:b0:648:afcb:a7ce with SMTP id 00721157ae682-66a63f4a795mr4937557b3.3.1721783483291; Tue, 23 Jul 2024 18:11:23 -0700 (PDT) Date: Wed, 24 Jul 2024 01:10:36 +0000 In-Reply-To: <20240724011037.3671523-1-jthoughton@google.com> Mime-Version: 1.0 References: <20240724011037.3671523-1-jthoughton@google.com> X-Mailer: git-send-email 2.46.0.rc1.232.g9752f9e123-goog Message-ID: <20240724011037.3671523-12-jthoughton@google.com> Subject: [PATCH v6 11/11] KVM: selftests: Add multi-gen LRU aging to access_tracking_perf_test From: James Houghton To: Andrew Morton , Paolo Bonzini Cc: Ankit Agrawal , Axel Rasmussen , Catalin Marinas , David Matlack , David Rientjes , James Houghton , James Morse , Jason Gunthorpe , Jonathan Corbet , Marc Zyngier , Oliver Upton , Raghavendra Rao Ananta , Ryan Roberts , Sean Christopherson , Shaoqin Huang , Suzuki K Poulose , Wei Xu , Will Deacon , Yu Zhao , Zenghui Yu , kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 2BAB840016 X-Stat-Signature: qa8hxzq6tad7bx1z36uayf7pwh93ba45 X-Rspam-User: X-HE-Tag: 1721783484-399735 X-HE-Meta: U2FsdGVkX1+hO1cn/eseVXmLe54VpWePUdi0Z+KIt/Ead+Iko3dI8vK19z0zhewo6QIgMMFMY5B/BnjfvfKsI9Qvmbvc6+tcQnOZ+SRCf0s0l/z8VE/D+O4NSZ82GQn4Ut0nHA3oKmB1nA6i8DMnIeX+D/glkEb+YmaunyaA/S7FuTpWHlXYezaoYZIPyEDS/54cgk/8TkAMV0UZ0sRKzKIYJNOvlvdO6MIIzTQMYlzmxgYiCdVaWvImTUXr1TRWy3l9J2cHZaEd3t+45zYXr+GdeFBfyqIR+oP2LY42IBqli4fcQlXAUyWMTN8u1HW1Z91zFhCPW1PcCOBjDCEQ7x+2uiTRNhRN2dGQuruF/FQkvaYsmfJ4NuE6t7F7SsmCqtYSdCMc994aLhAb5/PS50ut4ivuGtsxULFumOUgL3QDIdah60foYx3paeQKmQ1HoCoG6bKl3Upv1Kp54gvRh7kBQvEeNRfBqCcyaRSYAWqKeSRSyBanNmFN8YGPvLyVYn4nZdZvJdD3Sd2+wR9POa28IzZAjeoFmnpKbi5muDYJZFCsH66qdvOUnwhZPjaOroIisOAoIiZW3kj1/d7eEg5uSSfKJq796SwpQwcQG0Fe9yXLxL0YxL1qsXwjgV1qSCjVJvgsUzu46U78r6eJD6Iff1U38X/GeHYVCNq5P+UxzJKlcjcrvVUNHGPCPp1Ik0LveDMDksm1/kwDi0tkh25zW+yGt327zbfYmWLxjCzPlK806uKdWJGPg88wBv/USKSO3xrVJnIq81ftzm7UIFmrtrRKnX2y4JQQ2dSm8O3B93t9F7O9g+vjuwJUZCa0IqHiQM77a5p2g+iQpk/SRHjg4nrkNGvEAWm8jwqCOM5S76HbZfQB4zqCuspwOLkupSydUlc2NhnQL1kL1oOrY/VpgFqHqK+NQ851ZtHlhhaEgJydacYIKqcQc52V9597ksuc7A0QwRwfHFhXRIn PIicPosv 7qVVNBwr6ZYzvjYyqBcVHLWy26cROnmJkvshiem5WAFOrRqNl2QmG9Cu3L17AiXiFW+KJ7ZdfzTVQJer2ykC+pbNaD9uoLyiwYltF/H2Z6objZcydtufG+SHDlmmLI5hDvg6FJVnn6YWMLlfx44vGOaeXMox8TgzVzMtb0khlR34IIKE0osTkmkCmw05Nh4wHc0JlHrGWjfjzhAfLsvEGNVUrtWC0Aha1eICwAur+5Ban68c+O3VW8pldwSm4mx5NaxS7+1du5pOIWBfZ9MsqDVw2GawMyWBdorF4tnyoS45HLHBRGSyB/0lZCdiW4OXBfNWK7qnimj8sc0GMS4ZfUFynDY/N8ZUub+bwJZMAmRtRh+LhQ54JEMmRjRMmOF1cah4Ztt/D+q+C36zumgYvxh82vGydjgmU4QdhX3GQKzpeQqh1dvErWjnLpAavw0sOXY7GShbq7R1kWOW6HExqvHIwIs1X6v76dGskU546qt/96tNDwzPhW3h+WB7QRyrl0YL9LP2cymv/WBPSsWrD+3K1ZltzFpUk5B1oUmkJIT2Sv0YzxgKvvm9KmaxrSjhKmDSv+sydDIVo7GTGGPRhBaZN7jNo6sxn2WQG X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This test now has two modes of operation: 1. (default) To check how much vCPU performance was affected by access tracking (previously existed, now supports MGLRU aging). 2. (-p) To also benchmark how fast MGLRU can do aging while vCPUs are faulting in memory. Mode (1) also serves as a way to verify that aging is working properly for pages only accessed by KVM. It will fail if one does not have the 0x8 lru_gen feature bit. To support MGLRU, the test creates a memory cgroup, moves itself into it, then uses the lru_gen debugfs output to track memory in that cgroup. The logic to parse the lru_gen debugfs output has been put into selftests/kvm/lib/lru_gen_util.c. Co-developed-by: Axel Rasmussen Signed-off-by: Axel Rasmussen Signed-off-by: James Houghton --- tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/access_tracking_perf_test.c | 369 +++++++++++++++-- .../selftests/kvm/include/lru_gen_util.h | 55 +++ .../testing/selftests/kvm/lib/lru_gen_util.c | 391 ++++++++++++++++++ 4 files changed, 786 insertions(+), 30 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/lru_gen_util.h create mode 100644 tools/testing/selftests/kvm/lib/lru_gen_util.c diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index b084ba2262a0..0ab8d3f4628c 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -22,6 +22,7 @@ LIBKVM += lib/elf.c LIBKVM += lib/guest_modes.c LIBKVM += lib/io.c LIBKVM += lib/kvm_util.c +LIBKVM += lib/lru_gen_util.c LIBKVM += lib/memstress.c LIBKVM += lib/guest_sprintf.c LIBKVM += lib/rbtree.c diff --git a/tools/testing/selftests/kvm/access_tracking_perf_test.c b/tools/testing/selftests/kvm/access_tracking_perf_test.c index 3c7defd34f56..6ff64ac349a9 100644 --- a/tools/testing/selftests/kvm/access_tracking_perf_test.c +++ b/tools/testing/selftests/kvm/access_tracking_perf_test.c @@ -38,6 +38,7 @@ #include #include #include +#include #include #include #include @@ -47,6 +48,20 @@ #include "memstress.h" #include "guest_modes.h" #include "processor.h" +#include "lru_gen_util.h" + +static const char *TEST_MEMCG_NAME = "access_tracking_perf_test"; +static const int LRU_GEN_ENABLED = 0x1; +static const int LRU_GEN_MM_WALK = 0x2; +static const int LRU_GEN_SECONDARY_MMU_WALK = 0x8; +static const char *CGROUP_PROCS = "cgroup.procs"; +/* + * If using MGLRU, this test assumes a cgroup v2 or cgroup v1 memory hierarchy + * is mounted at cgroup_root. + * + * Can be changed with -r. + */ +static const char *cgroup_root = "/sys/fs/cgroup"; /* Global variable used to synchronize all of the vCPU threads. */ static int iteration; @@ -62,6 +77,9 @@ static enum { /* The iteration that was last completed by each vCPU. */ static int vcpu_last_completed_iteration[KVM_MAX_VCPUS]; +/* The time at which the last iteration was completed */ +static struct timespec vcpu_last_completed_time[KVM_MAX_VCPUS]; + /* Whether to overlap the regions of memory vCPUs access. */ static bool overlap_memory_access; @@ -74,6 +92,12 @@ struct test_params { /* The number of vCPUs to create in the VM. */ int nr_vcpus; + + /* Whether to use lru_gen aging instead of idle page tracking. */ + bool lru_gen; + + /* Whether to test the performance of aging itself. */ + bool benchmark_lru_gen; }; static uint64_t pread_uint64(int fd, const char *filename, uint64_t index) @@ -89,6 +113,50 @@ static uint64_t pread_uint64(int fd, const char *filename, uint64_t index) } +static void write_file_long(const char *path, long v) +{ + FILE *f; + + f = fopen(path, "w"); + TEST_ASSERT(f, "fopen(%s) failed", path); + TEST_ASSERT(fprintf(f, "%ld\n", v) > 0, + "fprintf to %s failed", path); + TEST_ASSERT(!fclose(f), "fclose(%s) failed", path); +} + +static char *path_join(const char *parent, const char *child) +{ + char *out = NULL; + + return asprintf(&out, "%s/%s", parent, child) >= 0 ? out : NULL; +} + +static char *memcg_path(const char *memcg) +{ + return path_join(cgroup_root, memcg); +} + +static char *memcg_file_path(const char *memcg, const char *file) +{ + char *mp = memcg_path(memcg); + char *fp; + + if (!mp) + return NULL; + fp = path_join(mp, file); + free(mp); + return fp; +} + +static void move_to_memcg(const char *memcg, pid_t pid) +{ + char *procs = memcg_file_path(memcg, CGROUP_PROCS); + + TEST_ASSERT(procs, "Failed to construct cgroup.procs path"); + write_file_long(procs, pid); + free(procs); +} + #define PAGEMAP_PRESENT (1ULL << 63) #define PAGEMAP_PFN_MASK ((1ULL << 55) - 1) @@ -242,6 +310,8 @@ static void vcpu_thread_main(struct memstress_vcpu_args *vcpu_args) }; vcpu_last_completed_iteration[vcpu_idx] = current_iteration; + clock_gettime(CLOCK_MONOTONIC, + &vcpu_last_completed_time[vcpu_idx]); } } @@ -253,38 +323,68 @@ static void spin_wait_for_vcpu(int vcpu_idx, int target_iteration) } } +static bool all_vcpus_done(int target_iteration, int nr_vcpus) +{ + for (int i = 0; i < nr_vcpus; ++i) + if (READ_ONCE(vcpu_last_completed_iteration[i]) != + target_iteration) + return false; + + return true; +} + /* The type of memory accesses to perform in the VM. */ enum access_type { ACCESS_READ, ACCESS_WRITE, }; -static void run_iteration(struct kvm_vm *vm, int nr_vcpus, const char *description) +static void run_iteration(struct kvm_vm *vm, int nr_vcpus, const char *description, + bool wait) { - struct timespec ts_start; - struct timespec ts_elapsed; int next_iteration, i; /* Kick off the vCPUs by incrementing iteration. */ next_iteration = ++iteration; - clock_gettime(CLOCK_MONOTONIC, &ts_start); - /* Wait for all vCPUs to finish the iteration. */ - for (i = 0; i < nr_vcpus; i++) - spin_wait_for_vcpu(i, next_iteration); + if (wait) { + struct timespec ts_start; + struct timespec ts_elapsed; + + clock_gettime(CLOCK_MONOTONIC, &ts_start); - ts_elapsed = timespec_elapsed(ts_start); - pr_info("%-30s: %ld.%09lds\n", - description, ts_elapsed.tv_sec, ts_elapsed.tv_nsec); + for (i = 0; i < nr_vcpus; i++) + spin_wait_for_vcpu(i, next_iteration); + + ts_elapsed = timespec_elapsed(ts_start); + + pr_info("%-30s: %ld.%09lds\n", + description, ts_elapsed.tv_sec, ts_elapsed.tv_nsec); + } else + pr_info("%-30s\n", description); } -static void access_memory(struct kvm_vm *vm, int nr_vcpus, - enum access_type access, const char *description) +static void _access_memory(struct kvm_vm *vm, int nr_vcpus, + enum access_type access, const char *description, + bool wait) { memstress_set_write_percent(vm, (access == ACCESS_READ) ? 0 : 100); iteration_work = ITERATION_ACCESS_MEMORY; - run_iteration(vm, nr_vcpus, description); + run_iteration(vm, nr_vcpus, description, wait); +} + +static void access_memory(struct kvm_vm *vm, int nr_vcpus, + enum access_type access, const char *description) +{ + return _access_memory(vm, nr_vcpus, access, description, true); +} + +static void access_memory_async(struct kvm_vm *vm, int nr_vcpus, + enum access_type access, + const char *description) +{ + return _access_memory(vm, nr_vcpus, access, description, false); } static void mark_memory_idle(struct kvm_vm *vm, int nr_vcpus) @@ -297,19 +397,115 @@ static void mark_memory_idle(struct kvm_vm *vm, int nr_vcpus) */ pr_debug("Marking VM memory idle (slow)...\n"); iteration_work = ITERATION_MARK_IDLE; - run_iteration(vm, nr_vcpus, "Mark memory idle"); + run_iteration(vm, nr_vcpus, "Mark memory idle", true); } -static void run_test(enum vm_guest_mode mode, void *arg) +static void create_memcg(const char *memcg) +{ + const char *full_memcg_path = memcg_path(memcg); + int ret; + + TEST_ASSERT(full_memcg_path, "Failed to construct full memcg path"); +retry: + ret = mkdir(full_memcg_path, 0755); + if (ret && errno == EEXIST) { + TEST_ASSERT(!rmdir(full_memcg_path), + "Found existing memcg at %s, but rmdir failed", + full_memcg_path); + goto retry; + } + TEST_ASSERT(!ret, "Creating the memcg failed: mkdir(%s) failed", + full_memcg_path); + + pr_info("Created memcg at %s\n", full_memcg_path); +} + +/* + * Test lru_gen aging speed while vCPUs are faulting memory in. + * + * This test will run lru_gen aging until the vCPUs have finished all of + * the faulting work, reporting: + * - vcpu wall time (wall time for slowest vCPU) + * - average aging pass duration + * - total number of aging passes + * - total time spent aging + * + * This test produces the most useful results when the vcpu wall time and the + * total time spent aging are similar (i.e., we want to avoid timing aging + * while the vCPUs aren't doing any work). + */ +static void run_benchmark(enum vm_guest_mode mode, struct kvm_vm *vm, + struct test_params *params) { - struct test_params *params = arg; - struct kvm_vm *vm; int nr_vcpus = params->nr_vcpus; + struct memcg_stats stats; + struct timespec ts_start, ts_max, ts_vcpus_elapsed, + ts_aging_elapsed, ts_aging_elapsed_avg; + int num_passes = 0; - vm = memstress_create_vm(mode, nr_vcpus, params->vcpu_memory_bytes, 1, - params->backing_src, !overlap_memory_access); + printf("Running lru_gen benchmark...\n"); - memstress_start_vcpu_threads(nr_vcpus, vcpu_thread_main); + clock_gettime(CLOCK_MONOTONIC, &ts_start); + access_memory_async(vm, nr_vcpus, ACCESS_WRITE, + "Populating memory (async)"); + while (!all_vcpus_done(iteration, nr_vcpus)) { + lru_gen_do_aging_quiet(&stats, TEST_MEMCG_NAME); + ++num_passes; + } + + ts_aging_elapsed = timespec_elapsed(ts_start); + ts_aging_elapsed_avg = timespec_div(ts_aging_elapsed, num_passes); + + /* Find out when the slowest vCPU finished. */ + ts_max = ts_start; + for (int i = 0; i < nr_vcpus; ++i) { + struct timespec *vcpu_ts = &vcpu_last_completed_time[i]; + + if (ts_max.tv_sec < vcpu_ts->tv_sec || + (ts_max.tv_sec == vcpu_ts->tv_sec && + ts_max.tv_nsec < vcpu_ts->tv_nsec)) + ts_max = *vcpu_ts; + } + + ts_vcpus_elapsed = timespec_sub(ts_max, ts_start); + + pr_info("%-30s: %ld.%09lds\n", "vcpu wall time", + ts_vcpus_elapsed.tv_sec, ts_vcpus_elapsed.tv_nsec); + + pr_info("%-30s: %ld.%09lds, (passes:%d, total:%ld.%09lds)\n", + "lru_gen avg pass duration", + ts_aging_elapsed_avg.tv_sec, + ts_aging_elapsed_avg.tv_nsec, + num_passes, + ts_aging_elapsed.tv_sec, + ts_aging_elapsed.tv_nsec); +} + +/* + * Test how much access tracking affects vCPU performance. + * + * Supports two modes of access tracking: + * - idle page tracking + * - lru_gen aging + * + * When using lru_gen, this test additionally verifies that the pages are in + * fact getting younger and older, otherwise the performance data would be + * invalid. + * + * The forced lru_gen aging can race with aging that occurs naturally. + */ +static void run_test(enum vm_guest_mode mode, struct kvm_vm *vm, + struct test_params *params) +{ + int nr_vcpus = params->nr_vcpus; + bool lru_gen = params->lru_gen; + struct memcg_stats stats; + // If guest_page_size is larger than the host's page size, the + // guest (memstress) will only fault in a subset of the host's pages. + long total_pages = nr_vcpus * params->vcpu_memory_bytes / + max(memstress_args.guest_page_size, + (uint64_t)getpagesize()); + int found_gens[5]; pr_info("\n"); access_memory(vm, nr_vcpus, ACCESS_WRITE, "Populating memory"); @@ -319,11 +515,78 @@ static void run_test(enum vm_guest_mode mode, void *arg) access_memory(vm, nr_vcpus, ACCESS_READ, "Reading from populated memory"); /* Repeat on memory that has been marked as idle. */ - mark_memory_idle(vm, nr_vcpus); + if (lru_gen) { + /* Do an initial page table scan */ + lru_gen_do_aging(&stats, TEST_MEMCG_NAME); + TEST_ASSERT(sum_memcg_stats(&stats) >= total_pages, + "Not all pages tracked in lru_gen stats.\n" + "Is lru_gen enabled? Did the memcg get created properly?"); + + /* Find the generation we're currently in (probably youngest) */ + found_gens[0] = lru_gen_find_generation(&stats, total_pages); + + /* Do an aging pass now */ + lru_gen_do_aging(&stats, TEST_MEMCG_NAME); + + /* Same generation, but a newer generation has been made */ + found_gens[1] = lru_gen_find_generation(&stats, total_pages); + TEST_ASSERT(found_gens[1] == found_gens[0], + "unexpected gen change: %d vs. %d", + found_gens[1], found_gens[0]); + } else + mark_memory_idle(vm, nr_vcpus); + access_memory(vm, nr_vcpus, ACCESS_WRITE, "Writing to idle memory"); - mark_memory_idle(vm, nr_vcpus); + + if (lru_gen) { + /* Scan the page tables again */ + lru_gen_do_aging(&stats, TEST_MEMCG_NAME); + + /* The pages should now be young again, so in a newer generation */ + found_gens[2] = lru_gen_find_generation(&stats, total_pages); + TEST_ASSERT(found_gens[2] > found_gens[1], + "pages did not get younger"); + + /* Do another aging pass */ + lru_gen_do_aging(&stats, TEST_MEMCG_NAME); + + /* Same generation; new generation has been made */ + found_gens[3] = lru_gen_find_generation(&stats, total_pages); + TEST_ASSERT(found_gens[3] == found_gens[2], + "unexpected gen change: %d vs. %d", + found_gens[3], found_gens[2]); + } else + mark_memory_idle(vm, nr_vcpus); + access_memory(vm, nr_vcpus, ACCESS_READ, "Reading from idle memory"); + if (lru_gen) { + /* Scan the pages tables again */ + lru_gen_do_aging(&stats, TEST_MEMCG_NAME); + + /* The pages should now be young again, so in a newer generation */ + found_gens[4] = lru_gen_find_generation(&stats, total_pages); + TEST_ASSERT(found_gens[4] > found_gens[3], + "pages did not get younger"); + } +} + +static void setup_vm_and_run(enum vm_guest_mode mode, void *arg) +{ + struct test_params *params = arg; + int nr_vcpus = params->nr_vcpus; + struct kvm_vm *vm; + + vm = memstress_create_vm(mode, nr_vcpus, params->vcpu_memory_bytes, 1, + params->backing_src, !overlap_memory_access); + + memstress_start_vcpu_threads(nr_vcpus, vcpu_thread_main); + + if (params->benchmark_lru_gen) + run_benchmark(mode, vm, params); + else + run_test(mode, vm, params); + memstress_join_vcpu_threads(nr_vcpus); memstress_destroy_vm(vm); } @@ -331,8 +594,8 @@ static void run_test(enum vm_guest_mode mode, void *arg) static void help(char *name) { puts(""); - printf("usage: %s [-h] [-m mode] [-b vcpu_bytes] [-v vcpus] [-o] [-s mem_type]\n", - name); + printf("usage: %s [-h] [-m mode] [-b vcpu_bytes] [-v vcpus] [-o]" + " [-s mem_type] [-l] [-r memcg_root]\n", name); puts(""); printf(" -h: Display this help message."); guest_modes_help(); @@ -342,6 +605,9 @@ static void help(char *name) printf(" -v: specify the number of vCPUs to run.\n"); printf(" -o: Overlap guest memory accesses instead of partitioning\n" " them into a separate region of memory for each vCPU.\n"); + printf(" -l: Use MGLRU aging instead of idle page tracking\n"); + printf(" -p: Benchmark MGLRU aging while faulting memory in\n"); + printf(" -r: The memory cgroup hierarchy root to use (when -l is given)\n"); backing_src_help("-s"); puts(""); exit(0); @@ -353,13 +619,15 @@ int main(int argc, char *argv[]) .backing_src = DEFAULT_VM_MEM_SRC, .vcpu_memory_bytes = DEFAULT_PER_VCPU_MEM_SIZE, .nr_vcpus = 1, + .lru_gen = false, + .benchmark_lru_gen = false, }; int page_idle_fd; int opt; guest_modes_append_default(); - while ((opt = getopt(argc, argv, "hm:b:v:os:")) != -1) { + while ((opt = getopt(argc, argv, "hm:b:v:os:lr:p")) != -1) { switch (opt) { case 'm': guest_modes_cmdline(optarg); @@ -376,6 +644,15 @@ int main(int argc, char *argv[]) case 's': params.backing_src = parse_backing_src_type(optarg); break; + case 'l': + params.lru_gen = true; + break; + case 'p': + params.benchmark_lru_gen = true; + break; + case 'r': + cgroup_root = strdup(optarg); + break; case 'h': default: help(argv[0]); @@ -383,12 +660,44 @@ int main(int argc, char *argv[]) } } - page_idle_fd = open("/sys/kernel/mm/page_idle/bitmap", O_RDWR); - __TEST_REQUIRE(page_idle_fd >= 0, - "CONFIG_IDLE_PAGE_TRACKING is not enabled"); - close(page_idle_fd); + if (!params.lru_gen) { + page_idle_fd = open("/sys/kernel/mm/page_idle/bitmap", O_RDWR); + __TEST_REQUIRE(page_idle_fd >= 0, + "CONFIG_IDLE_PAGE_TRACKING is not enabled"); + close(page_idle_fd); + } else { + int lru_gen_fd, lru_gen_debug_fd; + long mglru_features; + char mglru_feature_str[8] = {}; + + lru_gen_fd = open("/sys/kernel/mm/lru_gen/enabled", O_RDONLY); + __TEST_REQUIRE(lru_gen_fd >= 0, + "CONFIG_LRU_GEN is not enabled"); + TEST_ASSERT(read(lru_gen_fd, &mglru_feature_str, 7) > 0, + "couldn't read lru_gen features"); + mglru_features = strtol(mglru_feature_str, NULL, 16); + __TEST_REQUIRE(mglru_features & LRU_GEN_ENABLED, + "lru_gen is not enabled"); + __TEST_REQUIRE(mglru_features & LRU_GEN_MM_WALK, + "lru_gen does not support MM_WALK"); + __TEST_REQUIRE(mglru_features & LRU_GEN_SECONDARY_MMU_WALK, + "lru_gen does not support SECONDARY_MMU_WALK"); + + lru_gen_debug_fd = open(DEBUGFS_LRU_GEN, O_RDWR); + __TEST_REQUIRE(lru_gen_debug_fd >= 0, + "Cannot access %s", DEBUGFS_LRU_GEN); + close(lru_gen_debug_fd); + } + + TEST_ASSERT(!params.benchmark_lru_gen || params.lru_gen, + "-p specified without -l"); + + if (params.lru_gen) { + create_memcg(TEST_MEMCG_NAME); + move_to_memcg(TEST_MEMCG_NAME, getpid()); + } - for_each_guest_mode(run_test, ¶ms); + for_each_guest_mode(setup_vm_and_run, ¶ms); return 0; } diff --git a/tools/testing/selftests/kvm/include/lru_gen_util.h b/tools/testing/selftests/kvm/include/lru_gen_util.h new file mode 100644 index 000000000000..4eef8085a3cb --- /dev/null +++ b/tools/testing/selftests/kvm/include/lru_gen_util.h @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Tools for integrating with lru_gen, like parsing the lru_gen debugfs output. + * + * Copyright (C) 2024, Google LLC. + */ +#ifndef SELFTEST_KVM_LRU_GEN_UTIL_H +#define SELFTEST_KVM_LRU_GEN_UTIL_H + +#include +#include +#include + +#include "test_util.h" + +#define MAX_NR_GENS 16 /* MAX_NR_GENS in include/linux/mmzone.h */ +#define MAX_NR_NODES 4 /* Maximum number of nodes we support */ + +static const char *DEBUGFS_LRU_GEN = "/sys/kernel/debug/lru_gen"; + +struct generation_stats { + int gen; + long age_ms; + long nr_anon; + long nr_file; +}; + +struct node_stats { + int node; + int nr_gens; /* Number of populated gens entries. */ + struct generation_stats gens[MAX_NR_GENS]; +}; + +struct memcg_stats { + unsigned long memcg_id; + int nr_nodes; /* Number of populated nodes entries. */ + struct node_stats nodes[MAX_NR_NODES]; +}; + +void print_memcg_stats(const struct memcg_stats *stats, const char *name); + +void read_memcg_stats(struct memcg_stats *stats, const char *memcg); + +void read_print_memcg_stats(struct memcg_stats *stats, const char *memcg); + +long sum_memcg_stats(const struct memcg_stats *stats); + +void lru_gen_do_aging(struct memcg_stats *stats, const char *memcg); + +void lru_gen_do_aging_quiet(struct memcg_stats *stats, const char *memcg); + +int lru_gen_find_generation(const struct memcg_stats *stats, + unsigned long total_pages); + +#endif /* SELFTEST_KVM_LRU_GEN_UTIL_H */ diff --git a/tools/testing/selftests/kvm/lib/lru_gen_util.c b/tools/testing/selftests/kvm/lib/lru_gen_util.c new file mode 100644 index 000000000000..3c02a635a9f7 --- /dev/null +++ b/tools/testing/selftests/kvm/lib/lru_gen_util.c @@ -0,0 +1,391 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2024, Google LLC. + */ + +#include + +#include "lru_gen_util.h" + +/* + * Tracks state while we parse memcg lru_gen stats. The file we're parsing is + * structured like this (some extra whitespace elided): + * + * memcg (id) (path) + * node (id) + * (gen_nr) (age_in_ms) (nr_anon_pages) (nr_file_pages) + */ +struct memcg_stats_parse_context { + bool consumed; /* Whether or not this line was consumed */ + /* Next parse handler to invoke */ + void (*next_handler)(struct memcg_stats *, + struct memcg_stats_parse_context *, char *); + int current_node_idx; /* Current index in nodes array */ + const char *name; /* The name of the memcg we're looking for */ +}; + +static void memcg_stats_handle_searching(struct memcg_stats *stats, + struct memcg_stats_parse_context *ctx, + char *line); +static void memcg_stats_handle_in_memcg(struct memcg_stats *stats, + struct memcg_stats_parse_context *ctx, + char *line); +static void memcg_stats_handle_in_node(struct memcg_stats *stats, + struct memcg_stats_parse_context *ctx, + char *line); + +struct split_iterator { + char *str; + char *save; +}; + +static char *split_next(struct split_iterator *it) +{ + char *ret = strtok_r(it->str, " \t\n\r", &it->save); + + it->str = NULL; + return ret; +} + +static void memcg_stats_handle_searching(struct memcg_stats *stats, + struct memcg_stats_parse_context *ctx, + char *line) +{ + struct split_iterator it = { .str = line }; + char *prefix = split_next(&it); + char *memcg_id = split_next(&it); + char *memcg_name = split_next(&it); + char *end; + + ctx->consumed = true; + + if (!prefix || strcmp("memcg", prefix)) + return; /* Not a memcg line (maybe empty), skip */ + + TEST_ASSERT(memcg_id && memcg_name, + "malformed memcg line; no memcg id or memcg_name"); + + if (strcmp(memcg_name + 1, ctx->name)) + return; /* Wrong memcg, skip */ + + /* Found it! */ + + stats->memcg_id = strtoul(memcg_id, &end, 10); + TEST_ASSERT(*end == '\0', "malformed memcg id '%s'", memcg_id); + if (!stats->memcg_id) + return; /* Removed memcg? */ + + ctx->next_handler = memcg_stats_handle_in_memcg; +} + +static void memcg_stats_handle_in_memcg(struct memcg_stats *stats, + struct memcg_stats_parse_context *ctx, + char *line) +{ + struct split_iterator it = { .str = line }; + char *prefix = split_next(&it); + char *id = split_next(&it); + long found_node_id; + char *end; + + ctx->consumed = true; + ctx->current_node_idx = -1; + + if (!prefix) + return; /* Skip empty lines */ + + if (!strcmp("memcg", prefix)) { + /* Memcg done, found next one; stop. */ + ctx->next_handler = NULL; + return; + } else if (strcmp("node", prefix)) + TEST_ASSERT(false, "found malformed line after 'memcg ...'," + "token: '%s'", prefix); + + /* At this point we know we have a node line. Parse the ID. */ + + TEST_ASSERT(id, "malformed node line; no node id"); + + found_node_id = strtol(id, &end, 10); + TEST_ASSERT(*end == '\0', "malformed node id '%s'", id); + + ctx->current_node_idx = stats->nr_nodes++; + TEST_ASSERT(ctx->current_node_idx < MAX_NR_NODES, + "memcg has stats for too many nodes, max is %d", + MAX_NR_NODES); + stats->nodes[ctx->current_node_idx].node = found_node_id; + + ctx->next_handler = memcg_stats_handle_in_node; +} + +static void memcg_stats_handle_in_node(struct memcg_stats *stats, + struct memcg_stats_parse_context *ctx, + char *line) +{ + /* Have to copy since we might not consume */ + char *my_line = strdup(line); + struct split_iterator it = { .str = my_line }; + char *gen, *age, *nr_anon, *nr_file; + struct node_stats *node_stats; + struct generation_stats *gen_stats; + char *end; + + TEST_ASSERT(it.str, "failed to copy input line"); + + gen = split_next(&it); + + /* Skip empty lines */ + if (!gen) + goto out_consume; /* Skip empty lines */ + + if (!strcmp("memcg", gen) || !strcmp("node", gen)) { + /* + * Reached next memcg or node section. Don't consume, let the + * other handler deal with this. + */ + ctx->next_handler = memcg_stats_handle_in_memcg; + goto out; + } + + node_stats = &stats->nodes[ctx->current_node_idx]; + TEST_ASSERT(node_stats->nr_gens < MAX_NR_GENS, + "found too many generation lines; max is %d", + MAX_NR_GENS); + gen_stats = &node_stats->gens[node_stats->nr_gens++]; + + age = split_next(&it); + nr_anon = split_next(&it); + nr_file = split_next(&it); + + TEST_ASSERT(age && nr_anon && nr_file, + "malformed generation line; not enough tokens"); + + gen_stats->gen = (int)strtol(gen, &end, 10); + TEST_ASSERT(*end == '\0', "malformed generation number '%s'", gen); + + gen_stats->age_ms = strtol(age, &end, 10); + TEST_ASSERT(*end == '\0', "malformed generation age '%s'", age); + + gen_stats->nr_anon = strtol(nr_anon, &end, 10); + TEST_ASSERT(*end == '\0', "malformed anonymous page count '%s'", + nr_anon); + + gen_stats->nr_file = strtol(nr_file, &end, 10); + TEST_ASSERT(*end == '\0', "malformed file page count '%s'", nr_file); + +out_consume: + ctx->consumed = true; +out: + free(my_line); +} + +/* Pretty-print lru_gen @stats. */ +void print_memcg_stats(const struct memcg_stats *stats, const char *name) +{ + int node, gen; + + fprintf(stderr, "stats for memcg %s (id %lu):\n", + name, stats->memcg_id); + for (node = 0; node < stats->nr_nodes; ++node) { + fprintf(stderr, "\tnode %d\n", stats->nodes[node].node); + for (gen = 0; gen < stats->nodes[node].nr_gens; ++gen) { + const struct generation_stats *gstats = + &stats->nodes[node].gens[gen]; + + fprintf(stderr, + "\t\tgen %d\tage_ms %ld" + "\tnr_anon %ld\tnr_file %ld\n", + gstats->gen, gstats->age_ms, gstats->nr_anon, + gstats->nr_file); + } + } +} + +/* Re-read lru_gen debugfs information for @memcg into @stats. */ +void read_memcg_stats(struct memcg_stats *stats, const char *memcg) +{ + FILE *f; + ssize_t read = 0; + char *line = NULL; + size_t bufsz; + struct memcg_stats_parse_context ctx = { + .next_handler = memcg_stats_handle_searching, + .name = memcg, + }; + + memset(stats, 0, sizeof(struct memcg_stats)); + + f = fopen(DEBUGFS_LRU_GEN, "r"); + TEST_ASSERT(f, "fopen(%s) failed", DEBUGFS_LRU_GEN); + + while (ctx.next_handler && (read = getline(&line, &bufsz, f)) > 0) { + ctx.consumed = false; + + do { + ctx.next_handler(stats, &ctx, line); + if (!ctx.next_handler) + break; + } while (!ctx.consumed); + } + + if (read < 0 && !feof(f)) + TEST_ASSERT(false, "getline(%s) failed", DEBUGFS_LRU_GEN); + + TEST_ASSERT(stats->memcg_id > 0, "Couldn't find memcg: %s\n" + "Did the memcg get created in the proper mount?", + memcg); + if (line) + free(line); + TEST_ASSERT(!fclose(f), "fclose(%s) failed", DEBUGFS_LRU_GEN); +} + +/* + * Find all pages tracked by lru_gen for this memcg in generation @target_gen. + * + * If @target_gen is negative, look for all generations. + */ +static long sum_memcg_stats_for_gen(int target_gen, + const struct memcg_stats *stats) +{ + int node, gen; + long total_nr = 0; + + for (node = 0; node < stats->nr_nodes; ++node) { + const struct node_stats *node_stats = &stats->nodes[node]; + + for (gen = 0; gen < node_stats->nr_gens; ++gen) { + const struct generation_stats *gen_stats = + &node_stats->gens[gen]; + + if (target_gen >= 0 && gen_stats->gen != target_gen) + continue; + + total_nr += gen_stats->nr_anon + gen_stats->nr_file; + } + } + + return total_nr; +} + +/* Find all pages tracked by lru_gen for this memcg. */ +long sum_memcg_stats(const struct memcg_stats *stats) +{ + return sum_memcg_stats_for_gen(-1, stats); +} + +/* Read the memcg stats and optionally print if this is a debug build. */ +void read_print_memcg_stats(struct memcg_stats *stats, const char *memcg) +{ + read_memcg_stats(stats, memcg); +#ifdef DEBUG + print_memcg_stats(stats, memcg); +#endif +} + +/* + * If lru_gen aging should force page table scanning. + * + * If you want to set this to false, you will need to do eviction + * before doing extra aging passes. + */ +static const bool force_scan = true; + +static void run_aging_impl(unsigned long memcg_id, int node_id, int max_gen) +{ + FILE *f = fopen(DEBUGFS_LRU_GEN, "w"); + char *command; + size_t sz; + + TEST_ASSERT(f, "fopen(%s) failed", DEBUGFS_LRU_GEN); + sz = asprintf(&command, "+ %lu %d %d 1 %d\n", + memcg_id, node_id, max_gen, force_scan); + TEST_ASSERT(sz > 0, "creating aging command failed"); + + pr_debug("Running aging command: %s", command); + if (fwrite(command, sizeof(char), sz, f) < sz) { + TEST_ASSERT(false, "writing aging command %s to %s failed", + command, DEBUGFS_LRU_GEN); + } + + TEST_ASSERT(!fclose(f), "fclose(%s) failed", DEBUGFS_LRU_GEN); +} + +static void _lru_gen_do_aging(struct memcg_stats *stats, const char *memcg, + bool verbose) +{ + int node, gen; + struct timespec ts_start; + struct timespec ts_elapsed; + + pr_debug("lru_gen: invoking aging...\n"); + + /* Must read memcg stats to construct the proper aging command. */ + read_print_memcg_stats(stats, memcg); + + if (verbose) + clock_gettime(CLOCK_MONOTONIC, &ts_start); + + for (node = 0; node < stats->nr_nodes; ++node) { + int max_gen = 0; + + for (gen = 0; gen < stats->nodes[node].nr_gens; ++gen) { + int this_gen = stats->nodes[node].gens[gen].gen; + + max_gen = max_gen > this_gen ? max_gen : this_gen; + } + + run_aging_impl(stats->memcg_id, stats->nodes[node].node, + max_gen); + } + + if (verbose) { + ts_elapsed = timespec_elapsed(ts_start); + pr_info("%-30s: %ld.%09lds\n", "lru_gen: Aging", + ts_elapsed.tv_sec, ts_elapsed.tv_nsec); + } + + /* Re-read so callers get updated information */ + read_print_memcg_stats(stats, memcg); +} + +/* Do aging, and print how long it took. */ +void lru_gen_do_aging(struct memcg_stats *stats, const char *memcg) +{ + return _lru_gen_do_aging(stats, memcg, true); +} + +/* Do aging, don't print anything. */ +void lru_gen_do_aging_quiet(struct memcg_stats *stats, const char *memcg) +{ + return _lru_gen_do_aging(stats, memcg, false); +} + +/* + * Find which generation contains more than half of @total_pages, assuming that + * such a generation exists. + */ +int lru_gen_find_generation(const struct memcg_stats *stats, + unsigned long total_pages) +{ + int node, gen, gen_idx, min_gen = INT_MAX, max_gen = -1; + + for (node = 0; node < stats->nr_nodes; ++node) + for (gen_idx = 0; gen_idx < stats->nodes[node].nr_gens; + ++gen_idx) { + gen = stats->nodes[node].gens[gen_idx].gen; + max_gen = gen > max_gen ? gen : max_gen; + min_gen = gen < min_gen ? gen : min_gen; + } + + for (gen = min_gen; gen < max_gen; ++gen) + /* See if the most pages are in this generation. */ + if (sum_memcg_stats_for_gen(gen, stats) > + total_pages / 2) + return gen; + + TEST_ASSERT(false, "No generation includes majority of %lu pages.", + total_pages); + + /* unreachable, but make the compiler happy */ + return -1; +}