From patchwork Fri Jun 11 23:56:54 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12316715
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
	DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
	INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,
	USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no
	version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id E40C1C48BE5
	for <kvm@archiver.kernel.org>; Fri, 11 Jun 2021 23:57:31 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id C67D2613C3
	for <kvm@archiver.kernel.org>; Fri, 11 Jun 2021 23:57:31 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230060AbhFKX72 (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Fri, 11 Jun 2021 19:59:28 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38402 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S229548AbhFKX72 (ORCPT <rfc822;kvm@vger.kernel.org>);
        Fri, 11 Jun 2021 19:59:28 -0400
Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com
 [IPv6:2607:f8b0:4864:20::b49])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B1BDC061574
        for <kvm@vger.kernel.org>; Fri, 11 Jun 2021 16:57:21 -0700 (PDT)
Received: by mail-yb1-xb49.google.com with SMTP id
 g9-20020a25ae490000b029052f9e5b7d3fso6357724ybe.4
        for <kvm@vger.kernel.org>; Fri, 11 Jun 2021 16:57:21 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20161025;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=OMqXSsbUPy1mE5TkMzPWO/uoFpvb3vaQpXE4qPNr8II=;
        b=LfuLk077tzr0eaOz3LfPD9uMxSH+qmrjvOZvqQRucLbXdwkGyNpPao9ajwRjLK8l2U
         kpbw0e2r6WOf6CE+13+AoW+vrJuP32SwQBmBOEUXinN2hf5sTmamp36UAtg8HNuJ33Gj
         +QydKTadKcb+7spufQH61hy53Ncge2QsnZRB+LmyYvxiAWAbbevR3hlrTNWhXY9cWCPk
         F5I/2mtHLu9vCL9YD2tmxLdsb6BGskAL9rNkqhqkAQnXg1ys8K4PUJN8OpvhFQmZlYtv
         Sx/hSByJ7FzlBqGaOSlwMCE+P1XJxwwI1U1u6Rjm3uAnZTC2X746r2wRB9Pl6VdSlLJ/
         EhLQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=OMqXSsbUPy1mE5TkMzPWO/uoFpvb3vaQpXE4qPNr8II=;
        b=nqLU6IwuTUN3a1koFp0o2pUk6Fc4IGOGFpWTaI0qUG3lVa6zflQQiINZ7R6i2DucCY
         nTh0I0t50ZkA7Wyst6rxMBAKSc06Du2UAtHy5VVmfq7mMhN6i/IdsXhHXaDQlk4xYA+3
         moMDj3BgoRndUZ/sCVYaQc6FrZpBOWneE+QF+RZNR0SyG/jeg553mKgkvYOQw+JOM4C2
         LzOg9Ary9LdoI7bqumhZolMK6Af5mLN42ntK2SUmw7qyTT0yXlpPdVfQYsEMTjOiy3Hw
         pRWJESQzIf+XCdQv3wmdP8/TmdyaFGPBfrf6a3CqorGD4KkT3U/FSWo1uWZvDwq9MpHf
         qqJw==
X-Gm-Message-State: AOAM5323fcmL0fXXH2WONGm3OhPOIWBkqaQuSi8CPjY4mKv85MBu076z
        DGT6mmCUnb4LYoeZB1dwXNWwCtee0nKi+rugQU4mZOgP2sVTJHYhNnHEZVIsBsUDgmJkLZ23q6I
        iPVq6msJ0GOKExdRXPxa8wsQTNAnRX9tTFwveIknxGsSEiwA06E0Ia1TCMkHcEjo=
X-Google-Smtp-Source: 
 ABdhPJwMsMZPNFxRSpRXIDJFRq8IlTvkO7sDTVRyVAX0QMccusSmhvTSXkJou42ip06mmtNJmNRhVWKfsggy4Q==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:10:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a25:7109:: with SMTP id
 m9mr9688589ybc.274.1623455840709; Fri, 11 Jun 2021 16:57:20 -0700 (PDT)
Date: Fri, 11 Jun 2021 23:56:54 +0000
In-Reply-To: <20210611235701.3941724-1-dmatlack@google.com>
Message-Id: <20210611235701.3941724-2-dmatlack@google.com>
Mime-Version: 1.0
References: <20210611235701.3941724-1-dmatlack@google.com>
X-Mailer: git-send-email 2.32.0.272.g935e593368-goog
Subject: [PATCH 1/8] KVM: x86/mmu: Refactor is_tdp_mmu_root()
From: David Matlack <dmatlack@google.com>
To: kvm@vger.kernel.org
Cc: Ben Gardon <bgardon@google.com>, Joerg Roedel <joro@8bytes.org>,
        Jim Mattson <jmattson@google.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Sean Christopherson <seanjc@google.com>,
        Paolo Bonzini <pbonzini@redhat.com>,
        Junaid Shahid <junaids@google.com>,
        Andrew Jones <drjones@redhat.com>,
        David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Refactor is_tdp_mmu_root() into is_vcpu_using_tdp_mmu() to reduce
duplicated code at call sites and make the code more readable.

Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c     | 10 +++++-----
 arch/x86/kvm/mmu/tdp_mmu.c |  2 +-
 arch/x86/kvm/mmu/tdp_mmu.h |  8 +++++---
 3 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 0144c40d09c7..eccd889d20a5 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3545,7 +3545,7 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 		return reserved;
 	}
 
-	if (is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa))
+	if (is_vcpu_using_tdp_mmu(vcpu))
 		leaf = kvm_tdp_mmu_get_walk(vcpu, addr, sptes, &root);
 	else
 		leaf = get_walk(vcpu, addr, sptes, &root);
@@ -3729,7 +3729,7 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
 	if (page_fault_handle_page_track(vcpu, error_code, gfn))
 		return RET_PF_EMULATE;
 
-	if (!is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa)) {
+	if (!is_vcpu_using_tdp_mmu(vcpu)) {
 		r = fast_page_fault(vcpu, gpa, error_code);
 		if (r != RET_PF_INVALID)
 			return r;
@@ -3751,7 +3751,7 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
 
 	r = RET_PF_RETRY;
 
-	if (is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa))
+	if (is_vcpu_using_tdp_mmu(vcpu))
 		read_lock(&vcpu->kvm->mmu_lock);
 	else
 		write_lock(&vcpu->kvm->mmu_lock);
@@ -3762,7 +3762,7 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
 	if (r)
 		goto out_unlock;
 
-	if (is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa))
+	if (is_vcpu_using_tdp_mmu(vcpu))
 		r = kvm_tdp_mmu_map(vcpu, gpa, error_code, map_writable, max_level,
 				    pfn, prefault);
 	else
@@ -3770,7 +3770,7 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
 				 prefault, is_tdp);
 
 out_unlock:
-	if (is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa))
+	if (is_vcpu_using_tdp_mmu(vcpu))
 		read_unlock(&vcpu->kvm->mmu_lock);
 	else
 		write_unlock(&vcpu->kvm->mmu_lock);
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 237317b1eddd..f4cc79dabeae 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -979,7 +979,7 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
 
 	if (WARN_ON(!VALID_PAGE(vcpu->arch.mmu->root_hpa)))
 		return RET_PF_RETRY;
-	if (WARN_ON(!is_tdp_mmu_root(vcpu->kvm, vcpu->arch.mmu->root_hpa)))
+	if (WARN_ON(!is_vcpu_using_tdp_mmu(vcpu)))
 		return RET_PF_RETRY;
 
 	level = kvm_mmu_hugepage_adjust(vcpu, gfn, max_level, &pfn,
diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h
index 5fdf63090451..c8cf12809fcf 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.h
+++ b/arch/x86/kvm/mmu/tdp_mmu.h
@@ -91,16 +91,18 @@ static inline bool is_tdp_mmu_enabled(struct kvm *kvm) { return false; }
 static inline bool is_tdp_mmu_page(struct kvm_mmu_page *sp) { return false; }
 #endif
 
-static inline bool is_tdp_mmu_root(struct kvm *kvm, hpa_t hpa)
+static inline bool is_vcpu_using_tdp_mmu(struct kvm_vcpu *vcpu)
 {
+	struct kvm *kvm = vcpu->kvm;
 	struct kvm_mmu_page *sp;
+	hpa_t root_hpa = vcpu->arch.mmu->root_hpa;
 
 	if (!is_tdp_mmu_enabled(kvm))
 		return false;
-	if (WARN_ON(!VALID_PAGE(hpa)))
+	if (WARN_ON(!VALID_PAGE(root_hpa)))
 		return false;
 
-	sp = to_shadow_page(hpa);
+	sp = to_shadow_page(root_hpa);
 	if (WARN_ON(!sp))
 		return false;
 

From patchwork Fri Jun 11 23:56:55 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12316719
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-23.5 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
	DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
	INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,
	UNWANTED_LANGUAGE_BODY,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham
	autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 9675DC48BE6
	for <kvm@archiver.kernel.org>; Fri, 11 Jun 2021 23:57:40 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id 7FE30613C3
	for <kvm@archiver.kernel.org>; Fri, 11 Jun 2021 23:57:40 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230319AbhFKX7h (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Fri, 11 Jun 2021 19:59:37 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38410 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230239AbhFKX7g (ORCPT <rfc822;kvm@vger.kernel.org>);
        Fri, 11 Jun 2021 19:59:36 -0400
Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com
 [IPv6:2607:f8b0:4864:20::44a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 89D8FC0617AF
        for <kvm@vger.kernel.org>; Fri, 11 Jun 2021 16:57:23 -0700 (PDT)
Received: by mail-pf1-x44a.google.com with SMTP id
 s5-20020aa78d450000b02902ace63a7e93so4103541pfe.8
        for <kvm@vger.kernel.org>; Fri, 11 Jun 2021 16:57:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20161025;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=i1zHBOsNGSrxOcZYJqlAY/e1Hza/FbWWdMuJU4RiDNc=;
        b=hDLoqHOtwGIzUTGUZo1mxmDxAY/g2dQfAeKdPG+AuFoxWQ+Z3gD5uW6KCOxKSRySae
         znNwOAvAhUtW8PjbVNLrnfhWy6X4EUVfpJ/ag1wBed0VietviZpnll6M6atgfH5AjUyu
         5audtRBN88eflM1rmAOIAsAkEyvce2qAjCcy+DvLPL96t3oNXqR58z4LlJgjbaMGuPN4
         UDXKBK+InRVVtcEAbXs/GMqbSEbLipgeWVqYhYYKwBmUd/I2JBx9qNslghFuSCZXyGCo
         lE7YSIEUpXjEJgPKE68e2H6A1098UBTXjzwzZRP6ZCQfj/SDROh/V3vg9ZmlDfPALGjK
         ycOw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=i1zHBOsNGSrxOcZYJqlAY/e1Hza/FbWWdMuJU4RiDNc=;
        b=AIPtUsee9l6qsSVBkWnJA/zvEjsXbwydOE6TYxGjCJ/va3nnZJRWiWkqGDD4XNS5jG
         l8KX11dQrqrbkzkoyGqLUjRqSgI8f6RZ4TC+65q5liiSKSfNHZoMl72LxXJw/lz1gGoZ
         OSrNXzLbuSUAGkzjk959wiLUL//0MhFZQ+zFtfY3ODcaPJC8vNgN9zhoja4zlLOZi2sc
         ho7zmPhpXj1y7+cv/FZ6vgWdI30vPjHowKYWfqaZmUmIS6Wfwk/byicGBTgc3vSQFPgS
         NniVLRnleaOIvsjtexOSrKwJB64suUFitM0cbvTvcenbNXlfedPhJZ9EByY0N/FwaCnw
         SKhQ==
X-Gm-Message-State: AOAM531f/2W27vqqaO4EVIRlREIbydaqAXLV5yJjMjjyScmsDk2MZ2fa
        Napb3G914C/ajVVrPKGU8uFtyLWgN3D/iHe5diKw7L3WurHDt0Ta4v6gvLiBVw+VDLI28avngIl
        O/vXzzcLMx092k5NbCOO4v6ppY1Zdqgv4j+s0XzFzkIqGDZNpM48JHV2xIHOWV90=
X-Google-Smtp-Source: 
 ABdhPJxsY32FiO+oAV8qTyWboUJGY5vM1rLhdeRMlzAqWQpzIHgc+JiDYT2Sbu7j8ok0F6jTnQHjb4BZhg4Yjg==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:10:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a17:90a:a393:: with SMTP id
 x19mr696266pjp.1.1623455842082; Fri, 11 Jun 2021 16:57:22 -0700 (PDT)
Date: Fri, 11 Jun 2021 23:56:55 +0000
In-Reply-To: <20210611235701.3941724-1-dmatlack@google.com>
Message-Id: <20210611235701.3941724-3-dmatlack@google.com>
Mime-Version: 1.0
References: <20210611235701.3941724-1-dmatlack@google.com>
X-Mailer: git-send-email 2.32.0.272.g935e593368-goog
Subject: [PATCH 2/8] KVM: x86/mmu: Rename cr2_or_gpa to gpa in fast_page_fault
From: David Matlack <dmatlack@google.com>
To: kvm@vger.kernel.org
Cc: Ben Gardon <bgardon@google.com>, Joerg Roedel <joro@8bytes.org>,
        Jim Mattson <jmattson@google.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Sean Christopherson <seanjc@google.com>,
        Paolo Bonzini <pbonzini@redhat.com>,
        Junaid Shahid <junaids@google.com>,
        Andrew Jones <drjones@redhat.com>,
        David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

fast_page_fault is only called from direct_page_fault where we know the
address is a gpa.

Fixes: 736c291c9f36 ("KVM: x86: Use gpa_t for cr2/gpa to fix TDP support on 32-bit KVM")
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index eccd889d20a5..1d0fe1445e04 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3007,8 +3007,7 @@ static bool is_access_allowed(u32 fault_err_code, u64 spte)
 /*
  * Returns one of RET_PF_INVALID, RET_PF_FIXED or RET_PF_SPURIOUS.
  */
-static int fast_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
-			   u32 error_code)
+static int fast_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code)
 {
 	struct kvm_shadow_walk_iterator iterator;
 	struct kvm_mmu_page *sp;
@@ -3024,7 +3023,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 	do {
 		u64 new_spte;
 
-		for_each_shadow_entry_lockless(vcpu, cr2_or_gpa, iterator, spte)
+		for_each_shadow_entry_lockless(vcpu, gpa, iterator, spte)
 			if (!is_shadow_present_pte(spte))
 				break;
 
@@ -3103,8 +3102,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 
 	} while (true);
 
-	trace_fast_page_fault(vcpu, cr2_or_gpa, error_code, iterator.sptep,
-			      spte, ret);
+	trace_fast_page_fault(vcpu, gpa, error_code, iterator.sptep, spte, ret);
 	walk_shadow_page_lockless_end(vcpu);
 
 	return ret;

From patchwork Fri Jun 11 23:56:56 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12316717
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
	DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
	INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,
	USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no
	version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 51A90C48BE5
	for <kvm@archiver.kernel.org>; Fri, 11 Jun 2021 23:57:39 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id 39DEE613C3
	for <kvm@archiver.kernel.org>; Fri, 11 Jun 2021 23:57:39 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230186AbhFKX7g (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Fri, 11 Jun 2021 19:59:36 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38420 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S229548AbhFKX7e (ORCPT <rfc822;kvm@vger.kernel.org>);
        Fri, 11 Jun 2021 19:59:34 -0400
Received: from mail-qk1-x74a.google.com (mail-qk1-x74a.google.com
 [IPv6:2607:f8b0:4864:20::74a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8438AC0613A2
        for <kvm@vger.kernel.org>; Fri, 11 Jun 2021 16:57:25 -0700 (PDT)
Received: by mail-qk1-x74a.google.com with SMTP id
 o186-20020a37bec30000b02903aa376d30fdso16971212qkf.22
        for <kvm@vger.kernel.org>; Fri, 11 Jun 2021 16:57:25 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20161025;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=S0iOA4oBvcUglTmYnkZepeXKKGz0OH9ri/KggjqOTpQ=;
        b=axr71vLEIIZsyDujMQU1zA/lglNnSze2n8likCqyIW2v0luAE1BhQcT51TwUE9ewxn
         Nb1GQ6eHis+rgSc/T34LmDFYSqYBzzyG4vUXZJalmTm49c/29KfqefptivTzxjWuWtP5
         HUUbEJ01xOkN8k44LWG3hEvQCDBAQm8H2kQ7ts6prVKdnoi2vBmvmLhbHhheaNFo82wS
         Sv2oiYsnja2cNDmLvdGIeaedghtuKGFe1FgFoyQFGWp7ZE2EuKdtHcm0vfTlTwitsND5
         zFXRAMT+mxanLToyIBZxKxnw/7P1eq41F/bwmzmHvQ+7pAQ9whbIVq68i1CTPPNs+ae1
         pSYw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=S0iOA4oBvcUglTmYnkZepeXKKGz0OH9ri/KggjqOTpQ=;
        b=OFUCd9s43E0joZvmX+Q2HqnXglFfL1n1zZORimAIOHGD+tewgxJz1ln6HIsbx5JyfV
         rvxQHBpKnM/mS6m/zNt7gSfxDmXh/MsWDiTimBtyLd2L24V/R5EZQR/xPnfLy3X5CiqY
         /whxuMDy0Q69i5guXTi123u9avDXgUsxRUfoMRtIaFZmWHhr/QgKdVEJSJAnwiCyRAQt
         h+HFzBudMz+NGq5Dq3+B0z7Hyw2O524HDiwbR2NPLP2YjT0W7tbLJYKepkxaY3NXgdYb
         uGMn88VGWZAsHEct8Ol3VWAqh0BPW7aa3nPjmL0kjmwpRnDtnE3hFiBe2U2Qy990X7Av
         4LLg==
X-Gm-Message-State: AOAM531wCFKH44Y5iqa69g36whBcqQfDS6MFHJTb8Pu5V6r2Oiyiibl5
        ScVFINoExZu9aI56pqzEWRn3Oyiu4uVuwLO4RNMUuFfMoQhQp2rgMJhl9uTDvNCPtVnzNtM+1px
        NFUGNxzVTbm7gjw2pS9bj6qo+wbP/N8X9XS5xqjpbZDWRj+acBbzjZH7gc9JNuU8=
X-Google-Smtp-Source: 
 ABdhPJxgYhVRgoNp4qwsdEe2K3L+UboPcE13qD4dFU/aKCeD6stO3e80zJafhpVrEYfu5si/RttFB90YVKF4vQ==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:10:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a05:6214:240b:: with SMTP id
 fv11mr7303097qvb.23.1623455843717; Fri, 11 Jun 2021 16:57:23 -0700 (PDT)
Date: Fri, 11 Jun 2021 23:56:56 +0000
In-Reply-To: <20210611235701.3941724-1-dmatlack@google.com>
Message-Id: <20210611235701.3941724-4-dmatlack@google.com>
Mime-Version: 1.0
References: <20210611235701.3941724-1-dmatlack@google.com>
X-Mailer: git-send-email 2.32.0.272.g935e593368-goog
Subject: [PATCH 3/8] KVM: x86/mmu: Fix use of enums in trace_fast_page_fault
From: David Matlack <dmatlack@google.com>
To: kvm@vger.kernel.org
Cc: Ben Gardon <bgardon@google.com>, Joerg Roedel <joro@8bytes.org>,
        Jim Mattson <jmattson@google.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Sean Christopherson <seanjc@google.com>,
        Paolo Bonzini <pbonzini@redhat.com>,
        Junaid Shahid <junaids@google.com>,
        Andrew Jones <drjones@redhat.com>,
        David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Enum values have to be exported to userspace since the formatting is not
done in the kernel. Without doing this perf maps RET_PF_FIXED and
RET_PF_SPURIOUS to 0, which results in incorrect output:

  $ perf record -a -e kvmmmu:fast_page_fault --filter "ret==3" -- ./access_tracking_perf_test
  $ perf script | head -1
   [...] new 610006048d25877 spurious 0 fixed 0  <------ should be 1

Fix this by exporting the enum values to userspace with TRACE_DEFINE_ENUM.

Fixes: c4371c2a682e ("KVM: x86/mmu: Return unique RET_PF_* values if the fault was fixed")
Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/mmu/mmutrace.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/mmu/mmutrace.h b/arch/x86/kvm/mmu/mmutrace.h
index e798489b56b5..669b1405c60d 100644
--- a/arch/x86/kvm/mmu/mmutrace.h
+++ b/arch/x86/kvm/mmu/mmutrace.h
@@ -244,6 +244,9 @@ TRACE_EVENT(
 		  __entry->access)
 );
 
+TRACE_DEFINE_ENUM(RET_PF_FIXED);
+TRACE_DEFINE_ENUM(RET_PF_SPURIOUS);
+
 TRACE_EVENT(
 	fast_page_fault,
 	TP_PROTO(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u32 error_code,

From patchwork Fri Jun 11 23:56:57 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12316731
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
	DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
	INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,
	USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no
	version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 12C53C48BD1
	for <kvm@archiver.kernel.org>; Fri, 11 Jun 2021 23:58:44 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id EC031613C6
	for <kvm@archiver.kernel.org>; Fri, 11 Jun 2021 23:58:43 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230446AbhFLAAl (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Fri, 11 Jun 2021 20:00:41 -0400
Received: from mail-pj1-f73.google.com ([209.85.216.73]:39853 "EHLO
        mail-pj1-f73.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230410AbhFLAAk (ORCPT <rfc822;kvm@vger.kernel.org>);
        Fri, 11 Jun 2021 20:00:40 -0400
Received: by mail-pj1-f73.google.com with SMTP id
 w4-20020a17090a4f44b029016bab19a594so6874006pjl.4
        for <kvm@vger.kernel.org>; Fri, 11 Jun 2021 16:58:26 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20161025;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=KY5nFFut95ziz/p5qqfU7JOBjyPM21Fmr4qGyxbPUSE=;
        b=hjnavhQDp75bZrAHSJtJu1ThhKwEv1ggAZZy9W8PFV7oO6E5O0CSNFqIxZCW0O1hze
         W1+xapQ47Evl3lm0WWwiWm6NA3U8KelnT48EHfHy+uqCiCzNO4K12woub4XAJnqCIf82
         bE4bd6d7wd/8iGjQtgLqi7F0GhUpX8Y9joZgwidG+ClbZMeN0vBiQky80z9yHHcuiFQe
         6tb8RV7+917eqr2YolS8KNUQxCq6LnjPJ/kVEk9CshcXJ2HkHNbLkxRxQ5wbyA0ICCG6
         XCwr6uIHD+es/ksiS7nFOxSJ9N1cSIN8YfYL8FVq58ebr48K5qfQarkTU9Q02nMnCE9y
         h22w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=KY5nFFut95ziz/p5qqfU7JOBjyPM21Fmr4qGyxbPUSE=;
        b=P/kuFwB/+rVjpbfIkzoCZbzcKG9GE56rmwK5FTGPOKaPD2cO4sK7YYR47LfY4BicyB
         mR9gjFNP8C325WsDh4FFZnQSyY6PcLdLrKOrb7xlmlBSbaMEYI5Q1brHUIYVRx/p4hW9
         2ZVT8gcPSZYSyuKxFBm1jSiY6ZyW6Z2h17PVpjkln46OaGTQ9FFt1QgKNKpXuMVHj9lU
         sQ6V0jqkV4mRd8tOb7r3mABayIjW2x/7+krq+/ThotktM/7j2NN3ZxzHE1vL+TU1r0cs
         dg6BnYPmW61Fv2hNv2p/7CVI7B1mWCWWw0qa/gbYV0FSRBLmHT8UiF3kitqhvq+zDa98
         BQgQ==
X-Gm-Message-State: AOAM532IUIg30/zBdtf+PtTVfqLdIX4FfQHOkh3nA2CdYbA//3NDFWhp
        YulavlSK6wUkawo2IWAOEs0wItBXkRXgww/ac8VtjuCUPcdWaZKaIEmBoEtrc8FE2KXL7G58/aF
        tY66cKpvkMpA+ADwTJvegtLnD6i5JcqL+qMjZw5jAl0furbcLlZHlIkrNvjHt2jc=
X-Google-Smtp-Source: 
 ABdhPJx5d4RxhMz2O6nogSgIotzrc+hsbpMefYEY5R1zh/Yqw/hxyMU3rk2jo+T+/GvE/90FG6kaBkSyLV7l9w==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:10:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a17:902:82c2:b029:118:df43:e2f4 with SMTP
 id u2-20020a17090282c2b0290118df43e2f4mr3302205plz.14.1623455845467; Fri, 11
 Jun 2021 16:57:25 -0700 (PDT)
Date: Fri, 11 Jun 2021 23:56:57 +0000
In-Reply-To: <20210611235701.3941724-1-dmatlack@google.com>
Message-Id: <20210611235701.3941724-5-dmatlack@google.com>
Mime-Version: 1.0
References: <20210611235701.3941724-1-dmatlack@google.com>
X-Mailer: git-send-email 2.32.0.272.g935e593368-goog
Subject: [PATCH 4/8] KVM: x86/mmu: Common API for lockless shadow page walks
From: David Matlack <dmatlack@google.com>
To: kvm@vger.kernel.org
Cc: Ben Gardon <bgardon@google.com>, Joerg Roedel <joro@8bytes.org>,
        Jim Mattson <jmattson@google.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Sean Christopherson <seanjc@google.com>,
        Paolo Bonzini <pbonzini@redhat.com>,
        Junaid Shahid <junaids@google.com>,
        Andrew Jones <drjones@redhat.com>,
        David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Introduce a common API for walking the shadow page tables locklessly
that abstracts away whether the TDP MMU is enabled or not. This will be
used in a follow-up patch to support the TDP MMU in fast_page_fault.

The API can be used as follows:

  struct shadow_page_walk walk;

  walk_shadow_page_lockless_begin(vcpu);
  if (!walk_shadow_page_lockless(vcpu, addr, &walk))
    goto out;

  ... use `walk` ...

out:
  walk_shadow_page_lockless_end(vcpu);

Note: Separating walk_shadow_page_lockless_begin() from
walk_shadow_page_lockless() seems superfluous at first glance but is
needed to support fast_page_fault() since it performs multiple walks
under the same begin/end block.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c          | 96 ++++++++++++++++++++-------------
 arch/x86/kvm/mmu/mmu_internal.h | 15 ++++++
 arch/x86/kvm/mmu/tdp_mmu.c      | 34 ++++++------
 arch/x86/kvm/mmu/tdp_mmu.h      |  6 ++-
 4 files changed, 96 insertions(+), 55 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 1d0fe1445e04..8140c262f4d3 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -623,6 +623,11 @@ static bool mmu_spte_age(u64 *sptep)
 
 static void walk_shadow_page_lockless_begin(struct kvm_vcpu *vcpu)
 {
+	if (is_vcpu_using_tdp_mmu(vcpu)) {
+		kvm_tdp_mmu_walk_lockless_begin();
+		return;
+	}
+
 	/*
 	 * Prevent page table teardown by making any free-er wait during
 	 * kvm_flush_remote_tlbs() IPI to all active vcpus.
@@ -638,6 +643,11 @@ static void walk_shadow_page_lockless_begin(struct kvm_vcpu *vcpu)
 
 static void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu)
 {
+	if (is_vcpu_using_tdp_mmu(vcpu)) {
+		kvm_tdp_mmu_walk_lockless_end();
+		return;
+	}
+
 	/*
 	 * Make sure the write to vcpu->mode is not reordered in front of
 	 * reads to sptes.  If it does, kvm_mmu_commit_zap_page() can see us
@@ -3501,59 +3511,61 @@ static bool mmio_info_in_cache(struct kvm_vcpu *vcpu, u64 addr, bool direct)
 }
 
 /*
- * Return the level of the lowest level SPTE added to sptes.
- * That SPTE may be non-present.
+ * Walks the shadow page table for the given address until a leaf or non-present
+ * spte is encountered.
+ *
+ * Returns false if no walk could be performed, in which case `walk` does not
+ * contain any valid data.
+ *
+ * Must be called between walk_shadow_page_lockless_{begin,end}.
  */
-static int get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, int *root_level)
+static bool walk_shadow_page_lockless(struct kvm_vcpu *vcpu, u64 addr,
+				      struct shadow_page_walk *walk)
 {
-	struct kvm_shadow_walk_iterator iterator;
-	int leaf = -1;
+	struct kvm_shadow_walk_iterator it;
+	bool walk_ok = false;
 	u64 spte;
 
-	walk_shadow_page_lockless_begin(vcpu);
+	if (is_vcpu_using_tdp_mmu(vcpu))
+		return kvm_tdp_mmu_walk_lockless(vcpu, addr, walk);
 
-	for (shadow_walk_init(&iterator, vcpu, addr),
-	     *root_level = iterator.level;
-	     shadow_walk_okay(&iterator);
-	     __shadow_walk_next(&iterator, spte)) {
-		leaf = iterator.level;
-		spte = mmu_spte_get_lockless(iterator.sptep);
+	shadow_walk_init(&it, vcpu, addr);
+	walk->root_level = it.level;
 
-		sptes[leaf] = spte;
+	for (; shadow_walk_okay(&it); __shadow_walk_next(&it, spte)) {
+		walk_ok = true;
+
+		spte = mmu_spte_get_lockless(it.sptep);
+		walk->last_level = it.level;
+		walk->sptes[it.level] = spte;
 
 		if (!is_shadow_present_pte(spte))
 			break;
 	}
 
-	walk_shadow_page_lockless_end(vcpu);
-
-	return leaf;
+	return walk_ok;
 }
 
 /* return true if reserved bit(s) are detected on a valid, non-MMIO SPTE. */
 static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 {
-	u64 sptes[PT64_ROOT_MAX_LEVEL + 1];
+	struct shadow_page_walk walk;
 	struct rsvd_bits_validate *rsvd_check;
-	int root, leaf, level;
+	int last_level, level;
 	bool reserved = false;
 
-	if (!VALID_PAGE(vcpu->arch.mmu->root_hpa)) {
-		*sptep = 0ull;
+	*sptep = 0ull;
+
+	if (!VALID_PAGE(vcpu->arch.mmu->root_hpa))
 		return reserved;
-	}
 
-	if (is_vcpu_using_tdp_mmu(vcpu))
-		leaf = kvm_tdp_mmu_get_walk(vcpu, addr, sptes, &root);
-	else
-		leaf = get_walk(vcpu, addr, sptes, &root);
+	walk_shadow_page_lockless_begin(vcpu);
 
-	if (unlikely(leaf < 0)) {
-		*sptep = 0ull;
-		return reserved;
-	}
+	if (!walk_shadow_page_lockless(vcpu, addr, &walk))
+		goto out;
 
-	*sptep = sptes[leaf];
+	last_level = walk.last_level;
+	*sptep = walk.sptes[last_level];
 
 	/*
 	 * Skip reserved bits checks on the terminal leaf if it's not a valid
@@ -3561,29 +3573,37 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 	 * design, always have reserved bits set.  The purpose of the checks is
 	 * to detect reserved bits on non-MMIO SPTEs. i.e. buggy SPTEs.
 	 */
-	if (!is_shadow_present_pte(sptes[leaf]))
-		leaf++;
+	if (!is_shadow_present_pte(walk.sptes[last_level]))
+		last_level++;
 
 	rsvd_check = &vcpu->arch.mmu->shadow_zero_check;
 
-	for (level = root; level >= leaf; level--)
+	for (level = walk.root_level; level >= last_level; level--) {
+		u64 spte = walk.sptes[level];
+
 		/*
 		 * Use a bitwise-OR instead of a logical-OR to aggregate the
 		 * reserved bit and EPT's invalid memtype/XWR checks to avoid
 		 * adding a Jcc in the loop.
 		 */
-		reserved |= __is_bad_mt_xwr(rsvd_check, sptes[level]) |
-			    __is_rsvd_bits_set(rsvd_check, sptes[level], level);
+		reserved |= __is_bad_mt_xwr(rsvd_check, spte) |
+			    __is_rsvd_bits_set(rsvd_check, spte, level);
+	}
 
 	if (reserved) {
 		pr_err("%s: reserved bits set on MMU-present spte, addr 0x%llx, hierarchy:\n",
 		       __func__, addr);
-		for (level = root; level >= leaf; level--)
+		for (level = walk.root_level; level >= last_level; level--) {
+			u64 spte = walk.sptes[level];
+
 			pr_err("------ spte = 0x%llx level = %d, rsvd bits = 0x%llx",
-			       sptes[level], level,
-			       rsvd_check->rsvd_bits_mask[(sptes[level] >> 7) & 1][level-1]);
+			       spte, level,
+			       rsvd_check->rsvd_bits_mask[(spte >> 7) & 1][level-1]);
+		}
 	}
 
+out:
+	walk_shadow_page_lockless_end(vcpu);
 	return reserved;
 }
 
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index d64ccb417c60..26da6ca30fbf 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -165,4 +165,19 @@ void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc);
 void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp);
 void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp);
 
+struct shadow_page_walk {
+	/* The level of the root spte in the walk. */
+	int root_level;
+
+	/*
+	 * The level of the last spte in the walk. The last spte is either the
+	 * leaf of the walk (which may or may not be present) or the first
+	 * non-present spte encountered during the walk.
+	 */
+	int last_level;
+
+	/* The spte value at each level. */
+	u64 sptes[PT64_ROOT_MAX_LEVEL + 1];
+};
+
 #endif /* __KVM_X86_MMU_INTERNAL_H */
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index f4cc79dabeae..36f4844a5f95 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1504,28 +1504,32 @@ bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm,
 	return spte_set;
 }
 
-/*
- * Return the level of the lowest level SPTE added to sptes.
- * That SPTE may be non-present.
- */
-int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
-			 int *root_level)
+void kvm_tdp_mmu_walk_lockless_begin(void)
+{
+	rcu_read_lock();
+}
+
+void kvm_tdp_mmu_walk_lockless_end(void)
+{
+	rcu_read_unlock();
+}
+
+bool kvm_tdp_mmu_walk_lockless(struct kvm_vcpu *vcpu, u64 addr,
+			       struct shadow_page_walk *walk)
 {
 	struct tdp_iter iter;
 	struct kvm_mmu *mmu = vcpu->arch.mmu;
 	gfn_t gfn = addr >> PAGE_SHIFT;
-	int leaf = -1;
+	bool walk_ok = false;
 
-	*root_level = vcpu->arch.mmu->shadow_root_level;
-
-	rcu_read_lock();
+	walk->root_level = vcpu->arch.mmu->shadow_root_level;
 
 	tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) {
-		leaf = iter.level;
-		sptes[leaf] = iter.old_spte;
-	}
+		walk_ok = true;
 
-	rcu_read_unlock();
+		walk->last_level = iter.level;
+		walk->sptes[iter.level] = iter.old_spte;
+	}
 
-	return leaf;
+	return walk_ok;
 }
diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h
index c8cf12809fcf..772d11bbb92a 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.h
+++ b/arch/x86/kvm/mmu/tdp_mmu.h
@@ -76,8 +76,10 @@ bool kvm_tdp_mmu_zap_collapsible_sptes(struct kvm *kvm,
 bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm,
 				   struct kvm_memory_slot *slot, gfn_t gfn);
 
-int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
-			 int *root_level);
+void kvm_tdp_mmu_walk_lockless_begin(void);
+void kvm_tdp_mmu_walk_lockless_end(void);
+bool kvm_tdp_mmu_walk_lockless(struct kvm_vcpu *vcpu, u64 addr,
+			       struct shadow_page_walk *walk);
 
 #ifdef CONFIG_X86_64
 void kvm_mmu_init_tdp_mmu(struct kvm *kvm);

From patchwork Fri Jun 11 23:56:58 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12316721
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
	DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
	INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,
	USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no
	version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 7FFDEC48BE5
	for <kvm@archiver.kernel.org>; Fri, 11 Jun 2021 23:57:41 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id 6A147613C3
	for <kvm@archiver.kernel.org>; Fri, 11 Jun 2021 23:57:41 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230331AbhFKX7i (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Fri, 11 Jun 2021 19:59:38 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38428 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230211AbhFKX7g (ORCPT <rfc822;kvm@vger.kernel.org>);
        Fri, 11 Jun 2021 19:59:36 -0400
Received: from mail-qt1-x84a.google.com (mail-qt1-x84a.google.com
 [IPv6:2607:f8b0:4864:20::84a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 25B2BC0613A3
        for <kvm@vger.kernel.org>; Fri, 11 Jun 2021 16:57:28 -0700 (PDT)
Received: by mail-qt1-x84a.google.com with SMTP id
 b9-20020ac86bc90000b029024a9c2c55b2so883105qtt.15
        for <kvm@vger.kernel.org>; Fri, 11 Jun 2021 16:57:28 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20161025;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=naUUfLFw1Sdu4x0S/DKaQk2Ombp4kYpoKKh+TUQmQU8=;
        b=jb5KUhubSkySRC3x+qAf54PBwP0MYH1Pfsz9zfRfXiEcWcWyTvogHSqK9eO9ONq1nM
         +fDPvWg+vXAEDO8TSl3p4c6UaSLAGCTZ7dU7N0hcsrKOJxgL44HAAQj52u+XRNFnkGAp
         k/WdL7uwCeFL55esTEdguc4GigZoBOxYfGa8gOAUiXoxzTWoL4VZgSdGvBkijFknKme1
         0JT8mVurglPTUjiQMcOaQ2zMD4kT0vJuGhWcaUlXQWCPaW9DTDy04KHwLYtckcKt+uYk
         O7kqLFgZVTM9da7OQUzEDwbVYe0xWBkHTpqEnvxqx4mktEge41pLS/Q47oVSJWpvBCpq
         XRDQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=naUUfLFw1Sdu4x0S/DKaQk2Ombp4kYpoKKh+TUQmQU8=;
        b=rer17ouJalVOV06bIVoHjWdHX1iqwVunMWpZDnypG+cFFGh8LATHq52BO7POHw5NzB
         SBCR65Igo6F3ZDhC233+EkobFNsJ9Dch2S80bfbz8aDa5ekpy4K6CO9KBkujyAo2+5uo
         Oq7jyoJDZuawEPp+hEGg5y9f2G/x2G26lcc43sLx/XFxDtQxQr/JCpM/7VmxkrDsLkVT
         q1qonqdsKHJtyMNcFFl51HpHUAvPvGLpCchcMZ3HzyY9NKGD2+LXD/ppMZEMgBTHJ/g0
         tBBBoSFh8CErC5edjt7ZxV+/mLTGn88r/Q1GNc9+L4wyeA3hceG7HGrJTzUm3C1wg3n7
         YpLw==
X-Gm-Message-State: AOAM531VK1quEVRKNAWTsRoXvfkvcKiJJZEtoswWDVxcvww/P9/tNh8A
        q9D0VFfJDJTKuNwDO4ujKuLZHeDBUodYwIlW2XlhRPgDOMFooCItzFvPORHGivbMq/gzOaviOs3
        uhV4b7ZMFyBrEo5DvkWNZAKurPKMFqajRZJ1YSepdpPoJ5TNXz4Vn6OT1sNQjrfg=
X-Google-Smtp-Source: 
 ABdhPJyKaiFtMzGhA+PpHuXaFUqGRyK8EKQ4bagNa44YZtpO2l2dHtlPuYw0cZ0H8k2Yvf/SiH59uJwwb1mLEg==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:10:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:ad4:4bc7:: with SMTP id
 l7mr7421689qvw.7.1623455847214; Fri, 11 Jun 2021 16:57:27 -0700 (PDT)
Date: Fri, 11 Jun 2021 23:56:58 +0000
In-Reply-To: <20210611235701.3941724-1-dmatlack@google.com>
Message-Id: <20210611235701.3941724-6-dmatlack@google.com>
Mime-Version: 1.0
References: <20210611235701.3941724-1-dmatlack@google.com>
X-Mailer: git-send-email 2.32.0.272.g935e593368-goog
Subject: [PATCH 5/8] KVM: x86/mmu: Also record spteps in shadow_page_walk
From: David Matlack <dmatlack@google.com>
To: kvm@vger.kernel.org
Cc: Ben Gardon <bgardon@google.com>, Joerg Roedel <joro@8bytes.org>,
        Jim Mattson <jmattson@google.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Sean Christopherson <seanjc@google.com>,
        Paolo Bonzini <pbonzini@redhat.com>,
        Junaid Shahid <junaids@google.com>,
        Andrew Jones <drjones@redhat.com>,
        David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

In order to use walk_shadow_page_lockless() in fast_page_fault() we need
to also record the spteps.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c          | 1 +
 arch/x86/kvm/mmu/mmu_internal.h | 3 +++
 arch/x86/kvm/mmu/tdp_mmu.c      | 1 +
 3 files changed, 5 insertions(+)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 8140c262f4d3..765f5b01768d 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3538,6 +3538,7 @@ static bool walk_shadow_page_lockless(struct kvm_vcpu *vcpu, u64 addr,
 		spte = mmu_spte_get_lockless(it.sptep);
 		walk->last_level = it.level;
 		walk->sptes[it.level] = spte;
+		walk->spteps[it.level] = it.sptep;
 
 		if (!is_shadow_present_pte(spte))
 			break;
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index 26da6ca30fbf..0fefbd5d6c95 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -178,6 +178,9 @@ struct shadow_page_walk {
 
 	/* The spte value at each level. */
 	u64 sptes[PT64_ROOT_MAX_LEVEL + 1];
+
+	/* The spte pointers at each level. */
+	u64 *spteps[PT64_ROOT_MAX_LEVEL + 1];
 };
 
 #endif /* __KVM_X86_MMU_INTERNAL_H */
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 36f4844a5f95..7279d17817a1 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1529,6 +1529,7 @@ bool kvm_tdp_mmu_walk_lockless(struct kvm_vcpu *vcpu, u64 addr,
 
 		walk->last_level = iter.level;
 		walk->sptes[iter.level] = iter.old_spte;
+		walk->spteps[iter.level] = iter.sptep;
 	}
 
 	return walk_ok;

From patchwork Fri Jun 11 23:56:59 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12316727
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
	DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
	INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,
	USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no
	version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id A4BD7C48BE6
	for <kvm@archiver.kernel.org>; Fri, 11 Jun 2021 23:58:38 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id 8A22D613C6
	for <kvm@archiver.kernel.org>; Fri, 11 Jun 2021 23:58:38 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230380AbhFLAAg (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Fri, 11 Jun 2021 20:00:36 -0400
Received: from mail-pf1-f201.google.com ([209.85.210.201]:39438 "EHLO
        mail-pf1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230348AbhFLAAe (ORCPT <rfc822;kvm@vger.kernel.org>);
        Fri, 11 Jun 2021 20:00:34 -0400
Received: by mail-pf1-f201.google.com with SMTP id
 j206-20020a6280d70000b02902e9e02e1654so4110251pfd.6
        for <kvm@vger.kernel.org>; Fri, 11 Jun 2021 16:58:29 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20161025;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=rs43XjvFocbBYfeRQNGhmBNnR9F38flGi3d9HcssKs8=;
        b=BL/bSA0nQYIes/M0Zr2z5Kk1DbXXupF67tUzFmDLzGvuBo7GtbKD4AinbYfrQyPLUW
         UQoVx4dSLpVTL5lkpXMNIyt8AJXWxBFsvFDcMiAee/F8ttti6GifbmtZPN6zRnEjbWU4
         uvb8Ku8GnBSb0RQJK/t/OQK3yUQkGNTwTrHg+n1xfinc7Vu4kUpeuYs2RnTjClgUlcmP
         Fhfa3/LkxukA1z0nCqMlfo6GZT6xHT8aB9kdS9H6j7pEjaqntaNJfxlzp8kNC98Yrecm
         6nY+wJ/G9JAsgGO+8Ak+DmjAso13xNyeSd1rOzpriJcTKrTeoezw5MTb+JDe1UKbiif9
         3GqQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=rs43XjvFocbBYfeRQNGhmBNnR9F38flGi3d9HcssKs8=;
        b=fl/Fpp55dBYsz6G6ucmmNUBsPOq0+woso3Bzj5MWlQnOgL3GlOrgdsCMTsjSo5difo
         74DACEY9UxfUtRcd7Dev6mitp8C5JRzJHRPXh1IH8pE5v+fPtsgoPf+Waog/2+4M55fY
         tOexl4tvlA84oVBb3iCqB1uFrg9gNJ0S7QqfgbJCThoAYEtSJu9BcZotEKR8Z8gAyavM
         TgcTmimVgtrGcwzui2qD1Pm+kn0oTEJCtMruFUV+VdM6ieWt5VVEJYwYby7nsVxfs5zG
         ie3lTlKDNuRaZINEClhILhP6fnDZk2V8o9iTglqgwewuoAZACTM928roaNXCPX/Cey2I
         oyxQ==
X-Gm-Message-State: AOAM530s1IOvfqO70rKLwAX4NVURic3dr5vKxt1JoPNET0EFxzbTHafl
        flYAVHhsWX/sAnn68E0z6Ekr6Qu1y/D5YZwPFnpBfdEnItufivvYgvHYnQRoVmCxqIedvovGQim
        LuQ7xJydoIf4rkKxIEozVgnkJJS5CsSezjUZUawfcBoH1Ea9RVjAlp1GSkyr/gwk=
X-Google-Smtp-Source: 
 ABdhPJz39p6EOFQiDLMXN85np0Af5aeGr5Z5LvMSBxVcMsztht/KJmQsK3znalwig2YZl9OQysnmSEhvgzukDQ==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:10:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a62:6581:0:b029:2ef:bcb1:c406 with SMTP
 id z123-20020a6265810000b02902efbcb1c406mr10674990pfb.28.1623455848780; Fri,
 11 Jun 2021 16:57:28 -0700 (PDT)
Date: Fri, 11 Jun 2021 23:56:59 +0000
In-Reply-To: <20210611235701.3941724-1-dmatlack@google.com>
Message-Id: <20210611235701.3941724-7-dmatlack@google.com>
Mime-Version: 1.0
References: <20210611235701.3941724-1-dmatlack@google.com>
X-Mailer: git-send-email 2.32.0.272.g935e593368-goog
Subject: [PATCH 6/8] KVM: x86/mmu: fast_page_fault support for the TDP MMU
From: David Matlack <dmatlack@google.com>
To: kvm@vger.kernel.org
Cc: Ben Gardon <bgardon@google.com>, Joerg Roedel <joro@8bytes.org>,
        Jim Mattson <jmattson@google.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Sean Christopherson <seanjc@google.com>,
        Paolo Bonzini <pbonzini@redhat.com>,
        Junaid Shahid <junaids@google.com>,
        Andrew Jones <drjones@redhat.com>,
        David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

This commit enables the fast_page_fault handler to work when the TDP MMU
is enabled by leveraging the new walk_shadow_page_lockless* API to
collect page walks independent of the TDP MMU.

fast_page_fault was already using
walk_shadow_page_lockless_{begin,end}(), we just have to change the
actual walk to use walk_shadow_page_lockless() which does the right
thing if the TDP MMU is in use.

Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 52 +++++++++++++++++-------------------------
 1 file changed, 21 insertions(+), 31 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 765f5b01768d..5562727c3699 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -657,6 +657,9 @@ static void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu)
 	local_irq_enable();
 }
 
+static bool walk_shadow_page_lockless(struct kvm_vcpu *vcpu, u64 addr,
+				      struct shadow_page_walk *walk);
+
 static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect)
 {
 	int r;
@@ -2967,14 +2970,9 @@ static bool page_fault_can_be_fast(u32 error_code)
  * Returns true if the SPTE was fixed successfully. Otherwise,
  * someone else modified the SPTE from its original value.
  */
-static bool
-fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
-			u64 *sptep, u64 old_spte, u64 new_spte)
+static bool fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, gpa_t gpa,
+				    u64 *sptep, u64 old_spte, u64 new_spte)
 {
-	gfn_t gfn;
-
-	WARN_ON(!sp->role.direct);
-
 	/*
 	 * Theoretically we could also set dirty bit (and flush TLB) here in
 	 * order to eliminate unnecessary PML logging. See comments in
@@ -2990,14 +2988,8 @@ fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 	if (cmpxchg64(sptep, old_spte, new_spte) != old_spte)
 		return false;
 
-	if (is_writable_pte(new_spte) && !is_writable_pte(old_spte)) {
-		/*
-		 * The gfn of direct spte is stable since it is
-		 * calculated by sp->gfn.
-		 */
-		gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt);
-		kvm_vcpu_mark_page_dirty(vcpu, gfn);
-	}
+	if (is_writable_pte(new_spte) && !is_writable_pte(old_spte))
+		kvm_vcpu_mark_page_dirty(vcpu, gpa >> PAGE_SHIFT);
 
 	return true;
 }
@@ -3019,10 +3011,9 @@ static bool is_access_allowed(u32 fault_err_code, u64 spte)
  */
 static int fast_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code)
 {
-	struct kvm_shadow_walk_iterator iterator;
-	struct kvm_mmu_page *sp;
 	int ret = RET_PF_INVALID;
 	u64 spte = 0ull;
+	u64 *sptep = NULL;
 	uint retry_count = 0;
 
 	if (!page_fault_can_be_fast(error_code))
@@ -3031,17 +3022,19 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code)
 	walk_shadow_page_lockless_begin(vcpu);
 
 	do {
+		struct shadow_page_walk walk;
 		u64 new_spte;
 
-		for_each_shadow_entry_lockless(vcpu, gpa, iterator, spte)
-			if (!is_shadow_present_pte(spte))
-				break;
+		if (!walk_shadow_page_lockless(vcpu, gpa, &walk))
+			break;
+
+		spte = walk.sptes[walk.last_level];
+		sptep = walk.spteps[walk.last_level];
 
 		if (!is_shadow_present_pte(spte))
 			break;
 
-		sp = sptep_to_sp(iterator.sptep);
-		if (!is_last_spte(spte, sp->role.level))
+		if (!is_last_spte(spte, walk.last_level))
 			break;
 
 		/*
@@ -3084,7 +3077,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code)
 			 *
 			 * See the comments in kvm_arch_commit_memory_region().
 			 */
-			if (sp->role.level > PG_LEVEL_4K)
+			if (walk.last_level > PG_LEVEL_4K)
 				break;
 		}
 
@@ -3098,8 +3091,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code)
 		 * since the gfn is not stable for indirect shadow page. See
 		 * Documentation/virt/kvm/locking.rst to get more detail.
 		 */
-		if (fast_pf_fix_direct_spte(vcpu, sp, iterator.sptep, spte,
-					    new_spte)) {
+		if (fast_pf_fix_direct_spte(vcpu, gpa, sptep, spte, new_spte)) {
 			ret = RET_PF_FIXED;
 			break;
 		}
@@ -3112,7 +3104,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code)
 
 	} while (true);
 
-	trace_fast_page_fault(vcpu, gpa, error_code, iterator.sptep, spte, ret);
+	trace_fast_page_fault(vcpu, gpa, error_code, sptep, spte, ret);
 	walk_shadow_page_lockless_end(vcpu);
 
 	return ret;
@@ -3748,11 +3740,9 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
 	if (page_fault_handle_page_track(vcpu, error_code, gfn))
 		return RET_PF_EMULATE;
 
-	if (!is_vcpu_using_tdp_mmu(vcpu)) {
-		r = fast_page_fault(vcpu, gpa, error_code);
-		if (r != RET_PF_INVALID)
-			return r;
-	}
+	r = fast_page_fault(vcpu, gpa, error_code);
+	if (r != RET_PF_INVALID)
+		return r;
 
 	r = mmu_topup_memory_caches(vcpu, false);
 	if (r)

From patchwork Fri Jun 11 23:57:00 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12316725
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
	DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
	INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,
	USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no
	version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id B6033C48BD1
	for <kvm@archiver.kernel.org>; Fri, 11 Jun 2021 23:58:37 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id 9E151613C6
	for <kvm@archiver.kernel.org>; Fri, 11 Jun 2021 23:58:37 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229874AbhFLAAf (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Fri, 11 Jun 2021 20:00:35 -0400
Received: from mail-pf1-f201.google.com ([209.85.210.201]:36389 "EHLO
        mail-pf1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230346AbhFLAAe (ORCPT <rfc822;kvm@vger.kernel.org>);
        Fri, 11 Jun 2021 20:00:34 -0400
Received: by mail-pf1-f201.google.com with SMTP id
 l145-20020a6288970000b02902e9f6a5c2c3so4101525pfd.3
        for <kvm@vger.kernel.org>; Fri, 11 Jun 2021 16:58:30 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20161025;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=5pAQV1Lw2Y47BZl4IaSZrS+FJXZs41SS8a9vGzneCvw=;
        b=AeiuNS4S9JPhjK1Kz82bMHe7Dh6sDgYMa8jUFh1Z+naBWT1MD4c/Ed/pRqvPvrrX7c
         JuIr4EY1Pia0nCQda/RzoiogTXrtxxMQXJBXf1QPRTQwzd72nZPMxQ2E5hZ/eqTbe2IY
         V7JHD+GjmYosvHpzSV+vwMW3fwJ/WIDAMS5VHuGUdQOEewW2xP0g78M4o6id/pveaO8A
         qS2hgu3fO9ugNk2+iY2VPmJ9RLlKfpK3AW5YRWerzW8s81XBVoRtNcVhYwISbbdZraMO
         RHEqh5D5wdrbqmuR6LRem5jDqncOLFwqqCzeB238oNZi0kbD11H8CvWN9nb9vjjEzFyM
         DFqA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=5pAQV1Lw2Y47BZl4IaSZrS+FJXZs41SS8a9vGzneCvw=;
        b=p97gI07+kOqGrfhCiRy0A8R13jUC+LUb4lI0O+V6ikC91rhywzeKOJi+aVHVI+9Rtv
         CRSEOo4DsD5DppdaHTJLglYL0JHUqyiF/RIUBDXKw1lBgXxhKmtE8HvhjahOvVt3vqwp
         e6ioB8Chlb414AXRSFlENUpHBbnFlL6qnLk4UOGeoTVPIK7lNqbB4DQslddJ2nCy98yr
         Y1pBps02ONbjuTLGWgDPbSpaKxgHEVA4MZGoE5hBCie8mHNKj0djeZnoDbINpXtTn34/
         RqtCICfE9olOnci9WK0loYoYLjX53h7DFRmx/kjHWlsP02f7bAQZk6oOhtYp/iOsucoi
         R0dg==
X-Gm-Message-State: AOAM5317Is2wr9mFMIlM+aD6bh1ClCu1KOfE9QRjlulPfUg3TjJ5v273
        N6T1Pz7y8MZB+D7vnwtZKxq10bm5ANH1mb2QOS/pJCGh1ZXsj/NULZFLrKs3699gzVO+9XM/FBy
        RakfwtQ4dozcpAc9fCvkXr+pYwXnSa4IVSTb12AwLTeBu8RFOk5R/0ehO769FNsA=
X-Google-Smtp-Source: 
 ABdhPJzkUGwIsXX5wNHuGCo+J59Fy9E5QZsfvNo5Axwp2Ru3MG7XonR+Or9TcQnK1KTkgzI9Ol/8tzTmfl8KgQ==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:10:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:aa7:8202:0:b029:2d8:c24d:841d with SMTP
 id k2-20020aa782020000b02902d8c24d841dmr10586583pfi.57.1623455850345; Fri, 11
 Jun 2021 16:57:30 -0700 (PDT)
Date: Fri, 11 Jun 2021 23:57:00 +0000
In-Reply-To: <20210611235701.3941724-1-dmatlack@google.com>
Message-Id: <20210611235701.3941724-8-dmatlack@google.com>
Mime-Version: 1.0
References: <20210611235701.3941724-1-dmatlack@google.com>
X-Mailer: git-send-email 2.32.0.272.g935e593368-goog
Subject: [PATCH 7/8] KVM: selftests: Fix missing break in dirty_log_perf_test
 arg parsing
From: David Matlack <dmatlack@google.com>
To: kvm@vger.kernel.org
Cc: Ben Gardon <bgardon@google.com>, Joerg Roedel <joro@8bytes.org>,
        Jim Mattson <jmattson@google.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Sean Christopherson <seanjc@google.com>,
        Paolo Bonzini <pbonzini@redhat.com>,
        Junaid Shahid <junaids@google.com>,
        Andrew Jones <drjones@redhat.com>,
        David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

There is a missing break statement which causes a fallthrough to the
next statement where optarg will be null and a segmentation fault will
be generated.

Fixes: 9e965bb75aae ("KVM: selftests: Add backing src parameter to dirty_log_perf_test")
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon
---
 tools/testing/selftests/kvm/dirty_log_perf_test.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index 04a2641261be..80cbd3a748c0 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -312,6 +312,7 @@ int main(int argc, char *argv[])
 			break;
 		case 'o':
 			p.partition_vcpu_memory_access = false;
+			break;
 		case 's':
 			p.backing_src = parse_backing_src_type(optarg);
 			break;

From patchwork Fri Jun 11 23:57:01 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12316729
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-26.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
	DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
	INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,
	USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no
	version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 624C3C48BE8
	for <kvm@archiver.kernel.org>; Fri, 11 Jun 2021 23:58:39 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id 4271161288
	for <kvm@archiver.kernel.org>; Fri, 11 Jun 2021 23:58:39 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230443AbhFLAAg (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Fri, 11 Jun 2021 20:00:36 -0400
Received: from mail-pf1-f202.google.com ([209.85.210.202]:33705 "EHLO
        mail-pf1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230349AbhFLAAe (ORCPT <rfc822;kvm@vger.kernel.org>);
        Fri, 11 Jun 2021 20:00:34 -0400
Received: by mail-pf1-f202.google.com with SMTP id
 i13-20020aa78b4d0000b02902ea019ef670so4113523pfd.0
        for <kvm@vger.kernel.org>; Fri, 11 Jun 2021 16:58:32 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20161025;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=DcsPY5Lu0+b3afx6g2ezWU+d2rrcbU3Zjbgj10xg5/I=;
        b=ObvEN3IQkOFY2+ySBaczorNsCHZKTj7p+HDcc6Ke1inPbf+4DobbKthT1PbfoRzBUf
         Wh25WDBaqgGAzdkkQqi7/7++n5PwlMwPXb3qUP+90LStE1aSkLG93FxEdHe3dDbEtEJ5
         KTm6dsx7BEW+w68PT7H6yzBlLpmQlZF0fGBVEYcB+bIwA9no1WWV1AA/HF9TkyNR93KI
         2uLtCfQNQtwapQxvRSN9s9n1WYqn5rO2kMaNNakn085zJmlYm2TlUt9pQoByHGsXGV+i
         NaqwN+9/cBfXd1MAQKpW5XeAaJ7yPZFeLI71YiQJIlP9EOBMhlB8Uo3OkDnIflojGMVn
         8Mpg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=DcsPY5Lu0+b3afx6g2ezWU+d2rrcbU3Zjbgj10xg5/I=;
        b=sQV8af/oAwt6YXY5jyU675cXqskUhPuJzNqhZLTEOtFQtTL2MQPHn4lE2NVpj68iFR
         y8RKIt+crSHkuuDYfJ5FJU7KbNlq2EZ8/Nur7TdVnbWU7gUF4PJsm+SwqY97Z+QVrHnN
         Ny6PxzaiUdGSwl8yWVne8VAgmH4T6kpRtJWJTx5qayqwCv9uvDLZQoeijZrQ4IDS3kOC
         wttlM6uSKsr5qNLCrm9k+Jz6gJbfakVdy7OcGTwTf5Rds0uABsbSJ8hyCRiKSOCFb95N
         U36II+gXjReJlQprn7V9/Qj8rfpRIj7+ZezZqJgIiwFGLjmNmHp+9ZICSCJKLsP115y1
         BrEQ==
X-Gm-Message-State: AOAM531wzmSLXuKxRO0zuxOMtrAxK5eCz4dpn86sG96lYUMpjli4rO6g
        wEozv9A7ahPrEDKLsMB54J/9xX94HT1lsWcVs6qzqd/eQ9NfktWJd3XOKIJlz6wVjPc+MAfbWWa
        3nLPjBk0VglWTTItx6G6O/GA31ctcXvBwq8o9Ocym+sw6vGxtdI+1DH3ixjzoVm8=
X-Google-Smtp-Source: 
 ABdhPJyIqzY3AvLczlhlpMOdifjhUaurzlXQIGZxe56+Ptg+DaHVNr0CZv053XKolBiN4ae1KxUXSfrxLN68kw==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:10:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a17:90b:3a8c:: with SMTP id
 om12mr6961326pjb.103.1623455851970; Fri, 11 Jun 2021 16:57:31 -0700 (PDT)
Date: Fri, 11 Jun 2021 23:57:01 +0000
In-Reply-To: <20210611235701.3941724-1-dmatlack@google.com>
Message-Id: <20210611235701.3941724-9-dmatlack@google.com>
Mime-Version: 1.0
References: <20210611235701.3941724-1-dmatlack@google.com>
X-Mailer: git-send-email 2.32.0.272.g935e593368-goog
Subject: [PATCH 8/8] KVM: selftests: Introduce access_tracking_perf_test
From: David Matlack <dmatlack@google.com>
To: kvm@vger.kernel.org
Cc: Ben Gardon <bgardon@google.com>, Joerg Roedel <joro@8bytes.org>,
        Jim Mattson <jmattson@google.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Sean Christopherson <seanjc@google.com>,
        Paolo Bonzini <pbonzini@redhat.com>,
        Junaid Shahid <junaids@google.com>,
        Andrew Jones <drjones@redhat.com>,
        David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

This test measures the performance effects of KVM's access tracking.
Access tracking is driven by the MMU notifiers test_young, clear_young,
and clear_flush_young. These notifiers do not have a direct userspace
API, however the clear_young notifier can be triggered by marking a
pages as idle in /sys/kernel/mm/page_idle/bitmap. This test leverages
that mechanism to enable access tracking on guest memory.

To measure performance this test runs a VM with a configurable number of
vCPUs that each touch every page in disjoint regions of memory.
Performance is measured in the time it takes all vCPUs to finish
touching their predefined region.

Example invocation:

  $ ./access_tracking_perf_test -v 8
  Testing guest mode: PA-bits:ANY, VA-bits:48,  4K pages
  guest physical test memory offset: 0xffdfffff000

  Populating memory             : 1.337752570s
  Writing to populated memory   : 0.010177640s
  Reading from populated memory : 0.009548239s
  Mark memory idle              : 23.973131748s
  Writing to idle memory        : 0.063584496s
  Mark memory idle              : 24.924652964s
  Reading from idle memory      : 0.062042814s

Breaking down the results:

 * "Populating memory": The time it takes for all vCPUs to perform the
   first write to every page in their region.

 * "Writing to populated memory" / "Reading from populated memory": The
   time it takes for all vCPUs to write and read to every page in their
   region after it has been populated. This serves as a control for the
   later results.

 * "Mark memory idle": The time it takes for every vCPU to mark every
   page in their region as idle through page_idle.

 * "Writing to idle memory" / "Reading from idle memory": The time it
   takes for all vCPUs to write and read to every page in their region
   after it has been marked idle.

This test should be portable across architectures but it is only enabled
for x86_64 since that's all I have tested.

Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
---
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/access_tracking_perf_test.c | 419 ++++++++++++++++++
 3 files changed, 421 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/access_tracking_perf_test.c

diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
index bd83158e0e0b..32a362d71e05 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -34,6 +34,7 @@
 /x86_64/xen_vmcall_test
 /x86_64/xss_msr_test
 /x86_64/vmx_pmu_msrs_test
+/access_tracking_perf_test
 /demand_paging_test
 /dirty_log_test
 /dirty_log_perf_test
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index e439d027939d..9f1b478da92b 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -67,6 +67,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/tsc_msrs_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_pmu_msrs_test
 TEST_GEN_PROGS_x86_64 += x86_64/xen_shinfo_test
 TEST_GEN_PROGS_x86_64 += x86_64/xen_vmcall_test
+TEST_GEN_PROGS_x86_64 += access_tracking_perf_test
 TEST_GEN_PROGS_x86_64 += demand_paging_test
 TEST_GEN_PROGS_x86_64 += dirty_log_test
 TEST_GEN_PROGS_x86_64 += dirty_log_perf_test
diff --git a/tools/testing/selftests/kvm/access_tracking_perf_test.c b/tools/testing/selftests/kvm/access_tracking_perf_test.c
new file mode 100644
index 000000000000..60828f2d780f
--- /dev/null
+++ b/tools/testing/selftests/kvm/access_tracking_perf_test.c
@@ -0,0 +1,419 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * access_tracking_test
+ *
+ * Copyright (C) 2021, Google, Inc.
+ *
+ * This test measures the performance effects of KVM's access tracking.
+ * Access tracking is driven by the MMU notifiers test_young, clear_young, and
+ * clear_flush_young. These notifiers do not have a direct userspace API,
+ * however the clear_young notifier can be triggered by marking a pages as idle
+ * in /sys/kernel/mm/page_idle/bitmap. This test leverages that mechanism to
+ * enable access tracking on guest memory.
+ *
+ * To measure performance this test runs a VM with a configurable number of
+ * vCPUs that each touch every page in disjoint regions of memory. Performance
+ * is measured in the time it takes all vCPUs to finish touching their
+ * predefined region.
+ *
+ * Note that a deterministic correctness test of access tracking is not possible
+ * by using page_idle as it exists today. This is for a few reasons:
+ *
+ * 1. page_idle only issues clear_young notifiers, which lack a TLB flush. This
+ *    means subsequent guest accesses are not guaranteed to see page table
+ *    updates made by KVM until some time in the future.
+ *
+ * 2. page_idle only operates on LRU pages. Newly allocated pages are not
+ *    immediately allocated to LRU lists. Instead they are held in a "pagevec",
+ *    which is drained to LRU lists some time in the future. There is no
+ *    userspace API to force this drain to occur.
+ *
+ * These limitations are worked around in this test by using a large enough
+ * region of memory for each vCPU such that the number of translations cached in
+ * the TLB and the number of pages held in pagevecs are a small fraction of the
+ * overall workload. And if either of those conditions are not true this test
+ * will fail rather than silently passing.
+ */
+#include <inttypes.h>
+#include <limits.h>
+#include <pthread.h>
+#include <sys/mman.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+
+#include "kvm_util.h"
+#include "test_util.h"
+#include "perf_test_util.h"
+#include "guest_modes.h"
+
+/* Global variable used to synchronize all of the vCPU threads. */
+static int iteration = -1;
+
+/* Defines what vCPU threads should do during a given iteration. */
+static enum {
+	/* Run the vCPU to access all its memory. */
+	ITERATION_ACCESS_MEMORY,
+	/* Mark the vCPU's memory idle in page_idle. */
+	ITERATION_MARK_IDLE,
+} iteration_work;
+
+/* Set to true when vCPU threads should exit. */
+static bool done;
+
+/* The iteration that was last completed by each vCPU. */
+static int vcpu_last_completed_iteration[KVM_MAX_VCPUS];
+
+/* Whether to overlap the regions of memory vCPUs access. */
+static bool overlap_memory_access;
+
+struct test_params {
+	/* The backing source for the region of memory. */
+	enum vm_mem_backing_src_type backing_src;
+
+	/* The amount of memory to allocate for each vCPU. */
+	uint64_t vcpu_memory_bytes;
+
+	/* The number of vCPUs to create in the VM. */
+	int vcpus;
+};
+
+static uint64_t pread_uint64(int fd, const char *filename, uint64_t index)
+{
+	uint64_t value;
+	off_t offset = index * sizeof(value);
+
+	TEST_ASSERT(pread(fd, &value, sizeof(value), offset) == sizeof(value),
+		    "pread from %s offset 0x%" PRIx64 " failed!",
+		    filename, offset);
+
+	return value;
+
+}
+
+static uint64_t lookup_pfn(int pagemap_fd, struct kvm_vm *vm, uint64_t gva)
+{
+	uint64_t hva = (uint64_t) addr_gva2hva(vm, gva);
+	uint64_t entry;
+
+	entry = pread_uint64(pagemap_fd, "pagemap", hva / getpagesize());
+	if (!(entry & (1ULL << 63)))
+		return 0;
+
+	return (entry & ((1ULL << 55) - 1));
+}
+
+static bool is_page_idle(int page_idle_fd, uint64_t pfn)
+{
+	uint64_t bits = pread_uint64(page_idle_fd, "page_idle", pfn / 64);
+
+	return !!((bits >> (pfn % 64)) & 1);
+}
+
+static void mark_page_idle(int page_idle_fd, uint64_t pfn)
+{
+	uint64_t bits = 1ULL << (pfn % 64);
+
+	TEST_ASSERT(pwrite(page_idle_fd, &bits, 8, 8 * (pfn / 64)) == 8,
+		    "Set page_idle bits for PFN 0x%" PRIx64, pfn);
+}
+
+static void mark_vcpu_memory_idle(struct kvm_vm *vm, int vcpu_id)
+{
+	uint64_t base_gva = perf_test_args.vcpu_args[vcpu_id].gva;
+	uint64_t pages = perf_test_args.vcpu_args[vcpu_id].pages;
+	uint64_t page;
+	uint64_t still_idle = 0;
+	uint64_t no_pfn = 0;
+	int page_idle_fd;
+	int pagemap_fd;
+
+	/* If vCPUs are using an overlapping region, let vCPU 0 mark it idle. */
+	if (overlap_memory_access && vcpu_id)
+		return;
+
+	page_idle_fd = open("/sys/kernel/mm/page_idle/bitmap", O_RDWR);
+	TEST_ASSERT(page_idle_fd > 0, "Failed to open page_idle.");
+
+	pagemap_fd = open("/proc/self/pagemap", O_RDONLY);
+	TEST_ASSERT(pagemap_fd > 0, "Failed to open pagemap.");
+
+	for (page = 0; page < pages; page++) {
+		uint64_t gva = base_gva + page * perf_test_args.guest_page_size;
+		uint64_t pfn = lookup_pfn(pagemap_fd, vm, gva);
+
+		if (!pfn) {
+			no_pfn++;
+			continue;
+		}
+
+		if (is_page_idle(page_idle_fd, pfn)) {
+			still_idle++;
+			continue;
+		}
+
+		mark_page_idle(page_idle_fd, pfn);
+	}
+
+	/*
+	 * Assumption: Less than 1% of pages are going to be swapped out from
+	 * under us during this test.
+	 */
+	TEST_ASSERT(no_pfn < pages / 100,
+		    "vCPU %d: No PFN for %" PRIu64 " out of %" PRIu64 " pages.",
+		    vcpu_id, no_pfn, pages);
+
+	/*
+	 * Test that at least 90% of memory has been marked idle (the rest might
+	 * not be marked idle because the pages have not yet made it to an LRU
+	 * list or the translations are still cached in the TLB). 90% is
+	 * arbitrary; high enough that we ensure most memory access went through
+	 * access tracking but low enough as to not make the test too brittle
+	 * over time and across architectures.
+	 */
+	TEST_ASSERT(still_idle < pages / 10,
+		    "vCPU%d: Too many pages still idle (%"PRIu64 " out of %"
+		    PRIu64 ").\n",
+		    vcpu_id, still_idle, pages);
+
+	close(page_idle_fd);
+	close(pagemap_fd);
+}
+
+static void assert_ucall(struct kvm_vm *vm, uint32_t vcpu_id,
+			 uint64_t expected_ucall)
+{
+	struct ucall uc;
+	uint64_t actual_ucall = get_ucall(vm, vcpu_id, &uc);
+
+	TEST_ASSERT(expected_ucall == actual_ucall,
+		    "Guest exited unexpectedly (expected ucall %" PRIu64
+		    ", got %" PRIu64 ")",
+		    expected_ucall, actual_ucall);
+}
+
+static bool spin_wait_for_next_iteration(int *current_iteration)
+{
+	int last_iteration = *current_iteration;
+
+	do {
+		if (READ_ONCE(done))
+			return false;
+
+		*current_iteration = READ_ONCE(iteration);
+	} while (last_iteration == *current_iteration);
+
+	return true;
+}
+
+static void *vcpu_thread_main(void *arg)
+{
+	struct perf_test_vcpu_args *vcpu_args = arg;
+	struct kvm_vm *vm = perf_test_args.vm;
+	int vcpu_id = vcpu_args->vcpu_id;
+	int current_iteration = -1;
+
+	vcpu_args_set(vm, vcpu_id, 1, vcpu_id);
+
+	while (spin_wait_for_next_iteration(&current_iteration)) {
+		switch (READ_ONCE(iteration_work)) {
+		case ITERATION_ACCESS_MEMORY:
+			vcpu_run(vm, vcpu_id);
+			assert_ucall(vm, vcpu_id, UCALL_SYNC);
+			break;
+		case ITERATION_MARK_IDLE:
+			mark_vcpu_memory_idle(vm, vcpu_id);
+			break;
+		};
+
+		vcpu_last_completed_iteration[vcpu_id] = current_iteration;
+	}
+
+	return NULL;
+}
+
+static void spin_wait_for_vcpu(int vcpu_id, int target_iteration)
+{
+	while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id]) !=
+	       target_iteration) {
+		continue;
+	}
+}
+
+/* The type of memory accesses to perform in the VM. */
+enum access_type {
+	ACCESS_READ,
+	ACCESS_WRITE,
+};
+
+static void run_iteration(struct kvm_vm *vm, int vcpus, const char *description)
+{
+	struct timespec ts_start;
+	struct timespec ts_elapsed;
+	int next_iteration;
+	int vcpu_id;
+
+	/* Kick off the vCPUs by incrementing iteration. */
+	next_iteration = ++iteration;
+
+	clock_gettime(CLOCK_MONOTONIC, &ts_start);
+
+	/* Wait for all vCPUs to finish the iteration. */
+	for (vcpu_id = 0; vcpu_id < vcpus; vcpu_id++)
+		spin_wait_for_vcpu(vcpu_id, next_iteration);
+
+	ts_elapsed = timespec_elapsed(ts_start);
+	pr_info("%-30s: %ld.%09lds\n",
+		description, ts_elapsed.tv_sec, ts_elapsed.tv_nsec);
+}
+
+static void access_memory(struct kvm_vm *vm, int vcpus, enum access_type access,
+			  const char *description)
+{
+	perf_test_args.wr_fract = (access == ACCESS_READ) ? INT_MAX : 1;
+	sync_global_to_guest(vm, perf_test_args);
+	iteration_work = ITERATION_ACCESS_MEMORY;
+	run_iteration(vm, vcpus, description);
+}
+
+static void mark_memory_idle(struct kvm_vm *vm, int vcpus)
+{
+	/*
+	 * Even though this parallelizes the work across vCPUs, this is still a
+	 * very slow operation because page_idle forces the test to mark one pfn
+	 * at a time and the clear_young notifier serializes on the KVM MMU
+	 * lock.
+	 */
+	pr_debug("Marking VM memory idle (slow)...\n");
+	iteration_work = ITERATION_MARK_IDLE;
+	run_iteration(vm, vcpus, "Mark memory idle");
+}
+
+static pthread_t *create_vcpu_threads(int vcpus)
+{
+	pthread_t *vcpu_threads;
+	int i;
+
+	vcpu_threads = malloc(vcpus * sizeof(vcpu_threads[0]));
+	TEST_ASSERT(vcpu_threads, "Failed to allocate vcpu_threads.");
+
+	for (i = 0; i < vcpus; i++) {
+		vcpu_last_completed_iteration[i] = iteration;
+		pthread_create(&vcpu_threads[i], NULL, vcpu_thread_main,
+			       &perf_test_args.vcpu_args[i]);
+	}
+
+	return vcpu_threads;
+}
+
+static void terminate_vcpu_threads(pthread_t *vcpu_threads, int vcpus)
+{
+	int i;
+
+	/* Set done to signal the vCPU threads to exit */
+	done = true;
+
+	for (i = 0; i < vcpus; i++)
+		pthread_join(vcpu_threads[i], NULL);
+}
+
+static void run_test(enum vm_guest_mode mode, void *arg)
+{
+	struct test_params *params = arg;
+	struct kvm_vm *vm;
+	pthread_t *vcpu_threads;
+	int vcpus = params->vcpus;
+
+	vm = perf_test_create_vm(mode, vcpus, params->vcpu_memory_bytes,
+				 params->backing_src);
+
+	perf_test_setup_vcpus(vm, vcpus, params->vcpu_memory_bytes,
+			      !overlap_memory_access);
+
+	vcpu_threads = create_vcpu_threads(vcpus);
+
+	pr_info("\n");
+	access_memory(vm, vcpus, ACCESS_WRITE, "Populating memory");
+
+	/* As a control, read and write to the populated memory first. */
+	access_memory(vm, vcpus, ACCESS_WRITE, "Writing to populated memory");
+	access_memory(vm, vcpus, ACCESS_READ, "Reading from populated memory");
+
+	/* Repeat on memory that has been marked as idle. */
+	mark_memory_idle(vm, vcpus);
+	access_memory(vm, vcpus, ACCESS_WRITE, "Writing to idle memory");
+	mark_memory_idle(vm, vcpus);
+	access_memory(vm, vcpus, ACCESS_READ, "Reading from idle memory");
+
+	terminate_vcpu_threads(vcpu_threads, vcpus);
+	free(vcpu_threads);
+	perf_test_destroy_vm(vm);
+}
+
+static void help(char *name)
+{
+	puts("");
+	printf("usage: %s [-h] [-m mode] [-b vcpu_bytes] [-v vcpus] [-o]  [-s mem_type]\n",
+	       name);
+	puts("");
+	printf(" -h: Display this help message.");
+	guest_modes_help();
+	printf(" -b: specify the size of the memory region which should be\n"
+	       "     dirtied by each vCPU. e.g. 10M or 3G.\n"
+	       "     (default: 1G)\n");
+	printf(" -v: specify the number of vCPUs to run.\n");
+	printf(" -o: Overlap guest memory accesses instead of partitioning\n"
+	       "     them into a separate region of memory for each vCPU.\n");
+	printf(" -s: specify the type of memory that should be used to\n"
+	       "     back the guest data region.\n\n");
+	backing_src_help();
+	puts("");
+	exit(0);
+}
+
+int main(int argc, char *argv[])
+{
+	struct test_params params = {
+		.backing_src = VM_MEM_SRC_ANONYMOUS,
+		.vcpu_memory_bytes = DEFAULT_PER_VCPU_MEM_SIZE,
+		.vcpus = 1,
+	};
+	int page_idle_fd;
+	int opt;
+
+	guest_modes_append_default();
+
+	while ((opt = getopt(argc, argv, "hm:b:v:os:")) != -1) {
+		switch (opt) {
+		case 'm':
+			guest_modes_cmdline(optarg);
+			break;
+		case 'b':
+			params.vcpu_memory_bytes = parse_size(optarg);
+			break;
+		case 'v':
+			params.vcpus = atoi(optarg);
+			break;
+		case 'o':
+			overlap_memory_access = true;
+			break;
+		case 's':
+			params.backing_src = parse_backing_src_type(optarg);
+			break;
+		case 'h':
+		default:
+			help(argv[0]);
+			break;
+		}
+	}
+
+	page_idle_fd = open("/sys/kernel/mm/page_idle/bitmap", O_RDWR);
+	if (page_idle_fd < 0) {
+		print_skip("CONFIG_IDLE_PAGE_TRACKING is not enabled");
+		exit(KSFT_SKIP);
+	}
+	close(page_idle_fd);
+
+	for_each_guest_mode(run_test, &params);
+
+	return 0;
+}