[01/22] kvm: mmu: Separate making SPTEs from set_spte

Message ID	20200925212302.3979661-2-bgardon@google.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=qKOj=DC=vger.kernel.org=kvm-owner@kernel.org> Sender: "bgardon via sendgmr" <bgardon@bgardon.sea.corp.google.com> Date: Fri, 25 Sep 2020 14:22:41 -0700 In-Reply-To: <20200925212302.3979661-1-bgardon@google.com> Message-Id: <20200925212302.3979661-2-bgardon@google.com> Mime-Version: 1.0 References: <20200925212302.3979661-1-bgardon@google.com> Subject: [PATCH 01/22] kvm: mmu: Separate making SPTEs from set_spte From: Ben Gardon <bgardon@google.com> To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Cannon Matthews <cannonmatthews@google.com>, Paolo Bonzini <pbonzini@redhat.com>, Peter Xu <peterx@redhat.com>, Sean Christopherson <sean.j.christopherson@intel.com>, Peter Shier <pshier@google.com>, Peter Feiner <pfeiner@google.com>, Junaid Shahid <junaids@google.com>, Jim Mattson <jmattson@google.com>, Yulei Zhang <yulei.kernel@gmail.com>, Wanpeng Li <kernellwp@gmail.com>, Vitaly Kuznetsov <vkuznets@redhat.com>, Xiao Guangrong <xiaoguangrong.eric@gmail.com>, Ben Gardon <bgardon@google.com> Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	Introduce the TDP MMU \| expand [00/22] Introduce the TDP MMU [01/22] kvm: mmu: Separate making SPTEs from set_spte [02/22] kvm: mmu: Introduce tdp_iter [03/22] kvm: mmu: Init / Uninit the TDP MMU [04/22] kvm: mmu: Allocate and free TDP MMU roots [05/22] kvm: mmu: Add functions to handle changed TDP SPTEs [06/22] kvm: mmu: Make address space ID a property of memslots [07/22] kvm: mmu: Support zapping SPTEs in the TDP MMU [08/22] kvm: mmu: Separate making non-leaf sptes from link_shadow_page [09/22] kvm: mmu: Remove disallowed_hugepage_adjust shadow_walk_iterator arg [10/22] kvm: mmu: Add TDP MMU PF handler [11/22] kvm: mmu: Factor out allocating a new tdp_mmu_page [12/22] kvm: mmu: Allocate struct kvm_mmu_pages for all pages in TDP MMU [13/22] kvm: mmu: Support invalidate range MMU notifier for TDP MMU [14/22] kvm: mmu: Add access tracking for tdp_mmu [15/22] kvm: mmu: Support changed pte notifier in tdp MMU [16/22] kvm: mmu: Add dirty logging handler for changed sptes [17/22] kvm: mmu: Support dirty logging for the TDP MMU [18/22] kvm: mmu: Support disabling dirty logging for the tdp MMU [19/22] kvm: mmu: Support write protection for nesting in tdp MMU [20/22] kvm: mmu: NX largepage recovery for TDP MMU [21/22] kvm: mmu: Support MMIO in the TDP MMU [22/22] kvm: mmu: Don't clear write flooding count for direct roots

Message ID

20200925212302.3979661-2-bgardon@google.com (mailing list archive)

State

New, archived

Headers

Sender: "bgardon via sendgmr" <bgardon@bgardon.sea.corp.google.com>
Date: Fri, 25 Sep 2020 14:22:41 -0700
In-Reply-To: <20200925212302.3979661-1-bgardon@google.com>
Message-Id: <20200925212302.3979661-2-bgardon@google.com>
Mime-Version: 1.0
References: <20200925212302.3979661-1-bgardon@google.com>
Subject: [PATCH 01/22] kvm: mmu: Separate making SPTEs from set_spte
From: Ben Gardon <bgardon@google.com>
To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Cc: Cannon Matthews <cannonmatthews@google.com>,
        Paolo Bonzini <pbonzini@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Sean Christopherson <sean.j.christopherson@intel.com>,
        Peter Shier <pshier@google.com>,
        Peter Feiner <pfeiner@google.com>,
        Junaid Shahid <junaids@google.com>,
        Jim Mattson <jmattson@google.com>,
        Yulei Zhang <yulei.kernel@gmail.com>,
        Wanpeng Li <kernellwp@gmail.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
        Ben Gardon <bgardon@google.com>
Content-Type: text/plain; charset="UTF-8"
Precedence: bulk

Series

Introduce the TDP MMU | expand

Commit Message

Ben Gardon Sept. 25, 2020, 9:22 p.m. UTC

Separate the functions for generating leaf page table entries from the
function that inserts them into the paging structure. This refactoring
will facilitate changes to the MMU sychronization model to use atomic
compare / exchanges (which are not guaranteed to succeed) instead of a
monolithic MMU lock.

No functional change expected.

Tested by running kvm-unit-tests and KVM selftests on an Intel Haswell
machine. This commit introduced no new failures.

This series can be viewed in Gerrit at:
	https://linux-review.googlesource.com/c/virt/kvm/kvm/+/2538

Signed-off-by: Ben Gardon <bgardon@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 52 +++++++++++++++++++++++++++---------------
 1 file changed, 34 insertions(+), 18 deletions(-)

Comments

Sean Christopherson Sept. 30, 2020, 4:55 a.m. UTC | #1

On Fri, Sep 25, 2020 at 02:22:41PM -0700, Ben Gardon wrote:
> +static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
> +		    unsigned int pte_access, int level,
> +		    gfn_t gfn, kvm_pfn_t pfn, bool speculative,
> +		    bool can_unsync, bool host_writable)
> +{
> +	u64 spte = 0;
> +	struct kvm_mmu_page *sp;
> +	int ret = 0;
> +
> +	if (set_mmio_spte(vcpu, sptep, gfn, pfn, pte_access))
> +		return 0;
> +
> +	sp = sptep_to_sp(sptep);
> +
> +	spte = make_spte(vcpu, pte_access, level, gfn, pfn, *sptep, speculative,
> +			 can_unsync, host_writable, sp_ad_disabled(sp), &ret);
> +	if (!spte)
> +		return 0;

This is an impossible condition.  Well, maybe it's theoretically possible
if page track is active, with EPT exec-only support (shadow_present_mask is
zero), and pfn==0.  But in that case, returning early is wrong.

Rather than return the spte, what about returning 'ret', passing 'new_spte'
as a u64 *, and dropping the bail early path?  That would also eliminate
the minor wart of make_spte() relying on the caller to initialize 'ret'.

> +
> +	if (spte & PT_WRITABLE_MASK)
> +		kvm_vcpu_mark_page_dirty(vcpu, gfn);
> +
>  	if (mmu_spte_update(sptep, spte))
>  		ret |= SET_SPTE_NEED_REMOTE_TLB_FLUSH;
>  	return ret;
> -- 
> 2.28.0.709.gb0816b6eb0-goog
>

Ben Gardon Sept. 30, 2020, 11:03 p.m. UTC | #2

On Tue, Sep 29, 2020 at 9:55 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Fri, Sep 25, 2020 at 02:22:41PM -0700, Ben Gardon wrote:
> > +static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
> > +                 unsigned int pte_access, int level,
> > +                 gfn_t gfn, kvm_pfn_t pfn, bool speculative,
> > +                 bool can_unsync, bool host_writable)
> > +{
> > +     u64 spte = 0;
> > +     struct kvm_mmu_page *sp;
> > +     int ret = 0;
> > +
> > +     if (set_mmio_spte(vcpu, sptep, gfn, pfn, pte_access))
> > +             return 0;
> > +
> > +     sp = sptep_to_sp(sptep);
> > +
> > +     spte = make_spte(vcpu, pte_access, level, gfn, pfn, *sptep, speculative,
> > +                      can_unsync, host_writable, sp_ad_disabled(sp), &ret);
> > +     if (!spte)
> > +             return 0;
>
> This is an impossible condition.  Well, maybe it's theoretically possible
> if page track is active, with EPT exec-only support (shadow_present_mask is
> zero), and pfn==0.  But in that case, returning early is wrong.
>
> Rather than return the spte, what about returning 'ret', passing 'new_spte'
> as a u64 *, and dropping the bail early path?  That would also eliminate
> the minor wart of make_spte() relying on the caller to initialize 'ret'.

I agree that would make this much cleaner.

>
> > +
> > +     if (spte & PT_WRITABLE_MASK)
> > +             kvm_vcpu_mark_page_dirty(vcpu, gfn);
> > +
> >       if (mmu_spte_update(sptep, spte))
> >               ret |= SET_SPTE_NEED_REMOTE_TLB_FLUSH;
> >       return ret;
> > --
> > 2.28.0.709.gb0816b6eb0-goog
> >

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 71aa3da2a0b7b..81240b558d67f 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2971,20 +2971,14 @@  static bool kvm_is_mmio_pfn(kvm_pfn_t pfn)
 #define SET_SPTE_WRITE_PROTECTED_PT	BIT(0)
 #define SET_SPTE_NEED_REMOTE_TLB_FLUSH	BIT(1)
 
-static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
-		    unsigned int pte_access, int level,
-		    gfn_t gfn, kvm_pfn_t pfn, bool speculative,
-		    bool can_unsync, bool host_writable)
+static u64 make_spte(struct kvm_vcpu *vcpu, unsigned int pte_access, int level,
+		     gfn_t gfn, kvm_pfn_t pfn, u64 old_spte, bool speculative,
+		     bool can_unsync, bool host_writable, bool ad_disabled,
+		     int *ret)
 {
 	u64 spte = 0;
-	int ret = 0;
-	struct kvm_mmu_page *sp;
-
-	if (set_mmio_spte(vcpu, sptep, gfn, pfn, pte_access))
-		return 0;
 
-	sp = sptep_to_sp(sptep);
-	if (sp_ad_disabled(sp))
+	if (ad_disabled)
 		spte |= SPTE_AD_DISABLED_MASK;
 	else if (kvm_vcpu_ad_need_write_protect(vcpu))
 		spte |= SPTE_AD_WRPROT_ONLY_MASK;
@@ -3037,27 +3031,49 @@  static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 		 * is responsibility of mmu_get_page / kvm_sync_page.
 		 * Same reasoning can be applied to dirty page accounting.
 		 */
-		if (!can_unsync && is_writable_pte(*sptep))
-			goto set_pte;
+		if (!can_unsync && is_writable_pte(old_spte))
+			return spte;
 
 		if (mmu_need_write_protect(vcpu, gfn, can_unsync)) {
 			pgprintk("%s: found shadow page for %llx, marking ro\n",
 				 __func__, gfn);
-			ret |= SET_SPTE_WRITE_PROTECTED_PT;
+			*ret |= SET_SPTE_WRITE_PROTECTED_PT;
 			pte_access &= ~ACC_WRITE_MASK;
 			spte &= ~(PT_WRITABLE_MASK | SPTE_MMU_WRITEABLE);
 		}
 	}
 
-	if (pte_access & ACC_WRITE_MASK) {
-		kvm_vcpu_mark_page_dirty(vcpu, gfn);
+	if (pte_access & ACC_WRITE_MASK)
 		spte |= spte_shadow_dirty_mask(spte);
-	}
 
 	if (speculative)
 		spte = mark_spte_for_access_track(spte);
 
-set_pte:
+	return spte;
+}
+
+static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
+		    unsigned int pte_access, int level,
+		    gfn_t gfn, kvm_pfn_t pfn, bool speculative,
+		    bool can_unsync, bool host_writable)
+{
+	u64 spte = 0;
+	struct kvm_mmu_page *sp;
+	int ret = 0;
+
+	if (set_mmio_spte(vcpu, sptep, gfn, pfn, pte_access))
+		return 0;
+
+	sp = sptep_to_sp(sptep);
+
+	spte = make_spte(vcpu, pte_access, level, gfn, pfn, *sptep, speculative,
+			 can_unsync, host_writable, sp_ad_disabled(sp), &ret);
+	if (!spte)
+		return 0;
+
+	if (spte & PT_WRITABLE_MASK)
+		kvm_vcpu_mark_page_dirty(vcpu, gfn);
+
 	if (mmu_spte_update(sptep, spte))
 		ret |= SET_SPTE_NEED_REMOTE_TLB_FLUSH;
 	return ret;

[01/22] kvm: mmu: Separate making SPTEs from set_spte

Commit Message

Comments

Patch