[05/24] kvm: x86/mmu: Fix yielding in TDP MMU

Message ID	20210112181041.356734-6-bgardon@google.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> Sender: "bgardon via sendgmr" <bgardon@bgardon.sea.corp.google.com> Date: Tue, 12 Jan 2021 10:10:22 -0800 In-Reply-To: <20210112181041.356734-1-bgardon@google.com> Message-Id: <20210112181041.356734-6-bgardon@google.com> Mime-Version: 1.0 References: <20210112181041.356734-1-bgardon@google.com> Subject: [PATCH 05/24] kvm: x86/mmu: Fix yielding in TDP MMU From: Ben Gardon <bgardon@google.com> To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini <pbonzini@redhat.com>, Peter Xu <peterx@redhat.com>, Sean Christopherson <seanjc@google.com>, Peter Shier <pshier@google.com>, Peter Feiner <pfeiner@google.com>, Junaid Shahid <junaids@google.com>, Jim Mattson <jmattson@google.com>, Yulei Zhang <yulei.kernel@gmail.com>, Wanpeng Li <kernellwp@gmail.com>, Vitaly Kuznetsov <vkuznets@redhat.com>, Xiao Guangrong <xiaoguangrong.eric@gmail.com>, Ben Gardon <bgardon@google.com> Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	Allow parallel page faults with TDP MMU \| expand [00/24] Allow parallel page faults with TDP MMU [01/24] locking/rwlocks: Add contention detection for rwlocks [02/24] sched: Add needbreak for rwlocks [03/24] sched: Add cond_resched_rwlock [04/24] kvm: x86/mmu: change TDP MMU yield function returns to match cond_resched [05/24] kvm: x86/mmu: Fix yielding in TDP MMU [06/24] kvm: x86/mmu: Skip no-op changes in TDP MMU functions [07/24] kvm: x86/mmu: Add comment on __tdp_mmu_set_spte [08/24] kvm: x86/mmu: Add lockdep when setting a TDP MMU SPTE [09/24] kvm: x86/mmu: Don't redundantly clear TDP MMU pt memory [10/24] kvm: x86/mmu: Factor out handle disconnected pt [11/24] kvm: x86/mmu: Put TDP MMU PT walks in RCU read-critical section [12/24] kvm: x86/kvm: RCU dereference tdp mmu page table links [13/24] kvm: x86/mmu: Only free tdp_mmu pages after a grace period [14/24] kvm: mmu: Wrap mmu_lock lock / unlock in a function [15/24] kvm: mmu: Wrap mmu_lock cond_resched and needbreak [16/24] kvm: mmu: Wrap mmu_lock assertions [17/24] kvm: mmu: Move mmu_lock to struct kvm_arch [18/24] kvm: x86/mmu: Use an rwlock for the x86 TDP MMU [19/24] kvm: x86/mmu: Protect tdp_mmu_pages with a lock [20/24] kvm: x86/mmu: Add atomic option for setting SPTEs [21/24] kvm: x86/mmu: Use atomic ops to set SPTEs in TDP MMU map [22/24] kvm: x86/mmu: Flush TLBs after zap in TDP MMU PF handler [23/24] kvm: x86/mmu: Freeze SPTEs in disconnected pages [24/24] kvm: x86/mmu: Allow parallel page faults for the TDP MMU

Message ID

20210112181041.356734-6-bgardon@google.com (mailing list archive)

State

New, archived

Headers

Sender: "bgardon via sendgmr" <bgardon@bgardon.sea.corp.google.com>
Date: Tue, 12 Jan 2021 10:10:22 -0800
In-Reply-To: <20210112181041.356734-1-bgardon@google.com>
Message-Id: <20210112181041.356734-6-bgardon@google.com>
Mime-Version: 1.0
References: <20210112181041.356734-1-bgardon@google.com>
Subject: [PATCH 05/24] kvm: x86/mmu: Fix yielding in TDP MMU
From: Ben Gardon <bgardon@google.com>
To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>, Peter Xu <peterx@redhat.com>,
        Sean Christopherson <seanjc@google.com>,
        Peter Shier <pshier@google.com>,
        Peter Feiner <pfeiner@google.com>,
        Junaid Shahid <junaids@google.com>,
        Jim Mattson <jmattson@google.com>,
        Yulei Zhang <yulei.kernel@gmail.com>,
        Wanpeng Li <kernellwp@gmail.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
        Ben Gardon <bgardon@google.com>
Content-Type: text/plain; charset="UTF-8"
Precedence: bulk

Series

Allow parallel page faults with TDP MMU | expand

Commit Message

Ben Gardon Jan. 12, 2021, 6:10 p.m. UTC

There are two problems with the way the TDP MMU yields in long running
functions. 1.) Given certain conditions, the function may not yield
reliably / frequently enough. 2.) In some functions the TDP iter risks
not making forward progress if two threads livelock yielding to
one another.

Case 1 is possible if for example, a paging structure was very large
but had few, if any writable entries. wrprot_gfn_range could traverse many
entries before finding a writable entry and yielding.

Case 2 is possible if two threads were trying to execute wrprot_gfn_range.
Each could write protect an entry and then yield. This would reset the
tdp_iter's walk over the paging structure and the loop would end up
repeating the same entry over and over, preventing either thread from
making forward progress.

Fix these issues by moving the yield to the beginning of the loop,
before other checks and only yielding if the loop has made forward
progress since the last yield.

Fixes: a6a0b05da9f3 ("kvm: x86/mmu: Support dirty logging for the TDP MMU")
Reviewed-by: Peter Feiner <pfeiner@google.com>

Signed-off-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 83 +++++++++++++++++++++++++++++++-------
 1 file changed, 69 insertions(+), 14 deletions(-)

Comments

Sean Christopherson Jan. 20, 2021, 7:28 p.m. UTC | #1

On Tue, Jan 12, 2021, Ben Gardon wrote:
> There are two problems with the way the TDP MMU yields in long running
> functions. 1.) Given certain conditions, the function may not yield
> reliably / frequently enough. 2.) In some functions the TDP iter risks
> not making forward progress if two threads livelock yielding to
> one another.
> 
> Case 1 is possible if for example, a paging structure was very large
> but had few, if any writable entries. wrprot_gfn_range could traverse many
> entries before finding a writable entry and yielding.
> 
> Case 2 is possible if two threads were trying to execute wrprot_gfn_range.
> Each could write protect an entry and then yield. This would reset the
> tdp_iter's walk over the paging structure and the loop would end up
> repeating the same entry over and over, preventing either thread from
> making forward progress.
> 
> Fix these issues by moving the yield to the beginning of the loop,
> before other checks and only yielding if the loop has made forward
> progress since the last yield.

I think it'd be best to split this into two patches, e.g. ensure forward
progress and then yield more agressively.  They are two separate bugs, and I
don't think that ensuring forward progress would exacerbate case #1.  I'm not
worried about breaking things so much as getting more helpful shortlogs; "Fix
yielding in TDP MMU" doesn't provide any insight into what exactly was broken.
E.g. something like:

  KVM: x86/mmu: Ensure forward progress when yielding in TDP MMU iter
  KVM: x86/mmu: Yield in TDU MMU iter even if no real work was done

> Fixes: a6a0b05da9f3 ("kvm: x86/mmu: Support dirty logging for the TDP MMU")
> Reviewed-by: Peter Feiner <pfeiner@google.com>
> 
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
>  arch/x86/kvm/mmu/tdp_mmu.c | 83 +++++++++++++++++++++++++++++++-------
>  1 file changed, 69 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index b2784514ca2d..1987da0da66e 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -470,9 +470,23 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
>  			  gfn_t start, gfn_t end, bool can_yield)
>  {
>  	struct tdp_iter iter;
> +	gfn_t last_goal_gfn = start;
>  	bool flush_needed = false;
>  
>  	tdp_root_for_each_pte(iter, root, start, end) {
> +		/* Ensure forward progress has been made before yielding. */
> +		if (can_yield && iter.goal_gfn != last_goal_gfn &&

Make last_goal_gfn a property of the iterator, that way all this logic can be
shoved into tdp_mmu_iter_flush_cond_resched(), and the comments about ensuring
forward progress and effectively invalidating/resetting the iterator (the
comment below) can be a function comment, as opposed to being copied everywhere.
E.g. there can be a big scary warning in the function comment stating that the
caller must restart its loop if the helper yielded.

Tangentially related, the name goal_gfn is quite confusing.  "goal" and "end"
are synonyms, but "goal" is often initialized with "start", and it's not used to
terminate the walk.  Maybe next_gfn instead?  And maybe yielded_gfn, since
last_next_gfn is pretty horrendous.

> +		    tdp_mmu_iter_flush_cond_resched(kvm, &iter)) {

This isn't quite correct, as tdp_mmu_iter_flush_cond_resched() will do an
expensive remote TLB flush on every yield, even if no flush is needed.  The
cleanest solution is likely to drop tdp_mmu_iter_flush_cond_resched() and
instead add a @flush param to tdp_mmu_iter_cond_resched().  If it's tagged
__always_inline, then the callers that unconditionally pass true/false will
optimize out the conditional code.

At that point, I think it would also make sense to fold tdp_iter_refresh_walk()
into tdp_mmu_iter_cond_resched(), because really we shouldn't be mucking with
the guts of the iter except for the yield case.

> +			last_goal_gfn = iter.goal_gfn;

Another argument for both renaming goal_gfn and moving last_*_gfn into the iter:
it's not at all obvious that updating the last gfn _after_ tdp_iter_refresh_walk()
is indeed correct.

You can also avoid a local variable by doing max(iter->next_gfn, iter->gfn) when
calling tdp_iter_refresh_walk().  IMO, that's also a bit easier to understand
than an open-coded equivalent.

E.g. putting it all together, with yielded_gfn set by tdp_iter_start():

static __always_inline bool tdp_mmu_iter_cond_resched(struct kvm *kvm,
						     struct tdp_iter *iter,
						     bool flush)
{
	/* Ensure forward progress has been made since the last yield. */
	if (iter->next_gfn == iter->yielded_gfn)
		return false;

	if (need_resched() || spin_needbreak(&kvm->mmu_lock)) {
		if (flush)
			kvm_flush_remote_tlbs(kvm);
		cond_resched_lock(&kvm->mmu_lock);

		/*
		 * Restart the walk over the paging structure from the root,
		 * starting from the highest gfn the iterator had previously
		 * reached.  The entire paging structure, except the root, may
		 * have been completely torn down and rebuilt while we yielded.
		 */
		tdp_iter_start(iter, iter->pt_path[iter->root_level - 1],
			       iter->root_level, iter->min_level,
			       max(iter->next_gfn, iter->gfn));
		return true;
	}

	return false;
}

> +			flush_needed = false;
> +			/*
> +			 * Yielding caused the paging structure walk to be
> +			 * reset so skip to the next iteration to continue the
> +			 * walk from the root.
> +			 */
> +			continue;
> +		}
> +
>  		if (!is_shadow_present_pte(iter.old_spte))
>  			continue;
>

Ben Gardon Jan. 22, 2021, 1:06 a.m. UTC | #2

On Wed, Jan 20, 2021 at 11:28 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Jan 12, 2021, Ben Gardon wrote:
> > There are two problems with the way the TDP MMU yields in long running
> > functions. 1.) Given certain conditions, the function may not yield
> > reliably / frequently enough. 2.) In some functions the TDP iter risks
> > not making forward progress if two threads livelock yielding to
> > one another.
> >
> > Case 1 is possible if for example, a paging structure was very large
> > but had few, if any writable entries. wrprot_gfn_range could traverse many
> > entries before finding a writable entry and yielding.
> >
> > Case 2 is possible if two threads were trying to execute wrprot_gfn_range.
> > Each could write protect an entry and then yield. This would reset the
> > tdp_iter's walk over the paging structure and the loop would end up
> > repeating the same entry over and over, preventing either thread from
> > making forward progress.
> >
> > Fix these issues by moving the yield to the beginning of the loop,
> > before other checks and only yielding if the loop has made forward
> > progress since the last yield.
>
> I think it'd be best to split this into two patches, e.g. ensure forward
> progress and then yield more agressively.  They are two separate bugs, and I
> don't think that ensuring forward progress would exacerbate case #1.  I'm not
> worried about breaking things so much as getting more helpful shortlogs; "Fix
> yielding in TDP MMU" doesn't provide any insight into what exactly was broken.
> E.g. something like:
>
>   KVM: x86/mmu: Ensure forward progress when yielding in TDP MMU iter
>   KVM: x86/mmu: Yield in TDU MMU iter even if no real work was done
>
> > Fixes: a6a0b05da9f3 ("kvm: x86/mmu: Support dirty logging for the TDP MMU")
> > Reviewed-by: Peter Feiner <pfeiner@google.com>
> >
> > Signed-off-by: Ben Gardon <bgardon@google.com>
> > ---
> >  arch/x86/kvm/mmu/tdp_mmu.c | 83 +++++++++++++++++++++++++++++++-------
> >  1 file changed, 69 insertions(+), 14 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> > index b2784514ca2d..1987da0da66e 100644
> > --- a/arch/x86/kvm/mmu/tdp_mmu.c
> > +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> > @@ -470,9 +470,23 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
> >                         gfn_t start, gfn_t end, bool can_yield)
> >  {
> >       struct tdp_iter iter;
> > +     gfn_t last_goal_gfn = start;
> >       bool flush_needed = false;
> >
> >       tdp_root_for_each_pte(iter, root, start, end) {
> > +             /* Ensure forward progress has been made before yielding. */
> > +             if (can_yield && iter.goal_gfn != last_goal_gfn &&
>
> Make last_goal_gfn a property of the iterator, that way all this logic can be
> shoved into tdp_mmu_iter_flush_cond_resched(), and the comments about ensuring
> forward progress and effectively invalidating/resetting the iterator (the
> comment below) can be a function comment, as opposed to being copied everywhere.
> E.g. there can be a big scary warning in the function comment stating that the
> caller must restart its loop if the helper yielded.
>
> Tangentially related, the name goal_gfn is quite confusing.  "goal" and "end"
> are synonyms, but "goal" is often initialized with "start", and it's not used to
> terminate the walk.  Maybe next_gfn instead?  And maybe yielded_gfn, since
> last_next_gfn is pretty horrendous.

All these are excellent suggestions and definitely make the code
cleaner. I'll definitely adopt yielded_gfn. While I agree goal_gfn is
a little odd, I think next_gfn could be more misleading because the
goal_gfn is really more of a target than the next step. It might take
4 or 5 steps to actually reach a last-level entry mapping that gfn.
target_last_level_gfn or next_last_level_gfn would probably be the
most accurate option.

>
> > +                 tdp_mmu_iter_flush_cond_resched(kvm, &iter)) {
>
> This isn't quite correct, as tdp_mmu_iter_flush_cond_resched() will do an
> expensive remote TLB flush on every yield, even if no flush is needed.  The
> cleanest solution is likely to drop tdp_mmu_iter_flush_cond_resched() and
> instead add a @flush param to tdp_mmu_iter_cond_resched().  If it's tagged
> __always_inline, then the callers that unconditionally pass true/false will
> optimize out the conditional code.
>
> At that point, I think it would also make sense to fold tdp_iter_refresh_walk()
> into tdp_mmu_iter_cond_resched(), because really we shouldn't be mucking with
> the guts of the iter except for the yield case.
>
> > +                     last_goal_gfn = iter.goal_gfn;
>
> Another argument for both renaming goal_gfn and moving last_*_gfn into the iter:
> it's not at all obvious that updating the last gfn _after_ tdp_iter_refresh_walk()
> is indeed correct.
>
> You can also avoid a local variable by doing max(iter->next_gfn, iter->gfn) when
> calling tdp_iter_refresh_walk().  IMO, that's also a bit easier to understand
> than an open-coded equivalent.
>
> E.g. putting it all together, with yielded_gfn set by tdp_iter_start():
>
> static __always_inline bool tdp_mmu_iter_cond_resched(struct kvm *kvm,
>                                                      struct tdp_iter *iter,
>                                                      bool flush)
> {
>         /* Ensure forward progress has been made since the last yield. */
>         if (iter->next_gfn == iter->yielded_gfn)
>                 return false;
>
>         if (need_resched() || spin_needbreak(&kvm->mmu_lock)) {
>                 if (flush)
>                         kvm_flush_remote_tlbs(kvm);
>                 cond_resched_lock(&kvm->mmu_lock);
>
>                 /*
>                  * Restart the walk over the paging structure from the root,
>                  * starting from the highest gfn the iterator had previously
>                  * reached.  The entire paging structure, except the root, may
>                  * have been completely torn down and rebuilt while we yielded.
>                  */
>                 tdp_iter_start(iter, iter->pt_path[iter->root_level - 1],
>                                iter->root_level, iter->min_level,
>                                max(iter->next_gfn, iter->gfn));
>                 return true;
>         }
>
>         return false;
> }
>
> > +                     flush_needed = false;
> > +                     /*
> > +                      * Yielding caused the paging structure walk to be
> > +                      * reset so skip to the next iteration to continue the
> > +                      * walk from the root.
> > +                      */
> > +                     continue;
> > +             }
> > +
> >               if (!is_shadow_present_pte(iter.old_spte))
> >                       continue;
> >

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index b2784514ca2d..1987da0da66e 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -470,9 +470,23 @@  static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
 			  gfn_t start, gfn_t end, bool can_yield)
 {
 	struct tdp_iter iter;
+	gfn_t last_goal_gfn = start;
 	bool flush_needed = false;
 
 	tdp_root_for_each_pte(iter, root, start, end) {
+		/* Ensure forward progress has been made before yielding. */
+		if (can_yield && iter.goal_gfn != last_goal_gfn &&
+		    tdp_mmu_iter_flush_cond_resched(kvm, &iter)) {
+			last_goal_gfn = iter.goal_gfn;
+			flush_needed = false;
+			/*
+			 * Yielding caused the paging structure walk to be
+			 * reset so skip to the next iteration to continue the
+			 * walk from the root.
+			 */
+			continue;
+		}
+
 		if (!is_shadow_present_pte(iter.old_spte))
 			continue;
 
@@ -487,12 +501,7 @@  static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
 			continue;
 
 		tdp_mmu_set_spte(kvm, &iter, 0);
-
-		if (can_yield)
-			flush_needed = !tdp_mmu_iter_flush_cond_resched(kvm,
-									&iter);
-		else
-			flush_needed = true;
+		flush_needed = true;
 	}
 	return flush_needed;
 }
@@ -850,12 +859,25 @@  static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
 {
 	struct tdp_iter iter;
 	u64 new_spte;
+	gfn_t last_goal_gfn = start;
 	bool spte_set = false;
 
 	BUG_ON(min_level > KVM_MAX_HUGEPAGE_LEVEL);
 
 	for_each_tdp_pte_min_level(iter, root->spt, root->role.level,
 				   min_level, start, end) {
+		/* Ensure forward progress has been made before yielding. */
+		if (iter.goal_gfn != last_goal_gfn &&
+		    tdp_mmu_iter_cond_resched(kvm, &iter)) {
+			last_goal_gfn = iter.goal_gfn;
+			/*
+			 * Yielding caused the paging structure walk to be
+			 * reset so skip to the next iteration to continue the
+			 * walk from the root.
+			 */
+			continue;
+		}
+
 		if (!is_shadow_present_pte(iter.old_spte) ||
 		    !is_last_spte(iter.old_spte, iter.level))
 			continue;
@@ -864,8 +886,6 @@  static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
 
 		tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte);
 		spte_set = true;
-
-		tdp_mmu_iter_cond_resched(kvm, &iter);
 	}
 	return spte_set;
 }
@@ -906,9 +926,22 @@  static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
 {
 	struct tdp_iter iter;
 	u64 new_spte;
+	gfn_t last_goal_gfn = start;
 	bool spte_set = false;
 
 	tdp_root_for_each_leaf_pte(iter, root, start, end) {
+		/* Ensure forward progress has been made before yielding. */
+		if (iter.goal_gfn != last_goal_gfn &&
+		    tdp_mmu_iter_cond_resched(kvm, &iter)) {
+			last_goal_gfn = iter.goal_gfn;
+			/*
+			 * Yielding caused the paging structure walk to be
+			 * reset so skip to the next iteration to continue the
+			 * walk from the root.
+			 */
+			continue;
+		}
+
 		if (spte_ad_need_write_protect(iter.old_spte)) {
 			if (is_writable_pte(iter.old_spte))
 				new_spte = iter.old_spte & ~PT_WRITABLE_MASK;
@@ -923,8 +956,6 @@  static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
 
 		tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte);
 		spte_set = true;
-
-		tdp_mmu_iter_cond_resched(kvm, &iter);
 	}
 	return spte_set;
 }
@@ -1029,9 +1060,22 @@  static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
 {
 	struct tdp_iter iter;
 	u64 new_spte;
+	gfn_t last_goal_gfn = start;
 	bool spte_set = false;
 
 	tdp_root_for_each_pte(iter, root, start, end) {
+		/* Ensure forward progress has been made before yielding. */
+		if (iter.goal_gfn != last_goal_gfn &&
+		    tdp_mmu_iter_cond_resched(kvm, &iter)) {
+			last_goal_gfn = iter.goal_gfn;
+			/*
+			 * Yielding caused the paging structure walk to be
+			 * reset so skip to the next iteration to continue the
+			 * walk from the root.
+			 */
+			continue;
+		}
+
 		if (!is_shadow_present_pte(iter.old_spte))
 			continue;
 
@@ -1039,8 +1083,6 @@  static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
 
 		tdp_mmu_set_spte(kvm, &iter, new_spte);
 		spte_set = true;
-
-		tdp_mmu_iter_cond_resched(kvm, &iter);
 	}
 
 	return spte_set;
@@ -1078,9 +1120,23 @@  static void zap_collapsible_spte_range(struct kvm *kvm,
 {
 	struct tdp_iter iter;
 	kvm_pfn_t pfn;
+	gfn_t last_goal_gfn = start;
 	bool spte_set = false;
 
 	tdp_root_for_each_pte(iter, root, start, end) {
+		/* Ensure forward progress has been made before yielding. */
+		if (iter.goal_gfn != last_goal_gfn &&
+		    tdp_mmu_iter_flush_cond_resched(kvm, &iter)) {
+			last_goal_gfn = iter.goal_gfn;
+			spte_set = false;
+			/*
+			 * Yielding caused the paging structure walk to be
+			 * reset so skip to the next iteration to continue the
+			 * walk from the root.
+			 */
+			continue;
+		}
+
 		if (!is_shadow_present_pte(iter.old_spte) ||
 		    is_last_spte(iter.old_spte, iter.level))
 			continue;
@@ -1091,8 +1147,7 @@  static void zap_collapsible_spte_range(struct kvm *kvm,
 			continue;
 
 		tdp_mmu_set_spte(kvm, &iter, 0);
-
-		spte_set = !tdp_mmu_iter_flush_cond_resched(kvm, &iter);
+		spte_set = true;
 	}
 
 	if (spte_set)

[05/24] kvm: x86/mmu: Fix yielding in TDP MMU

Commit Message

Comments

Patch