[v5,12/13] KVM: Optimize gfn lookup in kvm_zap_gfn_range()

Message ID	062df8ac9eb280440a5f0c11159616b1bbb1c2c4.1632171479.git.maciej.szmigiero@oracle.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> From: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name> To: Paolo Bonzini <pbonzini@redhat.com>, Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Sean Christopherson <seanjc@google.com>, Wanpeng Li <wanpengli@tencent.com>, Jim Mattson <jmattson@google.com>, Igor Mammedov <imammedo@redhat.com>, Marc Zyngier <maz@kernel.org>, James Morse <james.morse@arm.com>, Julien Thierry <julien.thierry.kdev@gmail.com>, Suzuki K Poulose <suzuki.poulose@arm.com>, Huacai Chen <chenhuacai@kernel.org>, Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>, Paul Mackerras <paulus@ozlabs.org>, Christian Borntraeger <borntraeger@de.ibm.com>, Janosch Frank <frankja@linux.ibm.com>, David Hildenbrand <david@redhat.com>, Cornelia Huck <cohuck@redhat.com>, Claudio Imbrenda <imbrenda@linux.ibm.com>, Joerg Roedel <joro@8bytes.org>, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v5 12/13] KVM: Optimize gfn lookup in kvm_zap_gfn_range() Date: Mon, 20 Sep 2021 23:39:00 +0200 Message-Id: <062df8ac9eb280440a5f0c11159616b1bbb1c2c4.1632171479.git.maciej.szmigiero@oracle.com> In-Reply-To: <cover.1632171478.git.maciej.szmigiero@oracle.com> References: <cover.1632171478.git.maciej.szmigiero@oracle.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	KVM: Scalable memslots implementation \| expand [v5,00/13] KVM: Scalable memslots implementation [v5,01/13] KVM: x86: Cache total page count to avoid traversing the memslot array [v5,02/13] KVM: x86: Don't call kvm_mmu_change_mmu_pages() if the count hasn't changed [v5,03/13] KVM: Add "old" memslot parameter to kvm_arch_prepare_memory_region() [v5,04/13] KVM: x86: Move n_memslots_pages recalc to kvm_arch_prepare_memory_region() [v5,05/13] KVM: Integrate gfn_to_memslot_approx() into search_memslots() [v5,06/13] KVM: Move WARN on invalid memslot index to update_memslots() [v5,07/13] KVM: Just resync arch fields when slots_arch_lock gets reacquired [v5,08/13] KVM: Resolve memslot ID via a hash table instead of via a static array [v5,09/13] KVM: Use interval tree to do fast hva lookup in memslots [v5,10/13] KVM: s390: Introduce kvm_s390_get_gfn_end() [v5,11/13] KVM: Keep memslots in tree-based structures instead of array-based ones [v5,12/13] KVM: Optimize gfn lookup in kvm_zap_gfn_range() [v5,13/13] KVM: Optimize overlapping memslots check

Message ID

062df8ac9eb280440a5f0c11159616b1bbb1c2c4.1632171479.git.maciej.szmigiero@oracle.com (mailing list archive)

State

New, archived

Headers

From: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name>
To: Paolo Bonzini <pbonzini@redhat.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Igor Mammedov <imammedo@redhat.com>,
        Marc Zyngier <maz@kernel.org>,
        James Morse <james.morse@arm.com>,
        Julien Thierry <julien.thierry.kdev@gmail.com>,
        Suzuki K Poulose <suzuki.poulose@arm.com>,
        Huacai Chen <chenhuacai@kernel.org>,
        Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Paul Mackerras <paulus@ozlabs.org>,
        Christian Borntraeger <borntraeger@de.ibm.com>,
        Janosch Frank <frankja@linux.ibm.com>,
        David Hildenbrand <david@redhat.com>,
        Cornelia Huck <cohuck@redhat.com>,
        Claudio Imbrenda <imbrenda@linux.ibm.com>,
        Joerg Roedel <joro@8bytes.org>, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org
Subject: [PATCH v5 12/13] KVM: Optimize gfn lookup in kvm_zap_gfn_range()
Date: Mon, 20 Sep 2021 23:39:00 +0200
Message-Id: 
 <062df8ac9eb280440a5f0c11159616b1bbb1c2c4.1632171479.git.maciej.szmigiero@oracle.com>
In-Reply-To: <cover.1632171478.git.maciej.szmigiero@oracle.com>
References: <cover.1632171478.git.maciej.szmigiero@oracle.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

KVM: Scalable memslots implementation | expand

Commit Message

Maciej S. Szmigiero Sept. 20, 2021, 9:39 p.m. UTC

From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>

Introduce a memslots gfn upper bound operation and use it to optimize
kvm_zap_gfn_range().
This way this handler can do a quick lookup for intersecting gfns and won't
have to do a linear scan of the whole memslot set.

Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
---
 arch/x86/kvm/mmu/mmu.c   | 11 +++++--
 include/linux/kvm_host.h | 69 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 78 insertions(+), 2 deletions(-)

Comments

Sean Christopherson Oct. 20, 2021, 11:47 p.m. UTC | #1

On Mon, Sep 20, 2021, Maciej S. Szmigiero wrote:

Some mechanical comments while they're on my mind, I'll get back to a full review
tomorrow.

> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 6433efff447a..9ae5f7341cf5 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -833,6 +833,75 @@ struct kvm_memory_slot *id_to_memslot(struct kvm_memslots *slots, int id)
>  	return NULL;
>  }
>  
> +static inline
> +struct rb_node *kvm_memslots_gfn_upper_bound(struct kvm_memslots *slots, gfn_t gfn)

Function attributes should go on the same line as the function unless there's a
really good reason to do otherwise.

In this case, I would honestly just drop the helper.  It's really hard to express
what this function does in a name that isn't absurdly long, and there's exactly
one user at the end of the series.

https://lkml.kernel.org/r/20210930192417.1332877-1-keescook@chromium.org

> +{
> +	int idx = slots->node_idx;
> +	struct rb_node *node, *result = NULL;
> +
> +	for (node = slots->gfn_tree.rb_node; node; ) {
> +		struct kvm_memory_slot *slot;

My personal preference is to put declarations outside of the for loop.  I find it
easier to read, it's harder to have shadowing issues if all variables are declared
at the top, especially when using relatively generic names.

> +
> +		slot = container_of(node, struct kvm_memory_slot, gfn_node[idx]);
> +		if (gfn < slot->base_gfn) {
> +			result = node;
> +			node = node->rb_left;
> +		} else

Needs braces since the "if" has braces.

> +			node = node->rb_right;
> +	}
> +
> +	return result;
> +}
> +
> +static inline
> +struct rb_node *kvm_for_each_in_gfn_first(struct kvm_memslots *slots, gfn_t start)

The kvm_for_each_in_gfn prefix is _really_ confusing.  I get that these are all
helpers for "kvm_for_each_memslot...", but it's hard not to think these are all
iterators on their own.  I would gladly sacrifice namespacing for readability in
this case.

I also wouldn't worry about capturing the details.  For most folks reading this
code, the important part is understanding the control flow of
kvm_for_each_memslot_in_gfn_range().  Capturing the under-the-hood details in the
name isn't a priority since anyone modifying this code is going to have to do a
lot of staring no matter what :-)

> +static inline
> +bool kvm_for_each_in_gfn_no_more(struct kvm_memslots *slots, struct rb_node *node, gfn_t end)
> +{
> +	struct kvm_memory_slot *memslot;
> +
> +	memslot = container_of(node, struct kvm_memory_slot, gfn_node[slots->node_idx]);
> +
> +	/*
> +	 * If this slot starts beyond or at the end of the range so does
> +	 * every next one
> +	 */
> +	return memslot->base_gfn >= end;
> +}
> +
> +/* Iterate over each memslot *possibly* intersecting [start, end) range */
> +#define kvm_for_each_memslot_in_gfn_range(node, slots, start, end)	\
> +	for (node = kvm_for_each_in_gfn_first(slots, start);		\
> +	     node && !kvm_for_each_in_gfn_no_more(slots, node, end);	\

I think it makes sense to move the NULL check into the validation helper?  We had
a similar case in KVM's legacy MMU where a "null" check was left to the caller,
and it ended up with a bunch of redundant and confusing code.  I don't see that
happening here, but at the same time it's odd for the validator to not sanity
check @node.

> +	     node = rb_next(node))					\

It's silly, but I'd add a wrapper for this one, just to make it easy to follow
the control flow.

Maybe this as delta?  I'm definitely not set on the names, was just trying to
find something that's short and to the point.

---
 include/linux/kvm_host.h | 60 +++++++++++++++++++++-------------------
 1 file changed, 31 insertions(+), 29 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 9ae5f7341cf5..a88bd5d9e4aa 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -833,36 +833,29 @@ struct kvm_memory_slot *id_to_memslot(struct kvm_memslots *slots, int id)
 	return NULL;
 }

-static inline
-struct rb_node *kvm_memslots_gfn_upper_bound(struct kvm_memslots *slots, gfn_t gfn)
+static inline struct rb_node *kvm_get_first_node(struct kvm_memslots *slots,
+						 gfn_t start)
 {
+	struct kvm_memory_slot *slot;
+	struct rb_node *node, *tmp;
 	int idx = slots->node_idx;
-	struct rb_node *node, *result = NULL;
-
-	for (node = slots->gfn_tree.rb_node; node; ) {
-		struct kvm_memory_slot *slot;
-
-		slot = container_of(node, struct kvm_memory_slot, gfn_node[idx]);
-		if (gfn < slot->base_gfn) {
-			result = node;
-			node = node->rb_left;
-		} else
-			node = node->rb_right;
-	}
-
-	return result;
-}
-
-static inline
-struct rb_node *kvm_for_each_in_gfn_first(struct kvm_memslots *slots, gfn_t start)
-{
-	struct rb_node *node;

 	/*
 	 * Find the slot with the lowest gfn that can possibly intersect with
 	 * the range, so we'll ideally have slot start <= range start
 	 */
-	node = kvm_memslots_gfn_upper_bound(slots, start);
+	node = NULL;
+	for (tmp = slots->gfn_tree.rb_node; tmp; ) {
+
+		slot = container_of(node, struct kvm_memory_slot, gfn_node[idx]);
+		if (gfn < slot->base_gfn) {
+			node = tmp;
+			tmp = tmp->rb_left;
+		} else {
+			tmp = tmp->rb_right;
+		}
+	}
+
 	if (node) {
 		struct rb_node *pnode;

@@ -882,12 +875,16 @@ struct rb_node *kvm_for_each_in_gfn_first(struct kvm_memslots *slots, gfn_t star
 	return node;
 }

-static inline
-bool kvm_for_each_in_gfn_no_more(struct kvm_memslots *slots, struct rb_node *node, gfn_t end)
+static inline bool kvm_is_last_node(struct kvm_memslots *slots,
+				    struct rb_node *node, gfn_t end)
 {
 	struct kvm_memory_slot *memslot;

-	memslot = container_of(node, struct kvm_memory_slot, gfn_node[slots->node_idx]);
+	if (!node)
+		return true;
+
+	memslot = container_of(node, struct kvm_memory_slot,
+			       gfn_node[slots->node_idx]);

 	/*
 	 * If this slot starts beyond or at the end of the range so does
@@ -896,11 +893,16 @@ bool kvm_for_each_in_gfn_no_more(struct kvm_memslots *slots, struct rb_node *nod
 	return memslot->base_gfn >= end;
 }

+static inline bool kvm_get_next_node(struct rb_node *node)
+{
+	return rb_next(node)
+}
+
 /* Iterate over each memslot *possibly* intersecting [start, end) range */
 #define kvm_for_each_memslot_in_gfn_range(node, slots, start, end)	\
-	for (node = kvm_for_each_in_gfn_first(slots, start);		\
-	     node && !kvm_for_each_in_gfn_no_more(slots, node, end);	\
-	     node = rb_next(node))					\
+	for (node = kvm_get_first_node(slots, start);			\
+	     !kvm_is_last_node(slots, node, end);			\
+	     node = kvm_get_next_node(node))				\

 /*
  * KVM_SET_USER_MEMORY_REGION ioctl allows the following operations:
--

Maciej S. Szmigiero Oct. 21, 2021, 2:16 p.m. UTC | #2

On 21.10.2021 01:47, Sean Christopherson wrote:
> On Mon, Sep 20, 2021, Maciej S. Szmigiero wrote:
> 
> Some mechanical comments while they're on my mind, I'll get back to a full review
> tomorrow.
> 
>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>> index 6433efff447a..9ae5f7341cf5 100644
>> --- a/include/linux/kvm_host.h
>> +++ b/include/linux/kvm_host.h
>> @@ -833,6 +833,75 @@ struct kvm_memory_slot *id_to_memslot(struct kvm_memslots *slots, int id)
>>   	return NULL;
>>   }
>>   
>> +static inline
>> +struct rb_node *kvm_memslots_gfn_upper_bound(struct kvm_memslots *slots, gfn_t gfn)
> 
> Function attributes should go on the same line as the function unless there's a
> really good reason to do otherwise.

Here the reason was a long line length, which was 84 characters even with
function attributes moved to a separate line.

include/linux/kvm_host.h contains a lot of helpers written in a similar
style:
> static inline gfn_t
> hva_to_gfn_memslot(unsigned long hva, struct kvm_memory_slot *slot)

> In this case, I would honestly just drop the helper.  It's really hard to express
> what this function does in a name that isn't absurdly long, and there's exactly
> one user at the end of the series.

The "upper bound" is a common name for a binary search operation that
finds the first node that has its key strictly greater than the searched key.

It can be integrated into its caller but I would leave a comment there
describing what kind of operation that block of code does to aid in
understanding the code.

Although, to be honest, I don't quite get the reason for doing this
considering that you want to put a single "rb_next()" call into its own
helper for clarity below.

> https://lkml.kernel.org/r/20210930192417.1332877-1-keescook@chromium.org
> 
>> +{
>> +	int idx = slots->node_idx;
>> +	struct rb_node *node, *result = NULL;
>> +
>> +	for (node = slots->gfn_tree.rb_node; node; ) {
>> +		struct kvm_memory_slot *slot;
> 
> My personal preference is to put declarations outside of the for loop.  I find it
> easier to read, it's harder to have shadowing issues if all variables are declared
> at the top, especially when using relatively generic names.

Will do.

>> +
>> +		slot = container_of(node, struct kvm_memory_slot, gfn_node[idx]);
>> +		if (gfn < slot->base_gfn) {
>> +			result = node;
>> +			node = node->rb_left;
>> +		} else
> 
> Needs braces since the "if" has braces.

Will add them.

>> +			node = node->rb_right;
>> +	}
>> +
>> +	return result;
>> +}
>> +
>> +static inline
>> +struct rb_node *kvm_for_each_in_gfn_first(struct kvm_memslots *slots, gfn_t start)
> 
> The kvm_for_each_in_gfn prefix is _really_ confusing.  I get that these are all
> helpers for "kvm_for_each_memslot...", but it's hard not to think these are all
> iterators on their own.  I would gladly sacrifice namespacing for readability in
> this case.

"kvm_for_each_memslot_in_gfn_range" was your proposed name here:
https://lore.kernel.org/kvm/YK6GWUP107i5KAJo@google.com/

But no problem renaming it.

> I also wouldn't worry about capturing the details.  For most folks reading this
> code, the important part is understanding the control flow of
> kvm_for_each_memslot_in_gfn_range().  Capturing the under-the-hood details in the
> name isn't a priority since anyone modifying this code is going to have to do a
> lot of staring no matter what :-)

It's even better if somebody modifying complex code has to read it carefully
(within reason, obviously).

>> +static inline
>> +bool kvm_for_each_in_gfn_no_more(struct kvm_memslots *slots, struct rb_node *node, gfn_t end)
>> +{
>> +	struct kvm_memory_slot *memslot;
>> +
>> +	memslot = container_of(node, struct kvm_memory_slot, gfn_node[slots->node_idx]);
>> +
>> +	/*
>> +	 * If this slot starts beyond or at the end of the range so does
>> +	 * every next one
>> +	 */
>> +	return memslot->base_gfn >= end;
>> +}
>> +
>> +/* Iterate over each memslot *possibly* intersecting [start, end) range */
>> +#define kvm_for_each_memslot_in_gfn_range(node, slots, start, end)	\
>> +	for (node = kvm_for_each_in_gfn_first(slots, start);		\
>> +	     node && !kvm_for_each_in_gfn_no_more(slots, node, end);	\
> 
> I think it makes sense to move the NULL check into the validation helper?  We had
> a similar case in KVM's legacy MMU where a "null" check was left to the caller,
> and it ended up with a bunch of redundant and confusing code.  I don't see that
> happening here, but at the same time it's odd for the validator to not sanity
> check @node.
>

Will do.
  
>> +	     node = rb_next(node))					\
> 
> It's silly, but I'd add a wrapper for this one, just to make it easy to follow
> the control flow.
> 
> Maybe this as delta?  I'm definitely not set on the names, was just trying to
> find something that's short and to the point.

Thanks for the proposed patch, I added some comments inline below.

> ---
>   include/linux/kvm_host.h | 60 +++++++++++++++++++++-------------------
>   1 file changed, 31 insertions(+), 29 deletions(-)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 9ae5f7341cf5..a88bd5d9e4aa 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -833,36 +833,29 @@ struct kvm_memory_slot *id_to_memslot(struct kvm_memslots *slots, int id)
>   	return NULL;
>   }
> 
> -static inline
> -struct rb_node *kvm_memslots_gfn_upper_bound(struct kvm_memslots *slots, gfn_t gfn)
> +static inline struct rb_node *kvm_get_first_node(struct kvm_memslots *slots,
> +						 gfn_t start)
>   {
> +	struct kvm_memory_slot *slot;
> +	struct rb_node *node, *tmp;
>   	int idx = slots->node_idx;
> -	struct rb_node *node, *result = NULL;
> -
> -	for (node = slots->gfn_tree.rb_node; node; ) {
> -		struct kvm_memory_slot *slot;
> -
> -		slot = container_of(node, struct kvm_memory_slot, gfn_node[idx]);
> -		if (gfn < slot->base_gfn) {
> -			result = node;
> -			node = node->rb_left;
> -		} else
> -			node = node->rb_right;
> -	}
> -
> -	return result;
> -}
> -
> -static inline
> -struct rb_node *kvm_for_each_in_gfn_first(struct kvm_memslots *slots, gfn_t start)
> -{
> -	struct rb_node *node;
> 
>   	/*
>   	 * Find the slot with the lowest gfn that can possibly intersect with
>   	 * the range, so we'll ideally have slot start <= range start
>   	 */
> -	node = kvm_memslots_gfn_upper_bound(slots, start);
> +	node = NULL;
> +	for (tmp = slots->gfn_tree.rb_node; tmp; ) {
> +
> +		slot = container_of(node, struct kvm_memory_slot, gfn_node[idx]);

Here -------------------------------^ should be "tmp", not "node" since
you renamed "node" into "tmp" and "result" into "node" while integrating
this function into its caller.

> +		if (gfn < slot->base_gfn) {
> +			node = tmp;
> +			tmp = tmp->rb_left;
> +		} else {
> +			tmp = tmp->rb_right;
> +		}
> +	}
> +
>   	if (node) {
>   		struct rb_node *pnode;
> 
> @@ -882,12 +875,16 @@ struct rb_node *kvm_for_each_in_gfn_first(struct kvm_memslots *slots, gfn_t star
>   	return node;
>   }
> 
> -static inline
> -bool kvm_for_each_in_gfn_no_more(struct kvm_memslots *slots, struct rb_node *node, gfn_t end)
> +static inline bool kvm_is_last_node(struct kvm_memslots *slots,
> +				    struct rb_node *node, gfn_t end)

kvm_is_last_node() is a bit misleading since this function is supposed
to return true even on the last node, only returning false one node past
the last one (or when the tree runs out of nodes).

>   {
>   	struct kvm_memory_slot *memslot;
> 
> -	memslot = container_of(node, struct kvm_memory_slot, gfn_node[slots->node_idx]);
> +	if (!node)
> +		return true;
> +
> +	memslot = container_of(node, struct kvm_memory_slot,
> +			       gfn_node[slots->node_idx]);

You previously un-wrapped such lines, like for example in
https://lore.kernel.org/kvm/YK2GjzkWvjBcCFxn@google.com/ :
>> +		slot = container_of(node, struct kvm_memory_slot,
>> +				    gfn_node[idxactive]);
> 
> With 'idx', this can go on a single line.  It runs over by two chars, but the 80
> char limit is a soft limit, and IMO avoiding line breaks for things like this
> improves readability.


> 
>   	/*
>   	 * If this slot starts beyond or at the end of the range so does
> @@ -896,11 +893,16 @@ bool kvm_for_each_in_gfn_no_more(struct kvm_memslots *slots, struct rb_node *nod
>   	return memslot->base_gfn >= end;
>   }
> 
> +static inline bool kvm_get_next_node(struct rb_node *node)
> +{
> +	return rb_next(node)
> +}
> +
>   /* Iterate over each memslot *possibly* intersecting [start, end) range */
>   #define kvm_for_each_memslot_in_gfn_range(node, slots, start, end)	\
> -	for (node = kvm_for_each_in_gfn_first(slots, start);		\
> -	     node && !kvm_for_each_in_gfn_no_more(slots, node, end);	\
> -	     node = rb_next(node))					\
> +	for (node = kvm_get_first_node(slots, start);			\
> +	     !kvm_is_last_node(slots, node, end);			\
> +	     node = kvm_get_next_node(node))				\
> 
>   /*
>    * KVM_SET_USER_MEMORY_REGION ioctl allows the following operations:
> --
> 

Thanks,
Maciej

Sean Christopherson Oct. 21, 2021, 4:30 p.m. UTC | #3

On Thu, Oct 21, 2021, Maciej S. Szmigiero wrote:
> On 21.10.2021 01:47, Sean Christopherson wrote:
> > In this case, I would honestly just drop the helper.  It's really hard to express
> > what this function does in a name that isn't absurdly long, and there's exactly
> > one user at the end of the series.
> 
> The "upper bound" is a common name for a binary search operation that
> finds the first node that has its key strictly greater than the searched key.

Ah, that I did not know (obviously).  But I suspect that detail will be lost on
other readers as well, even if they are familiar with the terminology.

> It can be integrated into its caller but I would leave a comment there
> describing what kind of operation that block of code does to aid in
> understanding the code.

Yeah, completely agree a comment would be wonderful.

> Although, to be honest, I don't quite get the reason for doing this
> considering that you want to put a single "rb_next()" call into its own
> helper for clarity below.

The goal is to make the macro itself easy to understand, even if the reader may
not understand the underlying details.  The bare rb_next() forces the reader to
pause to think about exactly what "node" is, and perhaps even dive into the code
for the other helpers.

With something like this, a reader that doesn't know the memslots details can
get a good idea of the basic gist of the macro without having to even know the
type of "node".  Obviously someone writing code will need to know the type, but
for readers bouncing around code it's a detail they don't need to know.

#define kvm_for_each_memslot_in_gfn_range(node, slots, start, end)	\
	for (node = kvm_get_first_node(slots, start);			\
	     !kvm_is_valid_node(slots, node, end);			\
	     node = kvm_get_next_node(node))

Hmm, on that point, having the caller do

	memslot = container_of(node, struct kvm_memory_slot, gfn_node[idx]);

is more than a bit odd, and as is the case with the bare rb_next(), bleeds
implementation details into code that really doesn't care about implementation
details.  Eww, and looking closer, the caller also needs to grab slots->node_idx.

So while I would love to avoid an opaque iterator, adding one would be a net
positive in this case.  E.g.

/* Iterator used for walking memslots that overlap a gfn range. */
struct kvm_memslot_iterator iter {
        struct rb_node *node;
        struct kvm_memory_slot *memslot;
        struct kvm_memory_slots *slots;
	gfn_t start;
	gfn_t end;
} 

static inline void kvm_memslot_iter_start(struct kvm_memslot_iter *iter,
					  struct kvm_memslots *slots,
					  gfn_t start, gfn_t end)
{
	...
}

static inline bool kvm_memslot_iter_is_valid(struct kvm_memslot_iter *iter)
{
	/*
	 * If this slot starts beyond or at the end of the range so does
	 * every next one
	 */
	return iter->node && iter->memslot->base_gfn < end;
}

static inline void kvm_memslot_iter_next(struct kvm_memslot_iter *iter)
{
	iter->node = rb_next(iter->node);

	if (!iter->node)
		return;

	iter->memslot = container_of(iter->node, struct kvm_memory_slot,
				     gfn_node[iter->slots->node_idx]);
}

/* Iterate over each memslot *possibly* intersecting [start, end) range */
#define kvm_for_each_memslot_in_gfn_range(iter, node, slots, start, end) \
	for (kvm_memslot_iter_start(iter, node, slots, start, end);	 \
	     kvm_memslot_iter_is_valid(iter);				 \
	     kvm_memslot_iter_next(node))				 \

Ugh, this got me looking at kvm_zap_gfn_range(), and that thing is trainwreck.
There are three calls kvm_flush_remote_tlbs_with_address(), two of which should
be unnecessary, but become necessary because the last one is broken.  *sigh*

That'd also be a good excuse to extract the rmap loop to a separate helper.  Then
you don't need to constantly juggle the 80 char limit and variable collisions
while you're modifying this mess.  I'll post the attached patches separately
since the first one (two?) should go into 5.15.  They're compile tested only
at this point, but hopefully I've had enough coffee and they're safe to base
this series on top (note, they're based on kvm/queue, commit 73f122c4f06f ("KVM:
cleanup allocation of rmaps and page tracking data").

> > The kvm_for_each_in_gfn prefix is _really_ confusing.  I get that these are all
> > helpers for "kvm_for_each_memslot...", but it's hard not to think these are all
> > iterators on their own.  I would gladly sacrifice namespacing for readability in
> > this case.
> 
> "kvm_for_each_memslot_in_gfn_range" was your proposed name here:
> https://lore.kernel.org/kvm/YK6GWUP107i5KAJo@google.com/
> 
> But no problem renaming it.

Oh, I was commenting on the inner helpers.  The macro name itself is great. ;-)

> > @@ -882,12 +875,16 @@ struct rb_node *kvm_for_each_in_gfn_first(struct kvm_memslots *slots, gfn_t star
> >   	return node;
> >   }
> > 
> > -static inline
> > -bool kvm_for_each_in_gfn_no_more(struct kvm_memslots *slots, struct rb_node *node, gfn_t end)
> > +static inline bool kvm_is_last_node(struct kvm_memslots *slots,
> > +				    struct rb_node *node, gfn_t end)
> 
> kvm_is_last_node() is a bit misleading since this function is supposed
> to return true even on the last node, only returning false one node past
> the last one (or when the tree runs out of nodes).

Good point.  I didn't love the name when I suggested either.  What about
kvm_is_valid_node()?

> >   {
> >   	struct kvm_memory_slot *memslot;
> > 
> > -	memslot = container_of(node, struct kvm_memory_slot, gfn_node[slots->node_idx]);
> > +	if (!node)
> > +		return true;
> > +
> > +	memslot = container_of(node, struct kvm_memory_slot,
> > +			       gfn_node[slots->node_idx]);
> 
> You previously un-wrapped such lines, like for example in
> https://lore.kernel.org/kvm/YK2GjzkWvjBcCFxn@google.com/ :

Heh, yes, the balance between "too long" and "hard to read" is subjective.  The
ones I usually let run over are cases where it's a short word on the end, the
overrun is only a couple chars, the statement is the sole line of an if/else
statement, there's a null/validity check immediately following etc...

All that said, I don't have a strong opinion on this one, I'm a-ok if you want to
let it run over.

> > > +		slot = container_of(node, struct kvm_memory_slot,
> > > +				    gfn_node[idxactive]);
> > 
> > With 'idx', this can go on a single line.  It runs over by two chars, but the 80
> > char limit is a soft limit, and IMO avoiding line breaks for things like this
> > improves readability.
> 
> 
> > 
> >   	/*
> >   	 * If this slot starts beyond or at the end of the range so does
> > @@ -896,11 +893,16 @@ bool kvm_for_each_in_gfn_no_more(struct kvm_memslots *slots, struct rb_node *nod
> >   	return memslot->base_gfn >= end;
> >   }
> > 
> > +static inline bool kvm_get_next_node(struct rb_node *node)
> > +{
> > +	return rb_next(node)
> > +}
> > +
> >   /* Iterate over each memslot *possibly* intersecting [start, end) range */
> >   #define kvm_for_each_memslot_in_gfn_range(node, slots, start, end)	\
> > -	for (node = kvm_for_each_in_gfn_first(slots, start);		\
> > -	     node && !kvm_for_each_in_gfn_no_more(slots, node, end);	\
> > -	     node = rb_next(node))					\
> > +	for (node = kvm_get_first_node(slots, start);			\
> > +	     !kvm_is_last_node(slots, node, end);			\
> > +	     node = kvm_get_next_node(node))				\
> > 
> >   /*
> >    * KVM_SET_USER_MEMORY_REGION ioctl allows the following operations:
> > --
> > 
> 
> Thanks,
> Maciej

Maciej S. Szmigiero Oct. 21, 2021, 9:44 p.m. UTC | #4

On 21.10.2021 18:30, Sean Christopherson wrote:
> On Thu, Oct 21, 2021, Maciej S. Szmigiero wrote:
>> On 21.10.2021 01:47, Sean Christopherson wrote:
>>> In this case, I would honestly just drop the helper.  It's really hard to express
>>> what this function does in a name that isn't absurdly long, and there's exactly
>>> one user at the end of the series.
>>
>> The "upper bound" is a common name for a binary search operation that
>> finds the first node that has its key strictly greater than the searched key.
> 
> Ah, that I did not know (obviously).  But I suspect that detail will be lost on
> other readers as well, even if they are familiar with the terminology.
> 
>> It can be integrated into its caller but I would leave a comment there
>> describing what kind of operation that block of code does to aid in
>> understanding the code.
> 
> Yeah, completely agree a comment would be wonderful.

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index a05e581ef210..f2859988d630 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5724,18 +5724,25 @@  void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end)
 	int i;
 	bool flush = false;
 
+	if (WARN_ON_ONCE(gfn_end <= gfn_start))
+		return;
+
 	write_lock(&kvm->mmu_lock);
 
 	kvm_inc_notifier_count(kvm, gfn_start, gfn_end);
 
 	if (kvm_memslots_have_rmaps(kvm)) {
 		for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
-			int bkt;
+			int idx;
+			struct rb_node *node;
 
 			slots = __kvm_memslots(kvm, i);
-			kvm_for_each_memslot(memslot, bkt, slots) {
+			idx = slots->node_idx;
+
+			kvm_for_each_memslot_in_gfn_range(node, slots, gfn_start, gfn_end) {
 				gfn_t start, end;
 
+				memslot = container_of(node, struct kvm_memory_slot, gfn_node[idx]);
 				start = max(gfn_start, memslot->base_gfn);
 				end = min(gfn_end, memslot->base_gfn + memslot->npages);
 				if (start >= end)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 6433efff447a..9ae5f7341cf5 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -833,6 +833,75 @@  struct kvm_memory_slot *id_to_memslot(struct kvm_memslots *slots, int id)
 	return NULL;
 }
 
+static inline
+struct rb_node *kvm_memslots_gfn_upper_bound(struct kvm_memslots *slots, gfn_t gfn)
+{
+	int idx = slots->node_idx;
+	struct rb_node *node, *result = NULL;
+
+	for (node = slots->gfn_tree.rb_node; node; ) {
+		struct kvm_memory_slot *slot;
+
+		slot = container_of(node, struct kvm_memory_slot, gfn_node[idx]);
+		if (gfn < slot->base_gfn) {
+			result = node;
+			node = node->rb_left;
+		} else
+			node = node->rb_right;
+	}
+
+	return result;
+}
+
+static inline
+struct rb_node *kvm_for_each_in_gfn_first(struct kvm_memslots *slots, gfn_t start)
+{
+	struct rb_node *node;
+
+	/*
+	 * Find the slot with the lowest gfn that can possibly intersect with
+	 * the range, so we'll ideally have slot start <= range start
+	 */
+	node = kvm_memslots_gfn_upper_bound(slots, start);
+	if (node) {
+		struct rb_node *pnode;
+
+		/*
+		 * A NULL previous node means that the very first slot
+		 * already has a higher start gfn.
+		 * In this case slot start > range start.
+		 */
+		pnode = rb_prev(node);
+		if (pnode)
+			node = pnode;
+	} else {
+		/* a NULL node below means no slots */
+		node = rb_last(&slots->gfn_tree);
+	}
+
+	return node;
+}
+
+static inline
+bool kvm_for_each_in_gfn_no_more(struct kvm_memslots *slots, struct rb_node *node, gfn_t end)
+{
+	struct kvm_memory_slot *memslot;
+
+	memslot = container_of(node, struct kvm_memory_slot, gfn_node[slots->node_idx]);
+
+	/*
+	 * If this slot starts beyond or at the end of the range so does
+	 * every next one
+	 */
+	return memslot->base_gfn >= end;
+}
+
+/* Iterate over each memslot *possibly* intersecting [start, end) range */
+#define kvm_for_each_memslot_in_gfn_range(node, slots, start, end)	\
+	for (node = kvm_for_each_in_gfn_first(slots, start);		\
+	     node && !kvm_for_each_in_gfn_no_more(slots, node, end);	\
+	     node = rb_next(node))					\
+
 /*
  * KVM_SET_USER_MEMORY_REGION ioctl allows the following operations:
  * - create a new memory slot

[v5,12/13] KVM: Optimize gfn lookup in kvm_zap_gfn_range()

Commit Message

Comments

Patch