diff mbox series

[v2,1/5] KVM: x86: Introduce KVM_REQ_RING_SOFT_FULL

Message ID 20220916045135.154505-2-gshan@redhat.com (mailing list archive)
State New
Headers show
Series KVM: arm64: Enable ring-based dirty memory tracking | expand

Commit Message

Gavin Shan Sept. 16, 2022, 4:51 a.m. UTC
This adds KVM_REQ_RING_SOFT_FULL, which is raised when the dirty
ring of the specific VCPU becomes softly full in kvm_dirty_ring_push().
The VCPU is enforced to exit when the request is raised and its
dirty ring is softly full on its entrance.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 arch/x86/kvm/x86.c       | 5 +++--
 include/linux/kvm_host.h | 1 +
 virt/kvm/dirty_ring.c    | 4 ++++
 3 files changed, 8 insertions(+), 2 deletions(-)

Comments

Peter Xu Sept. 16, 2022, 6:09 p.m. UTC | #1
On Fri, Sep 16, 2022 at 12:51:31PM +0800, Gavin Shan wrote:
> This adds KVM_REQ_RING_SOFT_FULL, which is raised when the dirty
> ring of the specific VCPU becomes softly full in kvm_dirty_ring_push().
> The VCPU is enforced to exit when the request is raised and its
> dirty ring is softly full on its entrance.
> 
> Suggested-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  arch/x86/kvm/x86.c       | 5 +++--
>  include/linux/kvm_host.h | 1 +
>  virt/kvm/dirty_ring.c    | 4 ++++
>  3 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 43a6a7efc6ec..7f368f59f033 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10265,8 +10265,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  	bool req_immediate_exit = false;
>  
>  	/* Forbid vmenter if vcpu dirty ring is soft-full */
> -	if (unlikely(vcpu->kvm->dirty_ring_size &&
> -		     kvm_dirty_ring_soft_full(&vcpu->dirty_ring))) {
> +	if (kvm_check_request(KVM_REQ_RING_SOFT_FULL, vcpu) &&
> +	    kvm_dirty_ring_soft_full(&vcpu->dirty_ring)) {
> +		kvm_make_request(KVM_REQ_RING_SOFT_FULL, vcpu);
>  		vcpu->run->exit_reason = KVM_EXIT_DIRTY_RING_FULL;
>  		trace_kvm_dirty_ring_exit(vcpu);
>  		r = 0;

As commented previously - can we use kvm_test_request() instead? because we
don't want to unconditionally clear the bit.  Instead of making the request
again, we can clear request only if !full.

We can also safely move this into the block of below kvm_request_pending()
as Marc used to suggest.

To explicitly use kvm_clear_request(), we may need to be careful on the
memory barriers.  I'm wondering whether we should have moved
smp_mb__after_atomic() into kvm_clear_request() because kvm_clear_request()
is used outside kvm_check_request() and IIUC all the call sites should
better have that barrier too to be safe.

Side note: when I read the code around I also see some mis-use of clear
request where it can be omitted, e.g.:

		if (kvm_check_request(KVM_REQ_UNHALT, vcpu)) {
			kvm_clear_request(KVM_REQ_UNHALT, vcpu);
			vcpu->run->exit_reason = KVM_EXIT_IRQ_WINDOW_OPEN;
		}

Maybe it's a sign of bad naming, so we should renamed kvm_check_request()
to kvm_test_clear_request() too to show that clearing after that is not
needed?

Thanks,
Marc Zyngier Sept. 18, 2022, 9 a.m. UTC | #2
On Fri, 16 Sep 2022 19:09:52 +0100,
Peter Xu <peterx@redhat.com> wrote:
> 
> On Fri, Sep 16, 2022 at 12:51:31PM +0800, Gavin Shan wrote:
> > This adds KVM_REQ_RING_SOFT_FULL, which is raised when the dirty
> > ring of the specific VCPU becomes softly full in kvm_dirty_ring_push().
> > The VCPU is enforced to exit when the request is raised and its
> > dirty ring is softly full on its entrance.
> > 
> > Suggested-by: Marc Zyngier <maz@kernel.org>
> > Signed-off-by: Gavin Shan <gshan@redhat.com>
> > ---
> >  arch/x86/kvm/x86.c       | 5 +++--
> >  include/linux/kvm_host.h | 1 +
> >  virt/kvm/dirty_ring.c    | 4 ++++
> >  3 files changed, 8 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 43a6a7efc6ec..7f368f59f033 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -10265,8 +10265,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> >  	bool req_immediate_exit = false;
> >  
> >  	/* Forbid vmenter if vcpu dirty ring is soft-full */
> > -	if (unlikely(vcpu->kvm->dirty_ring_size &&
> > -		     kvm_dirty_ring_soft_full(&vcpu->dirty_ring))) {
> > +	if (kvm_check_request(KVM_REQ_RING_SOFT_FULL, vcpu) &&
> > +	    kvm_dirty_ring_soft_full(&vcpu->dirty_ring)) {
> > +		kvm_make_request(KVM_REQ_RING_SOFT_FULL, vcpu);
> >  		vcpu->run->exit_reason = KVM_EXIT_DIRTY_RING_FULL;
> >  		trace_kvm_dirty_ring_exit(vcpu);
> >  		r = 0;
> 
> As commented previously - can we use kvm_test_request() instead? because we
> don't want to unconditionally clear the bit.  Instead of making the request
> again, we can clear request only if !full.

I have the feeling that this is a micro-optimisation that won't lead
to much benefit in practice. You already have the cache line, just not
in exclusive mode, and given that this is per-vcpu, you'd only see the
cost if someone else is setting a request to this vcpu while
evaluating the local requests.

And now you need extra barriers...

Also, can we please refrain from changing things without data showing
that this actually is worse than what we had before? The point below
makes me think that this is actually beneficial as is.

> We can also safely move this into the block of below kvm_request_pending()
> as Marc used to suggest.

This, on the other hand, makes sure that we share the cost across all
requests. Requests should be extremely rare anyway (and if they
aren't, you have a whole lot of performance issues on your hands
anyway).

> 
> To explicitly use kvm_clear_request(), we may need to be careful on the
> memory barriers.  I'm wondering whether we should have moved
> smp_mb__after_atomic() into kvm_clear_request() because kvm_clear_request()
> is used outside kvm_check_request() and IIUC all the call sites should
> better have that barrier too to be safe.
>
> Side note: when I read the code around I also see some mis-use of clear
> request where it can be omitted, e.g.:
> 
> 		if (kvm_check_request(KVM_REQ_UNHALT, vcpu)) {
> 			kvm_clear_request(KVM_REQ_UNHALT, vcpu);
> 			vcpu->run->exit_reason = KVM_EXIT_IRQ_WINDOW_OPEN;
> 		}
> 
> Maybe it's a sign of bad naming, so we should renamed kvm_check_request()
> to kvm_test_clear_request() too to show that clearing after that is not
> needed?

Yeah, this kvm_clear_request() is superfluous. But this is rather well
documented, for once, and I don't think we should repaint it based on
a sample of one.

Thanks,

	M.
Gavin Shan Sept. 18, 2022, 11:58 p.m. UTC | #3
On 9/18/22 7:00 PM, Marc Zyngier wrote:
> On Fri, 16 Sep 2022 19:09:52 +0100,
> Peter Xu <peterx@redhat.com> wrote:
>>
>> On Fri, Sep 16, 2022 at 12:51:31PM +0800, Gavin Shan wrote:
>>> This adds KVM_REQ_RING_SOFT_FULL, which is raised when the dirty
>>> ring of the specific VCPU becomes softly full in kvm_dirty_ring_push().
>>> The VCPU is enforced to exit when the request is raised and its
>>> dirty ring is softly full on its entrance.
>>>
>>> Suggested-by: Marc Zyngier <maz@kernel.org>
>>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>>> ---
>>>   arch/x86/kvm/x86.c       | 5 +++--
>>>   include/linux/kvm_host.h | 1 +
>>>   virt/kvm/dirty_ring.c    | 4 ++++
>>>   3 files changed, 8 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 43a6a7efc6ec..7f368f59f033 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -10265,8 +10265,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>>   	bool req_immediate_exit = false;
>>>   
>>>   	/* Forbid vmenter if vcpu dirty ring is soft-full */
>>> -	if (unlikely(vcpu->kvm->dirty_ring_size &&
>>> -		     kvm_dirty_ring_soft_full(&vcpu->dirty_ring))) {
>>> +	if (kvm_check_request(KVM_REQ_RING_SOFT_FULL, vcpu) &&
>>> +	    kvm_dirty_ring_soft_full(&vcpu->dirty_ring)) {
>>> +		kvm_make_request(KVM_REQ_RING_SOFT_FULL, vcpu);
>>>   		vcpu->run->exit_reason = KVM_EXIT_DIRTY_RING_FULL;
>>>   		trace_kvm_dirty_ring_exit(vcpu);
>>>   		r = 0;
>>
>> As commented previously - can we use kvm_test_request() instead? because we
>> don't want to unconditionally clear the bit.  Instead of making the request
>> again, we can clear request only if !full.
> 
> I have the feeling that this is a micro-optimisation that won't lead
> to much benefit in practice. You already have the cache line, just not
> in exclusive mode, and given that this is per-vcpu, you'd only see the
> cost if someone else is setting a request to this vcpu while
> evaluating the local requests.
> 
> And now you need extra barriers...
> 
> Also, can we please refrain from changing things without data showing
> that this actually is worse than what we had before? The point below
> makes me think that this is actually beneficial as is.
> 

I think Marc's explanation makes sense. It won't make difference in terms
of performance. We need to explicitly handle barrier when kvm_test_request()
is used. So I prefer to keep the code if Peter agrees.

>> We can also safely move this into the block of below kvm_request_pending()
>> as Marc used to suggest.
> 
> This, on the other hand, makes sure that we share the cost across all
> requests. Requests should be extremely rare anyway (and if they
> aren't, you have a whole lot of performance issues on your hands
> anyway).
> 

Yeah, We shouldn't have too much requests. I missed the comment from Marc
to move this chunk to kvm_request_pending(). I will fix it in v3.

>>
>> To explicitly use kvm_clear_request(), we may need to be careful on the
>> memory barriers.  I'm wondering whether we should have moved
>> smp_mb__after_atomic() into kvm_clear_request() because kvm_clear_request()
>> is used outside kvm_check_request() and IIUC all the call sites should
>> better have that barrier too to be safe.
>>
>> Side note: when I read the code around I also see some mis-use of clear
>> request where it can be omitted, e.g.:
>>
>> 		if (kvm_check_request(KVM_REQ_UNHALT, vcpu)) {
>> 			kvm_clear_request(KVM_REQ_UNHALT, vcpu);
>> 			vcpu->run->exit_reason = KVM_EXIT_IRQ_WINDOW_OPEN;
>> 		}
>>
>> Maybe it's a sign of bad naming, so we should renamed kvm_check_request()
>> to kvm_test_clear_request() too to show that clearing after that is not
>> needed?
> 
> Yeah, this kvm_clear_request() is superfluous. But this is rather well
> documented, for once, and I don't think we should repaint it based on
> a sample of one.
> 

Yeah, I think Peter is correct that smp_mb__after_atomic() would be
part of kvm_clear_request(). Otherwise, the following two cases aren't
in same order:

       // kvm_check_request()             // test and clear
       kvm_test_request()                 kvm_test_request()
       kvm_clear_request()                kvm_clear_request()
       smp_mb__after_atomic()

Thanks,
Gavin
Marc Zyngier Sept. 19, 2022, 9:22 a.m. UTC | #4
On Mon, 19 Sep 2022 00:58:10 +0100,
Gavin Shan <gshan@redhat.com> wrote:
> 
> On 9/18/22 7:00 PM, Marc Zyngier wrote:
> > On Fri, 16 Sep 2022 19:09:52 +0100,
> > Peter Xu <peterx@redhat.com> wrote:
> >> 
> >> On Fri, Sep 16, 2022 at 12:51:31PM +0800, Gavin Shan wrote:
> >>> This adds KVM_REQ_RING_SOFT_FULL, which is raised when the dirty
> >>> ring of the specific VCPU becomes softly full in kvm_dirty_ring_push().
> >>> The VCPU is enforced to exit when the request is raised and its
> >>> dirty ring is softly full on its entrance.
> >>> 
> >>> Suggested-by: Marc Zyngier <maz@kernel.org>
> >>> Signed-off-by: Gavin Shan <gshan@redhat.com>
> >>> ---
> >>>   arch/x86/kvm/x86.c       | 5 +++--
> >>>   include/linux/kvm_host.h | 1 +
> >>>   virt/kvm/dirty_ring.c    | 4 ++++
> >>>   3 files changed, 8 insertions(+), 2 deletions(-)
> >>> 
> >>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >>> index 43a6a7efc6ec..7f368f59f033 100644
> >>> --- a/arch/x86/kvm/x86.c
> >>> +++ b/arch/x86/kvm/x86.c
> >>> @@ -10265,8 +10265,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> >>>   	bool req_immediate_exit = false;
> >>>     	/* Forbid vmenter if vcpu dirty ring is soft-full */
> >>> -	if (unlikely(vcpu->kvm->dirty_ring_size &&
> >>> -		     kvm_dirty_ring_soft_full(&vcpu->dirty_ring))) {
> >>> +	if (kvm_check_request(KVM_REQ_RING_SOFT_FULL, vcpu) &&
> >>> +	    kvm_dirty_ring_soft_full(&vcpu->dirty_ring)) {
> >>> +		kvm_make_request(KVM_REQ_RING_SOFT_FULL, vcpu);
> >>>   		vcpu->run->exit_reason = KVM_EXIT_DIRTY_RING_FULL;
> >>>   		trace_kvm_dirty_ring_exit(vcpu);
> >>>   		r = 0;
> >> 
> >> As commented previously - can we use kvm_test_request() instead? because we
> >> don't want to unconditionally clear the bit.  Instead of making the request
> >> again, we can clear request only if !full.
> > 
> > I have the feeling that this is a micro-optimisation that won't lead
> > to much benefit in practice. You already have the cache line, just not
> > in exclusive mode, and given that this is per-vcpu, you'd only see the
> > cost if someone else is setting a request to this vcpu while
> > evaluating the local requests.
> > 
> > And now you need extra barriers...
> > 
> > Also, can we please refrain from changing things without data showing
> > that this actually is worse than what we had before? The point below
> > makes me think that this is actually beneficial as is.
> > 
> 
> I think Marc's explanation makes sense. It won't make difference in terms
> of performance. We need to explicitly handle barrier when kvm_test_request()
> is used. So I prefer to keep the code if Peter agrees.
> 
> >> We can also safely move this into the block of below kvm_request_pending()
> >> as Marc used to suggest.
> > 
> > This, on the other hand, makes sure that we share the cost across all
> > requests. Requests should be extremely rare anyway (and if they
> > aren't, you have a whole lot of performance issues on your hands
> > anyway).
> > 
> 
> Yeah, We shouldn't have too much requests. I missed the comment from Marc
> to move this chunk to kvm_request_pending(). I will fix it in v3.
> 
> >> 
> >> To explicitly use kvm_clear_request(), we may need to be careful on the
> >> memory barriers.  I'm wondering whether we should have moved
> >> smp_mb__after_atomic() into kvm_clear_request() because kvm_clear_request()
> >> is used outside kvm_check_request() and IIUC all the call sites should
> >> better have that barrier too to be safe.
> >> 
> >> Side note: when I read the code around I also see some mis-use of clear
> >> request where it can be omitted, e.g.:
> >> 
> >> 		if (kvm_check_request(KVM_REQ_UNHALT, vcpu)) {
> >> 			kvm_clear_request(KVM_REQ_UNHALT, vcpu);
> >> 			vcpu->run->exit_reason = KVM_EXIT_IRQ_WINDOW_OPEN;
> >> 		}
> >> 
> >> Maybe it's a sign of bad naming, so we should renamed kvm_check_request()
> >> to kvm_test_clear_request() too to show that clearing after that is not
> >> needed?
> > 
> > Yeah, this kvm_clear_request() is superfluous. But this is rather well
> > documented, for once, and I don't think we should repaint it based on
> > a sample of one.
> > 
> 
> Yeah, I think Peter is correct that smp_mb__after_atomic() would be
> part of kvm_clear_request(). Otherwise, the following two cases aren't
> in same order:
> 
>       // kvm_check_request()             // test and clear
>       kvm_test_request()                 kvm_test_request()
>       kvm_clear_request()                kvm_clear_request()
>       smp_mb__after_atomic()

[fixing Drew's email, again]

Oh, I totally agree that a standalone use of kvm_clear_request() must
come with a barrier. It is just that without additional data, it isn't
obvious to me that there is any need for kvm_clear_request() itself to
have the barrier.

In a number of cases, kvm_clear_request() is used on in the context of
the same vcpu, and this should be enough to ensure visibility (for
example, I don't think kvm_vcpu_wfi() should require this barrier).

But maybe I'm missing something.

Thanks,

	M.
Peter Xu Sept. 19, 2022, 4:17 p.m. UTC | #5
On Mon, Sep 19, 2022 at 09:58:10AM +1000, Gavin Shan wrote:
> I think Marc's explanation makes sense. It won't make difference in terms
> of performance. We need to explicitly handle barrier when kvm_test_request()
> is used. So I prefer to keep the code if Peter agrees.

No strong opinion here. I keep thinking clear+set look awkward even if it's
unlikely path to trigger (ring should be recycled when reaching here for
any sane user app), but if it's already 2:1 then please go ahead. :)

Thanks,
diff mbox series

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 43a6a7efc6ec..7f368f59f033 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10265,8 +10265,9 @@  static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 	bool req_immediate_exit = false;
 
 	/* Forbid vmenter if vcpu dirty ring is soft-full */
-	if (unlikely(vcpu->kvm->dirty_ring_size &&
-		     kvm_dirty_ring_soft_full(&vcpu->dirty_ring))) {
+	if (kvm_check_request(KVM_REQ_RING_SOFT_FULL, vcpu) &&
+	    kvm_dirty_ring_soft_full(&vcpu->dirty_ring)) {
+		kvm_make_request(KVM_REQ_RING_SOFT_FULL, vcpu);
 		vcpu->run->exit_reason = KVM_EXIT_DIRTY_RING_FULL;
 		trace_kvm_dirty_ring_exit(vcpu);
 		r = 0;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index f4519d3689e1..53fa3134fee0 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -157,6 +157,7 @@  static inline bool is_error_page(struct page *page)
 #define KVM_REQ_VM_DEAD           (1 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 #define KVM_REQ_UNBLOCK           2
 #define KVM_REQ_UNHALT            3
+#define KVM_REQ_RING_SOFT_FULL    4
 #define KVM_REQUEST_ARCH_BASE     8
 
 /*
diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c
index f4c2a6eb1666..f0e49937bc9e 100644
--- a/virt/kvm/dirty_ring.c
+++ b/virt/kvm/dirty_ring.c
@@ -149,6 +149,7 @@  int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring)
 
 void kvm_dirty_ring_push(struct kvm_dirty_ring *ring, u32 slot, u64 offset)
 {
+	struct kvm_vcpu *vcpu = container_of(ring, struct kvm_vcpu, dirty_ring);
 	struct kvm_dirty_gfn *entry;
 
 	/* It should never get full */
@@ -166,6 +167,9 @@  void kvm_dirty_ring_push(struct kvm_dirty_ring *ring, u32 slot, u64 offset)
 	kvm_dirty_gfn_set_dirtied(entry);
 	ring->dirty_index++;
 	trace_kvm_dirty_ring_push(ring, slot, offset);
+
+	if (kvm_dirty_ring_soft_full(ring))
+		kvm_make_request(KVM_REQ_RING_SOFT_FULL, vcpu);
 }
 
 struct page *kvm_dirty_ring_get_page(struct kvm_dirty_ring *ring, u32 offset)