diff mbox series

[V5,15/15] x86/kvm: Use generic xfer to guest work function

Message ID 20200722220520.979724969@linutronix.de (mailing list archive)
State New, archived
Headers show
Series entry, x86, kvm: Generic entry/exit functionality for host and guest | expand

Commit Message

Thomas Gleixner July 22, 2020, 10 p.m. UTC
From: Thomas Gleixner <tglx@linutronix.de>

Use the generic infrastructure to check for and handle pending work before
transitioning into guest mode.

This now handles TIF_NOTIFY_RESUME as well which was ignored so
far. Handling it is important as this covers task work and task work will
be used to offload the heavy lifting of POSIX CPU timers to thread context.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V5: Rename exit -> xfer
---
 arch/x86/kvm/Kconfig   |    1 +
 arch/x86/kvm/vmx/vmx.c |   11 +++++------
 arch/x86/kvm/x86.c     |   15 ++++++---------
 3 files changed, 12 insertions(+), 15 deletions(-)

Comments

Sean Christopherson July 24, 2020, 12:17 a.m. UTC | #1
On Thu, Jul 23, 2020 at 12:00:09AM +0200, Thomas Gleixner wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Use the generic infrastructure to check for and handle pending work before
> transitioning into guest mode.
> 
> This now handles TIF_NOTIFY_RESUME as well which was ignored so
> far. Handling it is important as this covers task work and task work will
> be used to offload the heavy lifting of POSIX CPU timers to thread context.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V5: Rename exit -> xfer
> ---

One nit/question below (though it's really about patch 5).

Reviewed-and-tested-by: Sean Christopherson <sean.j.christopherson@intel.com>

> @@ -8676,15 +8677,11 @@ static int vcpu_run(struct kvm_vcpu *vcp
>  			break;
>  		}
>  
> -		if (signal_pending(current)) {
> -			r = -EINTR;
> -			vcpu->run->exit_reason = KVM_EXIT_INTR;
> -			++vcpu->stat.signal_exits;
> -			break;
> -		}
> -		if (need_resched()) {
> +		if (xfer_to_guest_mode_work_pending()) {
>  			srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
> -			cond_resched();
> +			r = xfer_to_guest_mode(vcpu);

Any reason not to call this xfer_to_guest_mode_work()?  Or handle_work(),
do_work(), etc...  Without the "work" part, it looks like a function that
should be invoked unconditionally.  It's obvious that's not the case if
one looks at the implementation, but it's a bit confusing on the KVM side
of things.

> +			if (r)
> +				return r;
>  			vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
>  		}
>  	}
>
Thomas Gleixner July 24, 2020, 12:46 a.m. UTC | #2
Sean,

Sean Christopherson <sean.j.christopherson@intel.com> writes:
> On Thu, Jul 23, 2020 at 12:00:09AM +0200, Thomas Gleixner wrote:
>> +		if (xfer_to_guest_mode_work_pending()) {
>>  			srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
>> -			cond_resched();
>> +			r = xfer_to_guest_mode(vcpu);
>
> Any reason not to call this xfer_to_guest_mode_work()?  Or handle_work(),
> do_work(), etc...  Without the "work" part, it looks like a function that
> should be invoked unconditionally.  It's obvious that's not the case if
> one looks at the implementation, but it's a bit confusing on the KVM side
> of things.

The reason is probably lazyness. The original approach was to have this
as close as possible to user entry/exit but with the recent changes
vs. instrumentation and RCU this is not longer the case.

I really want to keep the notion of transitioning in the function name,
so xfer_to_guest_mode_handle_work() makes a lot of sense.

I'll change that before merging the lot into the tip tree if your
Reviewed-by still stands with that change made w/o reposting.

Thanks,

        tglx
Sean Christopherson July 24, 2020, 12:55 a.m. UTC | #3
On Fri, Jul 24, 2020 at 02:46:26AM +0200, Thomas Gleixner wrote:
> Sean,
> 
> Sean Christopherson <sean.j.christopherson@intel.com> writes:
> > On Thu, Jul 23, 2020 at 12:00:09AM +0200, Thomas Gleixner wrote:
> >> +		if (xfer_to_guest_mode_work_pending()) {
> >>  			srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
> >> -			cond_resched();
> >> +			r = xfer_to_guest_mode(vcpu);
> >
> > Any reason not to call this xfer_to_guest_mode_work()?  Or handle_work(),
> > do_work(), etc...  Without the "work" part, it looks like a function that
> > should be invoked unconditionally.  It's obvious that's not the case if
> > one looks at the implementation, but it's a bit confusing on the KVM side
> > of things.
> 
> The reason is probably lazyness. The original approach was to have this
> as close as possible to user entry/exit but with the recent changes
> vs. instrumentation and RCU this is not longer the case.
> 
> I really want to keep the notion of transitioning in the function name,
> so xfer_to_guest_mode_handle_work() makes a lot of sense.
> 
> I'll change that before merging the lot into the tip tree if your
> Reviewed-by still stands with that change made w/o reposting.

Ya, works for me.
Ingo Molnar July 24, 2020, 2:24 p.m. UTC | #4
* Thomas Gleixner <tglx@linutronix.de> wrote:

> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Use the generic infrastructure to check for and handle pending work before
> transitioning into guest mode.
> 
> This now handles TIF_NOTIFY_RESUME as well which was ignored so
> far. Handling it is important as this covers task work and task work will
> be used to offload the heavy lifting of POSIX CPU timers to thread context.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V5: Rename exit -> xfer
> ---
>  arch/x86/kvm/Kconfig   |    1 +
>  arch/x86/kvm/vmx/vmx.c |   11 +++++------
>  arch/x86/kvm/x86.c     |   15 ++++++---------
>  3 files changed, 12 insertions(+), 15 deletions(-)
> 
> --- a/arch/x86/kvm/Kconfig
> +++ b/arch/x86/kvm/Kconfig
> @@ -42,6 +42,7 @@ config KVM
>  	select HAVE_KVM_MSI
>  	select HAVE_KVM_CPU_RELAX_INTERCEPT
>  	select HAVE_KVM_NO_POLL
> +	select KVM_XFER_TO_GUEST_WORK
>  	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
>  	select KVM_VFIO
>  	select SRCU
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -27,6 +27,7 @@
>  #include <linux/slab.h>
>  #include <linux/tboot.h>
>  #include <linux/trace_events.h>
> +#include <linux/entry-kvm.h>
>  
>  #include <asm/apic.h>
>  #include <asm/asm.h>
> @@ -5376,14 +5377,12 @@ static int handle_invalid_guest_state(st

>  
>  	return 1;
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -56,6 +56,7 @@
>  #include <linux/sched/stat.h>
>  #include <linux/sched/isolation.h>
>  #include <linux/mem_encrypt.h>
> +#include <linux/entry-kvm.h>
>  
>  #include <trace/events/kvm.h>
>  
> @@ -1585,7 +1586,7 @@ EXPORT_SYMBOL_GPL(kvm_emulate_wrmsr);
>  bool kvm_vcpu_exit_request(struct kvm_vcpu *vcpu)
>  {
>  	return vcpu->mode == EXITING_GUEST_MODE || kvm_request_pending(vcpu) ||
> -		need_resched() || signal_pending(current);
> +		xfer_to_guest_mode_work_pending();
>  }
>  EXPORT_SYMBOL_GPL(kvm_vcpu_exit_request);
>  
> @@ -8676,15 +8677,11 @@ static int vcpu_run(struct kvm_vcpu *vcp
>  			break;
>  		}
>  
> -		if (signal_pending(current)) {
> -			r = -EINTR;
> -			vcpu->run->exit_reason = KVM_EXIT_INTR;
> -			++vcpu->stat.signal_exits;
> -			break;
> -		}
> -		if (need_resched()) {
> +		if (xfer_to_guest_mode_work_pending()) {
>  			srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
> -			cond_resched();
> +			r = xfer_to_guest_mode(vcpu);
> +			if (r)
> +				return r;
>  			vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
>  		}
>  	}

So this chunk replaces vcpu_run()'s cond_resched() call with a call to 
xfer_to_guest_mode(), which checks NEED_RESCHED & acts upon it, among 
other things.

But:

>  		}
>  
>  		/*
> -		 * Note, return 1 and not 0, vcpu_run() is responsible for
> -		 * morphing the pending signal into the proper return code.
> +		 * Note, return 1 and not 0, vcpu_run() will invoke
> +		 * xfer_to_guest_mode() which will create a proper return
> +		 * code.
>  		 */
> -		if (signal_pending(current))
> +		if (__xfer_to_guest_mode_work_pending())
>  			return 1;
> -
> -		if (need_resched())
> -			schedule();
>  	}

AFAICS this chunk removes a conditional reschedule point from 
handle_invalid_guest_state() and replaces it with 
__xfer_to_guest_mode_work_pending().

But __xfer_to_guest_mode_work_pending() doesn't do the cond-resched of 
the full xfer_to_guest_mode_work() function - so we essentially lose a 
cond_resched() here.

Is this side effect intended, was the cond_resched() superfluous?

The logic is quite a maze, and the KVM code was missing a few of the 
state checks, so maybe this is all for the better - just wanted to 
mention it, in case it matters.

Thanks,

	Ingo
Thomas Gleixner July 24, 2020, 7:08 p.m. UTC | #5
Ingo Molnar <mingo@kernel.org> writes:
> * Thomas Gleixner <tglx@linutronix.de> wrote:
>>  		/*
>> -		 * Note, return 1 and not 0, vcpu_run() is responsible for
>> -		 * morphing the pending signal into the proper return code.
>> +		 * Note, return 1 and not 0, vcpu_run() will invoke
>> +		 * xfer_to_guest_mode() which will create a proper return
>> +		 * code.
>>  		 */
>> -		if (signal_pending(current))
>> +		if (__xfer_to_guest_mode_work_pending())
>>  			return 1;
>> -
>> -		if (need_resched())
>> -			schedule();
>>  	}
>
> AFAICS this chunk removes a conditional reschedule point from 
> handle_invalid_guest_state() and replaces it with 
> __xfer_to_guest_mode_work_pending().
>
> But __xfer_to_guest_mode_work_pending() doesn't do the cond-resched of 
> the full xfer_to_guest_mode_work() function - so we essentially lose a 
> cond_resched() here.
>
> Is this side effect intended, was the cond_resched() superfluous?

It makes the thing drop back to the outer loop for any pending work not
only for signals. That avoids having yet another thing to worry about.

Thanks,

        tglx
diff mbox series

Patch

--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -42,6 +42,7 @@  config KVM
 	select HAVE_KVM_MSI
 	select HAVE_KVM_CPU_RELAX_INTERCEPT
 	select HAVE_KVM_NO_POLL
+	select KVM_XFER_TO_GUEST_WORK
 	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
 	select KVM_VFIO
 	select SRCU
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -27,6 +27,7 @@ 
 #include <linux/slab.h>
 #include <linux/tboot.h>
 #include <linux/trace_events.h>
+#include <linux/entry-kvm.h>
 
 #include <asm/apic.h>
 #include <asm/asm.h>
@@ -5376,14 +5377,12 @@  static int handle_invalid_guest_state(st
 		}
 
 		/*
-		 * Note, return 1 and not 0, vcpu_run() is responsible for
-		 * morphing the pending signal into the proper return code.
+		 * Note, return 1 and not 0, vcpu_run() will invoke
+		 * xfer_to_guest_mode() which will create a proper return
+		 * code.
 		 */
-		if (signal_pending(current))
+		if (__xfer_to_guest_mode_work_pending())
 			return 1;
-
-		if (need_resched())
-			schedule();
 	}
 
 	return 1;
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -56,6 +56,7 @@ 
 #include <linux/sched/stat.h>
 #include <linux/sched/isolation.h>
 #include <linux/mem_encrypt.h>
+#include <linux/entry-kvm.h>
 
 #include <trace/events/kvm.h>
 
@@ -1585,7 +1586,7 @@  EXPORT_SYMBOL_GPL(kvm_emulate_wrmsr);
 bool kvm_vcpu_exit_request(struct kvm_vcpu *vcpu)
 {
 	return vcpu->mode == EXITING_GUEST_MODE || kvm_request_pending(vcpu) ||
-		need_resched() || signal_pending(current);
+		xfer_to_guest_mode_work_pending();
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_exit_request);
 
@@ -8676,15 +8677,11 @@  static int vcpu_run(struct kvm_vcpu *vcp
 			break;
 		}
 
-		if (signal_pending(current)) {
-			r = -EINTR;
-			vcpu->run->exit_reason = KVM_EXIT_INTR;
-			++vcpu->stat.signal_exits;
-			break;
-		}
-		if (need_resched()) {
+		if (xfer_to_guest_mode_work_pending()) {
 			srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
-			cond_resched();
+			r = xfer_to_guest_mode(vcpu);
+			if (r)
+				return r;
 			vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
 		}
 	}