Message ID | 20200722220520.979724969@linutronix.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | entry, x86, kvm: Generic entry/exit functionality for host and guest | expand |
On Thu, Jul 23, 2020 at 12:00:09AM +0200, Thomas Gleixner wrote: > From: Thomas Gleixner <tglx@linutronix.de> > > Use the generic infrastructure to check for and handle pending work before > transitioning into guest mode. > > This now handles TIF_NOTIFY_RESUME as well which was ignored so > far. Handling it is important as this covers task work and task work will > be used to offload the heavy lifting of POSIX CPU timers to thread context. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > --- > V5: Rename exit -> xfer > --- One nit/question below (though it's really about patch 5). Reviewed-and-tested-by: Sean Christopherson <sean.j.christopherson@intel.com> > @@ -8676,15 +8677,11 @@ static int vcpu_run(struct kvm_vcpu *vcp > break; > } > > - if (signal_pending(current)) { > - r = -EINTR; > - vcpu->run->exit_reason = KVM_EXIT_INTR; > - ++vcpu->stat.signal_exits; > - break; > - } > - if (need_resched()) { > + if (xfer_to_guest_mode_work_pending()) { > srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); > - cond_resched(); > + r = xfer_to_guest_mode(vcpu); Any reason not to call this xfer_to_guest_mode_work()? Or handle_work(), do_work(), etc... Without the "work" part, it looks like a function that should be invoked unconditionally. It's obvious that's not the case if one looks at the implementation, but it's a bit confusing on the KVM side of things. > + if (r) > + return r; > vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); > } > } >
Sean, Sean Christopherson <sean.j.christopherson@intel.com> writes: > On Thu, Jul 23, 2020 at 12:00:09AM +0200, Thomas Gleixner wrote: >> + if (xfer_to_guest_mode_work_pending()) { >> srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); >> - cond_resched(); >> + r = xfer_to_guest_mode(vcpu); > > Any reason not to call this xfer_to_guest_mode_work()? Or handle_work(), > do_work(), etc... Without the "work" part, it looks like a function that > should be invoked unconditionally. It's obvious that's not the case if > one looks at the implementation, but it's a bit confusing on the KVM side > of things. The reason is probably lazyness. The original approach was to have this as close as possible to user entry/exit but with the recent changes vs. instrumentation and RCU this is not longer the case. I really want to keep the notion of transitioning in the function name, so xfer_to_guest_mode_handle_work() makes a lot of sense. I'll change that before merging the lot into the tip tree if your Reviewed-by still stands with that change made w/o reposting. Thanks, tglx
On Fri, Jul 24, 2020 at 02:46:26AM +0200, Thomas Gleixner wrote: > Sean, > > Sean Christopherson <sean.j.christopherson@intel.com> writes: > > On Thu, Jul 23, 2020 at 12:00:09AM +0200, Thomas Gleixner wrote: > >> + if (xfer_to_guest_mode_work_pending()) { > >> srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); > >> - cond_resched(); > >> + r = xfer_to_guest_mode(vcpu); > > > > Any reason not to call this xfer_to_guest_mode_work()? Or handle_work(), > > do_work(), etc... Without the "work" part, it looks like a function that > > should be invoked unconditionally. It's obvious that's not the case if > > one looks at the implementation, but it's a bit confusing on the KVM side > > of things. > > The reason is probably lazyness. The original approach was to have this > as close as possible to user entry/exit but with the recent changes > vs. instrumentation and RCU this is not longer the case. > > I really want to keep the notion of transitioning in the function name, > so xfer_to_guest_mode_handle_work() makes a lot of sense. > > I'll change that before merging the lot into the tip tree if your > Reviewed-by still stands with that change made w/o reposting. Ya, works for me.
* Thomas Gleixner <tglx@linutronix.de> wrote: > From: Thomas Gleixner <tglx@linutronix.de> > > Use the generic infrastructure to check for and handle pending work before > transitioning into guest mode. > > This now handles TIF_NOTIFY_RESUME as well which was ignored so > far. Handling it is important as this covers task work and task work will > be used to offload the heavy lifting of POSIX CPU timers to thread context. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > --- > V5: Rename exit -> xfer > --- > arch/x86/kvm/Kconfig | 1 + > arch/x86/kvm/vmx/vmx.c | 11 +++++------ > arch/x86/kvm/x86.c | 15 ++++++--------- > 3 files changed, 12 insertions(+), 15 deletions(-) > > --- a/arch/x86/kvm/Kconfig > +++ b/arch/x86/kvm/Kconfig > @@ -42,6 +42,7 @@ config KVM > select HAVE_KVM_MSI > select HAVE_KVM_CPU_RELAX_INTERCEPT > select HAVE_KVM_NO_POLL > + select KVM_XFER_TO_GUEST_WORK > select KVM_GENERIC_DIRTYLOG_READ_PROTECT > select KVM_VFIO > select SRCU > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -27,6 +27,7 @@ > #include <linux/slab.h> > #include <linux/tboot.h> > #include <linux/trace_events.h> > +#include <linux/entry-kvm.h> > > #include <asm/apic.h> > #include <asm/asm.h> > @@ -5376,14 +5377,12 @@ static int handle_invalid_guest_state(st > > return 1; > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -56,6 +56,7 @@ > #include <linux/sched/stat.h> > #include <linux/sched/isolation.h> > #include <linux/mem_encrypt.h> > +#include <linux/entry-kvm.h> > > #include <trace/events/kvm.h> > > @@ -1585,7 +1586,7 @@ EXPORT_SYMBOL_GPL(kvm_emulate_wrmsr); > bool kvm_vcpu_exit_request(struct kvm_vcpu *vcpu) > { > return vcpu->mode == EXITING_GUEST_MODE || kvm_request_pending(vcpu) || > - need_resched() || signal_pending(current); > + xfer_to_guest_mode_work_pending(); > } > EXPORT_SYMBOL_GPL(kvm_vcpu_exit_request); > > @@ -8676,15 +8677,11 @@ static int vcpu_run(struct kvm_vcpu *vcp > break; > } > > - if (signal_pending(current)) { > - r = -EINTR; > - vcpu->run->exit_reason = KVM_EXIT_INTR; > - ++vcpu->stat.signal_exits; > - break; > - } > - if (need_resched()) { > + if (xfer_to_guest_mode_work_pending()) { > srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); > - cond_resched(); > + r = xfer_to_guest_mode(vcpu); > + if (r) > + return r; > vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); > } > } So this chunk replaces vcpu_run()'s cond_resched() call with a call to xfer_to_guest_mode(), which checks NEED_RESCHED & acts upon it, among other things. But: > } > > /* > - * Note, return 1 and not 0, vcpu_run() is responsible for > - * morphing the pending signal into the proper return code. > + * Note, return 1 and not 0, vcpu_run() will invoke > + * xfer_to_guest_mode() which will create a proper return > + * code. > */ > - if (signal_pending(current)) > + if (__xfer_to_guest_mode_work_pending()) > return 1; > - > - if (need_resched()) > - schedule(); > } AFAICS this chunk removes a conditional reschedule point from handle_invalid_guest_state() and replaces it with __xfer_to_guest_mode_work_pending(). But __xfer_to_guest_mode_work_pending() doesn't do the cond-resched of the full xfer_to_guest_mode_work() function - so we essentially lose a cond_resched() here. Is this side effect intended, was the cond_resched() superfluous? The logic is quite a maze, and the KVM code was missing a few of the state checks, so maybe this is all for the better - just wanted to mention it, in case it matters. Thanks, Ingo
Ingo Molnar <mingo@kernel.org> writes: > * Thomas Gleixner <tglx@linutronix.de> wrote: >> /* >> - * Note, return 1 and not 0, vcpu_run() is responsible for >> - * morphing the pending signal into the proper return code. >> + * Note, return 1 and not 0, vcpu_run() will invoke >> + * xfer_to_guest_mode() which will create a proper return >> + * code. >> */ >> - if (signal_pending(current)) >> + if (__xfer_to_guest_mode_work_pending()) >> return 1; >> - >> - if (need_resched()) >> - schedule(); >> } > > AFAICS this chunk removes a conditional reschedule point from > handle_invalid_guest_state() and replaces it with > __xfer_to_guest_mode_work_pending(). > > But __xfer_to_guest_mode_work_pending() doesn't do the cond-resched of > the full xfer_to_guest_mode_work() function - so we essentially lose a > cond_resched() here. > > Is this side effect intended, was the cond_resched() superfluous? It makes the thing drop back to the outer loop for any pending work not only for signals. That avoids having yet another thing to worry about. Thanks, tglx
--- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -42,6 +42,7 @@ config KVM select HAVE_KVM_MSI select HAVE_KVM_CPU_RELAX_INTERCEPT select HAVE_KVM_NO_POLL + select KVM_XFER_TO_GUEST_WORK select KVM_GENERIC_DIRTYLOG_READ_PROTECT select KVM_VFIO select SRCU --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -27,6 +27,7 @@ #include <linux/slab.h> #include <linux/tboot.h> #include <linux/trace_events.h> +#include <linux/entry-kvm.h> #include <asm/apic.h> #include <asm/asm.h> @@ -5376,14 +5377,12 @@ static int handle_invalid_guest_state(st } /* - * Note, return 1 and not 0, vcpu_run() is responsible for - * morphing the pending signal into the proper return code. + * Note, return 1 and not 0, vcpu_run() will invoke + * xfer_to_guest_mode() which will create a proper return + * code. */ - if (signal_pending(current)) + if (__xfer_to_guest_mode_work_pending()) return 1; - - if (need_resched()) - schedule(); } return 1; --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -56,6 +56,7 @@ #include <linux/sched/stat.h> #include <linux/sched/isolation.h> #include <linux/mem_encrypt.h> +#include <linux/entry-kvm.h> #include <trace/events/kvm.h> @@ -1585,7 +1586,7 @@ EXPORT_SYMBOL_GPL(kvm_emulate_wrmsr); bool kvm_vcpu_exit_request(struct kvm_vcpu *vcpu) { return vcpu->mode == EXITING_GUEST_MODE || kvm_request_pending(vcpu) || - need_resched() || signal_pending(current); + xfer_to_guest_mode_work_pending(); } EXPORT_SYMBOL_GPL(kvm_vcpu_exit_request); @@ -8676,15 +8677,11 @@ static int vcpu_run(struct kvm_vcpu *vcp break; } - if (signal_pending(current)) { - r = -EINTR; - vcpu->run->exit_reason = KVM_EXIT_INTR; - ++vcpu->stat.signal_exits; - break; - } - if (need_resched()) { + if (xfer_to_guest_mode_work_pending()) { srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); - cond_resched(); + r = xfer_to_guest_mode(vcpu); + if (r) + return r; vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); } }