diff mbox series

[v3,4/9] KVM: lapic: Allow user to disable auto-tuning of timer advancement

Message ID 20190416203248.29429-5-sean.j.christopherson@intel.com (mailing list archive)
State New, archived
Headers show
Series KVM: lapic: Fix a variety of timer adv issues | expand

Commit Message

Sean Christopherson April 16, 2019, 8:32 p.m. UTC
Add a new KVM param, lapic_timer_advance_autotune, to allow userspace
to disable auto-tuning the timer advancement, e.g. to manually tune the
delay.

Note, this does not completely restore the original behavior of
lapic_timer_advance_ns.  Prior to tracking the advancement per vCPU,
which is necessary to support autotuning, userspace could adjust
lapic_timer_advance_ns for *running* vCPU.  With per-vCPU tracking, the
module params are snapshotted at vCPU creation, i.e. applying a new
advancement effectively requires restarting a VM.

Dynamically updating a running vCPU is possible, e.g. a helper could be
added to retrieve the desired delay, choosing between the global module
param and the per-VCPU value depending on whether or not auto-tuning is
(globally) enabled, but introduces a great deal of complexity.  The
wrapper itself is not complex, but understanding and documenting the
effects of dynamically toggling auto-tuning and/or adjusting the timer
advancement is nigh impossible since the behavior would be dependent on
KVM's implementation as well as compiler optimizations.  In other words,
providing stable behavior would require extremely careful consideration
now and in the future.

Given that the expected use of a manually-tuned timer advancement is to
"tune once, run many", use the vastly simpler approach of recognizing
changes to the module params only when creating a new vCPU.

Fixes: 3b8a5df6c4dc6 ("KVM: LAPIC: Tune lapic_timer_advance_ns automatically")
Cc: Liran Alon <liran.alon@oracle.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: stable@vger.kernel.org
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/lapic.c | 4 +++-
 arch/x86/kvm/lapic.h | 3 ++-
 arch/x86/kvm/x86.c   | 6 +++++-
 3 files changed, 10 insertions(+), 3 deletions(-)

Comments

Paolo Bonzini April 17, 2019, 12:59 p.m. UTC | #1
On 16/04/19 22:32, Sean Christopherson wrote:
> +static bool __read_mostly lapic_timer_advance_autotune = true;
> +module_param(lapic_timer_advance_autotune, bool, S_IRUGO | S_IWUSR);
> +

Could we just have lapic_timer_advance_ns = 0 mean "adaptive tuning,
starting from a default of 1000" and anything else mean "no tuning, just
use this value"?

Paolo
Sean Christopherson April 17, 2019, 3:22 p.m. UTC | #2
On Wed, Apr 17, 2019 at 02:59:43PM +0200, Paolo Bonzini wrote:
> On 16/04/19 22:32, Sean Christopherson wrote:
> > +static bool __read_mostly lapic_timer_advance_autotune = true;
> > +module_param(lapic_timer_advance_autotune, bool, S_IRUGO | S_IWUSR);
> > +
> 
> Could we just have lapic_timer_advance_ns = 0 mean "adaptive tuning,
> starting from a default of 1000" and anything else mean "no tuning, just
> use this value"?

Originally I implemented a variation of that, except I disabled autotuning
if lapic_timer_advance_ns was set to anything other than the default.
Liran requested a separate param since the behavior was admittedly a bit
sneaky, and I didn't have a strong opinion either way, so here we are.

I think 'lapic_timer_advance_ns = 0' should be used to disable the
advancement.  What if we make the param a signed int and have '-1' select
the "adaptive tuning ..." option?  That could also be the default.  It
saves a param and is hopefully more intuitive than what I originally
proposed.
diff mbox series

Patch

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 3cd62edf8084..37d3489f68c9 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2263,7 +2263,8 @@  static enum hrtimer_restart apic_timer_fn(struct hrtimer *data)
 		return HRTIMER_NORESTART;
 }
 
-int kvm_create_lapic(struct kvm_vcpu *vcpu, u32 timer_advance_ns)
+int kvm_create_lapic(struct kvm_vcpu *vcpu, u32 timer_advance_ns,
+		     bool timer_advance_autotune)
 {
 	struct kvm_lapic *apic;
 
@@ -2288,6 +2289,7 @@  int kvm_create_lapic(struct kvm_vcpu *vcpu, u32 timer_advance_ns)
 		     HRTIMER_MODE_ABS_PINNED);
 	apic->lapic_timer.timer.function = apic_timer_fn;
 	apic->lapic_timer.timer_advance_ns = timer_advance_ns;
+	apic->lapic_timer.timer_advance_adjust_done = !timer_advance_autotune;
 
 	/*
 	 * APIC is created enabled. This will prevent kvm_lapic_set_base from
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 3e97f8a68967..7fa1aed02c14 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -64,7 +64,8 @@  struct kvm_lapic {
 
 struct dest_map;
 
-int kvm_create_lapic(struct kvm_vcpu *vcpu, u32 timer_advance_ns);
+int kvm_create_lapic(struct kvm_vcpu *vcpu, u32 timer_advance_ns,
+		     bool timer_advance_autotune);
 void kvm_free_lapic(struct kvm_vcpu *vcpu);
 
 int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b303a21a2bc2..60d80e354f62 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -140,6 +140,9 @@  module_param(tsc_tolerance_ppm, uint, S_IRUGO | S_IWUSR);
 static u32 __read_mostly lapic_timer_advance_ns = 1000;
 module_param(lapic_timer_advance_ns, uint, S_IRUGO | S_IWUSR);
 
+static bool __read_mostly lapic_timer_advance_autotune = true;
+module_param(lapic_timer_advance_autotune, bool, S_IRUGO | S_IWUSR);
+
 static bool __read_mostly vector_hashing = true;
 module_param(vector_hashing, bool, S_IRUGO);
 
@@ -9078,7 +9081,8 @@  int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 
 	if (irqchip_in_kernel(vcpu->kvm)) {
 		vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu);
-		r = kvm_create_lapic(vcpu, lapic_timer_advance_ns);
+		r = kvm_create_lapic(vcpu, lapic_timer_advance_ns,
+				     lapic_timer_advance_autotune);
 		if (r < 0)
 			goto fail_mmu_destroy;
 	} else