diff mbox series

[v2] x86: kvmguest: use TSC clocksource if invariant TSC is exposed

Message ID 20190104175412.GA31736@amt.cnet (mailing list archive)
State New, archived
Headers show
Series [v2] x86: kvmguest: use TSC clocksource if invariant TSC is exposed | expand

Commit Message

Marcelo Tosatti Jan. 4, 2019, 5:54 p.m. UTC
The invariant TSC bit has the following meaning:

"The time stamp counter in newer processors may support an enhancement,
referred to as invariant TSC. Processor's support for invariant TSC
is indicated by CPUID.80000007H:EDX[8]. The invariant TSC will run
at a constant rate in all ACPI P-, C-. and T-states. This is the
architectural behavior moving forward. On processors with invariant TSC
support, the OS may use the TSC for wall clock timer services (instead
of ACPI or HPET timers). TSC reads are much more efficient and do not
incur the overhead associated with a ring transition or access to a
platform resource."

IOW, TSC does not change frequency. In such case, and with
TSC scaling hardware available to handle migration, it is possible
to use the TSC clocksource directly, whose system calls are
faster.

Reduce the rating of kvmclock clocksource to allow TSC clocksource
to be the default if invariant TSC is exposed.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

v2: Use feature bits and tsc_unstable() check (Sean Christopherson)

Comments

Marcelo Tosatti Feb. 14, 2019, 11:14 a.m. UTC | #1
Ping?

On Fri, Jan 04, 2019 at 03:54:14PM -0200, Marcelo Tosatti wrote:
> 
> 
> The invariant TSC bit has the following meaning:
> 
> "The time stamp counter in newer processors may support an enhancement,
> referred to as invariant TSC. Processor's support for invariant TSC
> is indicated by CPUID.80000007H:EDX[8]. The invariant TSC will run
> at a constant rate in all ACPI P-, C-. and T-states. This is the
> architectural behavior moving forward. On processors with invariant TSC
> support, the OS may use the TSC for wall clock timer services (instead
> of ACPI or HPET timers). TSC reads are much more efficient and do not
> incur the overhead associated with a ring transition or access to a
> platform resource."
> 
> IOW, TSC does not change frequency. In such case, and with
> TSC scaling hardware available to handle migration, it is possible
> to use the TSC clocksource directly, whose system calls are
> faster.
> 
> Reduce the rating of kvmclock clocksource to allow TSC clocksource
> to be the default if invariant TSC is exposed.
> 
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> 
> v2: Use feature bits and tsc_unstable() check (Sean Christopherson)
> 
> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> index 30084ec..a14601c 100644
> --- a/arch/x86/kernel/kvmclock.c
> +++ b/arch/x86/kernel/kvmclock.c
> @@ -368,6 +368,20 @@ void __init kvmclock_init(void)
>  	machine_ops.crash_shutdown  = kvm_crash_shutdown;
>  #endif
>  	kvm_get_preset_lpj();
> +
> +	/*
> +	 * X86_FEATURE_NONSTOP_TSC is TSC runs at constant rate
> +	 * with P/T states and does not stop in deep C-states.
> +	 *
> +	 * Invariant TSC exposed by host means kvmclock is not necessary:
> +	 * can use TSC as clocksource.
> +	 *
> +	 */
> +	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
> +	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
> +	    !check_tsc_unstable())
> +		kvm_clock.rating = 299;
> +
>  	clocksource_register_hz(&kvm_clock, NSEC_PER_SEC);
>  	pv_info.name = "KVM";
>  }
>
Paolo Bonzini Feb. 14, 2019, 12:33 p.m. UTC | #2
On 04/01/19 18:54, Marcelo Tosatti wrote:
> 
> 
> The invariant TSC bit has the following meaning:
> 
> "The time stamp counter in newer processors may support an enhancement,
> referred to as invariant TSC. Processor's support for invariant TSC
> is indicated by CPUID.80000007H:EDX[8]. The invariant TSC will run
> at a constant rate in all ACPI P-, C-. and T-states. This is the
> architectural behavior moving forward. On processors with invariant TSC
> support, the OS may use the TSC for wall clock timer services (instead
> of ACPI or HPET timers). TSC reads are much more efficient and do not
> incur the overhead associated with a ring transition or access to a
> platform resource."
> 
> IOW, TSC does not change frequency. In such case, and with
> TSC scaling hardware available to handle migration, it is possible
> to use the TSC clocksource directly, whose system calls are
> faster.
> 
> Reduce the rating of kvmclock clocksource to allow TSC clocksource
> to be the default if invariant TSC is exposed.
> 
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> 
> v2: Use feature bits and tsc_unstable() check (Sean Christopherson)
> 
> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> index 30084ec..a14601c 100644
> --- a/arch/x86/kernel/kvmclock.c
> +++ b/arch/x86/kernel/kvmclock.c
> @@ -368,6 +368,20 @@ void __init kvmclock_init(void)
>  	machine_ops.crash_shutdown  = kvm_crash_shutdown;
>  #endif
>  	kvm_get_preset_lpj();
> +
> +	/*
> +	 * X86_FEATURE_NONSTOP_TSC is TSC runs at constant rate
> +	 * with P/T states and does not stop in deep C-states.
> +	 *
> +	 * Invariant TSC exposed by host means kvmclock is not necessary:
> +	 * can use TSC as clocksource.
> +	 *
> +	 */
> +	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
> +	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
> +	    !check_tsc_unstable())
> +		kvm_clock.rating = 299;
> +
>  	clocksource_register_hz(&kvm_clock, NSEC_PER_SEC);
>  	pv_info.name = "KVM";
>  }
> 
> 

Thanks, queued for 5.1.

Paolo
diff mbox series

Patch

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 30084ec..a14601c 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -368,6 +368,20 @@  void __init kvmclock_init(void)
 	machine_ops.crash_shutdown  = kvm_crash_shutdown;
 #endif
 	kvm_get_preset_lpj();
+
+	/*
+	 * X86_FEATURE_NONSTOP_TSC is TSC runs at constant rate
+	 * with P/T states and does not stop in deep C-states.
+	 *
+	 * Invariant TSC exposed by host means kvmclock is not necessary:
+	 * can use TSC as clocksource.
+	 *
+	 */
+	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
+	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
+	    !check_tsc_unstable())
+		kvm_clock.rating = 299;
+
 	clocksource_register_hz(&kvm_clock, NSEC_PER_SEC);
 	pv_info.name = "KVM";
 }