diff mbox series

[v15,10/13] tsc: Upgrade TSC clocksource rating

Message ID 20241203090045.942078-11-nikunj@amd.com (mailing list archive)
State New, archived
Headers show
Series Add Secure TSC support for SNP guests | expand

Commit Message

Nikunj A Dadhania Dec. 3, 2024, 9 a.m. UTC
In virtualized environments running on modern CPUs, the underlying
platforms guarantees to have a stable, always running TSC, i.e. that the
TSC is a superior timesource as compared to other clock sources (such as
kvmclock, HPET, ACPI timer, APIC, etc.).

Upgrade the rating of the early and regular clock source to prefer TSC over
other clock sources when TSC is invariant, non-stop and stable.

Cc: Alexey Makhalov <alexey.makhalov@broadcom.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/kernel/tsc.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

Comments

Borislav Petkov Dec. 30, 2024, 11:36 a.m. UTC | #1
On Tue, Dec 03, 2024 at 02:30:42PM +0530, Nikunj A Dadhania wrote:
> In virtualized environments running on modern CPUs, the underlying
> platforms guarantees to have a stable, always running TSC, i.e. that the
> TSC is a superior timesource as compared to other clock sources (such as
> kvmclock, HPET, ACPI timer, APIC, etc.).

That paragraph sounds like marketing fluff and can't be more far away from the
truth. We still can't have a stable clocksource in the currently ending 2024
without someone f*cking with it.

> Upgrade the rating of the early and regular clock source to prefer TSC over
> other clock sources when TSC is invariant, non-stop and stable.

I don't think so...

Have you read all that gunk in check_system_tsc_reliable() and the commits
touching that logic which disables the TSC clocksource watchdog caused by all
the hw crap that is being produced?

> Cc: Alexey Makhalov <alexey.makhalov@broadcom.com>
> Cc: Juergen Gross <jgross@suse.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> ---
>  arch/x86/kernel/tsc.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index c0eef924b84e..900edcde0c9e 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -1265,6 +1265,21 @@ static void __init check_system_tsc_reliable(void)
>  		tsc_disable_clocksource_watchdog();
>  }
>  
> +static void __init upgrade_clock_rating(struct clocksource *tsc_early,
> +					struct clocksource *tsc)
> +{
> +	/*
> +	 * Upgrade the clock rating for TSC early and regular clocksource when
> +	 * the underlying platform provides non-stop, invaraint and stable TSC.

Unknown word [invaraint] in comment.
Suggestions: ['invariant', 'inerrant', 'invent', 'intranet', 'infant', 'unvaried', 'informant', 'ingrained', 'entrant', 'inherent', 'unafraid', 'infuriate', 'inferring', 'univalent', 'infringe', 'infringed', 'infuriating']

Spellchecker pls.

> +	 */
> +	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
> +	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&

check_for_deprecated_apis: WARNING: arch/x86/kernel/tsc.c:1275: Do not use boot_cpu_has() - use cpu_feature_enabled() instead.
check_for_deprecated_apis: WARNING: arch/x86/kernel/tsc.c:1276: Do not use boot_cpu_has() - use cpu_feature_enabled() instead.

> +	    !tsc_unstable) {
> +		tsc_early->rating = 449;
> +		tsc->rating = 450;
> +	}
> +}
> +
>  /*
>   * Make an educated guess if the TSC is trustworthy and synchronized
>   * over all CPUs.
> @@ -1566,6 +1581,8 @@ void __init tsc_init(void)
>  	if (tsc_clocksource_reliable || no_tsc_watchdog)
>  		tsc_disable_clocksource_watchdog();
>  
> +	upgrade_clock_rating(&clocksource_tsc_early, &clocksource_tsc);
> +
>  	clocksource_register_khz(&clocksource_tsc_early, tsc_khz);
>  	detect_art();
>  }
> -- 
> 2.34.1
>
Nikunj A Dadhania Jan. 2, 2025, 5:20 a.m. UTC | #2
On 12/30/2024 5:06 PM, Borislav Petkov wrote:
> On Tue, Dec 03, 2024 at 02:30:42PM +0530, Nikunj A Dadhania wrote:

> 
>> Upgrade the rating of the early and regular clock source to prefer TSC over
>> other clock sources when TSC is invariant, non-stop and stable.
> 
> I don't think so...


This is what was suggested by tglx: 

"So if you know you want TSC to be selected, then upgrade the rating of
 both the early and the regular TSC clocksource and be done with it."

https://lore.kernel.org/lkml/87setgzvn5.ffs@tglx/

Regards,
Nikunj
Borislav Petkov Jan. 2, 2025, 9:32 a.m. UTC | #3
On Thu, Jan 02, 2025 at 10:50:53AM +0530, Nikunj A. Dadhania wrote:
> This is what was suggested by tglx: 
> 
> "So if you know you want TSC to be selected, then upgrade the rating of
>  both the early and the regular TSC clocksource and be done with it."

I highly doubt that he saw what you have now:

Your commit message is talking about virtualized environments but your diff is
doing a global, unconditional change which affects *everything*.

If we are going to do this, we need to put this at least through the
meatgrinder and run it on *everything* that's out there and see what falls out
before we make it the default.

But if tglx thinks this is fine, I won't object - I've uttered my scepticism
already.
Nikunj A Dadhania Jan. 3, 2025, 10:09 a.m. UTC | #4
On 1/2/2025 3:02 PM, Borislav Petkov wrote:
> On Thu, Jan 02, 2025 at 10:50:53AM +0530, Nikunj A. Dadhania wrote:
>> This is what was suggested by tglx: 
>>
>> "So if you know you want TSC to be selected, then upgrade the rating of
>>  both the early and the regular TSC clocksource and be done with it."
> 
> I highly doubt that he saw what you have now:
> 
> Your commit message is talking about virtualized environments but your diff is
> doing a global, unconditional change which affects *everything*.

Right, let me limit this only to virtualized environments as part of 
CONFIG_PARAVIRT.

Subject: [PATCH] x86/tsc: Upgrade TSC clocksource rating for guests

Hypervisor platform setup (x86_hyper_init::init_platform) routines register 
their own PV clock sources (KVM, HyperV, and Xen) at different clock ratings
resulting in selection of PV clock source even though a stable TSC clock
source is available. Upgrade the clock rating of the TSC early and
regular clock source to prefer TSC over PV clock sources when TSC is
invariant, non-stop and stable

Cc: Alexey Makhalov <alexey.makhalov@broadcom.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---
 arch/x86/kernel/tsc.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 34dec0b72ea8..5c6831a42889 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -274,10 +274,31 @@ bool using_native_sched_clock(void)
 {
 	return static_call_query(pv_sched_clock) == native_sched_clock;
 }
+
+/*
+ * Upgrade the clock rating for TSC early and regular clocksource when the
+ * underlying platform provides non-stop, invariant, and stable TSC. TSC
+ * early/regular clocksource will be preferred over other para-virtualized clock
+ * sources.
+ */
+static void __init upgrade_clock_rating(struct clocksource *tsc_early,
+					struct clocksource *tsc)
+{
+	if (cpu_feature_enabled(X86_FEATURE_HYPERVISOR) &&
+	    cpu_feature_enabled(X86_FEATURE_CONSTANT_TSC) &&
+	    cpu_feature_enabled(X86_FEATURE_NONSTOP_TSC) &&
+	    !tsc_unstable) {
+		tsc_early->rating = 449;
+		tsc->rating = 450;
+	}
+}
 #else
 u64 sched_clock_noinstr(void) __attribute__((alias("native_sched_clock")));
 
 bool using_native_sched_clock(void) { return true; }
+
+static void __init upgrade_clock_rating(struct clocksource *tsc_early,
+					struct clocksource *tsc) { }
 #endif
 
 notrace u64 sched_clock(void)
@@ -1564,6 +1585,8 @@ void __init tsc_init(void)
 	if (tsc_clocksource_reliable || no_tsc_watchdog)
 		tsc_disable_clocksource_watchdog();
 
+	upgrade_clock_rating(&clocksource_tsc_early, &clocksource_tsc);
+
 	clocksource_register_khz(&clocksource_tsc_early, tsc_khz);
 	detect_art();
 }
Borislav Petkov Jan. 3, 2025, 12:06 p.m. UTC | #5
On Fri, Jan 03, 2025 at 03:39:56PM +0530, Nikunj A. Dadhania wrote:
> Right, let me limit this only to virtualized environments as part of 
> CONFIG_PARAVIRT.

That's not what you do below - you check whether you're running as a guest.

And that's not sufficient as I just said. I think the only somewhat reliable
thing you can do is when you're running as a STSC guest.
Nikunj A Dadhania Jan. 3, 2025, 2:03 p.m. UTC | #6
On 1/3/2025 5:36 PM, Borislav Petkov wrote:
> On Fri, Jan 03, 2025 at 03:39:56PM +0530, Nikunj A. Dadhania wrote:
>> Right, let me limit this only to virtualized environments as part of 
>> CONFIG_PARAVIRT.
> 
> That's not what you do below - you check whether you're running as a guest.
> 
> And that's not sufficient as I just said. I think the only somewhat reliable
> thing you can do is when you're running as a STSC guest.

This is for all guest running with TSC that is constant, non-stop and 
stable[1] (and detailed response here [2]), STSC is one of them.

Thanks
Nikunj

1. https://lore.kernel.org/kvm/ZuR2t1QrBpPc1Sz2@google.com/
2. https://lore.kernel.org/all/8c3cc8a5-1b85-48b3-9361-523f27fb312a@amd.com/
diff mbox series

Patch

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index c0eef924b84e..900edcde0c9e 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1265,6 +1265,21 @@  static void __init check_system_tsc_reliable(void)
 		tsc_disable_clocksource_watchdog();
 }
 
+static void __init upgrade_clock_rating(struct clocksource *tsc_early,
+					struct clocksource *tsc)
+{
+	/*
+	 * Upgrade the clock rating for TSC early and regular clocksource when
+	 * the underlying platform provides non-stop, invaraint and stable TSC.
+	 */
+	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
+	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
+	    !tsc_unstable) {
+		tsc_early->rating = 449;
+		tsc->rating = 450;
+	}
+}
+
 /*
  * Make an educated guess if the TSC is trustworthy and synchronized
  * over all CPUs.
@@ -1566,6 +1581,8 @@  void __init tsc_init(void)
 	if (tsc_clocksource_reliable || no_tsc_watchdog)
 		tsc_disable_clocksource_watchdog();
 
+	upgrade_clock_rating(&clocksource_tsc_early, &clocksource_tsc);
+
 	clocksource_register_khz(&clocksource_tsc_early, tsc_khz);
 	detect_art();
 }