diff mbox series

[v2] x86/tsx: fix KVM guest live migration for tsx=on

Message ID 20220411200703.48654-1-jon@nutanix.com (mailing list archive)
State New, archived
Headers show
Series [v2] x86/tsx: fix KVM guest live migration for tsx=on | expand

Commit Message

Jon Kohler April 11, 2022, 8:07 p.m. UTC
Move automatic disablement for TSX microcode deprecation from tsx_init() to
x86_get_tsx_auto_mode(), such that systems with tsx=on will continue to
see the TSX CPU features (HLE, RTM) even on updated microcode.

KVM live migration could be possibly be broken in 5.14+ commit 293649307ef9
("x86/tsx: Clear CPUID bits when TSX always force aborts"). Consider the
following scenario:

1. KVM hosts clustered in a live migration capable setup.
2. KVM guests have TSX CPU features HLE and/or RTM presented.
3. One of the three maintenance events occur:
3a. An existing host running kernel >= 5.14 in the pool updated with the
    new microcode.
3b. A new host running kernel >= 5.14 is commissioned that already has the
    microcode update preloaded.
3c. All hosts are running kernel < 5.14 with microcode update already
    loaded and one existing host gets updated to kernel >= 5.14.
4. After maintenance event, the impacted host will not have HLE and RTM
   exposed, and live migrations with guests with TSX features might not
   migrate.

Users using tsx=on or CONFIG_X86_INTEL_TSX_MODE_ON should always see
HLE and RTM on capable Intel SKUs, even if microcode has been clubbed to
prevent functionality.

Users using tsx=auto get or CONFIG_X86_INTEL_TSX_MODE_AUTO get to roll the
dice with whatever the kernel believes the appropriate default is, which
includes the feature disappearing after a kernel and/or microcode update.
These users should consider masking HLE and RTM at a higher control plane
level, e.g. qemu or libvirt, such that guests on TSX enabled systems do not
see HLE/RTM and therefore do not enable TAA mitigation.

Fixes: 293649307ef9 ("x86/tsx: Clear CPUID bits when TSX always force aborts")

Signed-off-by: Jon Kohler <jon@nutanix.com>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Neelima Krishnan <neelima.krishnan@intel.com>
Cc: kvm@vger.kernel.org <kvm@vger.kernel.org>
---
v1 -> v2:
 - Addressed comments on approach from Dave.

 arch/x86/kernel/cpu/tsx.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

--
2.30.1 (Apple Git-130)

Comments

Pawan Gupta April 12, 2022, 7:55 p.m. UTC | #1
On Mon, Apr 11, 2022 at 04:07:01PM -0400, Jon Kohler wrote:
>Move automatic disablement for TSX microcode deprecation from tsx_init() to
>x86_get_tsx_auto_mode(), such that systems with tsx=on will continue to
>see the TSX CPU features (HLE, RTM) even on updated microcode.

This patch needs to be based on recent changes in TSX handling (due to
Feb 2022 microcode update). These patches were recently merged in tip
tree:

   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/urgent

Specifically these patches:

   x86/tsx: Use MSR_TSX_CTRL to clear CPUID bits [1]
   x86/tsx: Disable TSX development mode at boot [2]

Thanks,
Pawan

[1] https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=258f3b8c3210b03386e4ad92b4bd8652b5c1beb3
[2] https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=400331f8ffa3bec5c561417e5eec6848464e9160
Pawan Gupta April 12, 2022, 8:54 p.m. UTC | #2
On Mon, Apr 11, 2022 at 04:07:01PM -0400, Jon Kohler wrote:
>Move automatic disablement for TSX microcode deprecation from tsx_init() to
>x86_get_tsx_auto_mode(), such that systems with tsx=on will continue to
>see the TSX CPU features (HLE, RTM) even on updated microcode.
>
>KVM live migration could be possibly be broken in 5.14+ commit 293649307ef9
>("x86/tsx: Clear CPUID bits when TSX always force aborts"). Consider the
>following scenario:
>
>1. KVM hosts clustered in a live migration capable setup.
>2. KVM guests have TSX CPU features HLE and/or RTM presented.
>3. One of the three maintenance events occur:
>3a. An existing host running kernel >= 5.14 in the pool updated with the
>    new microcode.
>3b. A new host running kernel >= 5.14 is commissioned that already has the
>    microcode update preloaded.
>3c. All hosts are running kernel < 5.14 with microcode update already
>    loaded and one existing host gets updated to kernel >= 5.14.
>4. After maintenance event, the impacted host will not have HLE and RTM
>   exposed, and live migrations with guests with TSX features might not
>   migrate.

Which part was this reproduced on? AFAIK server parts(except for some
Intel Xeon E3s) did not get such microcode update.

Thanks,
Pawan
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c
index 9c7a5f049292..4b701fa64869 100644
--- a/arch/x86/kernel/cpu/tsx.c
+++ b/arch/x86/kernel/cpu/tsx.c
@@ -78,6 +78,10 @@  static bool __init tsx_ctrl_is_supported(void)

 static enum tsx_ctrl_states x86_get_tsx_auto_mode(void)
 {
+	if (boot_cpu_has(X86_FEATURE_RTM_ALWAYS_ABORT) &&
+	    boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT))
+		return TSX_CTRL_RTM_ALWAYS_ABORT;
+
 	if (boot_cpu_has_bug(X86_BUG_TAA))
 		return TSX_CTRL_DISABLE;

@@ -105,21 +109,6 @@  void __init tsx_init(void)
 	char arg[5] = {};
 	int ret;

-	/*
-	 * Hardware will always abort a TSX transaction if both CPUID bits
-	 * RTM_ALWAYS_ABORT and TSX_FORCE_ABORT are set. In this case, it is
-	 * better not to enumerate CPUID.RTM and CPUID.HLE bits. Clear them
-	 * here.
-	 */
-	if (boot_cpu_has(X86_FEATURE_RTM_ALWAYS_ABORT) &&
-	    boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT)) {
-		tsx_ctrl_state = TSX_CTRL_RTM_ALWAYS_ABORT;
-		tsx_clear_cpuid();
-		setup_clear_cpu_cap(X86_FEATURE_RTM);
-		setup_clear_cpu_cap(X86_FEATURE_HLE);
-		return;
-	}
-
 	if (!tsx_ctrl_is_supported()) {
 		tsx_ctrl_state = TSX_CTRL_NOT_SUPPORTED;
 		return;
@@ -173,5 +162,16 @@  void __init tsx_init(void)
 		 */
 		setup_force_cpu_cap(X86_FEATURE_RTM);
 		setup_force_cpu_cap(X86_FEATURE_HLE);
+	} else if (tsx_ctrl_state == TSX_CTRL_RTM_ALWAYS_ABORT) {
+
+		/*
+		 * Hardware will always abort a TSX transaction if both CPUID bits
+		 * RTM_ALWAYS_ABORT and TSX_FORCE_ABORT are set. In this case, it is
+		 * better not to enumerate CPUID.RTM and CPUID.HLE bits. Clear them
+		 * here.
+		 */
+		tsx_clear_cpuid();
+		setup_clear_cpu_cap(X86_FEATURE_RTM);
+		setup_clear_cpu_cap(X86_FEATURE_HLE);
 	}
 }