diff mbox series

[v13,04/22] x86/cpu: Detect TDX partial write machine check erratum

Message ID b089f93223958c168b5abd8eef0f810d616adb99.1692962263.git.kai.huang@intel.com (mailing list archive)
State New, archived
Headers show
Series TDX host kernel support | expand

Commit Message

Huang, Kai Aug. 25, 2023, 12:14 p.m. UTC
TDX memory has integrity and confidentiality protections.  Violations of
this integrity protection are supposed to only affect TDX operations and
are never supposed to affect the host kernel itself.  In other words,
the host kernel should never, itself, see machine checks induced by the
TDX integrity hardware.

Alas, the first few generations of TDX hardware have an erratum.  A
partial write to a TDX private memory cacheline will silently "poison"
the line.  Subsequent reads will consume the poison and generate a
machine check.  According to the TDX hardware spec, neither of these
things should have happened.

Virtually all kernel memory accesses operations happen in full
cachelines.  In practice, writing a "byte" of memory usually reads a 64
byte cacheline of memory, modifies it, then writes the whole line back.
Those operations do not trigger this problem.

This problem is triggered by "partial" writes where a write transaction
of less than cacheline lands at the memory controller.  The CPU does
these via non-temporal write instructions (like MOVNTI), or through
UC/WC memory mappings.  The issue can also be triggered away from the
CPU by devices doing partial writes via DMA.

With this erratum, there are additional things need to be done.  Similar
to other CPU bugs, use a CPU bug bit to indicate this erratum, and
detect this erratum during early boot.  Note this bug reflects the
hardware thus it is detected regardless of whether the kernel is built
with TDX support or not.

Signed-off-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
---

v12 -> v13:
 - Added David's tag.

v11 -> v12:
 - Added Kirill's tag
 - Changed to detect the erratum in early_init_intel() (Kirill)

v10 -> v11:
 - New patch


---
 arch/x86/include/asm/cpufeatures.h |  1 +
 arch/x86/kernel/cpu/intel.c        | 17 +++++++++++++++++
 2 files changed, 18 insertions(+)

Comments

Dave Hansen Sept. 8, 2023, 3:22 p.m. UTC | #1
On 8/25/23 05:14, Kai Huang wrote:
> TDX memory has integrity and confidentiality protections.  Violations of
> this integrity protection are supposed to only affect TDX operations and
> are never supposed to affect the host kernel itself.  In other words,
> the host kernel should never, itself, see machine checks induced by the
> TDX integrity hardware.

This is missing one thing: alluding to how this will be used.  We might
do that by saying: "To prepare for _____, add ______."

But that's a minor nit.

...
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Reviewed-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Huang, Kai Sept. 11, 2023, 12:39 p.m. UTC | #2
On Fri, 2023-09-08 at 08:22 -0700, Dave Hansen wrote:
> On 8/25/23 05:14, Kai Huang wrote:
> > TDX memory has integrity and confidentiality protections.  Violations of
> > this integrity protection are supposed to only affect TDX operations and
> > are never supposed to affect the host kernel itself.  In other words,
> > the host kernel should never, itself, see machine checks induced by the
> > TDX integrity hardware.
> 
> This is missing one thing: alluding to how this will be used.  We might
> do that by saying: "To prepare for _____, add ______."
> 
> But that's a minor nit.

Thanks for suggestion.

I thought I somehow mentioned at last in the changelog:

	With this erratum, there are additional things need to be done. 
Similar
	to other CPU bugs, use a CPU bug bit to indicate this erratum ...

Perhaps it's not clear.  How about below?

	With this erratum, there are additional things need to be done around
		kexec() and machine check handler.  To prepare for those changes, add a 		CPU bugbittoindicatethiserratum.Notethisbugreflectsthe		hardwarethusitisdetectedregardlessofwhetherthekernelisbuilt
	with TDX support or not.
> 
> ...
> > Signed-off-by: Kai Huang <kai.huang@intel.com>
> > Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Reviewed-by: David Hildenbrand <david@redhat.com>
> 
> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>

Thanks!
diff mbox series

Patch

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index cb8ca46213be..dc8701f8d88b 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -483,5 +483,6 @@ 
 #define X86_BUG_RETBLEED		X86_BUG(27) /* CPU is affected by RETBleed */
 #define X86_BUG_EIBRS_PBRSB		X86_BUG(28) /* EIBRS is vulnerable to Post Barrier RSB Predictions */
 #define X86_BUG_SMT_RSB			X86_BUG(29) /* CPU is vulnerable to Cross-Thread Return Address Predictions */
+#define X86_BUG_TDX_PW_MCE		X86_BUG(30) /* CPU may incur #MC if non-TD software does partial write to TDX private memory */
 
 #endif /* _ASM_X86_CPUFEATURES_H */
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 1c4639588ff9..e6c3107adc15 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -358,6 +358,21 @@  int intel_microcode_sanity_check(void *mc, bool print_err, int hdr_type)
 }
 EXPORT_SYMBOL_GPL(intel_microcode_sanity_check);
 
+static void check_tdx_erratum(struct cpuinfo_x86 *c)
+{
+	/*
+	 * These CPUs have an erratum.  A partial write from non-TD
+	 * software (e.g. via MOVNTI variants or UC/WC mapping) to TDX
+	 * private memory poisons that memory, and a subsequent read of
+	 * that memory triggers #MC.
+	 */
+	switch (c->x86_model) {
+	case INTEL_FAM6_SAPPHIRERAPIDS_X:
+	case INTEL_FAM6_EMERALDRAPIDS_X:
+		setup_force_cpu_bug(X86_BUG_TDX_PW_MCE);
+	}
+}
+
 static void early_init_intel(struct cpuinfo_x86 *c)
 {
 	u64 misc_enable;
@@ -509,6 +524,8 @@  static void early_init_intel(struct cpuinfo_x86 *c)
 	 */
 	if (detect_extended_topology_early(c) < 0)
 		detect_ht_early(c);
+
+	check_tdx_erratum(c);
 }
 
 static void bsp_init_intel(struct cpuinfo_x86 *c)