Message ID | 9fd0360dd914d93dab357d16b46b4290e6119d30.1673123823.git.demi@invisiblethingslab.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Make PAT handling less brittle | expand |
On 07.01.2023 23:07, Demi Marie Obenour wrote: > --- a/xen/arch/x86/Kconfig > +++ b/xen/arch/x86/Kconfig > @@ -227,6 +227,39 @@ config XEN_ALIGN_2M > > endchoice > > +config LINUX_PAT > + bool "Use Linux's PAT instead of Xen's default" > + help > + Use Linux's Page Attribute Table instead of the default Xen value. > + > + The Page Attribute Table (PAT) maps three bits in the page table entry > + to the actual cacheability used by the processor. Many Intel > + integrated GPUs have errata (bugs) that cause CPU access to GPU memory > + to ignore the topmost bit. When using Xen's default PAT, this results > + in caches not being flushed and incorrect images being displayed. The > + default PAT used by Linux does not cause this problem. > + > + If you say Y here, you will be able to use Intel integrated GPUs that > + are attached to your Linux dom0 or other Linux PV guests. However, > + you will not be able to use non-Linux OSs in dom0, and attaching a PCI > + device to a non-Linux PV guest will result in unpredictable guest > + behavior. If you say N here, you will be able to use a non-Linux > + dom0, and will be able to attach PCI devices to non-Linux PV guests. > + > + Note that saving a PV guest with an assigned PCI device on a machine > + with one PAT and restoring it on a machine with a different PAT won't > + work: the resulting guest may boot and even appear to work, but caches > + will not be flushed when needed, with unpredictable results. HVM > + (including PVH and PVHVM) guests and guests without assigned PCI > + devices do not care what PAT Xen uses, and migration (even live) > + between hypervisors with different PATs will work fine. Guests using > + PV Shim care about the PAT used by the PV Shim firmware, not the > + host’s PAT. Also, non-default PAT values are incompatible with the > + (deprecated) qemu-traditional stubdomain. > + > + Say Y if you are building a hypervisor for a Linux distribution that > + supports Intel iGPUs. Say N otherwise. I'm not convinced we want this; if other maintainers think differently, then I don't mean to stand in the way though. If so, however, - the above likely wants guarding by EXPERT and/or UNSUPPORTED - the support status of using this setting wants to be made crystal clear, perhaps by an addition to ./SUPPORT.md. Jan
On Mon, Jan 09, 2023 at 12:37:34PM +0100, Jan Beulich wrote: > On 07.01.2023 23:07, Demi Marie Obenour wrote: > > --- a/xen/arch/x86/Kconfig > > +++ b/xen/arch/x86/Kconfig > > @@ -227,6 +227,39 @@ config XEN_ALIGN_2M > > > > endchoice > > > > +config LINUX_PAT > > + bool "Use Linux's PAT instead of Xen's default" > > + help > > + Use Linux's Page Attribute Table instead of the default Xen value. > > + > > + The Page Attribute Table (PAT) maps three bits in the page table entry > > + to the actual cacheability used by the processor. Many Intel > > + integrated GPUs have errata (bugs) that cause CPU access to GPU memory > > + to ignore the topmost bit. When using Xen's default PAT, this results > > + in caches not being flushed and incorrect images being displayed. The > > + default PAT used by Linux does not cause this problem. > > + > > + If you say Y here, you will be able to use Intel integrated GPUs that > > + are attached to your Linux dom0 or other Linux PV guests. However, > > + you will not be able to use non-Linux OSs in dom0, and attaching a PCI > > + device to a non-Linux PV guest will result in unpredictable guest > > + behavior. If you say N here, you will be able to use a non-Linux > > + dom0, and will be able to attach PCI devices to non-Linux PV guests. > > + > > + Note that saving a PV guest with an assigned PCI device on a machine > > + with one PAT and restoring it on a machine with a different PAT won't > > + work: the resulting guest may boot and even appear to work, but caches > > + will not be flushed when needed, with unpredictable results. HVM > > + (including PVH and PVHVM) guests and guests without assigned PCI > > + devices do not care what PAT Xen uses, and migration (even live) > > + between hypervisors with different PATs will work fine. Guests using > > + PV Shim care about the PAT used by the PV Shim firmware, not the > > + host’s PAT. Also, non-default PAT values are incompatible with the > > + (deprecated) qemu-traditional stubdomain. > > + > > + Say Y if you are building a hypervisor for a Linux distribution that > > + supports Intel iGPUs. Say N otherwise. > > I'm not convinced we want this; if other maintainers think differently, > then I don't mean to stand in the way though. If so, however, > - the above likely wants guarding by EXPERT and/or UNSUPPORTED I considered this, but decided against it. Recent Intel iGPUs are simply incompatible with Xen’s default PAT, so anyone wanting to use Xen in a desktop environment must say Y here. Guarding this with EXPERT or UNSUPPORTED will not prevent distribution maintainers from enabling it, because the alternative is building a hypervisor that does not support the hardware their users actually have. Qubes OS is *already* shipping a patch to use Linux’s PAT, so you don’t need to worry that this code will go untested. And if there was a vulnerability that requires CONFIG_LINUX_PAT=y, I’d rather it not be dropped on Qubes users as a 0day.
diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig index 6a7825f4ba3c98e0496415123fde79ee62f771fa..18efccedfd08873cd169a54825b0ba4256a12942 100644 --- a/xen/arch/x86/Kconfig +++ b/xen/arch/x86/Kconfig @@ -227,6 +227,39 @@ config XEN_ALIGN_2M endchoice +config LINUX_PAT + bool "Use Linux's PAT instead of Xen's default" + help + Use Linux's Page Attribute Table instead of the default Xen value. + + The Page Attribute Table (PAT) maps three bits in the page table entry + to the actual cacheability used by the processor. Many Intel + integrated GPUs have errata (bugs) that cause CPU access to GPU memory + to ignore the topmost bit. When using Xen's default PAT, this results + in caches not being flushed and incorrect images being displayed. The + default PAT used by Linux does not cause this problem. + + If you say Y here, you will be able to use Intel integrated GPUs that + are attached to your Linux dom0 or other Linux PV guests. However, + you will not be able to use non-Linux OSs in dom0, and attaching a PCI + device to a non-Linux PV guest will result in unpredictable guest + behavior. If you say N here, you will be able to use a non-Linux + dom0, and will be able to attach PCI devices to non-Linux PV guests. + + Note that saving a PV guest with an assigned PCI device on a machine + with one PAT and restoring it on a machine with a different PAT won't + work: the resulting guest may boot and even appear to work, but caches + will not be flushed when needed, with unpredictable results. HVM + (including PVH and PVHVM) guests and guests without assigned PCI + devices do not care what PAT Xen uses, and migration (even live) + between hypervisors with different PATs will work fine. Guests using + PV Shim care about the PAT used by the PV Shim firmware, not the + host’s PAT. Also, non-default PAT values are incompatible with the + (deprecated) qemu-traditional stubdomain. + + Say Y if you are building a hypervisor for a Linux distribution that + supports Intel iGPUs. Say N otherwise. + config X2APIC_PHYSICAL bool "x2APIC Physical Destination mode" help diff --git a/xen/arch/x86/include/asm/page.h b/xen/arch/x86/include/asm/page.h index c7d77ab2901aa5bdb03a719af810c6f8d8ba9d4e..03839eb2b78517332663daad2089677d7000852c 100644 --- a/xen/arch/x86/include/asm/page.h +++ b/xen/arch/x86/include/asm/page.h @@ -331,6 +331,7 @@ void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t); #define PAGE_CACHE_ATTRS (_PAGE_PAT | _PAGE_PCD | _PAGE_PWT) +#ifndef CONFIG_LINUX_PAT /* Memory types, encoded under Xen's choice of MSR_PAT. */ #define _PAGE_WB ( 0) #define _PAGE_WT ( _PAGE_PWT) @@ -340,6 +341,17 @@ void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t); #define _PAGE_WP (_PAGE_PAT | _PAGE_PWT) #define _PAGE_RSVD_1 (_PAGE_PAT | _PAGE_PCD ) #define _PAGE_RSVD_2 (_PAGE_PAT | _PAGE_PCD | _PAGE_PWT) +#else +/* Memory types, encoded under Linux's choice of MSR_PAT. */ +#define _PAGE_WB ( 0) +#define _PAGE_WC ( _PAGE_PWT) +#define _PAGE_UCM ( _PAGE_PCD ) +#define _PAGE_UC ( _PAGE_PCD | _PAGE_PWT) +#define _PAGE_RSVD_1 (_PAGE_PAT ) +#define _PAGE_WP (_PAGE_PAT | _PAGE_PWT) +#define _PAGE_RSVD_2 (_PAGE_PAT | _PAGE_PCD ) +#define _PAGE_WT (_PAGE_PAT | _PAGE_PCD | _PAGE_PWT) +#endif /* * Debug option: Ensure that granted mappings are not implicitly unmapped. diff --git a/xen/arch/x86/include/asm/processor.h b/xen/arch/x86/include/asm/processor.h index 60b902060914584957db8afa5c7c1e6abdad4d13..413b59ab284990cca192fa1dc44b437f58bd282f 100644 --- a/xen/arch/x86/include/asm/processor.h +++ b/xen/arch/x86/include/asm/processor.h @@ -92,6 +92,20 @@ X86_EFLAGS_NT|X86_EFLAGS_DF|X86_EFLAGS_IF| \ X86_EFLAGS_TF) +#ifdef CONFIG_LINUX_PAT +/* + * Host IA32_CR_PAT value to cover all memory types. This is not the default + * MSR_PAT value, but is the same as the one used by Linux. + */ +#define XEN_MSR_PAT ((_AC(X86_MT_WB, ULL) << 0x00) | \ + (_AC(X86_MT_WC, ULL) << 0x08) | \ + (_AC(X86_MT_UCM, ULL) << 0x10) | \ + (_AC(X86_MT_UC, ULL) << 0x18) | \ + (_AC(X86_MT_WB, ULL) << 0x20) | \ + (_AC(X86_MT_WP, ULL) << 0x28) | \ + (_AC(X86_MT_UCM, ULL) << 0x30) | \ + (_AC(X86_MT_WT, ULL) << 0x38)) +#else /* * Host IA32_CR_PAT value to cover all memory types. This is not the default * MSR_PAT value, and is an ABI with PV guests. @@ -104,6 +118,7 @@ (_AC(X86_MT_WP, ULL) << 0x28) | \ (_AC(X86_MT_UC, ULL) << 0x30) | \ (_AC(X86_MT_UC, ULL) << 0x38)) +#endif #ifndef __ASSEMBLY__ diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index d69e9bea6c30bc782ab4c331f42502f6e61a028a..042c1875a02092a3f19c293003ef12209d88a450 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -6407,6 +6407,7 @@ unsigned long get_upper_mfn_bound(void) static void __init __maybe_unused build_assertions(void) { +#ifndef CONFIG_LINUX_PAT /* * If this trips, any guests that blindly rely on the public API in xen.h * (instead of reading the PAT from Xen, as Linux 3.19+ does) will be @@ -6414,6 +6415,7 @@ static void __init __maybe_unused build_assertions(void) * using different PATs will not work. */ BUILD_BUG_ON(XEN_MSR_PAT != 0x050100070406ULL); +#endif /* * _PAGE_WB must be zero. Linux PV guests assume that _PAGE_WB will be
Due to a hardware errata, Intel integrated GPUs are incompatible with Xen's PAT. Using Linux's PAT is a workaround for this flaw. Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com> --- xen/arch/x86/Kconfig | 33 ++++++++++++++++++++++++++++ xen/arch/x86/include/asm/page.h | 12 ++++++++++ xen/arch/x86/include/asm/processor.h | 15 +++++++++++++ xen/arch/x86/mm.c | 2 ++ 4 files changed, 62 insertions(+)