From patchwork Fri Aug 9 12:29:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11086381 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D486014D5 for ; Fri, 9 Aug 2019 12:31:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B6037287A1 for ; Fri, 9 Aug 2019 12:31:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AA4DB28CA3; Fri, 9 Aug 2019 12:31:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C55D9287A1 for ; Fri, 9 Aug 2019 12:31:14 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hw41H-0002ig-Mt; Fri, 09 Aug 2019 12:29:19 +0000 Received: from us1-rack-dfw2.inumbo.com ([104.130.134.6]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hw41H-0002ib-0M for xen-devel@lists.xenproject.org; Fri, 09 Aug 2019 12:29:19 +0000 X-Inumbo-ID: 4d564600-baa1-11e9-8980-bc764e045a96 Received: from mx1.suse.de (unknown [195.135.220.15]) by us1-rack-dfw2.inumbo.com (Halon) with ESMTPS id 4d564600-baa1-11e9-8980-bc764e045a96; Fri, 09 Aug 2019 12:29:16 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 996BDADFE; Fri, 9 Aug 2019 12:29:15 +0000 (UTC) To: "xen-devel@lists.xenproject.org" From: Jan Beulich Message-ID: Date: Fri, 9 Aug 2019 14:29:16 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 Content-Language: en-US Subject: [Xen-devel] [PATCH v4 0/2] x86/xpti: Don't leak TSS-adjacent percpu data via Meltdown X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Andrew Cooper 1: xen/link: Introduce .bss.percpu.page_aligned 2: x86/xpti: Don't leak TSS-adjacent percpu data via Meltdown Addressing my own v3 review comments, albeit in a slightly extended form. Jan xen/link: Introduce .bss.percpu.page_aligned Future changes are going to need to page align some percpu data. Shuffle the exact link order of items within the BSS to give .bss.percpu.page_aligned appropriate alignment, even on CPU0, which uses .bss.percpu itself. Insert explicit alignment such that the result is safe even with objects shorter than a page in length. The POINTER_ALIGN for __bss_end is to cover the lack of SMP_CACHE_BYTES alignment, as the loops which zero the BSS use pointer-sized stores on all architectures. In addition, we need to be able to specify an alignment attribute to __DEFINE_PER_CPU(). Rework it so the caller passes in all attributes, and adjust DEFINE_PER_CPU{,_READ_MOSTLY}() to match. This has the added bonus that it is now possible to grep for .bss.percpu and find all the users. Finally, introduce DEFINE_PER_CPU_PAGE_ALIGNED() which specifies the section attribute and verifies the type's alignment. Signed-off-by: Andrew Cooper Make DEFINE_PER_CPU_PAGE_ALIGNED() verify the alignment rather than specifying it. It is the underlying type which should be suitably aligned. Signed-off-by: Jan Beulich Acked-by: Julien Grall --- A sample build including the subsequent patch is now: ffff82d08092d000 B zero_page ffff82d08092e000 B per_cpu__init_tss ffff82d08092e000 B __per_cpu_start ffff82d08092f000 B per_cpu__cpupool ffff82d08092f008 b per_cpu__continue_info ffff82d08092f010 b per_cpu__grant_rwlock which demonstrates the correct alignment of data in .bss.percpu even when following a non-page-sized object in .bss.percpu.page_aligned. v4: * Drop stray trailing ALIGN(). Make DEFINE_PER_CPU_PAGE_ALIGNED() verify the alignment rather than specifying it. v3: * Insert explicit alignment. * Reduce __bss_end's alignment to just POINTER_ALIGN. v2: * Rework __DEFINE_PER_CPU() to allow for further attributes to be passed. * Specify __aligned(PAGE_SIZE) as part of DEFINE_PER_CPU_PAGE_ALIGNED(). x86/xpti: Don't leak TSS-adjacent percpu data via Meltdown The XPTI work restricted the visibility of most of memory, but missed a few aspects when it came to the TSS. Given that the TSS is just an object in percpu data, the 4k mapping for it created in setup_cpu_root_pgt() maps adjacent percpu data, making it all leakable via Meltdown, even when XPTI is in use. Furthermore, no care is taken to check that the TSS doesn't cross a page boundary. As it turns out, struct tss_struct is aligned on its size which does prevent it straddling a page boundary. Move the TSS into the page aligned percpu area, so no adjacent data can be leaked. Move the definition from setup.c to traps.c, which is a more appropriate place for it to live. Signed-off-by: Andrew Cooper Introduce / use struct tss_page. Signed-off-by: Jan Beulich --- TBD: Especially with how the previous patch now works I'm unconvinced of the utility of the linker script alignment check. It in particular doesn't check the property we're after in this patch, i.e. the fact that there's nothing else in the same page. NB: Sadly get_per_cpu_var() can't also be used on the "left" side of a #define. --- v4: * Introduce / use struct tss_page. v3: * Drop the remark about CET. It is no longer accurate in the latest version of the CET spec. v2: * Rebase over changes to include __aligned() within DEFINE_PER_CPU_PAGE_ALIGNED() * Drop now-unused xen/percpu.h from setup.c --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -16,7 +16,6 @@ #include #include #include -#include #include #include #include @@ -100,8 +99,6 @@ unsigned long __read_mostly xen_phys_sta unsigned long __read_mostly xen_virt_end; -DEFINE_PER_CPU(struct tss_struct, init_tss); - char __section(".bss.stack_aligned") __aligned(STACK_SIZE) cpu0_stack[STACK_SIZE]; --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -108,6 +108,12 @@ idt_entry_t __section(".bss.page_aligned /* Pointer to the IDT of every CPU. */ idt_entry_t *idt_tables[NR_CPUS] __read_mostly; +/* + * The TSS is smaller than a page, but we give it a full page to avoid + * adjacent per-cpu data leaking via Meltdown when XPTI is in use. + */ +DEFINE_PER_CPU_PAGE_ALIGNED(struct tss_page, init_tss_page); + bool (*ioemul_handle_quirk)( u8 opcode, char *io_emul_stub, struct cpu_user_regs *regs); --- a/xen/arch/x86/xen.lds.S +++ b/xen/arch/x86/xen.lds.S @@ -368,6 +368,8 @@ ASSERT(IS_ALIGNED(__2M_rwdata_end, SEC ASSERT(IS_ALIGNED(cpu0_stack, STACK_SIZE), "cpu0_stack misaligned") +ASSERT(IS_ALIGNED(per_cpu__init_tss_page, PAGE_SIZE), "per_cpu(init_tss) misaligned") + ASSERT(IS_ALIGNED(__init_begin, PAGE_SIZE), "__init_begin misaligned") ASSERT(IS_ALIGNED(__init_end, PAGE_SIZE), "__init_end misaligned") --- a/xen/include/asm-x86/processor.h +++ b/xen/include/asm-x86/processor.h @@ -411,7 +411,7 @@ static always_inline void __mwait(unsign #define IOBMP_BYTES 8192 #define IOBMP_INVALID_OFFSET 0x8000 -struct __packed __cacheline_aligned tss_struct { +struct __packed tss_struct { uint32_t :32; uint64_t rsp0, rsp1, rsp2; uint64_t :64; @@ -425,6 +425,11 @@ struct __packed __cacheline_aligned tss_ /* Pads the TSS to be cacheline-aligned (total size is 0x80). */ uint8_t __cacheline_filler[24]; }; +struct tss_page { + struct tss_struct __aligned(PAGE_SIZE) tss; +}; +DECLARE_PER_CPU(struct tss_page, init_tss_page); +#define per_cpu__init_tss get_per_cpu_var(init_tss_page.tss) #define IST_NONE 0UL #define IST_DF 1UL @@ -463,7 +468,6 @@ static inline void disable_each_ist(idt_ extern idt_entry_t idt_table[]; extern idt_entry_t *idt_tables[]; -DECLARE_PER_CPU(struct tss_struct, init_tss); DECLARE_PER_CPU(root_pgentry_t *, root_pgt); extern void write_ptbase(struct vcpu *v); --- a/xen/arch/arm/xen.lds.S +++ b/xen/arch/arm/xen.lds.S @@ -200,14 +200,16 @@ SECTIONS *(.bss.stack_aligned) . = ALIGN(PAGE_SIZE); *(.bss.page_aligned) - *(.bss) - . = ALIGN(SMP_CACHE_BYTES); + . = ALIGN(PAGE_SIZE); __per_cpu_start = .; + *(.bss.percpu.page_aligned) *(.bss.percpu) . = ALIGN(SMP_CACHE_BYTES); *(.bss.percpu.read_mostly) . = ALIGN(SMP_CACHE_BYTES); __per_cpu_data_end = .; + *(.bss) + . = ALIGN(POINTER_ALIGN); __bss_end = .; } :text _end = . ; --- a/xen/arch/x86/xen.lds.S +++ b/xen/arch/x86/xen.lds.S @@ -292,14 +292,16 @@ SECTIONS __bss_start = .; *(.bss.stack_aligned) *(.bss.page_aligned*) - *(.bss) - . = ALIGN(SMP_CACHE_BYTES); + . = ALIGN(PAGE_SIZE); __per_cpu_start = .; + *(.bss.percpu.page_aligned) *(.bss.percpu) . = ALIGN(SMP_CACHE_BYTES); *(.bss.percpu.read_mostly) . = ALIGN(SMP_CACHE_BYTES); __per_cpu_data_end = .; + *(.bss) + . = ALIGN(POINTER_ALIGN); __bss_end = .; } :text _end = . ; --- a/xen/include/asm-arm/percpu.h +++ b/xen/include/asm-arm/percpu.h @@ -10,10 +10,8 @@ extern char __per_cpu_start[], __per_cpu extern unsigned long __per_cpu_offset[NR_CPUS]; void percpu_init_areas(void); -/* Separate out the type, so (int[3], foo) works. */ -#define __DEFINE_PER_CPU(type, name, suffix) \ - __section(".bss.percpu" #suffix) \ - __typeof__(type) per_cpu_##name +#define __DEFINE_PER_CPU(attr, type, name) \ + attr __typeof__(type) per_cpu_ ## name #define per_cpu(var, cpu) \ (*RELOC_HIDE(&per_cpu__##var, __per_cpu_offset[cpu])) --- a/xen/include/asm-x86/percpu.h +++ b/xen/include/asm-x86/percpu.h @@ -7,10 +7,8 @@ extern unsigned long __per_cpu_offset[NR void percpu_init_areas(void); #endif -/* Separate out the type, so (int[3], foo) works. */ -#define __DEFINE_PER_CPU(type, name, suffix) \ - __section(".bss.percpu" #suffix) \ - __typeof__(type) per_cpu_##name +#define __DEFINE_PER_CPU(attr, type, name) \ + attr __typeof__(type) per_cpu_ ## name /* var is in discarded region: offset to particular copy we want */ #define per_cpu(var, cpu) \ --- a/xen/include/xen/percpu.h +++ b/xen/include/xen/percpu.h @@ -9,9 +9,17 @@ * The _##name concatenation is being used here to prevent 'name' from getting * macro expanded, while still allowing a per-architecture symbol name prefix. */ -#define DEFINE_PER_CPU(type, name) __DEFINE_PER_CPU(type, _##name, ) +#define DEFINE_PER_CPU(type, name) \ + __DEFINE_PER_CPU(__section(".bss.percpu"), type, _ ## name) + +#define DEFINE_PER_CPU_PAGE_ALIGNED(type, name) \ + typedef char name ## _chk_t[BUILD_BUG_ON_ZERO(__alignof(type) & \ + (PAGE_SIZE - 1))]; \ + __DEFINE_PER_CPU(__section(".bss.percpu.page_aligned"), \ + type, _ ## name) + #define DEFINE_PER_CPU_READ_MOSTLY(type, name) \ - __DEFINE_PER_CPU(type, _##name, .read_mostly) + __DEFINE_PER_CPU(__section(".bss.percpu.read_mostly"), type, _ ## name) #define get_per_cpu_var(var) (per_cpu__##var)