From patchwork Thu Mar 3 07:18:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 12767139 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9FA45C433EF for ; Thu, 3 Mar 2022 07:27:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230177AbiCCH2H (ORCPT ); Thu, 3 Mar 2022 02:28:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36408 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230170AbiCCH2F (ORCPT ); Thu, 3 Mar 2022 02:28:05 -0500 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8421F16C4C1 for ; Wed, 2 Mar 2022 23:27:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646292439; x=1677828439; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4G8LrCp3BKo7nDAwSJLN4TtxP1W336SuN/7tBlpwgXo=; b=dVTHAmNCQ7ieV7muLd5wbCPG0OnBxFDVf7CHInjSbMBLQOClqpk5x2Yq tZ8UTYd9KtQtVjvq746XLFbeZ0RK3e9aNo+VEjvvDNmlGCDThUVHbdGIG SdPghJM7LmHyGm+5cXkk89FiT2h5tv7GpnLGQcakyADYKs3r6/bJiPZsS fewhr4SAEihHoShWAs871ZyVgE5JJ71Bt3kUR5SqSLqtnDxpMtoetqLcr 2PVXwolRKb0JsuLoECu+i5BwROzL8A3G4QnlTP4TncdnlaBPTw9SG5t9F 5ciriSub/i2O62RqFP8CSaQxwvy3vpOD70SpKbgmqVhVpZfk2p1xzQ3w6 g==; X-IronPort-AV: E=McAfee;i="6200,9189,10274"; a="251176933" X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="251176933" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:19 -0800 X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="551631477" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.123]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:16 -0800 From: Zhenzhong Duan To: kvm@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, yu.c.zhang@intel.com, zixuanwang@google.com, marcorr@google.com, jun.nakajima@intel.com, erdemaktas@google.com Subject: [kvm-unit-tests RFC PATCH 01/17] x86 TDX: Add support functions for TDX framework Date: Thu, 3 Mar 2022 15:18:51 +0800 Message-Id: <20220303071907.650203-2-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303071907.650203-1-zhenzhong.duan@intel.com> References: <20220303071907.650203-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Port tdcall.S and tdx.c from TDX guest kernel source in arch/x86/kernel directory, simplified and keep only code which is useful for TDX kvm-unit-test framework. tdcall.S contains two low level ABI functions: __tdx_module_call and __tdx_hypercall. lib/x86/tdx.c contains wrapper functions for simulating various instructions through tdvmcall. Currently below instructions are simulated: IO read/write MSR read/write cpuid hlt Define a dummy is_tdx_guest() if TARGET_EFI is undefined as this function will be used globally in the future. Signed-off-by: Zhenzhong Duan Reviewed-by: Yu Zhang --- lib/x86/asm/setup.h | 1 + lib/x86/setup.c | 10 ++ lib/x86/tdcall.S | 303 ++++++++++++++++++++++++++++++++++++++++++++ lib/x86/tdx.c | 276 ++++++++++++++++++++++++++++++++++++++++ lib/x86/tdx.h | 76 +++++++++++ x86/Makefile.common | 2 + 6 files changed, 668 insertions(+) create mode 100644 lib/x86/tdcall.S create mode 100644 lib/x86/tdx.c create mode 100644 lib/x86/tdx.h diff --git a/lib/x86/asm/setup.h b/lib/x86/asm/setup.h index dbfb2a22bc1b..c467a2e94861 100644 --- a/lib/x86/asm/setup.h +++ b/lib/x86/asm/setup.h @@ -15,5 +15,6 @@ unsigned long setup_tss(u8 *stacktop); efi_status_t setup_efi(efi_bootinfo_t *efi_bootinfo); void setup_5level_page_table(void); #endif /* TARGET_EFI */ +#include "x86/tdx.h" #endif /* _X86_ASM_SETUP_H_ */ diff --git a/lib/x86/setup.c b/lib/x86/setup.c index bbd34682b79e..fbcd188ebb8f 100644 --- a/lib/x86/setup.c +++ b/lib/x86/setup.c @@ -283,6 +283,16 @@ efi_status_t setup_efi(efi_bootinfo_t *efi_bootinfo) efi_status_t status; const char *phase; + /* + * TDVF support partial memory accept, accept remaining memory + * early so memory allocator can use it. + */ + status = setup_tdx(); + if (status != EFI_SUCCESS && status != EFI_UNSUPPORTED) { + printf("INTEL TDX setup failed, error = 0x%lx\n", status); + return status; + } + status = setup_memory_allocator(efi_bootinfo); if (status != EFI_SUCCESS) { printf("Failed to set up memory allocator: "); diff --git a/lib/x86/tdcall.S b/lib/x86/tdcall.S new file mode 100644 index 000000000000..89133d211376 --- /dev/null +++ b/lib/x86/tdcall.S @@ -0,0 +1,303 @@ +/* + * Low level API for tdcall and tdvmcall + * + * Copyright (c) 2022, Intel Inc + * + * Authors: + * Zhenzhong Duan + * + * SPDX-License-Identifier: GPL-2.0 + */ + +#include + +#define ARG7_SP_OFFSET 0x08 + +#define TDX_MODULE_rcx 0x0 +#define TDX_MODULE_rdx 0x8 +#define TDX_MODULE_r8 0x10 +#define TDX_MODULE_r9 0x18 +#define TDX_MODULE_r10 0x20 +#define TDX_MODULE_r11 0x28 + +#define TDX_HYPERCALL_r10 0x0 +#define TDX_HYPERCALL_r11 0x8 +#define TDX_HYPERCALL_r12 0x10 +#define TDX_HYPERCALL_r13 0x18 +#define TDX_HYPERCALL_r14 0x20 +#define TDX_HYPERCALL_r15 0x28 + +/* + * Expose registers R10-R15 to VMM. It is passed via RCX register + * to the TDX Module, which will be used by the TDX module to + * identify the list of registers exposed to VMM. Each bit in this + * mask represents a register ID. Bit field details can be found + * in TDX GHCI specification. + */ +#define TDVMCALL_EXPOSE_REGS_MASK 0xfc00 + +/* + * TDX guests use the TDCALL instruction to make requests to the + * TDX module and hypercalls to the VMM. It is supported in + * Binutils >= 2.36. + */ +#define tdcall .byte 0x66,0x0f,0x01,0xcc + +/* HLT TDVMCALL sub-function ID */ +#define EXIT_REASON_HLT 12 + +/* + * __tdx_module_call() - Helper function used by TDX guests to request + * services from the TDX module (does not include VMM services). + * + * This function serves as a wrapper to move user call arguments to the + * correct registers as specified by TDCALL ABI and share it with the + * TDX module. If the TDCALL operation is successful and a valid + * "struct tdx_module_output" pointer is available (in "out" argument), + * output from the TDX module is saved to the memory specified in the + * "out" pointer. Also the status of the TDCALL operation is returned + * back to the user as a function return value. + * + *------------------------------------------------------------------------- + * TDCALL ABI: + *------------------------------------------------------------------------- + * Input Registers: + * + * RAX - TDCALL Leaf number. + * RCX,RDX,R8-R9 - TDCALL Leaf specific input registers. + * + * Output Registers: + * + * RAX - TDCALL instruction error code. + * RCX,RDX,R8-R11 - TDCALL Leaf specific output registers. + * + *------------------------------------------------------------------------- + * + * __tdx_module_call() function ABI: + * + * @fn (RDI) - TDCALL Leaf ID, moved to RAX + * @rcx (RSI) - Input parameter 1, moved to RCX + * @rdx (RDX) - Input parameter 2, moved to RDX + * @r8 (RCX) - Input parameter 3, moved to R8 + * @r9 (R8) - Input parameter 4, moved to R9 + * + * @out (R9) - struct tdx_module_output pointer + * stored temporarily in R12 (not + * shared with the TDX module). It + * can be NULL. + * + * Return status of TDCALL via RAX. + */ +.global __tdx_module_call +__tdx_module_call: + /* + * R12 will be used as temporary storage for + * struct tdx_module_output pointer. More + * details about struct tdx_module_output can + * be found in arch/x86/include/asm/tdx.h. Also + * note that registers R12-R15 are not used by + * TDCALL services supported by this helper + * function. + */ + + /* Callee saved, so preserve it */ + push %r12 + + /* + * Push output pointer to stack, after TDCALL operation, + * it will be fetched into R12 register. + */ + push %r9 + + /* Mangle function call ABI into TDCALL ABI: */ + /* Move TDCALL Leaf ID to RAX */ + mov %rdi, %rax + /* Move input 4 to R9 */ + mov %r8, %r9 + /* Move input 3 to R8 */ + mov %rcx, %r8 + /* Move input 1 to RCX */ + mov %rsi, %rcx + /* Leave input param 2 in RDX */ + + tdcall + + /* Fetch output pointer from stack to R12 */ + pop %r12 + + /* Check for TDCALL success: 0 - Successful, otherwise failed */ + test %rax, %rax + jnz 1f + + /* + * __tdx_module_call() can be initiated without an output pointer. + * So, check if caller provided an output struct before storing + * output registers. + */ + test %r12, %r12 + jz 1f + + /* Copy TDCALL result registers to output struct: */ + movq %rcx, TDX_MODULE_rcx(%r12) + movq %rdx, TDX_MODULE_rdx(%r12) + movq %r8, TDX_MODULE_r8(%r12) + movq %r9, TDX_MODULE_r9(%r12) + movq %r10, TDX_MODULE_r10(%r12) + movq %r11, TDX_MODULE_r11(%r12) +1: + /* Restore the state of R12 register */ + pop %r12 + ret + +/* + * __tdx_hypercall() - Helper function used by TDX guests to request + * services from the VMM. All requests are made via the TDX module + * using TDCALL instruction. + * + * This function serves as a wrapper to move user call arguments to the + * correct registers as specified by TDCALL ABI and share it with VMM + * via the TDX module. After TDCALL operation, output from the VMM is + * saved to the memory specified in the "out" (struct tdx_hypercall_output) + * pointer. + * + *------------------------------------------------------------------------- + * TD VMCALL ABI: + *------------------------------------------------------------------------- + * + * Input Registers: + * + * RAX - TDCALL instruction leaf number (0 - TDG.VP.VMCALL) + * RCX - BITMAP which controls which part of TD Guest GPR + * is passed as-is to VMM and back. + * R10 - Set 0 to indicate TDCALL follows standard TDX ABI + * specification. Non zero value indicates vendor + * specific ABI. + * R11 - VMCALL sub function number + * RBX, RBP, RDI, RSI - Used to pass VMCALL sub function specific arguments. + * R8-R9, R12-R15 - Same as above. + * + * Output Registers: + * + * RAX - TDCALL instruction status (Not related to hypercall + * output). + * R10 - Hypercall output error code. + * R11-R15 - Hypercall sub function specific output values. + * + *------------------------------------------------------------------------- + * + * __tdx_hypercall() function ABI: + * + * @type (RDI) - TD VMCALL type, moved to R10 + * @fn (RSI) - TD VMCALL sub function, moved to R11 + * @r12 (RDX) - Input parameter 1, moved to R12 + * @r13 (RCX) - Input parameter 2, moved to R13 + * @r14 (R8) - Input parameter 3, moved to R14 + * @r15 (R9) - Input parameter 4, moved to R15 + * + * @out (stack) - struct tdx_hypercall_output pointer (cannot be NULL) + * + * On successful completion, return TDCALL status or -EINVAL for invalid + * inputs. + */ +.globl __tdx_hypercall +__tdx_hypercall: + /* Move argument 7 from caller stack to RAX */ + movq ARG7_SP_OFFSET(%rsp), %rax + + /* Check if caller provided an output struct */ + test %rax, %rax + /* If out pointer is NULL, return -EINVAL */ + jz 1f + + /* Save callee-saved GPRs as mandated by the x86_64 ABI */ + push %r15 + push %r14 + push %r13 + push %r12 + + /* + * Save output pointer (rax) in stack, it will be used + * again when storing the output registers after TDCALL + * operation. + */ + push %rax + + /* Mangle function call ABI into TDCALL ABI: */ + /* Set TDCALL leaf ID (TDVMCALL (0)) in RAX */ + xor %eax, %eax + /* Move TDVMCALL type (standard vs vendor) in R10 */ + mov %rdi, %r10 + /* Move TDVMCALL sub function id to R11 */ + mov %rsi, %r11 + /* Move input 1 to R12 */ + mov %rdx, %r12 + /* Move input 2 to R13 */ + mov %rcx, %r13 + /* Move input 3 to R14 */ + mov %r8, %r14 + /* Move input 4 to R15 */ + mov %r9, %r15 + + movl $TDVMCALL_EXPOSE_REGS_MASK, %ecx + + /* + * For the idle loop STI needs to be called directly before + * the TDCALL that enters idle (EXIT_REASON_HLT case). STI + * enables interrupts only one instruction later. If there + * are any instructions between the STI and the TDCALL for + * HLT then an interrupt could happen in that time, but the + * code would go back to sleep afterwards, which can cause + * longer delays. + * + * This leads to significant difference in network performance + * benchmarks. So add a special case for EXIT_REASON_HLT to + * trigger STI before TDCALL. But this change is not required + * for all HLT cases. So use R15 register value to identify the + * case which needs STI. So, if R11 is EXIT_REASON_HLT and R15 + * is 1, then call STI before TDCALL instruction. Note that R15 + * register is not required by TDCALL ABI when triggering the + * hypercall for EXIT_REASON_HLT case. So use it in software to + * select the STI case. + */ + cmpl $EXIT_REASON_HLT, %r11d + jne skip_sti + cmpl $1, %r15d + jne skip_sti + /* Set R15 register to 0, it is unused in EXIT_REASON_HLT case */ + xor %r15, %r15 + sti +skip_sti: + tdcall + + /* Restore output pointer to R9 */ + pop %r9 + + /* Copy hypercall result registers to output struct: */ + movq %r10, TDX_HYPERCALL_r10(%r9) + movq %r11, TDX_HYPERCALL_r11(%r9) + movq %r12, TDX_HYPERCALL_r12(%r9) + movq %r13, TDX_HYPERCALL_r13(%r9) + movq %r14, TDX_HYPERCALL_r14(%r9) + movq %r15, TDX_HYPERCALL_r15(%r9) + + /* + * Zero out registers exposed to the VMM to avoid + * speculative execution with VMM-controlled values. + * This needs to include all registers present in + * TDVMCALL_EXPOSE_REGS_MASK (except R12-R15). + * R12-R15 context will be restored. + */ + xor %r10d, %r10d + xor %r11d, %r11d + + /* Restore callee-saved GPRs as mandated by the x86_64 ABI */ + pop %r12 + pop %r13 + pop %r14 + pop %r15 + + jmp 2f +1: + movq $-EINVAL, %rax +2: + retq diff --git a/lib/x86/tdx.c b/lib/x86/tdx.c new file mode 100644 index 000000000000..8308480105d6 --- /dev/null +++ b/lib/x86/tdx.c @@ -0,0 +1,276 @@ +/* + * TDX library + * + * Copyright (c) 2022, Intel Inc + * + * Authors: + * Zhenzhong Duan + * + * SPDX-License-Identifier: GPL-2.0 + */ + +#include "tdx.h" +#include "bitops.h" +#include "x86/processor.h" +#include "x86/smp.h" + +#define VE_IS_IO_OUT(exit_qual) (((exit_qual) & 8) ? 0 : 1) +#define VE_GET_IO_SIZE(exit_qual) (((exit_qual) & 7) + 1) +#define VE_GET_PORT_NUM(exit_qual) ((exit_qual) >> 16) +#define VE_IS_IO_STRING(exit_qual) ((exit_qual) & 16 ? 1 : 0) + +#define BUFSZ 2000 +#define serial_iobase 0x3f8 + +static struct spinlock tdx_puts_lock; + +/* + * Helper function used for making hypercall for "in" + * instruction. If IO is failed, it will return all 1s. + */ +static inline unsigned int tdx_io_in(int size, int port) +{ + struct tdx_hypercall_output out; + + __tdx_hypercall(TDX_HYPERCALL_STANDARD, EXIT_REASON_IO_INSTRUCTION, + size, 0, port, 0, &out); + + return out.r10 ? UINT_MAX : out.r11; +} + +/* + * Helper function used for making hypercall for "out" + * instruction. + */ +static inline void tdx_io_out(int size, int port, u64 value) +{ + struct tdx_hypercall_output out; + + __tdx_hypercall(TDX_HYPERCALL_STANDARD, EXIT_REASON_IO_INSTRUCTION, + size, 1, port, value, &out); +} + +static void tdx_outb(u8 value, u32 port) +{ + tdx_io_out(sizeof(u8), port, value); +} + +static u8 tdx_inb(u32 port) +{ + return tdx_io_in(sizeof(u8), port); +} + +static void tdx_serial_outb(char ch) +{ + u8 lsr; + + do { + lsr = tdx_inb(serial_iobase + 0x05); + } while (!(lsr & 0x20)); + + tdx_outb(ch, serial_iobase + 0x00); +} + +static void tdx_puts(const char *buf) +{ + unsigned long len = strlen(buf); + unsigned long i; + + spin_lock(&tdx_puts_lock); + + /* No need to initialize serial port as TDVF has done that */ + for (i = 0; i < len; i++) + tdx_serial_outb(buf[i]); + + spin_unlock(&tdx_puts_lock); +} + +/* Used only in TDX arch code itself */ +static int tdx_printf(const char *fmt, ...) +{ + va_list va; + char buf[BUFSZ]; + int r; + + va_start(va, fmt); + r = vsnprintf(buf, sizeof(buf), fmt, va); + va_end(va); + tdx_puts(buf); + return r; +} + +bool is_tdx_guest(void) +{ + static int tdx_guest = -1; + struct cpuid c; + u32 sig[3]; + + if (tdx_guest >= 0) + goto done; + + if (cpuid(0).a < TDX_CPUID_LEAF_ID) { + tdx_guest = 0; + goto done; + } + + c = cpuid(TDX_CPUID_LEAF_ID); + sig[0] = c.b; + sig[1] = c.d; + sig[2] = c.c; + + tdx_guest = !memcmp("IntelTDX ", sig, 12); + +done: + return !!tdx_guest; +} + +/* + * Wrapper for standard use of __tdx_hypercall with BUG_ON() check + * for TDCALL error. + */ +static inline u64 _tdx_hypercall(u64 fn, u64 r12, u64 r13, u64 r14, + u64 r15, struct tdx_hypercall_output *out) +{ + struct tdx_hypercall_output outl; + u64 err; + + /* __tdx_hypercall() does not accept NULL output pointer */ + if (!out) + out = &outl; + + err = __tdx_hypercall(TDX_HYPERCALL_STANDARD, fn, r12, r13, r14, + r15, out); + + /* Non zero return value indicates buggy TDX module, so panic */ + BUG_ON(err); + + if (out->r10) + tdx_printf("_tdx_hypercall err %lx %lx %lx %lx %lx %lx\n", + out->r10, out->r11, out->r12, out->r13, + out->r14, out->r15); + return out->r10; +} + +static bool _tdx_halt(const bool irq_disabled, const bool do_sti) +{ + u64 ret; + + /* + * Emulate HLT operation via hypercall. More info about ABI + * can be found in TDX Guest-Host-Communication Interface + * (GHCI), sec 3.8 TDG.VP.VMCALL. + * + * The VMM uses the "IRQ disabled" param to understand IRQ + * enabled status (RFLAGS.IF) of TD guest and determine + * whether or not it should schedule the halted vCPU if an + * IRQ becomes pending. E.g. if IRQs are disabled the VMM + * can keep the vCPU in virtual HLT, even if an IRQ is + * pending, without hanging/breaking the guest. + * + * do_sti parameter is used by __tdx_hypercall() to decide + * whether to call STI instruction before executing TDCALL + * instruction. + */ + ret = _tdx_hypercall(EXIT_REASON_HLT, irq_disabled, 0, 0, + do_sti, NULL); + return !ret; +} + +static bool tdx_read_msr(unsigned int msr, u64 *val) +{ + struct tdx_hypercall_output out; + u64 ret; + + /* + * Emulate the MSR read via hypercall. More info about ABI + * can be found in TDX Guest-Host-Communication Interface + * (GHCI), sec titled "TDG.VP.VMCALL". + */ + ret = _tdx_hypercall(EXIT_REASON_MSR_READ, msr, 0, 0, 0, &out); + + if (ret) + return false; + + *val = out.r11; + return true; +} + +static bool tdx_write_msr(unsigned int msr, unsigned int low, + unsigned int high) +{ + u64 ret; + + /* + * Emulate the MSR write via hypercall. More info about ABI + * can be found in TDX Guest-Host-Communication Interface + * (GHCI) sec titled "TDG.VP.VMCALL". + */ + ret = _tdx_hypercall(EXIT_REASON_MSR_WRITE, msr, + (u64)high << 32 | low, 0, 0, NULL); + + return !ret; +} + +static bool tdx_handle_cpuid(struct ex_regs *regs) +{ + struct tdx_hypercall_output out; + + /* + * Emulate CPUID instruction via hypercall. More info about + * ABI can be found in TDX Guest-Host-Communication Interface + * (GHCI), section titled "VP.VMCALL". + */ + if (_tdx_hypercall(EXIT_REASON_CPUID, regs->rax, regs->rcx, + 0, 0, &out)) + return false; + + /* + * As per TDX GHCI CPUID ABI, r12-r15 registers contains contents of + * EAX, EBX, ECX, EDX registers after CPUID instruction execution. + * So copy the register contents back to ex_regs. + */ + regs->rax = out.r12; + regs->rbx = out.r13; + regs->rcx = out.r14; + regs->rdx = out.r15; + + return true; +} + +static bool tdx_handle_io(struct ex_regs *regs, u32 exit_qual) +{ + struct tdx_hypercall_output outh; + int out, size, port, ret; + bool string; + u64 mask; + + string = VE_IS_IO_STRING(exit_qual); + + /* I/O strings ops are unrolled at build time. */ + if (string) { + tdx_printf("string io isn't supported in #VE currently.\n"); + return false; + } + + out = VE_IS_IO_OUT(exit_qual); + size = VE_GET_IO_SIZE(exit_qual); + port = VE_GET_PORT_NUM(exit_qual); + mask = GENMASK(8 * size, 0); + + ret = _tdx_hypercall(EXIT_REASON_IO_INSTRUCTION, + size, out, port, regs->rax, &outh); + if (!out) { + regs->rax &= ~mask; + regs->rax |= (ret ? UINT_MAX : outh.r11) & mask; + } + + return ret ? false : true; +} + +efi_status_t setup_tdx(void) +{ + if (!is_tdx_guest()) + return EFI_UNSUPPORTED; + + return EFI_SUCCESS; +} diff --git a/lib/x86/tdx.h b/lib/x86/tdx.h new file mode 100644 index 000000000000..92ae5277b04d --- /dev/null +++ b/lib/x86/tdx.h @@ -0,0 +1,76 @@ +/* + * TDX library + * + * Copyright (c) 2022, Intel Inc + * + * Authors: + * Zhenzhong Duan + * + * SPDX-License-Identifier: GPL-2.0 + */ + +#ifndef _ASM_X86_TDX_H +#define _ASM_X86_TDX_H + +#ifdef TARGET_EFI + +#include "libcflat.h" +#include "limits.h" +#include "efi.h" + +#define BUG_ON(condition) do { if (condition) abort(); } while (0) + +#define TDX_CPUID_LEAF_ID 0x21 +#define TDX_HYPERCALL_STANDARD 0 + +#define EXIT_REASON_CPUID 10 +#define EXIT_REASON_HLT 12 +#define EXIT_REASON_IO_INSTRUCTION 30 +#define EXIT_REASON_MSR_READ 31 +#define EXIT_REASON_MSR_WRITE 32 + +/* + * Used in __tdx_module_call() helper function to gather the + * output registers' values of TDCALL instruction when requesting + * services from the TDX module. This is software only structure + * and not related to TDX module/VMM. + */ +struct tdx_module_output { + u64 rcx; + u64 rdx; + u64 r8; + u64 r9; + u64 r10; + u64 r11; +}; + +/* + * Used in __tdx_hypercall() helper function to gather the + * output registers' values of TDCALL instruction when requesting + * services from the VMM. This is software only structure + * and not related to TDX module/VMM. + */ +struct tdx_hypercall_output { + u64 r10; + u64 r11; + u64 r12; + u64 r13; + u64 r14; + u64 r15; +}; + +bool is_tdx_guest(void); +efi_status_t setup_tdx(void); + +/* Helper function used to communicate with the TDX module */ +u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9, + struct tdx_module_output *out); + +/* Helper function used to request services from VMM */ +u64 __tdx_hypercall(u64 type, u64 fn, u64 r12, u64 r13, u64 r14, + u64 r15, struct tdx_hypercall_output *out); +#else +inline bool is_tdx_guest(void) { return false; } +#endif /* TARGET_EFI */ + +#endif /* _ASM_X86_TDX_H */ diff --git a/x86/Makefile.common b/x86/Makefile.common index ff02d9822321..8e2970b1cfc4 100644 --- a/x86/Makefile.common +++ b/x86/Makefile.common @@ -26,6 +26,8 @@ ifeq ($(TARGET_EFI),y) cflatobjs += lib/x86/amd_sev.o cflatobjs += lib/efi.o cflatobjs += x86/efi/reloc_x86_64.o +cflatobjs += lib/x86/tdcall.o +cflatobjs += lib/x86/tdx.o endif OBJDIRS += lib/x86 From patchwork Thu Mar 3 07:18:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 12767140 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D229CC4332F for ; Thu, 3 Mar 2022 07:27:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230190AbiCCH2K (ORCPT ); Thu, 3 Mar 2022 02:28:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36438 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230179AbiCCH2H (ORCPT ); Thu, 3 Mar 2022 02:28:07 -0500 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4690916C4C1 for ; Wed, 2 Mar 2022 23:27:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646292442; x=1677828442; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=JnGB8u57LksBwRNfjYZMEXUeN/HS+eeDw+eaG5RTyKk=; b=WhpZY/fxeHALOqqzLB2l3zIn33bNxHI0FAgLp7waSh/9CyR2egZYcH+k DVeuVnixmm3RAj6cW+YWTVnMXNOFArPW7Evq0e1b7n54HKJ5SzfNCQSCI asNc2/Mxufobde1pHq0Bgc507Qfy+gpsnXr9KPZSAr/jrQH+h2Bjfa2A6 FeJnf53NxJg+80O06MTORZLbWKWIcPLNozmNfUYNrgo2+NZdMDwhuhHIB omlJzNZEOSEwdWuHHwpck/3+Lz/8OemPwvnrzEp5a4EzEk+GvTHcmDcMD P/COYifjE5hGDUuta2RvYI/Cn57wdOYBOybomz9UzIVgqMHghsfa0aAbC A==; X-IronPort-AV: E=McAfee;i="6200,9189,10274"; a="251176938" X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="251176938" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:22 -0800 X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="551631491" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.123]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:19 -0800 From: Zhenzhong Duan To: kvm@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, yu.c.zhang@intel.com, zixuanwang@google.com, marcorr@google.com, jun.nakajima@intel.com, erdemaktas@google.com Subject: [kvm-unit-tests RFC PATCH 02/17] x86 TDX: Add #VE handler Date: Thu, 3 Mar 2022 15:18:52 +0800 Message-Id: <20220303071907.650203-3-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303071907.650203-1-zhenzhong.duan@intel.com> References: <20220303071907.650203-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Some instructions execution trigger #VE and are simulated in #VE handler. Add such a handler, currently support simulation of IO and MSR read/write, cpuid and hlt instructions. Signed-off-by: Zhenzhong Duan Reviewed-by: Yu Zhang --- lib/x86/desc.c | 3 ++ lib/x86/processor.h | 1 + lib/x86/tdx.c | 87 +++++++++++++++++++++++++++++++++++++++++++++ lib/x86/tdx.h | 17 +++++++++ 4 files changed, 108 insertions(+) diff --git a/lib/x86/desc.c b/lib/x86/desc.c index c2eb16e91fa1..b35274e44a8d 100644 --- a/lib/x86/desc.c +++ b/lib/x86/desc.c @@ -112,6 +112,7 @@ const char* exception_mnemonic(int vector) case 17: return "#AC"; case 18: return "#MC"; case 19: return "#XM"; + case 20: return "#VE"; default: return "#??"; } } @@ -227,6 +228,7 @@ EX(mf, 16); EX_E(ac, 17); EX(mc, 18); EX(xm, 19); +EX(ve, 20); EX_E(cp, 21); asm (".pushsection .text \n\t" @@ -273,6 +275,7 @@ static void *idt_handlers[32] = { [17] = &ac_fault, [18] = &mc_fault, [19] = &xm_fault, + [20] = &ve_fault, [21] = &cp_fault, }; diff --git a/lib/x86/processor.h b/lib/x86/processor.h index 117032a4895c..865269fd3857 100644 --- a/lib/x86/processor.h +++ b/lib/x86/processor.h @@ -28,6 +28,7 @@ #define GP_VECTOR 13 #define PF_VECTOR 14 #define AC_VECTOR 17 +#define VE_VECTOR 20 #define CP_VECTOR 21 #define X86_CR0_PE 0x00000001 diff --git a/lib/x86/tdx.c b/lib/x86/tdx.c index 8308480105d6..42ab25f47e57 100644 --- a/lib/x86/tdx.c +++ b/lib/x86/tdx.c @@ -267,10 +267,97 @@ static bool tdx_handle_io(struct ex_regs *regs, u32 exit_qual) return ret ? false : true; } +static bool tdx_get_ve_info(struct ve_info *ve) +{ + struct tdx_module_output out; + u64 ret; + + if (!ve) + return false; + + /* + * NMIs and machine checks are suppressed. Before this point any + * #VE is fatal. After this point (TDGETVEINFO call), NMIs and + * additional #VEs are permitted (but it is expected not to + * happen unless kernel panics). + */ + ret = __tdx_module_call(TDX_GET_VEINFO, 0, 0, 0, 0, &out); + if (ret) + return false; + + ve->exit_reason = out.rcx; + ve->exit_qual = out.rdx; + ve->gla = out.r8; + ve->gpa = out.r9; + ve->instr_len = out.r10 & UINT_MAX; + ve->instr_info = out.r10 >> 32; + + return true; +} + +static bool tdx_handle_virtualization_exception(struct ex_regs *regs, + struct ve_info *ve) +{ + bool ret = true; + u64 val = ~0ULL; + bool do_sti; + + switch (ve->exit_reason) { + case EXIT_REASON_HLT: + do_sti = !!(regs->rflags & X86_EFLAGS_IF); + /* Bypass failed hlt is better than hang */ + if (!_tdx_halt(!do_sti, do_sti)) + tdx_printf("HLT instruction emulation failed\n"); + break; + case EXIT_REASON_MSR_READ: + ret = tdx_read_msr(regs->rcx, &val); + if (ret) { + regs->rax = (u32)val; + regs->rdx = val >> 32; + } + break; + case EXIT_REASON_MSR_WRITE: + ret = tdx_write_msr(regs->rcx, regs->rax, regs->rdx); + break; + case EXIT_REASON_CPUID: + ret = tdx_handle_cpuid(regs); + break; + case EXIT_REASON_IO_INSTRUCTION: + ret = tdx_handle_io(regs, ve->exit_qual); + break; + default: + tdx_printf("Unexpected #VE: %ld\n", ve->exit_reason); + return false; + } + + /* After successful #VE handling, move the IP */ + if (ret) + regs->rip += ve->instr_len; + + return ret; +} + +/* #VE exception handler. */ +static void tdx_handle_ve(struct ex_regs *regs) +{ + struct ve_info ve; + + if (!tdx_get_ve_info(&ve)) { + tdx_printf("tdx_get_ve_info failed\n"); + return; + } + + tdx_handle_virtualization_exception(regs, &ve); +} + efi_status_t setup_tdx(void) { if (!is_tdx_guest()) return EFI_UNSUPPORTED; + handle_exception(20, tdx_handle_ve); + + printf("Initialized TDX.\n"); + return EFI_SUCCESS; } diff --git a/lib/x86/tdx.h b/lib/x86/tdx.h index 92ae5277b04d..68ddc136d1d9 100644 --- a/lib/x86/tdx.h +++ b/lib/x86/tdx.h @@ -29,6 +29,9 @@ #define EXIT_REASON_MSR_READ 31 #define EXIT_REASON_MSR_WRITE 32 +/* TDX Module call Leaf IDs */ +#define TDX_GET_VEINFO 3 + /* * Used in __tdx_module_call() helper function to gather the * output registers' values of TDCALL instruction when requesting @@ -59,6 +62,20 @@ struct tdx_hypercall_output { u64 r15; }; +/* + * Used by #VE exception handler to gather the #VE exception + * info from the TDX module. This is software only structure + * and not related to TDX module/VMM. + */ +struct ve_info { + u64 exit_reason; + u64 exit_qual; + u64 gla; /* Guest Linear (virtual) Address */ + u64 gpa; /* Guest Physical (virtual) Address */ + u32 instr_len; + u32 instr_info; +}; + bool is_tdx_guest(void); efi_status_t setup_tdx(void); From patchwork Thu Mar 3 07:18:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 12767141 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E15FAC35272 for ; Thu, 3 Mar 2022 07:27:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230225AbiCCH2N (ORCPT ); Thu, 3 Mar 2022 02:28:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36542 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230187AbiCCH2L (ORCPT ); Thu, 3 Mar 2022 02:28:11 -0500 Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 32F533A70A for ; Wed, 2 Mar 2022 23:27:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646292445; x=1677828445; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=C7MaZPwrtSUd56gxI3d1JMTZIsW8wktBIoCkdfZMe94=; b=A23kT/96ztvHNT2pAw3zbhzvspBAD3ik8Q0SYqvwzSBta8K3c3j8yYm8 jZmg4TxAbL9kfvCyFG4xZ8/zAdc0gUfzBNf5Vrq3gLhyf1rex0uphK1sf 8g5/9ONYRwURFMyzqSEtjEqgHr4IfcrSMDtOwkuPxp04Yj9Yt+I81gCLU +nVEmkArH47STnN7KjqjRWDbrutHT0eKwoYVi3otkedMz8Pac6p1lMDLO QNyCP39b0OdR9LVdOtssK0CypRjOvUABYa/ccQ0TTpAnZ7efXXHwEw4WS wEyHrXiawClKy7JdOge9nBcQ+4Gkf9go6BebvK21gHU8y49JMVfZZFHb8 Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10274"; a="314317841" X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="314317841" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:24 -0800 X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="551631510" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.123]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:22 -0800 From: Zhenzhong Duan To: kvm@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, yu.c.zhang@intel.com, zixuanwang@google.com, marcorr@google.com, jun.nakajima@intel.com, erdemaktas@google.com Subject: [kvm-unit-tests RFC PATCH 03/17] x86 TDX: Bypass APIC and enable x2APIC directly Date: Thu, 3 Mar 2022 15:18:53 +0800 Message-Id: <20220303071907.650203-4-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303071907.650203-1-zhenzhong.duan@intel.com> References: <20220303071907.650203-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org According to TDX Architecture Specification, 9.8 Interrupt Handling and APIC Virtualization: 1. Guest TDs must use virtualized x2APIC mode. xAPIC mode(using memory mapped APIC access) is not allowed. 2. Guest TDs attempts to RDMSR or WRMSR the IA32_APIC_BASE MSR cause a VE to the guest TD. The guest TD cannot disable the APIC. Bypass xAPIC initialization and enable x2APIC directly. Set software enable bit in x2APIC initializaion. Use uid/apicid mapping to get apicid in setup_tss(). Initially I enabled x2APIC early so apic_id() could be used. But that brings issue for multiprocessor support as reading APIC_ID in AP triggers #VE and require gdt/tss/idt to be initialized early, so setup_gdt_tss() early. Signed-off-by: Zhenzhong Duan Reviewed-by: Yu Zhang --- lib/x86/apic.c | 4 ++++ lib/x86/setup.c | 10 +++++++--- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/lib/x86/apic.c b/lib/x86/apic.c index da8f30134b22..84bfe98c58ff 100644 --- a/lib/x86/apic.c +++ b/lib/x86/apic.c @@ -147,6 +147,10 @@ int enable_x2apic(void) asm ("rdmsr" : "=a"(a), "=d"(d) : "c"(MSR_IA32_APICBASE)); a |= 1 << 10; asm ("wrmsr" : : "a"(a), "d"(d), "c"(MSR_IA32_APICBASE)); + + /* software APIC enabled bit is cleared after reset in TD-guest */ + x2apic_write(APIC_SPIV, 0x1ff); + apic_ops = &x2apic_ops; return 1; } else { diff --git a/lib/x86/setup.c b/lib/x86/setup.c index fbcd188ebb8f..e834fdfd290c 100644 --- a/lib/x86/setup.c +++ b/lib/x86/setup.c @@ -108,8 +108,9 @@ unsigned long setup_tss(u8 *stacktop) { u32 id; tss64_t *tss_entry; + static u32 cpus = 0; - id = apic_id(); + id = is_tdx_guest() ? id_map[cpus++] : apic_id(); /* Runtime address of current TSS */ tss_entry = &tss[id]; @@ -327,12 +328,15 @@ efi_status_t setup_efi(efi_bootinfo_t *efi_bootinfo) return status; } - reset_apic(); + /* xAPIC mode isn't allowed in TDX */ + if (!is_tdx_guest()) + reset_apic(); setup_gdt_tss(); setup_idt(); load_idt(); mask_pic_interrupts(); - enable_apic(); + if (!is_tdx_guest()) + enable_apic(); enable_x2apic(); smp_init(); setup_page_table(); From patchwork Thu Mar 3 07:18:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 12767142 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DDA0CC433EF for ; Thu, 3 Mar 2022 07:27:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230220AbiCCH2Q (ORCPT ); Thu, 3 Mar 2022 02:28:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36628 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230226AbiCCH2N (ORCPT ); Thu, 3 Mar 2022 02:28:13 -0500 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3C20410DC for ; Wed, 2 Mar 2022 23:27:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646292448; x=1677828448; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=q6Bijq9RqPmvXm9peANhjkv/SEQQ8PvTpgmj+xKkNM4=; b=goreraHv35Sw0GHWgMxokH2dcG/75Q4/kRex1K0xlFE/TffAAuMIP9fF /3wIbBoZY+6A+49YLYSkxHcHSRrd3S0HfWE6ynvLM8SU0Rl0V/IHlHT2m FXJSsyp5hXQvzqOiQWH9kjDnIsu/08kF7Uot/dzFfrqxHN7Wd7DStcVcy 5ua77q93I2j9DMgjN4T3TzDbF14bF0Umi1/oakgr9TEqybZEsUKUcjyR3 9pOEFS2Ej7GSAFScP/n+y1IRjchuemazaRxcb23zbE6gZagIxHPAFY3ZU dBg4Z7jQ9a5eMN0kiG4rhRG5Q5DwXp/iEeWgfqRLKkK3igx3re/3N7JC4 w==; X-IronPort-AV: E=McAfee;i="6200,9189,10274"; a="234214653" X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="234214653" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:27 -0800 X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="551631532" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.123]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:25 -0800 From: Zhenzhong Duan To: kvm@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, yu.c.zhang@intel.com, zixuanwang@google.com, marcorr@google.com, jun.nakajima@intel.com, erdemaktas@google.com Subject: [kvm-unit-tests RFC PATCH 04/17] x86 TDX: Add exception table support Date: Thu, 3 Mar 2022 15:18:54 +0800 Message-Id: <20220303071907.650203-5-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303071907.650203-1-zhenzhong.duan@intel.com> References: <20220303071907.650203-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Exception table is used to fixup from a faulty instruction execution. In TDX scenario, some instructions trigger #VE and simulated through tdvmcall. If the simulation fail, the instruction is treated as faulty and should be checked with the exception table to fixup. Move struct ex_record, exception_table_[start|end] in lib/x86/desc.h as it's a general declaration and will be used in TDX. Signed-off-by: Zhenzhong Duan Reviewed-by: Yu Zhang --- lib/x86/desc.c | 7 ------- lib/x86/desc.h | 6 ++++++ lib/x86/tdx.c | 23 +++++++++++++++++++++++ 3 files changed, 29 insertions(+), 7 deletions(-) diff --git a/lib/x86/desc.c b/lib/x86/desc.c index b35274e44a8d..52eb4152385a 100644 --- a/lib/x86/desc.c +++ b/lib/x86/desc.c @@ -84,13 +84,6 @@ void set_idt_sel(int vec, u16 sel) e->selector = sel; } -struct ex_record { - unsigned long rip; - unsigned long handler; -}; - -extern struct ex_record exception_table_start, exception_table_end; - const char* exception_mnemonic(int vector) { switch(vector) { diff --git a/lib/x86/desc.h b/lib/x86/desc.h index ad6277ba600a..068ec2394df9 100644 --- a/lib/x86/desc.h +++ b/lib/x86/desc.h @@ -212,6 +212,12 @@ extern tss64_t tss[]; #endif extern gdt_entry_t gdt[]; +struct ex_record { + unsigned long rip; + unsigned long handler; +}; +extern struct ex_record exception_table_start, exception_table_end; + unsigned exception_vector(void); int write_cr4_checking(unsigned long val); unsigned exception_error_code(void); diff --git a/lib/x86/tdx.c b/lib/x86/tdx.c index 42ab25f47e57..62e0e2842822 100644 --- a/lib/x86/tdx.c +++ b/lib/x86/tdx.c @@ -267,6 +267,22 @@ static bool tdx_handle_io(struct ex_regs *regs, u32 exit_qual) return ret ? false : true; } +static bool tdx_check_exception_table(struct ex_regs *regs) +{ + struct ex_record *ex; + + for (ex = &exception_table_start; ex != &exception_table_end; ++ex) { + if (ex->rip == regs->rip) { + regs->rip = ex->handler; + return true; + } + } + unhandled_exception(regs, false); + + /* never reached */ + return false; +} + static bool tdx_get_ve_info(struct ve_info *ve) { struct tdx_module_output out; @@ -298,10 +314,15 @@ static bool tdx_get_ve_info(struct ve_info *ve) static bool tdx_handle_virtualization_exception(struct ex_regs *regs, struct ve_info *ve) { + unsigned int ex_val; bool ret = true; u64 val = ~0ULL; bool do_sti; + /* #VE exit_reason in bit16-32 */ + ex_val = regs->vector | (ve->exit_reason << 16); + asm("mov %0, %%gs:4" : : "r"(ex_val)); + switch (ve->exit_reason) { case EXIT_REASON_HLT: do_sti = !!(regs->rflags & X86_EFLAGS_IF); @@ -333,6 +354,8 @@ static bool tdx_handle_virtualization_exception(struct ex_regs *regs, /* After successful #VE handling, move the IP */ if (ret) regs->rip += ve->instr_len; + else + ret = tdx_check_exception_table(regs); return ret; } From patchwork Thu Mar 3 07:18:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 12767143 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB932C433F5 for ; Thu, 3 Mar 2022 07:28:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230318AbiCCH2n (ORCPT ); Thu, 3 Mar 2022 02:28:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37854 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230252AbiCCH2c (ORCPT ); Thu, 3 Mar 2022 02:28:32 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E967E5FE9 for ; Wed, 2 Mar 2022 23:27:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646292452; x=1677828452; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pGbVvnH0+tW0IVP4OvryXS4dM2QPA9H1YCBKoJmLM34=; b=fB5P5kReeOunY2KtJ7Wt4vwZnMB/nvokAUYqxCe5QFgj43XBgYUPCVKC jaLrhq1+LPv6rLfGc7e04N+olHkTy/7q8wwzsbnlqM4bCJ6GvEjf+zVR0 LnVHhBSgow42wsbmode4wH4fNGjUqfjnjShz384pjuZcyrTKz0z7S6QRL xhftsolHaEOde4B8lX87CboZtA6f+ya2dzJD/DYdX1EaLYn0UVFp9fqPS KOqmad7fOgX61RBQWJ3UDasojjVDHuIAmRYOG1lKzLSJ1bRJvpFBcwrQz y9R4SuFKZyRKLdn9ttxk+22ObDGD18FGAniMwACfq0RNCvrYJIscSTyA5 g==; X-IronPort-AV: E=McAfee;i="6200,9189,10274"; a="252427533" X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="252427533" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:30 -0800 X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="551631561" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.123]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:27 -0800 From: Zhenzhong Duan To: kvm@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, yu.c.zhang@intel.com, zixuanwang@google.com, marcorr@google.com, jun.nakajima@intel.com, erdemaktas@google.com Subject: [kvm-unit-tests RFC PATCH 05/17] x86 TDX: bypass wrmsr simulation on some specific MSRs Date: Thu, 3 Mar 2022 15:18:55 +0800 Message-Id: <20220303071907.650203-6-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303071907.650203-1-zhenzhong.duan@intel.com> References: <20220303071907.650203-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In TDX scenario, some MSRs are initialized with expected value and not expected to be changed in TD-guest. Writing to MSR_IA32_TSC, MSR_IA32_APICBASE, MSR_EFER in TD-guest triggers #VE. In #VE handler these MSR access are simulated with tdvmcall. But in current TDX host side implementation, they are bypassed and return failure. In order to let test cases touching those MSRs run smoothly, bypass writing to those MSRs in #VE handler just like writing succeed. Signed-off-by: Zhenzhong Duan Reviewed-by: Yu Zhang --- lib/x86/tdx.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/lib/x86/tdx.c b/lib/x86/tdx.c index 62e0e2842822..1fc8030c34fa 100644 --- a/lib/x86/tdx.c +++ b/lib/x86/tdx.c @@ -311,6 +311,18 @@ static bool tdx_get_ve_info(struct ve_info *ve) return true; } +static bool tdx_is_bypassed_msr(u32 index) +{ + switch (index) { + case MSR_IA32_TSC: + case MSR_IA32_APICBASE: + case MSR_EFER: + return true; + default: + return false; + } +} + static bool tdx_handle_virtualization_exception(struct ex_regs *regs, struct ve_info *ve) { @@ -338,7 +350,8 @@ static bool tdx_handle_virtualization_exception(struct ex_regs *regs, } break; case EXIT_REASON_MSR_WRITE: - ret = tdx_write_msr(regs->rcx, regs->rax, regs->rdx); + if (!tdx_is_bypassed_msr(regs->rcx)) + ret = tdx_write_msr(regs->rcx, regs->rax, regs->rdx); break; case EXIT_REASON_CPUID: ret = tdx_handle_cpuid(regs); From patchwork Thu Mar 3 07:18:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 12767154 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA158C433F5 for ; Thu, 3 Mar 2022 07:28:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230284AbiCCH3C (ORCPT ); Thu, 3 Mar 2022 02:29:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230248AbiCCH2l (ORCPT ); Thu, 3 Mar 2022 02:28:41 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2F443881 for ; Wed, 2 Mar 2022 23:27:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646292458; x=1677828458; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=689/bs1uE2tAQyNb7+lRjwYDAJMMQs0xZFnTOExqe3Q=; b=ByIvMriD+BPKuqHsUoKa3SwWG1H9oZ03Gu81SVK+G8CojGxsl43xXgX4 z5vq/cATs+cP0E4ribxCQDbq5uAL2CajDztPARrnjMoDtnGUEqSau1c7C HO/C9LLEhpiKFoLFmhK2JFf/Uuu4zKL6tM9mNsM0L3Sl0cQZLiDOqN9Q4 UW0a94ttd5g/1MpeqpLgp/Cpfv/JIT2dEq1pA0Q66e7guA/UFZRni9/0U H76mgVvthbVvmdQHo8JudE4+E9RQBsjgb/P5s3gbg0kwGsEdEscLyroF/ ijLLPS78ZuLiVfd0XpredhXApBnXaVzhr9ltY2+MEFjPObzgCn/nTfnLJ Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10274"; a="252427542" X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="252427542" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:33 -0800 X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="551631592" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.123]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:30 -0800 From: Zhenzhong Duan To: kvm@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, yu.c.zhang@intel.com, zixuanwang@google.com, marcorr@google.com, jun.nakajima@intel.com, erdemaktas@google.com Subject: [kvm-unit-tests RFC PATCH 06/17] x86 TDX: Simulate single step on #VE handled instruction Date: Thu, 3 Mar 2022 15:18:56 +0800 Message-Id: <20220303071907.650203-7-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303071907.650203-1-zhenzhong.duan@intel.com> References: <20220303071907.650203-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org According to TDX spec, specific instructions are simulated in #VE handler, such as cpuid(0xb) and wrmsr(0x1a0). To avoid missing single step on these instructions, we have to simulate #DB processing in #VE handler. Move declaration of do_handle_exception() in header file, so it can be used in #VE handler for #DB processing. Signed-off-by: Zhenzhong Duan Reviewed-by: Yu Zhang --- lib/x86/desc.c | 5 ----- lib/x86/desc.h | 4 ++++ lib/x86/tdx.c | 9 ++++++++- 3 files changed, 12 insertions(+), 6 deletions(-) diff --git a/lib/x86/desc.c b/lib/x86/desc.c index 52eb4152385a..78f4b6576888 100644 --- a/lib/x86/desc.c +++ b/lib/x86/desc.c @@ -51,11 +51,6 @@ struct descriptor_table_ptr gdt_descr = { .base = (unsigned long)gdt, }; -#ifndef __x86_64__ -__attribute__((regparm(1))) -#endif -void do_handle_exception(struct ex_regs *regs); - void set_idt_entry(int vec, void *addr, int dpl) { idt_entry_t *e = &boot_idt[vec]; diff --git a/lib/x86/desc.h b/lib/x86/desc.h index 068ec2394df9..2cd819574374 100644 --- a/lib/x86/desc.h +++ b/lib/x86/desc.h @@ -222,6 +222,10 @@ unsigned exception_vector(void); int write_cr4_checking(unsigned long val); unsigned exception_error_code(void); bool exception_rflags_rf(void); +#ifndef __x86_64__ +__attribute__((regparm(1))) +#endif +void do_handle_exception(struct ex_regs *regs); void set_idt_entry(int vec, void *addr, int dpl); void set_idt_sel(int vec, u16 sel); void set_idt_dpl(int vec, u16 dpl); diff --git a/lib/x86/tdx.c b/lib/x86/tdx.c index 1fc8030c34fa..2b2e3164be33 100644 --- a/lib/x86/tdx.c +++ b/lib/x86/tdx.c @@ -365,8 +365,15 @@ static bool tdx_handle_virtualization_exception(struct ex_regs *regs, } /* After successful #VE handling, move the IP */ - if (ret) + if (ret) { regs->rip += ve->instr_len; + /* Simulate single step on simulated instruction */ + if (regs->rflags & X86_EFLAGS_TF) { + regs->vector = DB_VECTOR; + write_dr6(read_dr6() | (1 << 14)); + do_handle_exception(regs); + } + } else ret = tdx_check_exception_table(regs); From patchwork Thu Mar 3 07:18:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 12767155 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8772C433EF for ; Thu, 3 Mar 2022 07:28:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230250AbiCCH3E (ORCPT ); Thu, 3 Mar 2022 02:29:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38286 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230246AbiCCH2l (ORCPT ); Thu, 3 Mar 2022 02:28:41 -0500 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C24B72DCC for ; Wed, 2 Mar 2022 23:27:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646292458; x=1677828458; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=efl/khJDj+x8rvRDLcUl9AoBWzvRZc0qmgZJdcK4ATM=; b=R98TA6d819IL5RdCHnuqDbTML+oS+T+407/1R1A0BtX2TljALsj13m9+ IZzgrYHynS9sQg7CFE3NN7m/SoH55exnnpfr9CGVWke8O9wJqDbg5p1H5 VkL2FRBsXqLB+MhJPcZDAu8yEAUaA9IzlHgy/jKY6trCg8S+ejh2gTWZl A68GysHZ1s/EYkOiEXWaJpuGZqep9xZbkyizaZptW/nK8U4HEpQArcAtt /AsEZOWJrEiCxuLq8Ri0Oozj61WchAXdfCXpDyIzQ7CE541uSe/s1N6ik MVqtIJcNERJnjxvVZUgNcDLuGsGfN3khYnoJeMfAOAY2pCN0wwAcqImJM w==; X-IronPort-AV: E=McAfee;i="6200,9189,10274"; a="251176971" X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="251176971" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:36 -0800 X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="551631609" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.123]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:33 -0800 From: Zhenzhong Duan To: kvm@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, yu.c.zhang@intel.com, zixuanwang@google.com, marcorr@google.com, jun.nakajima@intel.com, erdemaktas@google.com Subject: [kvm-unit-tests RFC PATCH 07/17] x86 TDX: Extend EFI run script to support TDX Date: Thu, 3 Mar 2022 15:18:57 +0800 Message-Id: <20220303071907.650203-8-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303071907.650203-1-zhenzhong.duan@intel.com> References: <20220303071907.650203-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Currently TDX framework is based on EFI support and running test case in TDX environment requires special QEMU command line parameters. Add an environment variable EFI_TDX. When set, enable test case to run in TDX protected environment with special QEMU parameters. Force "-cpu host" to be the last parameter as qemu doesn't support to customize CPU feature for TD guest currently. Signed-off-by: Zhenzhong Duan Reviewed-by: Yu Zhang --- x86/efi/README.md | 6 ++++++ x86/efi/run | 19 +++++++++++++++++++ 2 files changed, 25 insertions(+) diff --git a/x86/efi/README.md b/x86/efi/README.md index a39f509cd9aa..b6f1fc68b0f3 100644 --- a/x86/efi/README.md +++ b/x86/efi/README.md @@ -30,6 +30,12 @@ the env variable `EFI_UEFI`: EFI_UEFI=/path/to/OVMF.fd ./x86/efi/run ./x86/msr.efi +### Run test cases with UEFI in TDX environment + +To run a test case with UEFI and TDX enabled: + + EFI_TDX=y ./x86/efi/run ./x86/msr.efi + ## Code structure ### Code from GNU-EFI diff --git a/x86/efi/run b/x86/efi/run index ac368a59ba9f..2af0a303ea0e 100755 --- a/x86/efi/run +++ b/x86/efi/run @@ -18,6 +18,7 @@ source config.mak : "${EFI_TEST:=efi-tests}" : "${EFI_SMP:=1}" : "${EFI_CASE:=$(basename $1 .efi)}" +: "${EFI_TDX:=n}" if [ ! -f "$EFI_UEFI" ]; then echo "UEFI firmware not found: $EFI_UEFI" @@ -29,6 +30,24 @@ fi # Remove the TEST_CASE from $@ shift 1 +# TDX support -kernel QEMU parameter, could utilize the original way of +# verifying QEMU's configuration. CPU feature customization isn't supported +# in TDX currently, so pass through all the features with `-cpu host` +if [ "$EFI_TDX" == "y" ]; then + "$TEST_DIR/run" \ + -device loader,file="$EFI_UEFI",id=fd0 \ + -object tdx-guest,id=tdx0 \ + -machine q35,kvm-type=tdx,pic=no,kernel_irqchip=split,confidential-guest-support=tdx0 \ + -kernel "$EFI_SRC/$EFI_CASE.efi" \ + -net none \ + -nographic \ + -m 256 \ + "$@" \ + -cpu host + + exit $? +fi + if [ "$EFI_CASE" = "_NO_FILE_4Uhere_" ]; then EFI_CASE=dummy fi From patchwork Thu Mar 3 07:18:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 12767153 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6FD63C433FE for ; Thu, 3 Mar 2022 07:28:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230224AbiCCH3A (ORCPT ); Thu, 3 Mar 2022 02:29:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230296AbiCCH2m (ORCPT ); Thu, 3 Mar 2022 02:28:42 -0500 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4E196DFAC for ; Wed, 2 Mar 2022 23:27:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646292461; x=1677828461; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=KcBiSiPAAMPxH979tDl3bCMRDnlV7SSRwfe7lwKEt24=; b=X6dzEqFMJELJEkcuTvkp2inytkSG8NGBRzr3EhH7Q9K5QMBaLElv5gbq oJjvGYWWwogRK9/3uO28JycOWC+g0ZlgM6vR7iIx3dm/56++H1YJ9Aj3h pct2ufiXACz0Sem6zb0lR9RbR6/G0FCfWth5A20RlG2Xr0AdGALr/NVpt RSqZknroG4EixEFAJ8yKQAgfMfRQHJHAdSvglPVSl+1HQXvYj+4Hwmnwc u0Ewm1Q+bUmA4+q/mQe199oKLb+L82Q1MqC+vx7KeYDkrY23CsVuGZBM6 SNnGiZ5bQezeflPLtaKpqy7JeNYcxl2mY7FVBmR/0qzuKwVEvOb+etTvP Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10274"; a="251176977" X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="251176977" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:38 -0800 X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="551631636" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.123]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:36 -0800 From: Zhenzhong Duan To: kvm@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, yu.c.zhang@intel.com, zixuanwang@google.com, marcorr@google.com, jun.nakajima@intel.com, erdemaktas@google.com Subject: [kvm-unit-tests RFC PATCH 08/17] x86 TDX: Add support for memory accept Date: Thu, 3 Mar 2022 15:18:58 +0800 Message-Id: <20220303071907.650203-9-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303071907.650203-1-zhenzhong.duan@intel.com> References: <20220303071907.650203-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org TDVF supports partial memory accept to optimize boot time and leaves remaining memory accept to OS. Accept remaining memory of EFI_UNACCEPTED_MEMORY type at bootup. Try 2M page accept first even though hugepage memory isn't used in qemu command line currently. Export below functions so they can be used by TDX specific sub-test in the future. tdx_shared_mask() tdx_hcall_gpa_intent() tdx_accept_memory() Signed-off-by: Zhenzhong Duan Reviewed-by: Yu Zhang --- lib/asm-generic/page.h | 7 ++- lib/linux/efi.h | 23 ++++++- lib/x86/setup.c | 2 +- lib/x86/tdx.c | 139 ++++++++++++++++++++++++++++++++++++++++- lib/x86/tdx.h | 20 +++++- 5 files changed, 185 insertions(+), 6 deletions(-) diff --git a/lib/asm-generic/page.h b/lib/asm-generic/page.h index 5ed086129657..56f539ac45eb 100644 --- a/lib/asm-generic/page.h +++ b/lib/asm-generic/page.h @@ -12,8 +12,11 @@ #include #define PAGE_SHIFT 12 -#define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT) -#define PAGE_MASK (~(PAGE_SIZE-1)) +#define PAGE_SIZE (_AC(1, UL) << PAGE_SHIFT) +#define PAGE_MASK (~(PAGE_SIZE - 1)) +#define PMD_SHIFT 21 +#define PMD_SIZE (_AC(1, UL) << PMD_SHIFT) +#define PMD_MASK (~(PMD_SIZE - 1)) #ifndef __ASSEMBLY__ diff --git a/lib/linux/efi.h b/lib/linux/efi.h index 455625aa155d..df9fa7974d87 100644 --- a/lib/linux/efi.h +++ b/lib/linux/efi.h @@ -96,7 +96,8 @@ typedef struct { #define EFI_MEMORY_MAPPED_IO_PORT_SPACE 12 #define EFI_PAL_CODE 13 #define EFI_PERSISTENT_MEMORY 14 -#define EFI_MAX_MEMORY_TYPE 15 +#define EFI_UNACCEPTED_MEMORY 15 +#define EFI_MAX_MEMORY_TYPE 16 /* Attribute values: */ #define EFI_MEMORY_UC ((u64)0x0000000000000001ULL) /* uncached */ @@ -416,6 +417,26 @@ struct efi_boot_memmap { unsigned long *buff_size; }; +/* + * efi_memdesc_ptr - get the n-th EFI memmap descriptor + * @map: the start of efi memmap + * @desc_size: the size of space for each EFI memmap descriptor + * @n: the index of efi memmap descriptor + * + * EFI boot service provides the GetMemoryMap() function to get a copy of the + * current memory map which is an array of memory descriptors, each of + * which describes a contiguous block of memory. It also gets the size of the + * map, and the size of each descriptor, etc. + * + * Note that per section 6.2 of UEFI Spec 2.6 Errata A, the returned size of + * each descriptor might not be equal to sizeof(efi_memory_memdesc_t), + * since efi_memory_memdesc_t may be extended in the future. Thus the OS + * MUST use the returned size of the descriptor to find the start of each + * efi_memory_memdesc_t in the memory map array. + */ +#define efi_memdesc_ptr(map, desc_size, n) \ + ((efi_memory_desc_t *)((void *)(map) + ((n) * (desc_size)))) + #define efi_bs_call(func, ...) efi_system_table->boottime->func(__VA_ARGS__) #define efi_rs_call(func, ...) efi_system_table->runtime->func(__VA_ARGS__) diff --git a/lib/x86/setup.c b/lib/x86/setup.c index e834fdfd290c..29202478ae0d 100644 --- a/lib/x86/setup.c +++ b/lib/x86/setup.c @@ -288,7 +288,7 @@ efi_status_t setup_efi(efi_bootinfo_t *efi_bootinfo) * TDVF support partial memory accept, accept remaining memory * early so memory allocator can use it. */ - status = setup_tdx(); + status = setup_tdx(efi_bootinfo); if (status != EFI_SUCCESS && status != EFI_UNSUPPORTED) { printf("INTEL TDX setup failed, error = 0x%lx\n", status); return status; diff --git a/lib/x86/tdx.c b/lib/x86/tdx.c index 2b2e3164be33..b74c697353d9 100644 --- a/lib/x86/tdx.c +++ b/lib/x86/tdx.c @@ -10,9 +10,11 @@ */ #include "tdx.h" +#include "errno.h" #include "bitops.h" #include "x86/processor.h" #include "x86/smp.h" +#include "asm/page.h" #define VE_IS_IO_OUT(exit_qual) (((exit_qual) & 8) ? 0 : 1) #define VE_GET_IO_SIZE(exit_qual) (((exit_qual) & 7) + 1) @@ -124,6 +126,42 @@ done: return !!tdx_guest; } +static struct { + unsigned int gpa_width; + unsigned long attributes; +} td_info; + +/* The highest bit of a guest physical address is the "sharing" bit */ +phys_addr_t tdx_shared_mask(void) +{ + return 1ULL << (td_info.gpa_width - 1); +} + +static void tdx_get_info(void) +{ + struct tdx_module_output out; + u64 ret; + + /* + * TDINFO TDX Module call is used to get the TD + * execution environment information like GPA + * width, number of available vcpus, debug mode + * information, etc. More details about the ABI + * can be found in TDX Guest-Host-Communication + * Interface (GHCI), sec 2.4.2 TDCALL [TDG.VP.INFO]. + */ + ret = __tdx_module_call(TDX_GET_INFO, 0, 0, 0, 0, &out); + + /* + * Non zero return means buggy TDX module (which is + * fatal). So raise a BUG(). + */ + BUG_ON(ret); + + td_info.gpa_width = out.rcx & GENMASK(5, 0); + td_info.attributes = out.rdx; +} + /* * Wrapper for standard use of __tdx_hypercall with BUG_ON() check * for TDCALL error. @@ -393,11 +431,110 @@ static void tdx_handle_ve(struct ex_regs *regs) tdx_handle_virtualization_exception(regs, &ve); } -efi_status_t setup_tdx(void) +static u64 tdx_accept_page(phys_addr_t gpa, bool page_2mb) +{ + /* + * Pass the page physical address and size (4KB|2MB) to the + * TDX module to accept the pending, private page. More info + * about ABI can be found in TDX Guest-Host-Communication + * Interface (GHCI), sec 2.4.7. + */ + + if (page_2mb) + gpa |= 1; + + return __tdx_module_call(TDX_ACCEPT_PAGE, gpa, 0, 0, 0, NULL); +} + +/* + * Inform the VMM of the guest's intent for this physical page: + * shared with the VMM or private to the guest. The VMM is + * expected to change its mapping of the page in response. + */ +int tdx_hcall_gpa_intent(phys_addr_t start, phys_addr_t end, + enum tdx_map_type map_type) +{ + u64 ret = 0; + + if (map_type == TDX_MAP_SHARED) { + start |= tdx_shared_mask(); + end |= tdx_shared_mask(); + } + + /* + * Notify VMM about page mapping conversion. More info + * about ABI can be found in TDX Guest-Host-Communication + * Interface (GHCI), sec 3.2. + */ + ret = _tdx_hypercall(TDVMCALL_MAP_GPA, start, end - start, 0, 0, + NULL); + if (ret) + ret = -EIO; + + if (ret || map_type == TDX_MAP_SHARED) + return ret; + /* + * For shared->private conversion, accept the page using + * TDX_ACCEPT_PAGE TDX module call. + */ + while (start < end) { + /* Try 2M page accept first if possible */ + if (!(start & ~PMD_MASK) && end - start >= PMD_SIZE && + !tdx_accept_page(start, true)) { + start += PMD_SIZE; + continue; + } + + if (tdx_accept_page(start, false)) + return -EIO; + start += PAGE_SIZE; + } + + return 0; +} + +bool tdx_accept_memory(phys_addr_t start, phys_addr_t end) +{ + if (tdx_hcall_gpa_intent(start, end, TDX_MAP_PRIVATE)) { + tdx_printf("Accepting memory failed\n"); + return false; + } + return true; +} + +static bool tdx_accept_memory_regions(struct efi_boot_memmap *mem_map) +{ + unsigned long i, nr_desc = *mem_map->map_size / *mem_map->desc_size; + efi_memory_desc_t *d; + + for (i = 0; i < nr_desc; i++) { + d = efi_memdesc_ptr(*mem_map->map, *mem_map->desc_size, i); + + if (d->type == EFI_UNACCEPTED_MEMORY) { + if (d->phys_addr & ~PAGE_MASK) { + tdx_printf("WARN: EFI: unaligned base %lx\n", + d->phys_addr); + d->phys_addr &= PAGE_MASK; + } + if (!tdx_accept_memory(d->phys_addr, d->phys_addr + + PAGE_SIZE * d->num_pages)) + return false; + + d->type = EFI_CONVENTIONAL_MEMORY; + } + } + return true; +} + +efi_status_t setup_tdx(efi_bootinfo_t *efi_bootinfo) { if (!is_tdx_guest()) return EFI_UNSUPPORTED; + tdx_get_info(); + if (!tdx_accept_memory_regions(&efi_bootinfo->mem_map)) + return EFI_OUT_OF_RESOURCES; + handle_exception(20, tdx_handle_ve); printf("Initialized TDX.\n"); diff --git a/lib/x86/tdx.h b/lib/x86/tdx.h index 68ddc136d1d9..2f938038dc45 100644 --- a/lib/x86/tdx.h +++ b/lib/x86/tdx.h @@ -30,7 +30,12 @@ #define EXIT_REASON_MSR_WRITE 32 /* TDX Module call Leaf IDs */ +#define TDX_GET_INFO 1 #define TDX_GET_VEINFO 3 +#define TDX_ACCEPT_PAGE 6 + +/* TDX hypercall Leaf IDs */ +#define TDVMCALL_MAP_GPA 0x10001 /* * Used in __tdx_module_call() helper function to gather the @@ -76,8 +81,21 @@ struct ve_info { u32 instr_info; }; +/* + * Page mapping type enum. This is software construct not + * part of any hardware or VMM ABI. + */ +enum tdx_map_type { + TDX_MAP_PRIVATE, + TDX_MAP_SHARED, +}; + bool is_tdx_guest(void); -efi_status_t setup_tdx(void); +phys_addr_t tdx_shared_mask(void); +int tdx_hcall_gpa_intent(phys_addr_t start, phys_addr_t end, + enum tdx_map_type map_type); +bool tdx_accept_memory(phys_addr_t start, phys_addr_t end); +efi_status_t setup_tdx(efi_bootinfo_t *efi_bootinfo); /* Helper function used to communicate with the TDX module */ u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9, From patchwork Thu Mar 3 07:18:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 12767151 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C65EC433FE for ; Thu, 3 Mar 2022 07:28:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230251AbiCCH26 (ORCPT ); Thu, 3 Mar 2022 02:28:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37800 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230266AbiCCH2m (ORCPT ); Thu, 3 Mar 2022 02:28:42 -0500 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 32AEB1B7A9 for ; Wed, 2 Mar 2022 23:27:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646292463; x=1677828463; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=aUWllZgcNcb3+4EQxRve/ekcY65dBqxxATMgqK0CxdU=; b=aXFJl4GROAczQdPmdUPEVWQm2fHgmKkpg/H+KQWocw0qn03qE0dnHQDv vqQXcDQxMvRaj82Xp8kJmjjO7fQGEd4aVZ4c+qF/SFuVFOYMSgZ7PKmrQ gkoHvv/1D39DbaND4fHnWMqq76fuUu63Nva6z7rMki5g5Qb132bwIHVIC BmwmMAxQKzW6BDB105zn1iLWkjtyfwxHJPz1BRDB9sCPWkFPI5412MmH1 rSKaxp/BMx25yLeTWYLcH3Zs02EuiRVt5cGBpr1h+FlDdes9yGDXPdifp PQrTFNE7NReOocP3Z+FsKf6woyx6eXcj8VRSuBWCzsBT+w/vtYtjn+mTP w==; X-IronPort-AV: E=McAfee;i="6200,9189,10274"; a="251176987" X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="251176987" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:41 -0800 X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="551631676" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.123]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:39 -0800 From: Zhenzhong Duan To: kvm@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, yu.c.zhang@intel.com, zixuanwang@google.com, marcorr@google.com, jun.nakajima@intel.com, erdemaktas@google.com Subject: [kvm-unit-tests RFC PATCH 09/17] acpi: Add MADT table parse code Date: Thu, 3 Mar 2022 15:18:59 +0800 Message-Id: <20220303071907.650203-10-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303071907.650203-1-zhenzhong.duan@intel.com> References: <20220303071907.650203-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Support LAPIC, X2APIC and WAKEUP sub-table, other sub-table are ignored for now. Also add a wakeup wrapping function which is used by TDX. The parsed result is stored in id_map[] and acpi_mp_wake_mailbox. Signed-off-by: Zhenzhong Duan Reviewed-by: Yu Zhang --- lib/x86/acpi.c | 171 ++++++++++++++++++++++++++++++++++++++++++++++++ lib/x86/acpi.h | 85 ++++++++++++++++++++++++ lib/x86/setup.c | 4 ++ 3 files changed, 260 insertions(+) diff --git a/lib/x86/acpi.c b/lib/x86/acpi.c index 1a82ced0b90f..6fb8ece7eabe 100644 --- a/lib/x86/acpi.c +++ b/lib/x86/acpi.c @@ -1,7 +1,12 @@ #include "libcflat.h" +#include "errno.h" #include "acpi.h" +#include "apic.h" +#include "asm/barrier.h" +#include "processor.h" #ifdef TARGET_EFI +unsigned char online_cpus[(MAX_TEST_CPUS + 7) / 8]; struct rsdp_descriptor *efi_rsdp = NULL; void set_efi_rsdp(struct rsdp_descriptor *rsdp) @@ -16,6 +21,172 @@ static struct rsdp_descriptor *get_rsdp(void) } return efi_rsdp; } + +struct acpi_madt_multiproc_wakeup_mailbox *acpi_mp_wake_mailbox; + +#define smp_store_release(p, val) \ +do { \ + barrier(); \ + WRITE_ONCE(*p, val); \ +} while (0) + +static inline bool test_bit(int nr, const void *addr) +{ + const u32 *p = (const u32 *)addr; + return ((1UL << (nr & 31)) & (p[nr >> 5])) != 0; +} + +int acpi_wakeup_cpu(int apicid, unsigned long start_ip) +{ + u8 timeout = 0xFF; + + /* + * According to the ACPI specification r6.4, sec 5.2.12.19, the + * mailbox-based wakeup mechanism cannot be used more than once + * for the same CPU, so skip sending wake commands to already + * awake CPU. + */ + if (test_bit(apicid, online_cpus)) { + printf("CPU already awake (APIC ID %x), skipping wakeup\n", + apicid); + return -EINVAL; + } + + /* + * Mailbox memory is shared between firmware and OS. Firmware will + * listen on mailbox command address, and once it receives the wakeup + * command, CPU associated with the given apicid will be booted. So, + * the value of apic_id and wakeup_vector has to be set before updating + * the wakeup command. So use smp_store_release to let the compiler know + * about it and preserve the order of writes. + */ + smp_store_release(&acpi_mp_wake_mailbox->apic_id, apicid); + smp_store_release(&acpi_mp_wake_mailbox->wakeup_vector, start_ip); + smp_store_release(&acpi_mp_wake_mailbox->command, + ACPI_MP_WAKE_COMMAND_WAKEUP); + + /* + * After writing wakeup command, wait for maximum timeout of 0xFF + * for firmware to reset the command address back zero to indicate + * the successful reception of command. + * NOTE: 255 as timeout value is decided based on our experiments. + * + * XXX: Change the timeout once ACPI specification comes up with + * standard maximum timeout value. + */ + while (READ_ONCE(acpi_mp_wake_mailbox->command) && timeout--) + cpu_relax(); + + if (timeout) { + /* + * If the CPU wakeup process is successful, store the + * status in online_cpus to prevent re-wakeup + * requests. + */ + set_bit(apicid, online_cpus); + return 0; + } + + /* If timed out (timeout == 0), return error */ + return -EIO; +} + +static bool parse_madt_table(struct acpi_table *madt) +{ + u64 table_start = (unsigned long)madt + sizeof(struct acpi_table_madt); + u64 table_end = (unsigned long)madt + madt->length; + struct acpi_subtable_header *sub_table; + bool failed = false; + u32 uid, apic_id; + u8 enabled; + + while (table_start < table_end && !failed) { + struct acpi_madt_local_apic *processor; + struct acpi_madt_local_x2apic *processor2; + struct acpi_madt_multiproc_wakeup *mp_wake; + + sub_table = (struct acpi_subtable_header *)table_start; + + switch (sub_table->type) { + case ACPI_MADT_TYPE_LOCAL_APIC: + processor = (struct acpi_madt_local_apic *)sub_table; + + if (BAD_MADT_ENTRY(processor, table_end)) { + failed = true; + break; + } + + uid = processor->processor_id; + apic_id = processor->id; + enabled = processor->lapic_flags & ACPI_MADT_ENABLED; + + /* Ignore invalid ID */ + if (apic_id == 0xff) + break; + if (enabled) + id_map[uid] = apic_id; + + printf("apicid %x uid %x %s\n", apic_id, uid, + enabled ? "enabled" : "disabled"); + break; + + case ACPI_MADT_TYPE_LOCAL_X2APIC: + processor2 = (struct acpi_madt_local_x2apic *)sub_table; + + if (BAD_MADT_ENTRY(processor2, table_end)) { + failed = true; + break; + } + + uid = processor2->uid; + apic_id = processor2->local_apic_id; + enabled = processor2->lapic_flags & ACPI_MADT_ENABLED; + + /* Ignore invalid ID */ + if (apic_id == 0xffffffff) + break; + if (enabled) + id_map[uid] = apic_id; + + printf("x2apicid %x uid %x %s\n", apic_id, uid, + enabled ? "enabled" : "disabled"); + break; + case ACPI_MADT_TYPE_MULTIPROC_WAKEUP: + mp_wake = (struct acpi_madt_multiproc_wakeup *)sub_table; + + if (BAD_MADT_ENTRY(mp_wake, table_end)) { + failed = true; + break; + } + + if (acpi_mp_wake_mailbox) + printf("WARN: duplicate mailbox %lx\n", (u64)acpi_mp_wake_mailbox); + + acpi_mp_wake_mailbox = (void *)mp_wake->base_address; + printf("MP Wake (Mailbox version[%d] base_address[%lx])\n", + mp_wake->mailbox_version, mp_wake->base_address); + break; + default: + /* Ignored currently */ + break; + } + if (!failed) + table_start += sub_table->length; + } + + return !failed; +} + +bool parse_acpi_table(void) +{ + struct acpi_table *tb; + + tb = find_acpi_table_addr(MADT_SIGNATURE); + if (tb) + return parse_madt_table(tb); + + return false; +} #else static struct rsdp_descriptor *get_rsdp(void) { diff --git a/lib/x86/acpi.h b/lib/x86/acpi.h index 67ba3899b1d7..509d9b5bb0b4 100644 --- a/lib/x86/acpi.h +++ b/lib/x86/acpi.h @@ -10,6 +10,7 @@ #define RSDT_SIGNATURE ACPI_SIGNATURE('R','S','D','T') #define FACP_SIGNATURE ACPI_SIGNATURE('F','A','C','P') #define FACS_SIGNATURE ACPI_SIGNATURE('F','A','C','S') +#define MADT_SIGNATURE ACPI_SIGNATURE('A','P','I','C') #define ACPI_SIGNATURE_8BYTE(c1, c2, c3, c4, c5, c6, c7, c8) \ @@ -46,6 +47,88 @@ struct acpi_table { char data[0]; }; +/******************************************************************************* + * + * MADT - Multiple APIC Description Table + * Version 3 + * + ******************************************************************************/ + +struct acpi_table_madt { + ACPI_TABLE_HEADER_DEF + u32 address; /* Physical address of local APIC */ + u32 flags; +}; + +/* Generic subtable header (used in MADT, SRAT, etc.) */ + +struct acpi_subtable_header { + u8 type; + u8 length; +}; + +#define ACPI_MADT_TYPE_LOCAL_APIC 0 +#define ACPI_MADT_TYPE_LOCAL_X2APIC 9 +#define ACPI_MADT_TYPE_MULTIPROC_WAKEUP 16 + +#define BAD_MADT_ENTRY(entry, end) ( \ + (!entry) || (unsigned long)entry + sizeof(*entry) > end || \ + ((struct acpi_subtable_header *)entry)->length < sizeof(*entry)) + +/* + * MADT Subtables, correspond to Type in struct acpi_subtable_header + */ + +/* 0: Processor Local APIC */ + +struct acpi_madt_local_apic { + struct acpi_subtable_header header; + u8 processor_id; /* ACPI processor id */ + u8 id; /* Processor's local APIC id */ + u32 lapic_flags; +}; + +/* 9: Processor Local X2APIC (ACPI 4.0) */ + +struct acpi_madt_local_x2apic { + struct acpi_subtable_header header; + u16 reserved; /* reserved - must be zero */ + u32 local_apic_id; /* Processor x2APIC ID */ + u32 lapic_flags; + u32 uid; /* ACPI processor UID */ +}; + +/* 16: Multiprocessor wakeup (ACPI 6.4) */ + +struct acpi_madt_multiproc_wakeup { + struct acpi_subtable_header header; + u16 mailbox_version; + u32 reserved; /* reserved - must be zero */ + u64 base_address; +}; + +#define ACPI_MULTIPROC_WAKEUP_MB_OS_SIZE 2032 +#define ACPI_MULTIPROC_WAKEUP_MB_FIRMWARE_SIZE 2048 + +struct acpi_madt_multiproc_wakeup_mailbox { + u16 command; + u16 reserved; /* reserved - must be zero */ + u32 apic_id; + u64 wakeup_vector; + u8 reserved_os[ACPI_MULTIPROC_WAKEUP_MB_OS_SIZE]; /* reserved for OS use */ + u8 reserved_firmware[ACPI_MULTIPROC_WAKEUP_MB_FIRMWARE_SIZE]; /* reserved for firmware use */ +}; + +#define ACPI_MP_WAKE_COMMAND_WAKEUP 1 + +/* + * Common flags fields for MADT subtables + */ + +/* MADT Local APIC flags */ + +#define ACPI_MADT_ENABLED (1) /* 00: Processor is usable if set */ + struct rsdt_descriptor_rev1 { ACPI_TABLE_HEADER_DEF u32 table_offset_entry[0]; @@ -108,5 +191,7 @@ struct facs_descriptor_rev1 void set_efi_rsdp(struct rsdp_descriptor *rsdp); void* find_acpi_table_addr(u32 sig); +int acpi_wakeup_cpu(int apicid, unsigned long start_ip); +bool parse_acpi_table(void); #endif diff --git a/lib/x86/setup.c b/lib/x86/setup.c index 29202478ae0d..63c4dbb25064 100644 --- a/lib/x86/setup.c +++ b/lib/x86/setup.c @@ -314,6 +314,10 @@ efi_status_t setup_efi(efi_bootinfo_t *efi_bootinfo) return status; } + /* Parse all acpi tables, currently only MADT table */ + if (!parse_acpi_table()) + return EFI_NOT_FOUND; + phase = "AMD SEV"; status = setup_amd_sev(); From patchwork Thu Mar 3 07:19:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 12767144 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 095E0C433EF for ; Thu, 3 Mar 2022 07:28:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229765AbiCCH2r (ORCPT ); Thu, 3 Mar 2022 02:28:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230310AbiCCH2m (ORCPT ); Thu, 3 Mar 2022 02:28:42 -0500 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 520CF21810 for ; Wed, 2 Mar 2022 23:27:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646292465; x=1677828465; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=IDcjAjR45CZYQPp3bmPZspXevCk8rh2UmmPvZhgnrLI=; b=HEOgs6TzJCnDHcBHBAeTLm9G9rLsKc4ZakfibfwtXQ4SQXzBmfWCft+r bCKD3gXAyooTBaSR8W1yPhWXPYcqefdHdr2HLo/z9NHTJ/xtsIfcgcerF C6S1WWG/UVEvruHupetvn/OIZJDnZO8NxQO4Dn5i6bEsQlhScrL4kh5NC oD9ALCTQneZqy/h/cNaeuNjZ1yBjbv/hBkVz0HEMEJ1m6reQ5sFSY2tVz gEVcbueDZ4ArLvKPNaIQ0qszgLPUZGSIYnqEHIU9se8ee1bhJoW7AK+Lt lpKwwtjXLDld8/CwdWO+dnAq9XeOakYx2nlIPj+FQDrj46l7pdLfXLFNa w==; X-IronPort-AV: E=McAfee;i="6200,9189,10274"; a="251176996" X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="251176996" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:44 -0800 X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="551631696" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.123]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:41 -0800 From: Zhenzhong Duan To: kvm@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, yu.c.zhang@intel.com, zixuanwang@google.com, marcorr@google.com, jun.nakajima@intel.com, erdemaktas@google.com Subject: [kvm-unit-tests RFC PATCH 10/17] x86 TDX: Add multi processor support Date: Thu, 3 Mar 2022 15:19:00 +0800 Message-Id: <20220303071907.650203-11-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303071907.650203-1-zhenzhong.duan@intel.com> References: <20220303071907.650203-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In TD-guest, multiprocessor support is different from normal guest. In normal guest, BSP send startup IPI to all APs to trigger APs starting from 16bit real mode. While in TD-guest, TDVF initializes APs into 64bit mode before pass to OS/bootloader. OS enumerates uid/apicid mapping information through MADT table and wake up APs one by one through MP wakeup mechanism. So the entry code for APs is 64bit. Though it is targeting TDX MP support, there are consideration about integration with normal UEFI MP support in the future. Signed-off-by: Zhenzhong Duan Reviewed-by: Yu Zhang --- lib/x86/asm/setup.h | 1 + lib/x86/setup.c | 31 +++++++++++++++++++++++++++++++ lib/x86/tdx.c | 33 +++++++++++++++++++++++++++++++++ lib/x86/tdx.h | 2 ++ x86/efi/crt0-efi-x86_64.S | 12 +++++++++++- x86/efi/efistart64.S | 5 +++++ 6 files changed, 83 insertions(+), 1 deletion(-) diff --git a/lib/x86/asm/setup.h b/lib/x86/asm/setup.h index c467a2e94861..5e7aa2eb4332 100644 --- a/lib/x86/asm/setup.h +++ b/lib/x86/asm/setup.h @@ -14,6 +14,7 @@ unsigned long setup_tss(u8 *stacktop); efi_status_t setup_efi(efi_bootinfo_t *efi_bootinfo); void setup_5level_page_table(void); +void secondary_startup_64(void); #endif /* TARGET_EFI */ #include "x86/tdx.h" diff --git a/lib/x86/setup.c b/lib/x86/setup.c index 63c4dbb25064..7bf5d431f2a8 100644 --- a/lib/x86/setup.c +++ b/lib/x86/setup.c @@ -279,6 +279,35 @@ static void setup_gdt_tss(void) load_gdt_tss(tss_offset); } +static void setup_percpu_area(void) +{ + u64 rsp; + + asm volatile ("mov %%rsp, %0" : "=m"(rsp) :: "memory"); + + /* per cpu stack size is PAGE_SIZE */ + rsp &= ~((u64)PAGE_SIZE - 1); + wrmsr(MSR_GS_BASE, rsp); +} + +void secondary_startup_64(void) +{ + setup_gdt_tss(); + load_idt(); + setup_percpu_area(); + enable_x2apic(); + tdx_ap_init(); + + while (1) + safe_halt(); +} + +static void aps_init(void) +{ + if (is_tdx_guest()) + tdx_aps_init(); +} + efi_status_t setup_efi(efi_bootinfo_t *efi_bootinfo) { efi_status_t status; @@ -332,6 +361,7 @@ efi_status_t setup_efi(efi_bootinfo_t *efi_bootinfo) return status; } + setup_percpu_area(); /* xAPIC mode isn't allowed in TDX */ if (!is_tdx_guest()) reset_apic(); @@ -342,6 +372,7 @@ efi_status_t setup_efi(efi_bootinfo_t *efi_bootinfo) if (!is_tdx_guest()) enable_apic(); enable_x2apic(); + aps_init(); smp_init(); setup_page_table(); diff --git a/lib/x86/tdx.c b/lib/x86/tdx.c index b74c697353d9..4bd658b95028 100644 --- a/lib/x86/tdx.c +++ b/lib/x86/tdx.c @@ -12,9 +12,14 @@ #include "tdx.h" #include "errno.h" #include "bitops.h" +#include "atomic.h" +#include "fwcfg.h" +#include "x86/acpi.h" #include "x86/processor.h" #include "x86/smp.h" +#include "x86/apic.h" #include "asm/page.h" +#include "asm/barrier.h" #define VE_IS_IO_OUT(exit_qual) (((exit_qual) & 8) ? 0 : 1) #define VE_GET_IO_SIZE(exit_qual) (((exit_qual) & 7) + 1) @@ -541,3 +546,31 @@ efi_status_t setup_tdx(efi_bootinfo_t *efi_bootinfo) return EFI_SUCCESS; } + +static atomic_t cpu_online_count = {1}; +extern unsigned char online_cpus[(MAX_TEST_CPUS + 7) / 8]; +extern void ap_start64(void); + +/* TDX uses ACPI WAKE UP mechanism to wake up APs instead of SIPI */ +efi_status_t tdx_aps_init(void) +{ + u32 i, total_cpus = fwcfg_get_nb_cpus(); + + /* BSP is already online */ + set_bit(id_map[0], online_cpus); + + for (i = 1; i < total_cpus; i++) { + if (acpi_wakeup_cpu(id_map[i], (u64)ap_start64)) + return EFI_DEVICE_ERROR; + } + + while (atomic_read(&cpu_online_count) != total_cpus) + cpu_relax(); + + return EFI_SUCCESS; +} + +void tdx_ap_init(void) +{ + atomic_inc(&cpu_online_count); +} diff --git a/lib/x86/tdx.h b/lib/x86/tdx.h index 2f938038dc45..4b75fcec7367 100644 --- a/lib/x86/tdx.h +++ b/lib/x86/tdx.h @@ -96,6 +96,8 @@ int tdx_hcall_gpa_intent(phys_addr_t start, phys_addr_t end, enum tdx_map_type map_type); bool tdx_accept_memory(phys_addr_t start, phys_addr_t end); efi_status_t setup_tdx(efi_bootinfo_t *efi_bootinfo); +efi_status_t tdx_aps_init(void); +void tdx_ap_init(void); /* Helper function used to communicate with the TDX module */ u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9, diff --git a/x86/efi/crt0-efi-x86_64.S b/x86/efi/crt0-efi-x86_64.S index eaf165649591..f3309f6e35ef 100644 --- a/x86/efi/crt0-efi-x86_64.S +++ b/x86/efi/crt0-efi-x86_64.S @@ -41,6 +41,8 @@ .globl _start _start: + lea stacktop(%rip), %rsp + add %rsp, smp_stacktop(%rip) subq $8, %rsp pushq %rcx pushq %rdx @@ -61,12 +63,20 @@ _start: call efi_main addq $8, %rsp + .globl ap_start64 +ap_start64: + mov $-PAGE_SIZE, %rsp + lock xadd %rsp, smp_stacktop(%rip) + call secondary_startup_64 + .exit: ret + .data +smp_stacktop: .quad -PAGE_SIZE + // hand-craft a dummy .reloc section so EFI knows it's a relocatable executable: - .data dummy: .long 0 #define IMAGE_REL_ABSOLUTE 0 diff --git a/x86/efi/efistart64.S b/x86/efi/efistart64.S index 017abba85a68..648d047febb5 100644 --- a/x86/efi/efistart64.S +++ b/x86/efi/efistart64.S @@ -22,6 +22,11 @@ ptl4: . = . + PAGE_SIZE .align PAGE_SIZE +.globl stacktop + . = . + PAGE_SIZE * MAX_TEST_CPUS +stacktop: +.align PAGE_SIZE + .section .init .code64 .text From patchwork Thu Mar 3 07:19:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 12767145 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DDF57C433F5 for ; Thu, 3 Mar 2022 07:28:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230327AbiCCH2s (ORCPT ); Thu, 3 Mar 2022 02:28:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38338 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230241AbiCCH2o (ORCPT ); Thu, 3 Mar 2022 02:28:44 -0500 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DEE673DA5D for ; Wed, 2 Mar 2022 23:27:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646292469; x=1677828469; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=spaXWCL9cELMrFJ3ShOrqOYTZM2I4jQAfEYKi891ioU=; b=hzWFeRRaMSseaDiR0Rc3NKNTUM3j71CILNCJNzFkXvqN9SNnTmn082NC 2yJnscDeg5/ebDm9YgljB0j0ugfaqH5LGwEJeEN4oe6T8R81aFU25OHPW ED0xG04tzQmWcNleKqKgEbvGROHWRvwEHpRgBWGUlktYPLFk2IWI6TVxe 6dVahqDreb54UM2lKPv3ef3w4xtFVMsXF7UveOuVYnUql/2q8kQC7swwk TCLB8u0JE1ZwjvQ8OBAESb2cD0tAFOuJooM0SaQFVSPNA7MFqm4ys7dQ9 51h1+bOtUGeurxir4LhgZ4Z8vrKRcPPSoraKjwsWpp9rzY7JW8KHKrIxx w==; X-IronPort-AV: E=McAfee;i="6200,9189,10274"; a="251177007" X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="251177007" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:47 -0800 X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="551631722" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.123]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:44 -0800 From: Zhenzhong Duan To: kvm@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, yu.c.zhang@intel.com, zixuanwang@google.com, marcorr@google.com, jun.nakajima@intel.com, erdemaktas@google.com Subject: [kvm-unit-tests RFC PATCH 11/17] x86 TDX: Add a formal IPI handler Date: Thu, 3 Mar 2022 15:19:01 +0800 Message-Id: <20220303071907.650203-12-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303071907.650203-1-zhenzhong.duan@intel.com> References: <20220303071907.650203-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Current IPI handler may currupts cpu context, it's not an big issue as AP only enable interrupt in idle loop. But in TD-guest, hlt instruction is simulated though tdvmcall in #VE handler. IPI will currupt #VE context. Save and restore cpu context in IPI handler to avoid crash. Signed-off-by: Zhenzhong Duan Reviewed-by: Yu Zhang --- lib/x86/smp.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/lib/x86/smp.c b/lib/x86/smp.c index 2ac0ef74f264..8a37143c6d78 100644 --- a/lib/x86/smp.c +++ b/lib/x86/smp.c @@ -39,12 +39,20 @@ static __attribute__((used)) void ipi(void) asm ( "ipi_entry: \n" - " call ipi \n" -#ifndef __x86_64__ - " iret" -#else - " iretq" +#ifdef __x86_64__ + "push %r15; push %r14; push %r13; push %r12 \n\t" + "push %r11; push %r10; push %r9; push %r8 \n\t" #endif + "push %"R "di; push %"R "si; push %"R "bp; \n\t" + "push %"R "bx; push %"R "dx; push %"R "cx; push %"R "ax \n\t" + "call ipi \n\t" + "pop %"R "ax; pop %"R "cx; pop %"R "dx; pop %"R "bx \n\t" + "pop %"R "bp; pop %"R "si; pop %"R "di \n\t" +#ifdef __x86_64__ + "pop %r8; pop %r9; pop %r10; pop %r11 \n\t" + "pop %r12; pop %r13; pop %r14; pop %r15 \n\t" +#endif + "iret"W" \n\t" ); int cpu_count(void) From patchwork Thu Mar 3 07:19:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 12767146 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D365C4332F for ; Thu, 3 Mar 2022 07:28:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230239AbiCCH2u (ORCPT ); Thu, 3 Mar 2022 02:28:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37902 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230340AbiCCH2p (ORCPT ); Thu, 3 Mar 2022 02:28:45 -0500 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8AD113E5FC for ; Wed, 2 Mar 2022 23:27:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646292470; x=1677828470; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8woeIU/iix6qyra7P6vahSx8k+pr4gm0x9nMEUBjbcQ=; b=AR0NJxYfayn2qfwiXvm9r6dhnH7t83O11nV2NfK40rA/MLEQ5Mi28nUf njtg6Vbbu1Dqusu37dBOYp9A/VTDL9/LKnCGhMNffXvKetniLrYPp2+G8 NumNObU+6T5+ljGHnFH9vyTfmxwKfNNlbbypWVxLqauL3FnRvXKVMhW6s NmFXnoM6ORKEj75AKtKRC1ZRWz/NTRiKxQiPef2y9BJZBYkeHYxXIdBgA +4hTtFALuLIuE32o4t2ERnIc3+re4b4oUdhHkxaio/KTcY/FmKFYthidv HE6cpx8eUryrmlKDr+dmjzbDMwZRiBXf769XJAY5zkTXWjjmOHDDcYK9l Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10274"; a="251177017" X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="251177017" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:50 -0800 X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="551631745" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.123]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:47 -0800 From: Zhenzhong Duan To: kvm@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, yu.c.zhang@intel.com, zixuanwang@google.com, marcorr@google.com, jun.nakajima@intel.com, erdemaktas@google.com Subject: [kvm-unit-tests RFC PATCH 12/17] x86 TDX: Enable lvl5 boot page table Date: Thu, 3 Mar 2022 15:19:02 +0800 Message-Id: <20220303071907.650203-13-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303071907.650203-1-zhenzhong.duan@intel.com> References: <20220303071907.650203-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org TDVF enables lvl5 page table before pass to OS/bootloader while OVMF enables lvl4 page table. Check CR4.X86_CR4_LA57 to decide which page table level to use and initialize our own lvl5 page table if TDX. Move setup_page_table() before APs startup so that lvl5 page table is ready for APs. Refactor the common part of setting cr3 in a wrapper function load_page_table(). Signed-off-by: Zhenzhong Duan Reviewed-by: Yu Zhang --- lib/x86/setup.c | 22 +++++++++++++++++++--- x86/efi/efistart64.S | 5 +++++ 2 files changed, 24 insertions(+), 3 deletions(-) diff --git a/lib/x86/setup.c b/lib/x86/setup.c index 7bf5d431f2a8..3a60762494d6 100644 --- a/lib/x86/setup.c +++ b/lib/x86/setup.c @@ -231,10 +231,23 @@ static efi_status_t setup_rsdp(efi_bootinfo_t *efi_bootinfo) } /* Defined in cstart64.S or efistart64.S */ +extern u8 ptl5; extern u8 ptl4; extern u8 ptl3; extern u8 ptl2; +static void load_page_table(void) +{ + /* + * Load new page table based on the level of firmware provided page + * table. + */ + if (read_cr4() & X86_CR4_LA57) + write_cr3((ulong)&ptl5); + else + write_cr3((ulong)&ptl4); +} + static void setup_page_table(void) { pgd_t *curr_pt; @@ -247,6 +260,9 @@ static void setup_page_table(void) /* Set AMD SEV C-Bit for page table entries */ flags |= get_amd_sev_c_bit_mask(); + /* Level 5 */ + curr_pt = (pgd_t *)&ptl5; + curr_pt[0] = ((phys_addr_t)&ptl4) | flags; /* Level 4 */ curr_pt = (pgd_t *)&ptl4; curr_pt[0] = ((phys_addr_t)&ptl3) | flags; @@ -266,8 +282,7 @@ static void setup_page_table(void) setup_ghcb_pte((pgd_t *)&ptl4); } - /* Load 4-level page table */ - write_cr3((ulong)&ptl4); + load_page_table(); } static void setup_gdt_tss(void) @@ -297,6 +312,7 @@ void secondary_startup_64(void) setup_percpu_area(); enable_x2apic(); tdx_ap_init(); + load_page_table(); while (1) safe_halt(); @@ -372,9 +388,9 @@ efi_status_t setup_efi(efi_bootinfo_t *efi_bootinfo) if (!is_tdx_guest()) enable_apic(); enable_x2apic(); + setup_page_table(); aps_init(); smp_init(); - setup_page_table(); return EFI_SUCCESS; } diff --git a/x86/efi/efistart64.S b/x86/efi/efistart64.S index 648d047febb5..ef3db0110c3c 100644 --- a/x86/efi/efistart64.S +++ b/x86/efi/efistart64.S @@ -22,6 +22,11 @@ ptl4: . = . + PAGE_SIZE .align PAGE_SIZE +.globl ptl5 +ptl5: + . = . + PAGE_SIZE +.align PAGE_SIZE + .globl stacktop . = . + PAGE_SIZE * MAX_TEST_CPUS stacktop: From patchwork Thu Mar 3 07:19:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 12767147 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2956AC433F5 for ; Thu, 3 Mar 2022 07:28:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230252AbiCCH2v (ORCPT ); Thu, 3 Mar 2022 02:28:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38232 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230292AbiCCH2q (ORCPT ); Thu, 3 Mar 2022 02:28:46 -0500 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2251854BC4 for ; Wed, 2 Mar 2022 23:27:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646292475; x=1677828475; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TLIdi+0MXlVUApHZB/znGcCoPilvf3SXkpe6+A3cUbI=; b=Pd5goDtKJ+gcxma/F+q39fjW5W5K+xgxKr3mWzDfCmx6WUUx9RTNnkzh YbCOJsI4FxbciHHKmhl/maHaJzMCNacYxMnpO319IiwdZJKrk7nd4lvuE tHqHiXXiX55bpEAqH87Lgtd5q7BOcz9v0tDlcfS3ZVbmsGrC2PSLADtEF y1aY+jacC7hmZr4fTvUWxAYfC5b4+LZTYdj/9gN4MwKdo7srBGTYvdmVf CPHrKhUTR8xXI7AgC3W08M8AislKG1s/CKBR70li9jlVy5K4hkolZShmZ gA9jnpjP+B+BcyDtTpwgKeNWI5O6PeZCH8x9vc9DdeBI8yJNwi9xCA5on w==; X-IronPort-AV: E=McAfee;i="6200,9189,10274"; a="251177026" X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="251177026" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:53 -0800 X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="551631775" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.123]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:50 -0800 From: Zhenzhong Duan To: kvm@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, yu.c.zhang@intel.com, zixuanwang@google.com, marcorr@google.com, jun.nakajima@intel.com, erdemaktas@google.com Subject: [kvm-unit-tests RFC PATCH 13/17] x86 TDX: Add lvl5 page table support to virtual memory Date: Thu, 3 Mar 2022 15:19:03 +0800 Message-Id: <20220303071907.650203-14-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303071907.650203-1-zhenzhong.duan@intel.com> References: <20220303071907.650203-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Currently in TDX test case init stage, it setup an initial lvl5 boot page table, but VM code support only lvl4 page table. This mismatch make the test cases requiring virtual memory crash. Add below changes to support lvl5 page table for virtual memory: 1. skip finding high memory 2. check X86_CR4_LA57 to decide to initialize lvl5 or lvl4 page table 3. always set X86_CR0_NE for TDX test case Signed-off-by: Zhenzhong Duan Reviewed-by: Yu Zhang --- lib/x86/processor.h | 1 + lib/x86/setup.c | 5 +++++ lib/x86/vm.c | 14 ++++++++++++-- 3 files changed, 18 insertions(+), 2 deletions(-) diff --git a/lib/x86/processor.h b/lib/x86/processor.h index 865269fd3857..4deff9ebe044 100644 --- a/lib/x86/processor.h +++ b/lib/x86/processor.h @@ -35,6 +35,7 @@ #define X86_CR0_MP 0x00000002 #define X86_CR0_EM 0x00000004 #define X86_CR0_TS 0x00000008 +#define X86_CR0_NE 0x00000020 #define X86_CR0_WP 0x00010000 #define X86_CR0_AM 0x00040000 #define X86_CR0_NW 0x20000000 diff --git a/lib/x86/setup.c b/lib/x86/setup.c index 3a60762494d6..0c299d3dd9bc 100644 --- a/lib/x86/setup.c +++ b/lib/x86/setup.c @@ -64,6 +64,11 @@ static struct mbi_bootinfo *bootinfo; #ifdef __x86_64__ void find_highmem(void) { +#ifdef TARGET_EFI + /* The largest free memory region is already chosen in setup_efi() */ + return; +#endif /* TARGET_EFI */ + /* Memory above 4 GB is only supported on 64-bit systems. */ if (!(bootinfo->flags & 64)) return; diff --git a/lib/x86/vm.c b/lib/x86/vm.c index 56be57be673a..4ead6ed358ae 100644 --- a/lib/x86/vm.c +++ b/lib/x86/vm.c @@ -3,6 +3,7 @@ #include "vmalloc.h" #include "alloc_page.h" #include "smp.h" +#include "tdx.h" static pteval_t pte_opt_mask; @@ -16,7 +17,12 @@ pteval_t *install_pte(pgd_t *cr3, pteval_t *pt = cr3; unsigned offset; - for (level = PAGE_LEVEL; level > pte_level; --level) { + if (read_cr4() & X86_CR4_LA57) + level = 5; + else + level = PAGE_LEVEL; + + for (; level > pte_level; --level) { offset = PGDIR_OFFSET((uintptr_t)virt, level); if (!(pt[offset] & PT_PRESENT_MASK)) { pteval_t *new_pt = pt_page; @@ -187,7 +193,11 @@ void *setup_mmu(phys_addr_t end_of_memory, void *opt_mask) #ifndef __x86_64__ write_cr4(X86_CR4_PSE); #endif - write_cr0(X86_CR0_PG |X86_CR0_PE | X86_CR0_WP); + /* According to TDX module spec 10.6.1 CR0.NE should be 1 */ + if (is_tdx_guest()) + write_cr0(X86_CR0_PG | X86_CR0_PE | X86_CR0_WP | X86_CR0_NE); + else + write_cr0(X86_CR0_PG | X86_CR0_PE | X86_CR0_WP); printf("paging enabled\n"); printf("cr0 = %lx\n", read_cr0()); From patchwork Thu Mar 3 07:19:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 12767148 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9FCA8C433EF for ; Thu, 3 Mar 2022 07:28:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229723AbiCCH2x (ORCPT ); Thu, 3 Mar 2022 02:28:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37854 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230242AbiCCH2s (ORCPT ); Thu, 3 Mar 2022 02:28:48 -0500 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3774CF541B for ; Wed, 2 Mar 2022 23:27:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646292476; x=1677828476; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HfcpUSTpN38O0ixGMjonhMYPbsbTHLCcDC6lTkSGn9I=; b=Sbk7mxOWaffXtVMYQ2hSaYt5ZQa6OtAiYDvP8WMVM2vHLSb6/4AzXDQ1 cvw9z++8Eo4QcTsDkITucWWtkcF+Ys590Mj/AW5ycxTbVDhgnrRwMzrr8 6G1NY0tJcKPRoIp1gxqU1l9ImcmEXbNh+fm4yeYdzQ+PcvbaIPFncPglC +YrL7RSaKw4REx7MdTturKdNSgFnpF4/IpKhrDiDqGaUM0TZvjArAVy4+ zUlSwLPFy7WoxhgKw2qlvDnYIUcDKYPsJshS2uo05nizVAnoHYJ5Hd6mJ eaMXQelUNQ+lRE0YdUy3po1iOCr9pc6KTHIVywTWjXjMfZfwdi4gPEZzK w==; X-IronPort-AV: E=McAfee;i="6200,9189,10274"; a="251177032" X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="251177032" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:56 -0800 X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="551631812" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.123]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:53 -0800 From: Zhenzhong Duan To: kvm@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, yu.c.zhang@intel.com, zixuanwang@google.com, marcorr@google.com, jun.nakajima@intel.com, erdemaktas@google.com Subject: [kvm-unit-tests RFC PATCH 14/17] x86 TDX: Add TDX specific test case Date: Thu, 3 Mar 2022 15:19:04 +0800 Message-Id: <20220303071907.650203-15-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303071907.650203-1-zhenzhong.duan@intel.com> References: <20220303071907.650203-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org sub-test1: Test APIC self IPI with vector < 16 trigger #VE. sub-test2: Test single step on simulation instructions work well with single step emulation in #VE handler, we choose cpuid(0xb) and wrmsr(0x1a0) to test. Please note not all cpuid trigger #VE, e.x. cpuid(0) will not. Signed-off-by: Zhenzhong Duan Reviewed-by: Yu Zhang --- x86/Makefile.x86_64 | 1 + x86/intel_tdx.c | 94 +++++++++++++++++++++++++++++++++++++++++++++ x86/unittests.cfg | 4 ++ 3 files changed, 99 insertions(+) create mode 100644 x86/intel_tdx.c diff --git a/x86/Makefile.x86_64 b/x86/Makefile.x86_64 index a3cb75ae5868..de79212951a3 100644 --- a/x86/Makefile.x86_64 +++ b/x86/Makefile.x86_64 @@ -31,6 +31,7 @@ tests += $(TEST_DIR)/vmware_backdoors.$(exe) tests += $(TEST_DIR)/rdpru.$(exe) tests += $(TEST_DIR)/pks.$(exe) tests += $(TEST_DIR)/pmu_lbr.$(exe) +tests += $(TEST_DIR)/intel_tdx.$(exe) ifeq ($(TARGET_EFI),y) tests += $(TEST_DIR)/amd_sev.$(exe) diff --git a/x86/intel_tdx.c b/x86/intel_tdx.c new file mode 100644 index 000000000000..e7e65fb32b89 --- /dev/null +++ b/x86/intel_tdx.c @@ -0,0 +1,94 @@ +#include "libcflat.h" +#include "x86/processor.h" +#include "x86/apic-defs.h" +#include "x86/tdx.h" +#include "msr.h" + +static volatile unsigned long db_addr[10], dr6[10]; +static volatile unsigned int n; + +static void test_selfipi_msr(void) +{ + unsigned char vector; + u64 i; + + printf("start APIC_SELF_IPI MSR write test.\n"); + + for (i = 0; i < 16; i++) { + vector = wrmsr_checking(APIC_SELF_IPI, i); + report(vector == VE_VECTOR, + "Expected #VE on WRSMR(%s, 0x%lx), got vector %d", + "APIC_SELF_IPI", i, vector); + } + + printf("end APIC_SELF_IPI MSR write test.\n"); +} + +static void handle_db(struct ex_regs *regs) +{ + db_addr[n] = regs->rip; + dr6[n] = read_dr6(); + + if (dr6[n] & 0x1) + regs->rflags |= (1 << 16); + + if (++n >= 10) { + regs->rflags &= ~(1 << 8); + write_dr7(0x00000400); + } +} + +static void test_single_step(void) +{ + unsigned long start; + + handle_exception(DB_VECTOR, handle_db); + + /* + * cpuid(0xb) and wrmsr(0x1a0) trigger #VE and are then emulated. + * Test #DB on these instructions as there is single step + * simulation in #VE handler. This is complement to x86/debug.c + * which test cpuid(0) and in(0x3fd) instruction. In fact, + * cpuid(0) is emulated by seam module. + */ + n = 0; + write_dr6(0); + asm volatile( + "pushf\n\t" + "pop %%rax\n\t" + "or $(1<<8),%%rax\n\t" + "push %%rax\n\t" + "lea (%%rip),%0\n\t" + "popf\n\t" + "and $~(1<<8),%%rax\n\t" + "push %%rax\n\t" + "mov $0xb,%%rax\n\t" + "cpuid\n\t" + "movl $0x1a0,%%ecx\n\t" + "rdmsr\n\t" + "wrmsr\n\t" + "popf\n\t" + : "=r" (start) : : "rax", "ebx", "ecx", "edx"); + report(n == 8 && + db_addr[0] == start + 1 + 6 && dr6[0] == 0xffff4ff0 && + db_addr[1] == start + 1 + 6 + 1 && dr6[1] == 0xffff4ff0 && + db_addr[2] == start + 1 + 6 + 1 + 7 && dr6[2] == 0xffff4ff0 && + db_addr[3] == start + 1 + 6 + 1 + 7 + 2 && dr6[3] == 0xffff4ff0 && + db_addr[4] == start + 1 + 6 + 1 + 7 + 2 + 5 && dr6[4] == 0xffff4ff0 && + db_addr[5] == start + 1 + 6 + 1 + 7 + 2 + 5 + 2 && dr6[5] == 0xffff4ff0 && + db_addr[6] == start + 1 + 6 + 1 + 7 + 2 + 5 + 2 + 2 && dr6[6] == 0xffff4ff0 && + db_addr[7] == start + 1 + 6 + 1 + 7 + 2 + 5 + 2 + 2 + 1 && dr6[6] == 0xffff4ff0, + "single step emulated instructions"); +} + +int main(void) +{ + if (!is_tdx_guest()) { + printf("Not TDX environment!\n"); + return report_summary(); + } + + test_selfipi_msr(); + test_single_step(); + return report_summary(); +} diff --git a/x86/unittests.cfg b/x86/unittests.cfg index 9a70ba3b4f2e..840e2054d54d 100644 --- a/x86/unittests.cfg +++ b/x86/unittests.cfg @@ -437,3 +437,7 @@ file = cet.flat arch = x86_64 smp = 2 extra_params = -enable-kvm -m 2048 -cpu host + +[intel_tdx] +file = intel_tdx.flat +arch = x86_64 From patchwork Thu Mar 3 07:19:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 12767149 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32DBCC433F5 for ; Thu, 3 Mar 2022 07:28:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230138AbiCCH2y (ORCPT ); Thu, 3 Mar 2022 02:28:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38596 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230270AbiCCH2t (ORCPT ); Thu, 3 Mar 2022 02:28:49 -0500 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0973E2BFE for ; Wed, 2 Mar 2022 23:27:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646292479; x=1677828479; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GCl+3c8iMNpw3O28KZ9PCQxhOcMnqr996axLG6UAAro=; b=GOH7QwsZZ4MuUlb0kN/BeS3HN/2bUHWXz/HbTI010ioqYF33NzkLamhE O251L6Eia6I6FqJdEnQGuj6TulvHEmRWmaxEqfZDsbweZJiZxHeQ/KZAS c9CFRgcb/g403Qkx6WvBaQJMuFFMCJ0fQq0iXun+AZ1yxpesfzT1O4rmP HbWKgl98MeGib0SFQ++2URxUPTjo+0gFSL/u0UwIglAR5xBBYIoLIC+03 wJpBECqa/TFLwkYgLN1SYsAmbCIyu2RamYN9Chua3EnoQAOiG94HcEpeu DWrklSeCC1VHTRo7zCY+rXLFAcdMVqgD4bPgG/ClmkM7p6fppDbujPfKE g==; X-IronPort-AV: E=McAfee;i="6200,9189,10274"; a="251177038" X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="251177038" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:58 -0800 X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="551631832" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.123]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:56 -0800 From: Zhenzhong Duan To: kvm@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, yu.c.zhang@intel.com, zixuanwang@google.com, marcorr@google.com, jun.nakajima@intel.com, erdemaktas@google.com Subject: [kvm-unit-tests RFC PATCH 15/17] x86 TDX: bypass unsupported sub-test for TDX Date: Thu, 3 Mar 2022 15:19:05 +0800 Message-Id: <20220303071907.650203-16-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303071907.650203-1-zhenzhong.duan@intel.com> References: <20220303071907.650203-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org According to TDX Module V0.931 Table 18.2: MSR Virtualization: 1. MSR_IA32_MISC_ENABLE is reading native and #VE in writing. 2. MSR_CSTAR is #VE in reading/writing. MSR_CSTAR simulation is also not supported in TDX host side. That means changing those MSRs are unsupported. So bypass related sub-test. Signed-off-by: Zhenzhong Duan Reviewed-by: Yu Zhang --- x86/msr.c | 6 ++++++ x86/syscall.c | 3 ++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/x86/msr.c b/x86/msr.c index 44fbb3b233e9..3a538c9ba693 100644 --- a/x86/msr.c +++ b/x86/msr.c @@ -4,6 +4,7 @@ #include "processor.h" #include "msr.h" #include +#include "tdx.h" /** * This test allows two modes: @@ -89,6 +90,11 @@ static void test_rdmsr_fault(struct msr_info *msr) static void test_msr(struct msr_info *msr, bool is_64bit_host) { + /* Changing MSR_IA32_MISC_ENABLE and MSR_CSTAR is unsupported in TDX */ + if ((msr->index == MSR_IA32_MISC_ENABLE || msr->index == MSR_CSTAR) && + is_tdx_guest()) + return; + if (is_64bit_host || !msr->is_64bit_only) { test_msr_rw(msr, msr->value); diff --git a/x86/syscall.c b/x86/syscall.c index b0df07200f50..270dfdfcce19 100644 --- a/x86/syscall.c +++ b/x86/syscall.c @@ -5,6 +5,7 @@ #include "msr.h" #include "desc.h" #include "fwcfg.h" +#include "tdx.h" static void test_syscall_lazy_load(void) { @@ -106,7 +107,7 @@ int main(int ac, char **av) { test_syscall_lazy_load(); - if (!no_test_device || !is_intel()) + if ((!no_test_device || !is_intel()) && !is_tdx_guest()) test_syscall_tf(); else report_skip("syscall TF handling"); From patchwork Thu Mar 3 07:19:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 12767150 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14FA7C433EF for ; Thu, 3 Mar 2022 07:28:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230242AbiCCH24 (ORCPT ); Thu, 3 Mar 2022 02:28:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39122 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230269AbiCCH2v (ORCPT ); Thu, 3 Mar 2022 02:28:51 -0500 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 457E614866B for ; Wed, 2 Mar 2022 23:28:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646292482; x=1677828482; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=onaminJGlBSKcooFIkqPzR6gZ0Rx0iv8d2UEQxqj7R8=; b=dsBdlB0kU+zOWGTqa7BOtlJi+HiNahu54V/mElAr+IFbiGiL194zIVyc 902Dbg3t0uUkPZfaIUgq5xcDv+MsF6eoYRu0CzMMGYFUlM29qMol1hC3g /ZqF4jfU7bHgJauCzPEf/zFdMJBWLBbHOU66yR1wzY4MyV8vHqTT7nr8q ropFaPQ1cru9P8QP+YOubwUwqqa4yTSSjYeqtt3y5c86YxlHTosNzwG4n H5p3S4xHZ4f1mTV2/v16Ac9wWwRpD9TGG3HffyuG0LG0qE9W7D9IT+tqV DVPq2b9pV8vut4QPbSBP7xDboYwpw9mu6RV3bbQBbQAofS6mU94nIaHC4 g==; X-IronPort-AV: E=McAfee;i="6200,9189,10274"; a="251177046" X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="251177046" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:28:01 -0800 X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="551631848" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.123]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:27:59 -0800 From: Zhenzhong Duan To: kvm@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, yu.c.zhang@intel.com, zixuanwang@google.com, marcorr@google.com, jun.nakajima@intel.com, erdemaktas@google.com Subject: [kvm-unit-tests RFC PATCH 16/17] x86 UEFI: Add support for parameter passing Date: Thu, 3 Mar 2022 15:19:06 +0800 Message-Id: <20220303071907.650203-17-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303071907.650203-1-zhenzhong.duan@intel.com> References: <20220303071907.650203-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org UEFI supports command line in unicode format, translate it into ascii format and pass to main() as parameters. Only support general characters with ascii < 0x80. Signed-off-by: Zhenzhong Duan Reviewed-by: Yu Zhang --- lib/argv.c | 2 +- lib/argv.h | 1 + lib/efi.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++++ lib/linux/efi.h | 18 ++++++++++++ 4 files changed, 93 insertions(+), 1 deletion(-) diff --git a/lib/argv.c b/lib/argv.c index 0312d74011d3..4d5c318a4bc4 100644 --- a/lib/argv.c +++ b/lib/argv.c @@ -44,7 +44,7 @@ void __setup_args(void) __argc = argv - __argv; } -static void setup_args(const char *args) +void setup_args(const char *args) { if (!args) return; diff --git a/lib/argv.h b/lib/argv.h index 1fd746dc2177..0fa7772549da 100644 --- a/lib/argv.h +++ b/lib/argv.h @@ -9,6 +9,7 @@ #define _ARGV_H_ extern void __setup_args(void); +extern void setup_args(const char *args); extern void setup_args_progname(const char *args); extern void setup_env(char *env, int size); extern void add_setup_arg(const char *arg); diff --git a/lib/efi.c b/lib/efi.c index 64cc9789274e..69dbfa1d1f24 100644 --- a/lib/efi.c +++ b/lib/efi.c @@ -10,6 +10,7 @@ #include "efi.h" #include #include +#include /* From lib/argv.c */ extern int __argc, __envc; @@ -96,6 +97,72 @@ static void efi_exit(efi_status_t code) efi_rs_call(reset_system, EFI_RESET_SHUTDOWN, code, 0, NULL); } +/* + * Convert the unicode UEFI command line to ASCII, only support ascii < 0x80. + * Size of memory allocated return in *cmd_line_len. + */ +static efi_status_t efi_convert_cmdline(efi_loaded_image_t *image, + char **cmd_line_ptr, int *cmd_line_len) +{ + char *cmdline_addr = 0; + int options_chars = image->load_options_size; + const u16 *options = image->load_options; + int options_bytes = 0; + efi_status_t status; + + if (!options || !options_chars) + return EFI_NOT_FOUND; + + options_chars /= sizeof(*options); + status = efi_bs_call(allocate_pool, EFI_LOADER_DATA, options_chars + 1, + (void **)&cmdline_addr); + if (status != EFI_SUCCESS) + return status; + + while (options_bytes < options_chars) { + if (options[options_bytes] >= 0x80) + return EFI_UNSUPPORTED; + + cmdline_addr[options_bytes] = (char)options[options_bytes]; + options_bytes++; + } + + /* + * UEFI command line should already includes NUL termination, + * just in case. + */ + cmdline_addr[options_bytes] = '\0'; + + *cmd_line_len = options_bytes; + *cmd_line_ptr = (char *)cmdline_addr; + return EFI_SUCCESS; +} + +static efi_status_t setup_efi_args(efi_handle_t handle) +{ + efi_guid_t proto = LOADED_IMAGE_PROTOCOL_GUID; + efi_loaded_image_t *image = NULL; + char *cmdline_ptr; + int options_size = 0; + efi_status_t status; + + status = efi_bs_call(handle_protocol, handle, &proto, (void **)&image); + if (status != EFI_SUCCESS) { + printf("Failed to get handle for LOADED_IMAGE_PROTOCOL\n"); + return status; + } + + status = efi_convert_cmdline(image, &cmdline_ptr, &options_size); + + if (status != EFI_SUCCESS && status != EFI_NOT_FOUND) + return status; + + if (status == EFI_SUCCESS) + setup_args(cmdline_ptr); + + return EFI_SUCCESS; +} + efi_status_t efi_main(efi_handle_t handle, efi_system_table_t *sys_tab) { int ret; @@ -104,6 +171,12 @@ efi_status_t efi_main(efi_handle_t handle, efi_system_table_t *sys_tab) efi_system_table = sys_tab; + status = setup_efi_args(handle); + if (status != EFI_SUCCESS) { + printf("Failed to get efi parameters\n"); + goto efi_main_error; + } + /* Memory map struct values */ efi_memory_desc_t *map = NULL; unsigned long map_size = 0, desc_size = 0, key = 0, buff_size = 0; diff --git a/lib/linux/efi.h b/lib/linux/efi.h index df9fa7974d87..8b9fa06f84ba 100644 --- a/lib/linux/efi.h +++ b/lib/linux/efi.h @@ -59,6 +59,7 @@ typedef guid_t efi_guid_t; (c) & 0xff, ((c) >> 8) & 0xff, d } } #define ACPI_TABLE_GUID EFI_GUID(0xeb9d2d30, 0x2d88, 0x11d3, 0x9a, 0x16, 0x00, 0x90, 0x27, 0x3f, 0xc1, 0x4d) +#define LOADED_IMAGE_PROTOCOL_GUID EFI_GUID(0x5b1b31a1, 0x9562, 0x11d2, 0x8e, 0x3f, 0x00, 0xa0, 0xc9, 0x69, 0x72, 0x3b) typedef struct { efi_guid_t guid; @@ -417,6 +418,23 @@ struct efi_boot_memmap { unsigned long *buff_size; }; +#define __aligned_u64 u64 __attribute__((aligned(8))) +typedef struct { + u32 revision; + efi_handle_t parent_handle; + efi_system_table_t *system_table; + efi_handle_t device_handle; + void *file_path; + void *reserved; + u32 load_options_size; + void *load_options; + void *image_base; + __aligned_u64 image_size; + unsigned int image_code_type; + unsigned int image_data_type; + efi_status_t (__efiapi *unload)(efi_handle_t image_handle); +} efi_loaded_image_t; + /* * efi_memdesc_ptr - get the n-th EFI memmap descriptor * @map: the start of efi memmap From patchwork Thu Mar 3 07:19:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 12767152 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34960C433EF for ; Thu, 3 Mar 2022 07:28:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229932AbiCCH27 (ORCPT ); Thu, 3 Mar 2022 02:28:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39278 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230301AbiCCH2y (ORCPT ); Thu, 3 Mar 2022 02:28:54 -0500 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3B0E14A054 for ; Wed, 2 Mar 2022 23:28:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646292484; x=1677828484; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LJEmkoDqllpFH5Iid50VThIb6goNurX+rp3Rv8fKx9k=; b=dRa8RYVQCQcIS+ApoleFYwTh1gt+umAN4hl78zOTFu26N75xdtbI0j6O SUcWQlg1LyAG6iGiZPBj0wcfDPmgGc+ndHGoDhKmYqSBWcEzgxRgwt8zK Ol9zlwALdmy7U8FHbV9Mlc12CZEnYHOAQPxDK/30OJcp1I6fflzoa42bl /lBmlyma9He4S9IWlPJMjKExhT70NhB4H+c+1FjuYwtULFLg4SYsqhq/q xmrrtjrniY3ruP8Or3Uti1owjDRnUp8l1t4SaXwvyemJIxYhvdaR9ACNV QZCgQmQMmziES+vYMZAq6nvMy0lhvv0yK1ZMPCrTjdi1+YY9lvOu6qI31 g==; X-IronPort-AV: E=McAfee;i="6200,9189,10274"; a="251177053" X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="251177053" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:28:04 -0800 X-IronPort-AV: E=Sophos;i="5.90,151,1643702400"; d="scan'208";a="551631871" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.123]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Mar 2022 23:28:01 -0800 From: Zhenzhong Duan To: kvm@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, yu.c.zhang@intel.com, zixuanwang@google.com, marcorr@google.com, jun.nakajima@intel.com, erdemaktas@google.com Subject: [kvm-unit-tests RFC PATCH 17/17] x86 TDX: Make run_tests.sh work with TDX Date: Thu, 3 Mar 2022 15:19:07 +0800 Message-Id: <20220303071907.650203-18-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220303071907.650203-1-zhenzhong.duan@intel.com> References: <20220303071907.650203-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Define a special group 'tdx' for those test cases supported by TDX. So that when group 'tdx' specified, these test cases run in TDX protected environment if EFI_TDX=y. For example: EFI_TDX=y ./run_tests.sh -g tdx Signed-off-by: Zhenzhong Duan Reviewed-by: Yu Zhang --- README.md | 6 ++++++ x86/unittests.cfg | 18 +++++++++++++++++- 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 6e82dc22570e..a84460e9f96b 100644 --- a/README.md +++ b/README.md @@ -137,6 +137,12 @@ when the user does not provide an environ, then an environ generated from the ./errata.txt file and the host's kernel version is provided to all unit tests. +# Unit test in TDX environment + + All the test cases supported by TDX belong to 'tdx' group, by this + command: "EFI_TDX=y ./run_tests.sh -g tdx", all these test cases run + in a TDX protected environment. + # Contributing ## Directory structure diff --git a/x86/unittests.cfg b/x86/unittests.cfg index 840e2054d54d..8cb32e6e7bee 100644 --- a/x86/unittests.cfg +++ b/x86/unittests.cfg @@ -56,10 +56,12 @@ arch = i386 [smptest] file = smptest.flat smp = 2 +groups = tdx [smptest3] file = smptest.flat smp = 3 +groups = tdx [vmexit_cpuid] file = vmexit.flat @@ -155,6 +157,7 @@ file = hypercall.flat [idt_test] file = idt_test.flat arch = x86_64 +groups = tdx #[init] #file = init.flat @@ -163,6 +166,7 @@ arch = x86_64 file = memory.flat extra_params = -cpu max arch = x86_64 +groups = tdx [msr] # Use GenuineIntel to ensure SYSENTER MSRs are fully preserved, and to test @@ -171,6 +175,7 @@ arch = x86_64 # will fail due to shortcomings in KVM. file = msr.flat extra_params = -cpu max,vendor=GenuineIntel +groups = tdx [pmu] file = pmu.flat @@ -207,6 +212,7 @@ file = s3.flat [setjmp] file = setjmp.flat +groups = tdx [sieve] file = sieve.flat @@ -216,23 +222,28 @@ timeout = 180 file = syscall.flat arch = x86_64 extra_params = -cpu Opteron_G1,vendor=AuthenticAMD +groups = tdx [tsc] file = tsc.flat extra_params = -cpu kvm64,+rdtscp +groups = tdx [tsc_adjust] file = tsc_adjust.flat extra_params = -cpu max +groups = tdx [xsave] file = xsave.flat arch = x86_64 extra_params = -cpu max +groups = tdx [rmap_chain] file = rmap_chain.flat arch = x86_64 +groups = tdx [svm] file = svm.flat @@ -259,7 +270,7 @@ extra_params = --append "10000000 `date +%s`" file = pcid.flat extra_params = -cpu qemu64,+pcid,+invpcid arch = x86_64 -groups = pcid +groups = pcid tdx [pcid-disabled] file = pcid.flat @@ -277,10 +288,12 @@ groups = pcid file = rdpru.flat extra_params = -cpu max arch = x86_64 +groups = tdx [umip] file = umip.flat extra_params = -cpu qemu64,+umip +groups = tdx [la57] file = la57.flat @@ -393,6 +406,7 @@ check = /sys/module/kvm_intel/parameters/allow_smaller_maxphyaddr=Y [debug] file = debug.flat arch = x86_64 +groups = tdx [hyperv_synic] file = hyperv_synic.flat @@ -431,6 +445,7 @@ extra_params = -M q35,kernel-irqchip=split -device intel-iommu,intremap=on,eim=o file = tsx-ctrl.flat extra_params = -cpu max groups = tsx-ctrl +groups = tdx [intel_cet] file = cet.flat @@ -441,3 +456,4 @@ extra_params = -enable-kvm -m 2048 -cpu host [intel_tdx] file = intel_tdx.flat arch = x86_64 +groups = tdx nodefault