[v18,067/121] KVM: TDX: Implement TDX vcpu enter/exit path

Message ID	421905898dfe4966168bc0b7c29b7bcf682b440e.1705965635.git.isaku.yamahata@intel.com (mailing list archive)
State	New, archived
Headers	show Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B2D4D6A33D; Mon, 22 Jan 2024 23:55:59 +0000 (UTC) From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini <pbonzini@redhat.com>, erdemaktas@google.com, Sean Christopherson <seanjc@google.com>, Sagi Shahar <sagis@google.com>, Kai Huang <kai.huang@intel.com>, chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com Subject: [PATCH v18 067/121] KVM: TDX: Implement TDX vcpu enter/exit path Date: Mon, 22 Jan 2024 15:53:43 -0800 Message-Id: <421905898dfe4966168bc0b7c29b7bcf682b440e.1705965635.git.isaku.yamahata@intel.com> In-Reply-To: <cover.1705965634.git.isaku.yamahata@intel.com> References: <cover.1705965634.git.isaku.yamahata@intel.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	KVM TDX basic feature support \| expand [v18,000/121] KVM TDX basic feature support [v18,001/121] x86/virt/tdx: Export TDX KeyID information [v18,002/121] x86/virt/tdx: Export SEAMCALL functions [v18,003/121] KVM: x86: Add is_vm_type_supported callback [v18,004/121] KVM: VMX: Move out vmx_x86_ops to 'main.c' to wrap VMX and TDX [v18,005/121] KVM: x86/vmx: initialize loaded_vmcss_on_cpu in vmx_hardware_setup() [v18,006/121] KVM: x86/vmx: Refactor KVM VMX module init/exit functions [v18,007/121] KVM: VMX: Reorder vmx initialization with kvm vendor initialization [v18,008/121] KVM: TDX: Initialize the TDX module when loading the KVM intel kernel module [v18,009/121] KVM: TDX: Add placeholders for TDX VM/vcpu structure [v18,010/121] KVM: TDX: Make TDX VM type supported [v18,011/121,MARKER] The start of TDX KVM patch series: TDX architectural definitions [v18,012/121] KVM: TDX: Define TDX architectural definitions [v18,013/121] KVM: TDX: Add TDX "architectural" error codes [v18,014/121] KVM: TDX: Add C wrapper functions for SEAMCALLs to the TDX module [v18,015/121] KVM: TDX: Retry SEAMCALL on the lack of entropy error [v18,016/121] KVM: TDX: Add helper functions to print TDX SEAMCALL error [v18,017/121,MARKER] The start of TDX KVM patch series: TD VM creation/destruction [v18,018/121] KVM: TDX: Add helper functions to allocate/free TDX private host key id [v18,019/121] KVM: TDX: Add helper function to read TDX metadata in array [v18,020/121] x86/virt/tdx: Get system-wide info about TDX module on initialization [v18,021/121] KVM: TDX: Add place holder for TDX VM specific mem_enc_op ioctl [v18,022/121] KVM: TDX: x86: Add ioctl to get TDX systemwide parameters [v18,023/121] KVM: TDX: Make KVM_CAP_MAX_VCPUS backend specific [v18,024/121] KVM: TDX: create/destroy VM structure [v18,025/121] KVM: TDX: initialize VM with TDX specific parameters [v18,026/121] KVM: TDX: Make pmu_intel.c ignore guest TD case [v18,027/121] KVM: TDX: Refuse to unplug the last cpu on the package [v18,028/121,MARKER] The start of TDX KVM patch series: TD vcpu creation/destruction [v18,029/121] KVM: TDX: create/free TDX vcpu structure [v18,030/121] KVM: TDX: Do TDX specific vcpu initialization [v18,031/121,MARKER] The start of TDX KVM patch series: KVM MMU GPA shared bits [v18,032/121] KVM: x86/mmu: introduce config for PRIVATE KVM MMU [v18,033/121] KVM: x86/mmu: Add address conversion functions for TDX shared bit of GPA [v18,034/121,MARKER] The start of TDX KVM patch series: KVM TDP refactoring for TDX [v18,035/121] KVM: Allow page-sized MMU caches to be initialized with custom 64-bit values [v18,036/121] KVM: x86/mmu: Replace hardcoded value 0 for the initial value for SPTE [v18,037/121] KVM: x86/mmu: Allow non-zero value for non-present SPTE and removed SPTE [v18,038/121] KVM: x86/mmu: Add Suppress VE bit to shadow_mmio_mask/shadow_present_mask [v18,039/121] KVM: x86/mmu: Track shadow MMIO value on a per-VM basis [v18,040/121] KVM: x86/mmu: Disallow fast page fault on private GPA [v18,041/121] KVM: x86/mmu: Allow per-VM override of the TDP max page level [v18,042/121] KVM: VMX: Introduce test mode related to EPT violation VE [v18,043/121,MARKER] The start of TDX KVM patch series: KVM TDP MMU hooks [v18,044/121] KVM: x86/mmu: Assume guest MMIOs are shared [v18,045/121] KVM: x86/tdp_mmu: Init role member of struct kvm_mmu_page at allocation [v18,046/121] KVM: x86/mmu: Add a new is_private member for union kvm_mmu_page_role [v18,047/121] KVM: x86/mmu: Add a private pointer to struct kvm_mmu_page [v18,048/121] KVM: x86/tdp_mmu: Don't zap private pages for unsupported cases [v18,049/121] KVM: x86/tdp_mmu: Apply mmu notifier callback to only shared GPA [v18,050/121] KVM: x86/tdp_mmu: Sprinkle __must_check [v18,051/121] KVM: x86/tdp_mmu: Support TDX private mapping for TDP MMU [v18,052/121,MARKER] The start of TDX KVM patch series: TDX EPT violation [v18,053/121] KVM: x86/mmu: TDX: Do not enable page track for TD guest [v18,054/121] KVM: VMX: Split out guts of EPT violation to common/exposed function [v18,055/121] KVM: VMX: Move setting of EPT MMU masks to common VT-x code [v18,056/121] KVM: TDX: Add accessors VMX VMCS helpers [v18,057/121] KVM: TDX: Add load_mmu_pgd method for TDX [v18,058/121] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand SEPT [v18,059/121] KVM: TDX: Require TDP MMU and mmio caching for TDX [v18,060/121] KVM: TDX: TDP MMU TDX support [v18,061/121] KVM: TDX: MTRR: implement get_mt_mask() for TDX [v18,062/121,MARKER] The start of TDX KVM patch series: TD finalization [v18,063/121] KVM: x86/mmu: Introduce kvm_mmu_map_tdp_page() for use by TDX [v18,064/121] KVM: TDX: Create initial guest memory [v18,065/121] KVM: TDX: Finalize VM initialization [v18,066/121,MARKER] The start of TDX KVM patch series: TD vcpu enter/exit [v18,067/121] KVM: TDX: Implement TDX vcpu enter/exit path [v18,068/121] KVM: TDX: vcpu_run: save/restore host state(host kernel gs) [v18,069/121] KVM: TDX: restore host xsave state when exit from the guest TD [v18,070/121] KVM: x86: Allow to update cached values in kvm_user_return_msrs w/o wrmsr [v18,071/121] KVM: TDX: restore user ret MSRs [v18,072/121] KVM: TDX: Add TSX_CTRL msr into uret_msrs list [v18,073/121,MARKER] The start of TDX KVM patch series: TD vcpu exits/interrupts/hypercalls [v18,074/121] KVM: TDX: complete interrupts after tdexit [v18,075/121] KVM: TDX: restore debug store when TD exit [v18,076/121] KVM: TDX: handle vcpu migration over logical processor [v18,077/121] KVM: x86: Add a switch_db_regs flag to handle TDX's auto-switched behavior [v18,078/121] KVM: TDX: Add support for find pending IRQ in a protected local APIC [v18,079/121] KVM: x86: Assume timer IRQ was injected if APIC state is proteced [v18,080/121] KVM: TDX: remove use of struct vcpu_vmx from posted_interrupt.c [v18,081/121] KVM: TDX: Implement interrupt injection [v18,082/121] KVM: TDX: Implements vcpu request_immediate_exit [v18,083/121] KVM: TDX: Implement methods to inject NMI [v18,084/121] KVM: VMX: Modify NMI and INTR handlers to take intr_info as function argument [v18,085/121] KVM: VMX: Move NMI/exception handler to common helper [v18,086/121] KVM: x86: Split core of hypercall emulation to helper function [v18,087/121] KVM: TDX: Add a place holder to handle TDX VM exit [v18,088/121] KVM: TDX: Handle vmentry failure for INTEL TD guest [v18,089/121] KVM: TDX: handle EXIT_REASON_OTHER_SMI [v18,090/121] KVM: TDX: handle ept violation/misconfig exit [v18,091/121] KVM: TDX: handle EXCEPTION_NMI and EXTERNAL_INTERRUPT [v18,092/121] KVM: TDX: Handle EXIT_REASON_OTHER_SMI with MSMI [v18,093/121] KVM: TDX: Add a place holder for handler of TDX hypercalls (TDG.VP.VMCALL) [v18,094/121] KVM: TDX: handle KVM hypercall with TDG.VP.VMCALL [v18,095/121] KVM: TDX: Add KVM Exit for TDX TDG.VP.VMCALL [v18,096/121] KVM: TDX: Handle TDX PV CPUID hypercall [v18,097/121] KVM: TDX: Handle TDX PV HLT hypercall [v18,098/121] KVM: TDX: Handle TDX PV port io hypercall [v18,099/121] KVM: TDX: Handle TDX PV MMIO hypercall [v18,100/121] KVM: TDX: Implement callbacks for MSR operations for TDX [v18,101/121] KVM: TDX: Handle TDX PV rdmsr/wrmsr hypercall [v18,102/121] KVM: TDX: Handle MSR MTRRCap and MTRRDefType access [v18,103/121] KVM: TDX: Handle MSR IA32_FEAT_CTL MSR and IA32_MCG_EXT_CTL [v18,104/121] KVM: TDX: Handle TDG.VP.VMCALL<GetTdVmCallInfo> hypercall [v18,105/121] KVM: TDX: Silently discard SMI request [v18,106/121] KVM: TDX: Silently ignore INIT/SIPI [v18,107/121] KVM: TDX: Add methods to ignore accesses to CPU state [v18,108/121] KVM: TDX: Add methods to ignore guest instruction emulation [v18,109/121] KVM: TDX: Add a method to ignore dirty logging [v18,110/121] KVM: TDX: Add methods to ignore VMX preemption timer [v18,111/121] KVM: TDX: Add methods to ignore accesses to TSC [v18,112/121] KVM: TDX: Ignore setting up mce [v18,113/121] KVM: TDX: Add a method to ignore for TDX to ignore hypercall patch [v18,114/121] KVM: TDX: Add methods to ignore virtual apic related operation [v18,115/121] KVM: TDX: Inhibit APICv for TDX guest [v18,116/121] Documentation/virt/kvm: Document on Trust Domain Extensions(TDX) [v18,117/121] KVM: x86: design documentation on TDX support of x86 KVM TDP MMU [v18,118/121] KVM: TDX: Add hint TDX ioctl to release Secure-EPT [v18,119/121] RFC: KVM: x86: Add x86 callback to check cpuid [v18,120/121] RFC: KVM: x86, TDX: Add check for KVM_SET_CPUID2 [v18,121/121,MARKER] the end of (the first phase of) TDX KVM patch series

Message ID

421905898dfe4966168bc0b7c29b7bcf682b440e.1705965635.git.isaku.yamahata@intel.com (mailing list archive)

State

New, archived

Headers

From: isaku.yamahata@intel.com
To: kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org
Cc: isaku.yamahata@intel.com,
	isaku.yamahata@gmail.com,
	Paolo Bonzini <pbonzini@redhat.com>,
	erdemaktas@google.com,
	Sean Christopherson <seanjc@google.com>,
	Sagi Shahar <sagis@google.com>,
	Kai Huang <kai.huang@intel.com>,
	chen.bo@intel.com,
	hang.yuan@intel.com,
	tina.zhang@intel.com
Subject: [PATCH v18 067/121] KVM: TDX: Implement TDX vcpu enter/exit path
Date: Mon, 22 Jan 2024 15:53:43 -0800
Message-Id: 
 <421905898dfe4966168bc0b7c29b7bcf682b440e.1705965635.git.isaku.yamahata@intel.com>
In-Reply-To: <cover.1705965634.git.isaku.yamahata@intel.com>
References: <cover.1705965634.git.isaku.yamahata@intel.com>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

KVM TDX basic feature support | expand

Commit Message

Isaku Yamahata Jan. 22, 2024, 11:53 p.m. UTC

From: Isaku Yamahata <isaku.yamahata@intel.com>

This patch implements running TDX vcpu.  Once vcpu runs on the logical
processor (LP), the TDX vcpu is associated with it.  When the TDX vcpu
moves to another LP, the TDX vcpu needs to flush its status on the LP.
When destroying TDX vcpu, it needs to complete flush and flush cpu memory
cache.  Track which LP the TDX vcpu run and flush it as necessary.

Do nothing on sched_in event as TDX doesn't support pause loop.

TDX vcpu execution requires restoring PMU debug store after returning back
to KVM because the TDX module unconditionally resets the value.  To reuse
the existing code, export perf_restore_debug_store.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>

---
Changes v15 -> v16:
- use __seamcall_saved_ret()
- As struct tdx_module_args doesn't match with vcpu.arch.regs, copy regs
  before/after calling __seamcall_saved_ret().
---
 arch/x86/kvm/vmx/main.c    | 21 +++++++++-
 arch/x86/kvm/vmx/tdx.c     | 84 ++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/vmx/tdx.h     | 33 +++++++++++++++
 arch/x86/kvm/vmx/x86_ops.h |  2 +
 arch/x86/kvm/x86.c         |  1 +
 5 files changed, 139 insertions(+), 2 deletions(-)

Comments

Binbin Wu Feb. 20, 2024, 7:27 a.m. UTC | #1

On 1/23/2024 7:53 AM, isaku.yamahata@intel.com wrote:
> From: Isaku Yamahata <isaku.yamahata@intel.com>
>
> This patch implements running TDX vcpu.  Once vcpu runs on the logical
> processor (LP), the TDX vcpu is associated with it.  When the TDX vcpu
> moves to another LP, the TDX vcpu needs to flush its status on the LP.
> When destroying TDX vcpu, it needs to complete flush and flush cpu memory
> cache.  Track which LP the TDX vcpu run and flush it as necessary.
>
> Do nothing on sched_in event as TDX doesn't support pause loop.
>
> TDX vcpu execution requires restoring PMU debug store after returning back
> to KVM because the TDX module unconditionally resets the value.  To reuse
> the existing code, export perf_restore_debug_store.

The changelog doesn't match the code implemented in this patch.

>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
>
> ---
> Changes v15 -> v16:
> - use __seamcall_saved_ret()
> - As struct tdx_module_args doesn't match with vcpu.arch.regs, copy regs
>    before/after calling __seamcall_saved_ret().
> ---
>   arch/x86/kvm/vmx/main.c    | 21 +++++++++-
>   arch/x86/kvm/vmx/tdx.c     | 84 ++++++++++++++++++++++++++++++++++++++
>   arch/x86/kvm/vmx/tdx.h     | 33 +++++++++++++++
>   arch/x86/kvm/vmx/x86_ops.h |  2 +
>   arch/x86/kvm/x86.c         |  1 +
>   5 files changed, 139 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c
> index 0784290d846f..89ab8411500d 100644
> --- a/arch/x86/kvm/vmx/main.c
> +++ b/arch/x86/kvm/vmx/main.c
> @@ -173,6 +173,23 @@ static void vt_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>   	vmx_vcpu_reset(vcpu, init_event);
>   }
>   
> +static int vt_vcpu_pre_run(struct kvm_vcpu *vcpu)
> +{
> +	if (is_td_vcpu(vcpu))
> +		/* Unconditionally continue to vcpu_run(). */
> +		return 1;
> +
> +	return vmx_vcpu_pre_run(vcpu);
> +}
> +
> +static fastpath_t vt_vcpu_run(struct kvm_vcpu *vcpu)
> +{
> +	if (is_td_vcpu(vcpu))
> +		return tdx_vcpu_run(vcpu);
> +
> +	return vmx_vcpu_run(vcpu);
> +}
> +
>   static void vt_flush_tlb_all(struct kvm_vcpu *vcpu)
>   {
>   	if (is_td_vcpu(vcpu)) {
> @@ -325,8 +342,8 @@ struct kvm_x86_ops vt_x86_ops __initdata = {
>   	.flush_tlb_gva = vt_flush_tlb_gva,
>   	.flush_tlb_guest = vt_flush_tlb_guest,
>   
> -	.vcpu_pre_run = vmx_vcpu_pre_run,
> -	.vcpu_run = vmx_vcpu_run,
> +	.vcpu_pre_run = vt_vcpu_pre_run,
> +	.vcpu_run = vt_vcpu_run,
>   	.handle_exit = vmx_handle_exit,
>   	.skip_emulated_instruction = vmx_skip_emulated_instruction,
>   	.update_emulated_instruction = vmx_update_emulated_instruction,
> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> index faa04d8922b6..5a64ac4fd5fb 100644
> --- a/arch/x86/kvm/vmx/tdx.c
> +++ b/arch/x86/kvm/vmx/tdx.c
> @@ -11,6 +11,9 @@
>   #include "vmx.h"
>   #include "x86.h"
>   
> +#include <trace/events/kvm.h>
> +#include "trace.h"
> +
>   #undef pr_fmt
>   #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>   
> @@ -541,6 +544,87 @@ void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>   	 */
>   }
>   
> +static noinstr void tdx_vcpu_enter_exit(struct vcpu_tdx *tdx)
> +{
> +	struct tdx_module_args args;
> +
> +	/*
> +	 * Avoid section mismatch with to_tdx() with KVM_VM_BUG().  The caller
> +	 * should call to_tdx().
> +	 */
> +	struct kvm_vcpu *vcpu = &tdx->vcpu;
> +
> +	guest_state_enter_irqoff();
> +
> +	/*
> +	 * TODO: optimization:
> +	 * - Eliminate copy between args and vcpu->arch.regs.
> +	 * - copyin/copyout registers only if (tdx->tdvmvall.regs_mask != 0)
> +	 *   which means TDG.VP.VMCALL.
> +	 */
> +	args = (struct tdx_module_args) {
> +		.rcx = tdx->tdvpr_pa,
> +#define REG(reg, REG)	.reg = vcpu->arch.regs[VCPU_REGS_ ## REG]
> +		REG(rdx, RDX),
> +		REG(r8,  R8),
> +		REG(r9,  R9),
> +		REG(r10, R10),
> +		REG(r11, R11),
> +		REG(r12, R12),
> +		REG(r13, R13),
> +		REG(r14, R14),
> +		REG(r15, R15),
> +		REG(rbx, RBX),
> +		REG(rdi, RDI),
> +		REG(rsi, RSI),
> +#undef REG
> +	};
> +
> +	tdx->exit_reason.full = __seamcall_saved_ret(TDH_VP_ENTER, &args);
> +
> +#define REG(reg, REG)	vcpu->arch.regs[VCPU_REGS_ ## REG] = args.reg
> +		REG(rcx, RCX);
> +		REG(rdx, RDX);
> +		REG(r8,  R8);
> +		REG(r9,  R9);
> +		REG(r10, R10);
> +		REG(r11, R11);
> +		REG(r12, R12);
> +		REG(r13, R13);
> +		REG(r14, R14);
> +		REG(r15, R15);
> +		REG(rbx, RBX);
> +		REG(rdi, RDI);
> +		REG(rsi, RSI);
> +#undef REG
> +
> +	WARN_ON_ONCE(!kvm_rebooting &&
> +		     (tdx->exit_reason.full & TDX_SW_ERROR) == TDX_SW_ERROR);
> +
> +	guest_state_exit_irqoff();
> +}
> +
> +fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu)
> +{
> +	struct vcpu_tdx *tdx = to_tdx(vcpu);
> +
> +	if (unlikely(!tdx->initialized))
> +		return -EINVAL;
> +	if (unlikely(vcpu->kvm->vm_bugged)) {
> +		tdx->exit_reason.full = TDX_NON_RECOVERABLE_VCPU;
> +		return EXIT_FASTPATH_NONE;
> +	}
> +
> +	trace_kvm_entry(vcpu);
> +
> +	tdx_vcpu_enter_exit(tdx);
> +
> +	vcpu->arch.regs_avail &= ~VMX_REGS_LAZY_LOAD_SET;
> +	trace_kvm_exit(vcpu, KVM_ISA_VMX);
> +
> +	return EXIT_FASTPATH_NONE;
> +}
> +
>   void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int pgd_level)
>   {
>   	td_vmcs_write64(to_tdx(vcpu), SHARED_EPT_POINTER, root_hpa & PAGE_MASK);
> diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h
> index d589a2caedfb..45b0b88a9b28 100644
> --- a/arch/x86/kvm/vmx/tdx.h
> +++ b/arch/x86/kvm/vmx/tdx.h
> @@ -25,6 +25,37 @@ struct kvm_tdx {
>   	u64 tsc_offset;
>   };
>   
> +union tdx_exit_reason {
> +	struct {
> +		/* 31:0 mirror the VMX Exit Reason format */
> +		u64 basic		: 16;
> +		u64 reserved16		: 1;
> +		u64 reserved17		: 1;
> +		u64 reserved18		: 1;
> +		u64 reserved19		: 1;
> +		u64 reserved20		: 1;
> +		u64 reserved21		: 1;
> +		u64 reserved22		: 1;
> +		u64 reserved23		: 1;
> +		u64 reserved24		: 1;
> +		u64 reserved25		: 1;
> +		u64 bus_lock_detected	: 1;
> +		u64 enclave_mode	: 1;
> +		u64 smi_pending_mtf	: 1;
> +		u64 smi_from_vmx_root	: 1;
> +		u64 reserved30		: 1;
> +		u64 failed_vmentry	: 1;
> +
> +		/* 63:32 are TDX specific */
> +		u64 details_l1		: 8;
> +		u64 class		: 8;
> +		u64 reserved61_48	: 14;
> +		u64 non_recoverable	: 1;
> +		u64 error		: 1;
> +	};
> +	u64 full;
> +};
> +
>   struct vcpu_tdx {
>   	struct kvm_vcpu	vcpu;
>   
> @@ -32,6 +63,8 @@ struct vcpu_tdx {
>   	unsigned long *tdvpx_pa;
>   	bool td_vcpu_created;
>   
> +	union tdx_exit_reason exit_reason;
> +
>   	bool initialized;
>   
>   	/*
> diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h
> index 5a9aabf39c02..9061284487e8 100644
> --- a/arch/x86/kvm/vmx/x86_ops.h
> +++ b/arch/x86/kvm/vmx/x86_ops.h
> @@ -150,6 +150,7 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp);
>   int tdx_vcpu_create(struct kvm_vcpu *vcpu);
>   void tdx_vcpu_free(struct kvm_vcpu *vcpu);
>   void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event);
> +fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu);
>   u8 tdx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio);
>   
>   int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp);
> @@ -177,6 +178,7 @@ static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOP
>   static inline int tdx_vcpu_create(struct kvm_vcpu *vcpu) { return -EOPNOTSUPP; }
>   static inline void tdx_vcpu_free(struct kvm_vcpu *vcpu) {}
>   static inline void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) {}
> +static inline fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu) { return EXIT_FASTPATH_NONE; }
>   static inline u8 tdx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio) { return 0; }
>   
>   static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { return -EOPNOTSUPP; }
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index eee63b08f14f..2371a8df9be3 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -315,6 +315,7 @@ const struct kvm_stats_header kvm_vcpu_stats_header = {
>   };
>   
>   u64 __read_mostly host_xcr0;
> +EXPORT_SYMBOL_GPL(host_xcr0);

It's not used in this patch, can it be moved to where it is used?

>   
>   static struct kmem_cache *x86_emulator_cache;
>

diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c
index 0784290d846f..89ab8411500d 100644
--- a/arch/x86/kvm/vmx/main.c
+++ b/arch/x86/kvm/vmx/main.c
@@ -173,6 +173,23 @@  static void vt_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 	vmx_vcpu_reset(vcpu, init_event);
 }
 
+static int vt_vcpu_pre_run(struct kvm_vcpu *vcpu)
+{
+	if (is_td_vcpu(vcpu))
+		/* Unconditionally continue to vcpu_run(). */
+		return 1;
+
+	return vmx_vcpu_pre_run(vcpu);
+}
+
+static fastpath_t vt_vcpu_run(struct kvm_vcpu *vcpu)
+{
+	if (is_td_vcpu(vcpu))
+		return tdx_vcpu_run(vcpu);
+
+	return vmx_vcpu_run(vcpu);
+}
+
 static void vt_flush_tlb_all(struct kvm_vcpu *vcpu)
 {
 	if (is_td_vcpu(vcpu)) {
@@ -325,8 +342,8 @@  struct kvm_x86_ops vt_x86_ops __initdata = {
 	.flush_tlb_gva = vt_flush_tlb_gva,
 	.flush_tlb_guest = vt_flush_tlb_guest,
 
-	.vcpu_pre_run = vmx_vcpu_pre_run,
-	.vcpu_run = vmx_vcpu_run,
+	.vcpu_pre_run = vt_vcpu_pre_run,
+	.vcpu_run = vt_vcpu_run,
 	.handle_exit = vmx_handle_exit,
 	.skip_emulated_instruction = vmx_skip_emulated_instruction,
 	.update_emulated_instruction = vmx_update_emulated_instruction,
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index faa04d8922b6..5a64ac4fd5fb 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -11,6 +11,9 @@ 
 #include "vmx.h"
 #include "x86.h"
 
+#include <trace/events/kvm.h>
+#include "trace.h"
+
 #undef pr_fmt
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
@@ -541,6 +544,87 @@  void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 	 */
 }
 
+static noinstr void tdx_vcpu_enter_exit(struct vcpu_tdx *tdx)
+{
+	struct tdx_module_args args;
+
+	/*
+	 * Avoid section mismatch with to_tdx() with KVM_VM_BUG().  The caller
+	 * should call to_tdx().
+	 */
+	struct kvm_vcpu *vcpu = &tdx->vcpu;
+
+	guest_state_enter_irqoff();
+
+	/*
+	 * TODO: optimization:
+	 * - Eliminate copy between args and vcpu->arch.regs.
+	 * - copyin/copyout registers only if (tdx->tdvmvall.regs_mask != 0)
+	 *   which means TDG.VP.VMCALL.
+	 */
+	args = (struct tdx_module_args) {
+		.rcx = tdx->tdvpr_pa,
+#define REG(reg, REG)	.reg = vcpu->arch.regs[VCPU_REGS_ ## REG]
+		REG(rdx, RDX),
+		REG(r8,  R8),
+		REG(r9,  R9),
+		REG(r10, R10),
+		REG(r11, R11),
+		REG(r12, R12),
+		REG(r13, R13),
+		REG(r14, R14),
+		REG(r15, R15),
+		REG(rbx, RBX),
+		REG(rdi, RDI),
+		REG(rsi, RSI),
+#undef REG
+	};
+
+	tdx->exit_reason.full = __seamcall_saved_ret(TDH_VP_ENTER, &args);
+
+#define REG(reg, REG)	vcpu->arch.regs[VCPU_REGS_ ## REG] = args.reg
+		REG(rcx, RCX);
+		REG(rdx, RDX);
+		REG(r8,  R8);
+		REG(r9,  R9);
+		REG(r10, R10);
+		REG(r11, R11);
+		REG(r12, R12);
+		REG(r13, R13);
+		REG(r14, R14);
+		REG(r15, R15);
+		REG(rbx, RBX);
+		REG(rdi, RDI);
+		REG(rsi, RSI);
+#undef REG
+
+	WARN_ON_ONCE(!kvm_rebooting &&
+		     (tdx->exit_reason.full & TDX_SW_ERROR) == TDX_SW_ERROR);
+
+	guest_state_exit_irqoff();
+}
+
+fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_tdx *tdx = to_tdx(vcpu);
+
+	if (unlikely(!tdx->initialized))
+		return -EINVAL;
+	if (unlikely(vcpu->kvm->vm_bugged)) {
+		tdx->exit_reason.full = TDX_NON_RECOVERABLE_VCPU;
+		return EXIT_FASTPATH_NONE;
+	}
+
+	trace_kvm_entry(vcpu);
+
+	tdx_vcpu_enter_exit(tdx);
+
+	vcpu->arch.regs_avail &= ~VMX_REGS_LAZY_LOAD_SET;
+	trace_kvm_exit(vcpu, KVM_ISA_VMX);
+
+	return EXIT_FASTPATH_NONE;
+}
+
 void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int pgd_level)
 {
 	td_vmcs_write64(to_tdx(vcpu), SHARED_EPT_POINTER, root_hpa & PAGE_MASK);
diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h
index d589a2caedfb..45b0b88a9b28 100644
--- a/arch/x86/kvm/vmx/tdx.h
+++ b/arch/x86/kvm/vmx/tdx.h
@@ -25,6 +25,37 @@  struct kvm_tdx {
 	u64 tsc_offset;
 };
 
+union tdx_exit_reason {
+	struct {
+		/* 31:0 mirror the VMX Exit Reason format */
+		u64 basic		: 16;
+		u64 reserved16		: 1;
+		u64 reserved17		: 1;
+		u64 reserved18		: 1;
+		u64 reserved19		: 1;
+		u64 reserved20		: 1;
+		u64 reserved21		: 1;
+		u64 reserved22		: 1;
+		u64 reserved23		: 1;
+		u64 reserved24		: 1;
+		u64 reserved25		: 1;
+		u64 bus_lock_detected	: 1;
+		u64 enclave_mode	: 1;
+		u64 smi_pending_mtf	: 1;
+		u64 smi_from_vmx_root	: 1;
+		u64 reserved30		: 1;
+		u64 failed_vmentry	: 1;
+
+		/* 63:32 are TDX specific */
+		u64 details_l1		: 8;
+		u64 class		: 8;
+		u64 reserved61_48	: 14;
+		u64 non_recoverable	: 1;
+		u64 error		: 1;
+	};
+	u64 full;
+};
+
 struct vcpu_tdx {
 	struct kvm_vcpu	vcpu;
 
@@ -32,6 +63,8 @@  struct vcpu_tdx {
 	unsigned long *tdvpx_pa;
 	bool td_vcpu_created;
 
+	union tdx_exit_reason exit_reason;
+
 	bool initialized;
 
 	/*
diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h
index 5a9aabf39c02..9061284487e8 100644
--- a/arch/x86/kvm/vmx/x86_ops.h
+++ b/arch/x86/kvm/vmx/x86_ops.h
@@ -150,6 +150,7 @@  int tdx_vm_ioctl(struct kvm *kvm, void __user *argp);
 int tdx_vcpu_create(struct kvm_vcpu *vcpu);
 void tdx_vcpu_free(struct kvm_vcpu *vcpu);
 void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event);
+fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu);
 u8 tdx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio);
 
 int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp);
@@ -177,6 +178,7 @@  static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOP
 static inline int tdx_vcpu_create(struct kvm_vcpu *vcpu) { return -EOPNOTSUPP; }
 static inline void tdx_vcpu_free(struct kvm_vcpu *vcpu) {}
 static inline void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) {}
+static inline fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu) { return EXIT_FASTPATH_NONE; }
 static inline u8 tdx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio) { return 0; }
 
 static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { return -EOPNOTSUPP; }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index eee63b08f14f..2371a8df9be3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -315,6 +315,7 @@  const struct kvm_stats_header kvm_vcpu_stats_header = {
 };
 
 u64 __read_mostly host_xcr0;
+EXPORT_SYMBOL_GPL(host_xcr0);
 
 static struct kmem_cache *x86_emulator_cache;

[v18,067/121] KVM: TDX: Implement TDX vcpu enter/exit path

Commit Message

Comments

Patch