From patchwork Wed Jan 29 09:58:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adrian Hunter X-Patchwork-Id: 13953580 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 110AE1AE005; Wed, 29 Jan 2025 09:59:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.12 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144788; cv=none; b=OnOzz9kP/eR7Qt7jfaBEOqjFcUPdCSSUKE9bTH6qqBeAxJSIIAIFmR3ga0Is/G33hiHHYlNgJRMoMHpZdV4FTkS74DWFSYULnVPiKuerKPjY8w7MGZ9bGv4mFJezxp9hY3R3Zq/o4i5Fcm/tBKOf4k8YzAOm5RThrkAsS1osNl0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144788; c=relaxed/simple; bh=/4Lo8XF6cbPBU2Y7UwHVxWV7Y8Vf0vf6oq/Z59rDoIE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=b2Cwm2a09qgpRx4teGzhJxhJ9qLZB/a6jYZV30EBGegnTgJ+RSVrbbm1GhUIvLGZ9JaEh6Ky6vc8/LQ2kf/pb2fbHM/EEd+MJrNI7QbmuxdiG2LrazJfLtN3yL5Xz+bLWHoYf4f1MVVhhyBnVdAzIO5FpbfegyjWyARpilcj76E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=gGckBbF+; arc=none smtp.client-ip=198.175.65.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="gGckBbF+" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738144786; x=1769680786; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/4Lo8XF6cbPBU2Y7UwHVxWV7Y8Vf0vf6oq/Z59rDoIE=; b=gGckBbF+dCVTz9E7YwiYJBDPlpoNxomdv3sa+jFvDsu3pEN/IAfKbuFI MqZsw6hJ1qidmY8ngFks7P/9R36SL5kJYf7IPaqGkxUS39u96qZgn8dPw tcU/tXTpti5eo6j5QBgp7geuWkm+d3mKQCOa2R/U7NCPTpjoV6kTNrRFJ dKxQMhnBNSlqObqxLYpQvkOYoSqHnhcSyYBoSFcpe8wcYgLLQPSq5VmVu sJZdOHO6srlTttDDV1gJjPSNxhR7vmE33HjfOzlFJ0el5hhKoIJLDj0+f Gz4+FjqZtOH3+ASVkLKXoByfL1n8zPMAuwC4TnE4Gq5bwQA83N1v665hi Q==; X-CSE-ConnectionGUID: gxcfjVTFTTmrTklF1BEN5A== X-CSE-MsgGUID: f7xV9Qy9Sn2camGr+Xdp5g== X-IronPort-AV: E=McAfee;i="6700,10204,11329"; a="50035966" X-IronPort-AV: E=Sophos;i="6.13,243,1732608000"; d="scan'208";a="50035966" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 01:59:24 -0800 X-CSE-ConnectionGUID: ofh4u1lCQe2kTW83QZPjwA== X-CSE-MsgGUID: JOTYmd5hScyyij0ku6Qdtg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="132262644" Received: from ahunter6-mobl1.ger.corp.intel.com (HELO ahunter-VirtualBox.ger.corp.intel.com) ([10.246.0.178]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 01:59:19 -0800 From: Adrian Hunter To: pbonzini@redhat.com, seanjc@google.com Cc: kvm@vger.kernel.org, rick.p.edgecombe@intel.com, kai.huang@intel.com, adrian.hunter@intel.com, reinette.chatre@intel.com, xiaoyao.li@intel.com, tony.lindgren@linux.intel.com, binbin.wu@linux.intel.com, dmatlack@google.com, isaku.yamahata@intel.com, nik.borisov@suse.com, linux-kernel@vger.kernel.org, yan.y.zhao@intel.com, chao.gao@intel.com, weijiang.yang@intel.com, dave.hansen@linux.intel.com, x86@kernel.org Subject: [PATCH V2 01/12] x86/virt/tdx: Make tdh_vp_enter() noinstr Date: Wed, 29 Jan 2025 11:58:50 +0200 Message-ID: <20250129095902.16391-2-adrian.hunter@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250129095902.16391-1-adrian.hunter@intel.com> References: <20250129095902.16391-1-adrian.hunter@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki Make tdh_vp_enter() noinstr because KVM requires VM entry to be noinstr for 2 reasons: 1. The use of context tracking via guest_state_enter_irqoff() and guest_state_exit_irqoff() 2. The need to avoid IRET between VM-exit and NMI handling in order to avoid prematurely releasing NMI inhibit. Consequently make __seamcall_saved_ret() noinstr also. Currently tdh_vp_enter() is the only caller of __seamcall_saved_ret(). Signed-off-by: Adrian Hunter --- TD vcpu enter/exit v2: - New patch --- arch/x86/virt/vmx/tdx/seamcall.S | 3 +++ arch/x86/virt/vmx/tdx/tdx.c | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/x86/virt/vmx/tdx/seamcall.S b/arch/x86/virt/vmx/tdx/seamcall.S index 5b1f2286aea9..6854c52c374b 100644 --- a/arch/x86/virt/vmx/tdx/seamcall.S +++ b/arch/x86/virt/vmx/tdx/seamcall.S @@ -41,6 +41,9 @@ SYM_FUNC_START(__seamcall_ret) TDX_MODULE_CALL host=1 ret=1 SYM_FUNC_END(__seamcall_ret) +/* KVM requires non-instrumentable __seamcall_saved_ret() for TDH.VP.ENTER */ +.section .noinstr.text, "ax" + /* * __seamcall_saved_ret() - Host-side interface functions to SEAM software * (the P-SEAMLDR or the TDX module), with saving output registers to the diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index 4a010e65276d..1515c467dd86 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -1511,7 +1511,7 @@ static void tdx_clflush_page(struct page *page) clflush_cache_range(page_to_virt(page), PAGE_SIZE); } -u64 tdh_vp_enter(struct tdx_vp *td, struct tdx_module_args *args) +noinstr u64 tdh_vp_enter(struct tdx_vp *td, struct tdx_module_args *args) { args->rcx = tdx_tdvpr_pa(td); From patchwork Wed Jan 29 09:58:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adrian Hunter X-Patchwork-Id: 13953581 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D41741AE005; Wed, 29 Jan 2025 09:59:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.12 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144793; cv=none; b=XyY5n03K/RFoP+8KgR2bchnrpTVnlD9F2agb5QD1o1fMv7bKu5gFJJZUS7J2TA1/3CWmY0ymWQN613QOnp63RGAudDyJ348KJF0P1YJmG/iRRo9CqjFRPq+AICJEnLf+ygOW1iKmVf4TLggKTK5zW8r9dh+CMjF7VNkoift9Bho= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144793; c=relaxed/simple; bh=n+Q6EGUmpUH8fNKUFT8UiTcL8EopaupqREmb3UuQXPo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=L211SR77Rm494HsptHG4IZzXyNuJXUXjvaULHeWtZj+xLRGlObhjjunt5RJQ8v/lUo4XCZFX+Au0FgHGqKfhYzyW9T4kvIzEpw1TMGEHBzhsBolAUeQi4kW4qKvsBqMFgbwShyWv1aadYLrpWULuAsdugjkj8saeVSPiBoEj5l4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=AmXsVM7I; arc=none smtp.client-ip=198.175.65.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="AmXsVM7I" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738144792; x=1769680792; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=n+Q6EGUmpUH8fNKUFT8UiTcL8EopaupqREmb3UuQXPo=; b=AmXsVM7IUEPBVEuEvSMpVFomL5MgOJILoa9DFgSxyxjhUufZgep3Hm8Z Lijn6SyT+TOS/H+aYxvx5RE/h1u3hcL3qXXNiN1NPXEBrKXpjPROpsbBA pDYk0NPvholX65bAyIRX2IWjKSYD0/j9rZay42tYpJ3rtiPOuYQ/ir421 aFEz1OiZzyXLHo1gKwL0ZYPS6id4/w/s0qtQ2VUEsrJZvtdkp7ZGQmwva bAI7aIkJOgIfmcCQI5Kfwr371tECY+W3iGXnpqop+G5V/uZU15QHCrUTG +lda1KnF6SW3O5XSQ+CnnDZm7rgP6LDzX+2d29PmDy6RVszpKO5OHCZZe A==; X-CSE-ConnectionGUID: 7xeyoXgqTVWfkUzUIJWWLA== X-CSE-MsgGUID: HOan9eIdSyOWtB0Ws6GqWw== X-IronPort-AV: E=McAfee;i="6700,10204,11329"; a="50035991" X-IronPort-AV: E=Sophos;i="6.13,243,1732608000"; d="scan'208";a="50035991" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 01:59:29 -0800 X-CSE-ConnectionGUID: 94/HuDFFR3m9H4ywczsHqQ== X-CSE-MsgGUID: m3EY/jQ0RLSAZ1+0RGMEYg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="132262652" Received: from ahunter6-mobl1.ger.corp.intel.com (HELO ahunter-VirtualBox.ger.corp.intel.com) ([10.246.0.178]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 01:59:25 -0800 From: Adrian Hunter To: pbonzini@redhat.com, seanjc@google.com Cc: kvm@vger.kernel.org, rick.p.edgecombe@intel.com, kai.huang@intel.com, adrian.hunter@intel.com, reinette.chatre@intel.com, xiaoyao.li@intel.com, tony.lindgren@linux.intel.com, binbin.wu@linux.intel.com, dmatlack@google.com, isaku.yamahata@intel.com, nik.borisov@suse.com, linux-kernel@vger.kernel.org, yan.y.zhao@intel.com, chao.gao@intel.com, weijiang.yang@intel.com Subject: [PATCH V2 02/12] KVM: x86: Allow the use of kvm_load_host_xsave_state() with guest_state_protected Date: Wed, 29 Jan 2025 11:58:51 +0200 Message-ID: <20250129095902.16391-3-adrian.hunter@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250129095902.16391-1-adrian.hunter@intel.com> References: <20250129095902.16391-1-adrian.hunter@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki From: Sean Christopherson Allow the use of kvm_load_host_xsave_state() with vcpu->arch.guest_state_protected == true. This will allow TDX to reuse kvm_load_host_xsave_state() instead of creating its own version. For consistency, amend kvm_load_guest_xsave_state() also. Ensure that guest state that kvm_load_host_xsave_state() depends upon, such as MSR_IA32_XSS, cannot be changed by user space, if guest_state_protected. [Adrian: wrote commit message] Link: https://lore.kernel.org/r/Z2GiQS_RmYeHU09L@google.com Signed-off-by: Sean Christopherson Signed-off-by: Adrian Hunter --- TD vcpu enter/exit v2: - New patch --- arch/x86/kvm/svm/svm.c | 7 +++++-- arch/x86/kvm/x86.c | 18 +++++++++++------- 2 files changed, 16 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 7640a84e554a..b4bcfe15ad5e 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4253,7 +4253,9 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu, svm_set_dr6(svm, DR6_ACTIVE_LOW); clgi(); - kvm_load_guest_xsave_state(vcpu); + + if (!vcpu->arch.guest_state_protected) + kvm_load_guest_xsave_state(vcpu); kvm_wait_lapic_expire(vcpu); @@ -4282,7 +4284,8 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu, if (unlikely(svm->vmcb->control.exit_code == SVM_EXIT_NMI)) kvm_before_interrupt(vcpu, KVM_HANDLING_NMI); - kvm_load_host_xsave_state(vcpu); + if (!vcpu->arch.guest_state_protected) + kvm_load_host_xsave_state(vcpu); stgi(); /* Any pending NMI will happen here */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index bbb6b7f40b3a..5cf9f023fd4b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1169,11 +1169,9 @@ EXPORT_SYMBOL_GPL(kvm_lmsw); void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu) { - if (vcpu->arch.guest_state_protected) - return; + WARN_ON_ONCE(vcpu->arch.guest_state_protected); if (kvm_is_cr4_bit_set(vcpu, X86_CR4_OSXSAVE)) { - if (vcpu->arch.xcr0 != kvm_host.xcr0) xsetbv(XCR_XFEATURE_ENABLED_MASK, vcpu->arch.xcr0); @@ -1192,13 +1190,11 @@ EXPORT_SYMBOL_GPL(kvm_load_guest_xsave_state); void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu) { - if (vcpu->arch.guest_state_protected) - return; - if (cpu_feature_enabled(X86_FEATURE_PKU) && ((vcpu->arch.xcr0 & XFEATURE_MASK_PKRU) || kvm_is_cr4_bit_set(vcpu, X86_CR4_PKE))) { - vcpu->arch.pkru = rdpkru(); + if (!vcpu->arch.guest_state_protected) + vcpu->arch.pkru = rdpkru(); if (vcpu->arch.pkru != vcpu->arch.host_pkru) wrpkru(vcpu->arch.host_pkru); } @@ -3916,6 +3912,10 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) if (!msr_info->host_initiated && !guest_cpuid_has(vcpu, X86_FEATURE_XSAVES)) return 1; + + if (vcpu->arch.guest_state_protected) + return 1; + /* * KVM supports exposing PT to the guest, but does not support * IA32_XSS[bit 8]. Guests have to use RDMSR/WRMSR rather than @@ -4375,6 +4375,10 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) if (!msr_info->host_initiated && !guest_cpuid_has(vcpu, X86_FEATURE_XSAVES)) return 1; + + if (vcpu->arch.guest_state_protected) + return 1; + msr_info->data = vcpu->arch.ia32_xss; break; case MSR_K7_CLK_CTL: From patchwork Wed Jan 29 09:58:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adrian Hunter X-Patchwork-Id: 13953582 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7D1401B0F11; Wed, 29 Jan 2025 09:59:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.12 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144795; cv=none; b=NmxJEk3kZqBbqht2cFdW/cv1F4g8QC5oA5u5ARff/8K1CjLRT/P5BWTRKGrUuq6kjxwBqoKg3h8ya1NGFTGvHP1TBxjZZRTHj6ZoPu13wyf/2I1ip2sTjFp3+89ll7B+EYHVSl5zWYJClABywF3sQVc89uVunPsFr2dVg+N06yE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144795; c=relaxed/simple; bh=4W+k7F8rltEb7FRA6LdbCMzAnGGInrQETpdXbUzNK9Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XRHqI0ceDknIW+lWohmzcnA9jJX9AJSBMiUkFi3U0oYkejPSS09Vvc1SYm5lCRQfXlafBASjmviHJOKQI5psabhgYHvMdGU8iZCj7waVHGzpnrXLI6CmFV/PVAT/51u79e7HVxGPQWGYci+BMR27Wc3QuKP+AImiKzTr9LZ8CNs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Wo+JrTvO; arc=none smtp.client-ip=198.175.65.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Wo+JrTvO" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738144793; x=1769680793; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4W+k7F8rltEb7FRA6LdbCMzAnGGInrQETpdXbUzNK9Y=; b=Wo+JrTvOyLPCn5Xu20ipfdR6ydwKgbaAKNQl7DYSDOY0qi8DcMTaK80C 0rdE1RT62JN3BvVjTCF6tOEK5rP6xtGxs00O8b1H+yZgV7w93Fa+Q+mKK W4V4Z5W1kveihXxppvrRq65lflRxiB2v0hVkL0VLm2bQNlelC0TiMiSyJ EiZq/qDNM6XfRCG3E9FCB6KBKJNgIoSvHbNq3z1kwkLo/4G0Du2X+a1mP U2XnKbG9Oju0Jcp5k6SbSSoUJoXhuaDABdscDE6XIGjJAAOj2XtExC5Uu eL5RD6nGq4rWVsPyMBckCVytbwN4zvYzgrw1OU5eSHr/ZU44aV2u1lAVJ A==; X-CSE-ConnectionGUID: FoJpwTEHSBOnXbfe/MM+cg== X-CSE-MsgGUID: HffDNfJCQUK6UWz3GX6eYg== X-IronPort-AV: E=McAfee;i="6700,10204,11329"; a="50035998" X-IronPort-AV: E=Sophos;i="6.13,243,1732608000"; d="scan'208";a="50035998" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 01:59:34 -0800 X-CSE-ConnectionGUID: szEJzDk5SiKPB192iHX+Cw== X-CSE-MsgGUID: t8CTpE0ZTxWEjW71TcckMA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="132262663" Received: from ahunter6-mobl1.ger.corp.intel.com (HELO ahunter-VirtualBox.ger.corp.intel.com) ([10.246.0.178]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 01:59:29 -0800 From: Adrian Hunter To: pbonzini@redhat.com, seanjc@google.com Cc: kvm@vger.kernel.org, rick.p.edgecombe@intel.com, kai.huang@intel.com, adrian.hunter@intel.com, reinette.chatre@intel.com, xiaoyao.li@intel.com, tony.lindgren@linux.intel.com, binbin.wu@linux.intel.com, dmatlack@google.com, isaku.yamahata@intel.com, nik.borisov@suse.com, linux-kernel@vger.kernel.org, yan.y.zhao@intel.com, chao.gao@intel.com, weijiang.yang@intel.com Subject: [PATCH V2 03/12] KVM: TDX: Set arch.has_protected_state to true Date: Wed, 29 Jan 2025 11:58:52 +0200 Message-ID: <20250129095902.16391-4-adrian.hunter@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250129095902.16391-1-adrian.hunter@intel.com> References: <20250129095902.16391-1-adrian.hunter@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki TDX VMs have protected state. Accordingly, set arch.has_protected_state to true. This will cause the following IOCTL functions to return an error: kvm_arch_vcpu_ioctl() case KVM_GET_SREGS2 kvm_arch_vcpu_ioctl() case KVM_SET_SREGS2 kvm_arch_vcpu_ioctl_get_regs() kvm_arch_vcpu_ioctl_set_regs() kvm_arch_vcpu_ioctl_get_sregs() kvm_arch_vcpu_ioctl_set_sregs() kvm_vcpu_ioctl_x86_get_debugregs() kvm_vcpu_ioctl_x86_set_debugregs kvm_vcpu_ioctl_x86_get_xcrs() kvm_vcpu_ioctl_x86_set_xcrs() In addition, the following will error for confidential FPU state: kvm_vcpu_ioctl_x86_get_xsave () kvm_vcpu_ioctl_x86_get_xsave2() kvm_vcpu_ioctl_x86_set_xsave() kvm_arch_vcpu_ioctl_get_fpu() kvm_arch_vcpu_ioctl_set_fpu() And finally, in accordance with commit 66155de93bcf ("KVM: x86: Disallow read-only memslots for SEV-ES and SEV-SNP (and TDX)"), read-only memslots will be disallowed. Signed-off-by: Adrian Hunter --- TD vcpu enter/exit v2: - New patch --- arch/x86/kvm/vmx/tdx.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index ea9498028212..a7ebdafdfd82 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -553,6 +553,7 @@ int tdx_vm_init(struct kvm *kvm) { struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + kvm->arch.has_protected_state = true; kvm->arch.has_private_mem = true; /* From patchwork Wed Jan 29 09:58:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adrian Hunter X-Patchwork-Id: 13953583 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AF3441AF0C7; Wed, 29 Jan 2025 09:59:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.12 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144801; cv=none; b=KK0/kmO/nPaCeV8N551TlA26TMrRuty7zWeKieoir8H2vMaQfZ/Mh8H3lsFTbxU9fSnQC+084U10BLU/fCChJmmXbEPOAUCnbL7xtyPlQZ88A0n66fVg9MlTgJCUdF6b0fgxhrDk0/cNXJ2JRRZj9mZ5VWVCjlUNBpsFzAT4j90= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144801; c=relaxed/simple; bh=U49lnxFgTOGhWHuzuTWZjic6849O19g5vRlbNmg9BJE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sGxTLgHQRxhb7aLeyZQQhUowh7tQGAgekBvo4IZogrq0b+ztKg17PR+EoizoNRX9IywEJAzcS88kC/2fFvxlENST6gQQ+NmjxRnM2NCnHKRhGQXXiw2cbAtmx8o+t7B8jQBzxBo5AGjoCcUDAFFfU6SuxfFmqizv3or8jnt7ZgA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=XjwmL6yf; arc=none smtp.client-ip=198.175.65.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="XjwmL6yf" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738144799; x=1769680799; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=U49lnxFgTOGhWHuzuTWZjic6849O19g5vRlbNmg9BJE=; b=XjwmL6yf9w14VoeJ3tyFuhKqxM7yD0ofcMi8yjU98B9nr/10veSmOgZT Bm0+h6FpCME+F3T5CBuF9/diJ8QdHZX77tZ1Im0lC1aCcnpEKj62XMj09 Gm5cSq3YCQXEstTYv2qf2iJhpYlWRn5efAnIinAsnl6DsLoq0l862yxjt OwRlPVNLW1grCku5RcNt6a+h4444l1Iz4Cjxp6XDFdYTBr64XTbXzQd+6 IGilUc8au38TJMTb6dLhf2BhRvFMR/2WshRArvSXsU+eJUq5gpGHWCIyi gpPtlqr1Ref9ftU+cWOQOHF6YX6uwknag5sUbnLQfkTlzpaH6ZwkwjS0Z g==; X-CSE-ConnectionGUID: Ou3VWln7TaSPjh5BM6UCMg== X-CSE-MsgGUID: AIbp7vHQRiWOT56K+Ndmpw== X-IronPort-AV: E=McAfee;i="6700,10204,11329"; a="50036009" X-IronPort-AV: E=Sophos;i="6.13,243,1732608000"; d="scan'208";a="50036009" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 01:59:39 -0800 X-CSE-ConnectionGUID: XiMTv+bMTaOzcKt1c6rP8A== X-CSE-MsgGUID: ooP+MWyMSWSPm5EFIZ+jyw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="132262670" Received: from ahunter6-mobl1.ger.corp.intel.com (HELO ahunter-VirtualBox.ger.corp.intel.com) ([10.246.0.178]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 01:59:34 -0800 From: Adrian Hunter To: pbonzini@redhat.com, seanjc@google.com Cc: kvm@vger.kernel.org, rick.p.edgecombe@intel.com, kai.huang@intel.com, adrian.hunter@intel.com, reinette.chatre@intel.com, xiaoyao.li@intel.com, tony.lindgren@linux.intel.com, binbin.wu@linux.intel.com, dmatlack@google.com, isaku.yamahata@intel.com, nik.borisov@suse.com, linux-kernel@vger.kernel.org, yan.y.zhao@intel.com, chao.gao@intel.com, weijiang.yang@intel.com Subject: [PATCH V2 04/12] KVM: VMX: Move common fields of struct vcpu_{vmx,tdx} to a struct Date: Wed, 29 Jan 2025 11:58:53 +0200 Message-ID: <20250129095902.16391-5-adrian.hunter@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250129095902.16391-1-adrian.hunter@intel.com> References: <20250129095902.16391-1-adrian.hunter@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki From: Binbin Wu Move common fields of struct vcpu_vmx and struct vcpu_tdx to struct vcpu_vt, to share the code between VMX/TDX as much as possible and to make TDX exit handling more VMX like. No functional change intended. [Adrian: move code that depends on struct vcpu_vmx back to vmx.h] Suggested-by: Sean Christopherson Link: https://lore.kernel.org/r/Z1suNzg2Or743a7e@google.com Signed-off-by: Binbin Wu Signed-off-by: Adrian Hunter --- TD vcpu enter/exit v2: - New patch --- arch/x86/kvm/vmx/common.h | 68 +++++++++++++++++++++ arch/x86/kvm/vmx/main.c | 4 ++ arch/x86/kvm/vmx/nested.c | 10 ++-- arch/x86/kvm/vmx/posted_intr.c | 18 +++--- arch/x86/kvm/vmx/tdx.h | 16 +---- arch/x86/kvm/vmx/vmx.c | 99 +++++++++++++++---------------- arch/x86/kvm/vmx/vmx.h | 104 +++++++++++++-------------------- 7 files changed, 178 insertions(+), 141 deletions(-) diff --git a/arch/x86/kvm/vmx/common.h b/arch/x86/kvm/vmx/common.h index 7a592467a044..9d4982694f06 100644 --- a/arch/x86/kvm/vmx/common.h +++ b/arch/x86/kvm/vmx/common.h @@ -6,6 +6,74 @@ #include "mmu.h" +union vmx_exit_reason { + struct { + u32 basic : 16; + u32 reserved16 : 1; + u32 reserved17 : 1; + u32 reserved18 : 1; + u32 reserved19 : 1; + u32 reserved20 : 1; + u32 reserved21 : 1; + u32 reserved22 : 1; + u32 reserved23 : 1; + u32 reserved24 : 1; + u32 reserved25 : 1; + u32 bus_lock_detected : 1; + u32 enclave_mode : 1; + u32 smi_pending_mtf : 1; + u32 smi_from_vmx_root : 1; + u32 reserved30 : 1; + u32 failed_vmentry : 1; + }; + u32 full; +}; + +struct vcpu_vt { + /* Posted interrupt descriptor */ + struct pi_desc pi_desc; + + /* Used if this vCPU is waiting for PI notification wakeup. */ + struct list_head pi_wakeup_list; + + union vmx_exit_reason exit_reason; + + unsigned long exit_qualification; + u32 exit_intr_info; + + /* + * If true, guest state has been loaded into hardware, and host state + * saved into vcpu_{vt,vmx,tdx}. If false, host state is loaded into + * hardware. + */ + bool guest_state_loaded; + +#ifdef CONFIG_X86_64 + u64 msr_host_kernel_gs_base; +#endif + + unsigned long host_debugctlmsr; +}; + +#ifdef CONFIG_INTEL_TDX_HOST + +static __always_inline bool is_td(struct kvm *kvm) +{ + return kvm->arch.vm_type == KVM_X86_TDX_VM; +} + +static __always_inline bool is_td_vcpu(struct kvm_vcpu *vcpu) +{ + return is_td(vcpu->kvm); +} + +#else + +static inline bool is_td(struct kvm *kvm) { return false; } +static inline bool is_td_vcpu(struct kvm_vcpu *vcpu) { return false; } + +#endif + static inline bool vt_is_tdx_private_gpa(struct kvm *kvm, gpa_t gpa) { /* For TDX the direct mask is the shared mask. */ diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 0ff7394f8466..1cc1c06461f2 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -10,6 +10,10 @@ #include "tdx.h" #include "tdx_arch.h" +#ifdef CONFIG_INTEL_TDX_HOST +static_assert(offsetof(struct vcpu_vmx, vt) == offsetof(struct vcpu_tdx, vt)); +#endif + static void vt_disable_virtualization_cpu(void) { /* Note, TDX *and* VMX need to be disabled if TDX is enabled. */ diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 8a7af02d466e..3add9f1073ff 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -275,7 +275,7 @@ static void vmx_sync_vmcs_host_state(struct vcpu_vmx *vmx, { struct vmcs_host_state *dest, *src; - if (unlikely(!vmx->guest_state_loaded)) + if (unlikely(!vmx->vt.guest_state_loaded)) return; src = &prev->host_state; @@ -425,7 +425,7 @@ static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu, * tables also changed, but KVM should not treat EPT Misconfig * VM-Exits as writes. */ - WARN_ON_ONCE(vmx->exit_reason.basic != EXIT_REASON_EPT_VIOLATION); + WARN_ON_ONCE(vmx->vt.exit_reason.basic != EXIT_REASON_EPT_VIOLATION); /* * PML Full and EPT Violation VM-Exits both use bit 12 to report @@ -4622,7 +4622,7 @@ static void prepare_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, { /* update exit information fields: */ vmcs12->vm_exit_reason = vm_exit_reason; - if (to_vmx(vcpu)->exit_reason.enclave_mode) + if (vmx_get_exit_reason(vcpu).enclave_mode) vmcs12->vm_exit_reason |= VMX_EXIT_REASONS_SGX_ENCLAVE_MODE; vmcs12->exit_qualification = exit_qualification; @@ -6115,7 +6115,7 @@ static int handle_vmfunc(struct kvm_vcpu *vcpu) * nested VM-Exit. Pass the original exit reason, i.e. don't hardcode * EXIT_REASON_VMFUNC as the exit reason. */ - nested_vmx_vmexit(vcpu, vmx->exit_reason.full, + nested_vmx_vmexit(vcpu, vmx->vt.exit_reason.full, vmx_get_intr_info(vcpu), vmx_get_exit_qual(vcpu)); return 1; @@ -6560,7 +6560,7 @@ static bool nested_vmx_l1_wants_exit(struct kvm_vcpu *vcpu, bool nested_vmx_reflect_vmexit(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); - union vmx_exit_reason exit_reason = vmx->exit_reason; + union vmx_exit_reason exit_reason = vmx->vt.exit_reason; unsigned long exit_qual; u32 exit_intr_info; diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c index ec08fa3caf43..5696e0f9f924 100644 --- a/arch/x86/kvm/vmx/posted_intr.c +++ b/arch/x86/kvm/vmx/posted_intr.c @@ -33,7 +33,7 @@ static DEFINE_PER_CPU(raw_spinlock_t, wakeup_vcpus_on_cpu_lock); static inline struct pi_desc *vcpu_to_pi_desc(struct kvm_vcpu *vcpu) { - return &(to_vmx(vcpu)->pi_desc); + return &(to_vt(vcpu)->pi_desc); } static int pi_try_set_control(struct pi_desc *pi_desc, u64 *pold, u64 new) @@ -53,7 +53,7 @@ static int pi_try_set_control(struct pi_desc *pi_desc, u64 *pold, u64 new) void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu) { struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu); - struct vcpu_vmx *vmx = to_vmx(vcpu); + struct vcpu_vt *vt = to_vt(vcpu); struct pi_desc old, new; unsigned long flags; unsigned int dest; @@ -90,7 +90,7 @@ void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu) */ if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR) { raw_spin_lock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu)); - list_del(&vmx->pi_wakeup_list); + list_del(&vt->pi_wakeup_list); raw_spin_unlock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu)); } @@ -146,14 +146,14 @@ static bool vmx_can_use_vtd_pi(struct kvm *kvm) static void pi_enable_wakeup_handler(struct kvm_vcpu *vcpu) { struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu); - struct vcpu_vmx *vmx = to_vmx(vcpu); + struct vcpu_vt *vt = to_vt(vcpu); struct pi_desc old, new; unsigned long flags; local_irq_save(flags); raw_spin_lock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu)); - list_add_tail(&vmx->pi_wakeup_list, + list_add_tail(&vt->pi_wakeup_list, &per_cpu(wakeup_vcpus_on_cpu, vcpu->cpu)); raw_spin_unlock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu)); @@ -220,13 +220,13 @@ void pi_wakeup_handler(void) int cpu = smp_processor_id(); struct list_head *wakeup_list = &per_cpu(wakeup_vcpus_on_cpu, cpu); raw_spinlock_t *spinlock = &per_cpu(wakeup_vcpus_on_cpu_lock, cpu); - struct vcpu_vmx *vmx; + struct vcpu_vt *vt; raw_spin_lock(spinlock); - list_for_each_entry(vmx, wakeup_list, pi_wakeup_list) { + list_for_each_entry(vt, wakeup_list, pi_wakeup_list) { - if (pi_test_on(&vmx->pi_desc)) - kvm_vcpu_wake_up(&vmx->vcpu); + if (pi_test_on(&vt->pi_desc)) + kvm_vcpu_wake_up(vt_to_vcpu(vt)); } raw_spin_unlock(spinlock); } diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index 3904479892f3..ba880dae547f 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -6,6 +6,8 @@ #include "tdx_errno.h" #ifdef CONFIG_INTEL_TDX_HOST +#include "common.h" + int tdx_bringup(void); void tdx_cleanup(void); @@ -43,6 +45,7 @@ enum vcpu_tdx_state { struct vcpu_tdx { struct kvm_vcpu vcpu; + struct vcpu_vt vt; struct tdx_vp vp; @@ -55,16 +58,6 @@ void tdh_vp_rd_failed(struct vcpu_tdx *tdx, char *uclass, u32 field, u64 err); void tdh_vp_wr_failed(struct vcpu_tdx *tdx, char *uclass, char *op, u32 field, u64 val, u64 err); -static inline bool is_td(struct kvm *kvm) -{ - return kvm->arch.vm_type == KVM_X86_TDX_VM; -} - -static inline bool is_td_vcpu(struct kvm_vcpu *vcpu) -{ - return is_td(vcpu->kvm); -} - static __always_inline u64 td_tdcs_exec_read64(struct kvm_tdx *kvm_tdx, u32 field) { u64 err, data; @@ -174,9 +167,6 @@ struct vcpu_tdx { struct kvm_vcpu vcpu; }; -static inline bool is_td(struct kvm *kvm) { return false; } -static inline bool is_td_vcpu(struct kvm_vcpu *vcpu) { return false; } - #endif #endif diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index e22df2b1e887..5475abb11533 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1282,6 +1282,7 @@ void vmx_set_host_fs_gs(struct vmcs_host_state *host, u16 fs_sel, u16 gs_sel, void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); + struct vcpu_vt *vt = to_vt(vcpu); struct vmcs_host_state *host_state; #ifdef CONFIG_X86_64 int cpu = raw_smp_processor_id(); @@ -1310,7 +1311,7 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) if (vmx->nested.need_vmcs12_to_shadow_sync) nested_sync_vmcs12_to_shadow(vcpu); - if (vmx->guest_state_loaded) + if (vt->guest_state_loaded) return; host_state = &vmx->loaded_vmcs->host_state; @@ -1331,12 +1332,12 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) fs_sel = current->thread.fsindex; gs_sel = current->thread.gsindex; fs_base = current->thread.fsbase; - vmx->msr_host_kernel_gs_base = current->thread.gsbase; + vt->msr_host_kernel_gs_base = current->thread.gsbase; } else { savesegment(fs, fs_sel); savesegment(gs, gs_sel); fs_base = read_msr(MSR_FS_BASE); - vmx->msr_host_kernel_gs_base = read_msr(MSR_KERNEL_GS_BASE); + vt->msr_host_kernel_gs_base = read_msr(MSR_KERNEL_GS_BASE); } wrmsrl(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base); @@ -1348,14 +1349,14 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) #endif vmx_set_host_fs_gs(host_state, fs_sel, gs_sel, fs_base, gs_base); - vmx->guest_state_loaded = true; + vt->guest_state_loaded = true; } static void vmx_prepare_switch_to_host(struct vcpu_vmx *vmx) { struct vmcs_host_state *host_state; - if (!vmx->guest_state_loaded) + if (!vmx->vt.guest_state_loaded) return; host_state = &vmx->loaded_vmcs->host_state; @@ -1383,10 +1384,10 @@ static void vmx_prepare_switch_to_host(struct vcpu_vmx *vmx) #endif invalidate_tss_limit(); #ifdef CONFIG_X86_64 - wrmsrl(MSR_KERNEL_GS_BASE, vmx->msr_host_kernel_gs_base); + wrmsrl(MSR_KERNEL_GS_BASE, vmx->vt.msr_host_kernel_gs_base); #endif load_fixmap_gdt(raw_smp_processor_id()); - vmx->guest_state_loaded = false; + vmx->vt.guest_state_loaded = false; vmx->guest_uret_msrs_loaded = false; } @@ -1394,7 +1395,7 @@ static void vmx_prepare_switch_to_host(struct vcpu_vmx *vmx) static u64 vmx_read_guest_kernel_gs_base(struct vcpu_vmx *vmx) { preempt_disable(); - if (vmx->guest_state_loaded) + if (vmx->vt.guest_state_loaded) rdmsrl(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base); preempt_enable(); return vmx->msr_guest_kernel_gs_base; @@ -1403,7 +1404,7 @@ static u64 vmx_read_guest_kernel_gs_base(struct vcpu_vmx *vmx) static void vmx_write_guest_kernel_gs_base(struct vcpu_vmx *vmx, u64 data) { preempt_disable(); - if (vmx->guest_state_loaded) + if (vmx->vt.guest_state_loaded) wrmsrl(MSR_KERNEL_GS_BASE, data); preempt_enable(); vmx->msr_guest_kernel_gs_base = data; @@ -1524,7 +1525,7 @@ void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) vmx_vcpu_pi_load(vcpu, cpu); - vmx->host_debugctlmsr = get_debugctlmsr(); + vmx->vt.host_debugctlmsr = get_debugctlmsr(); } void vmx_vcpu_put(struct kvm_vcpu *vcpu) @@ -1703,7 +1704,7 @@ int vmx_check_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type, * so that guest userspace can't DoS the guest simply by triggering * emulation (enclaves are CPL3 only). */ - if (to_vmx(vcpu)->exit_reason.enclave_mode) { + if (vmx_get_exit_reason(vcpu).enclave_mode) { kvm_queue_exception(vcpu, UD_VECTOR); return X86EMUL_PROPAGATE_FAULT; } @@ -1718,7 +1719,7 @@ int vmx_check_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type, static int skip_emulated_instruction(struct kvm_vcpu *vcpu) { - union vmx_exit_reason exit_reason = to_vmx(vcpu)->exit_reason; + union vmx_exit_reason exit_reason = vmx_get_exit_reason(vcpu); unsigned long rip, orig_rip; u32 instr_len; @@ -4277,7 +4278,7 @@ static int vmx_deliver_nested_posted_interrupt(struct kvm_vcpu *vcpu, */ static int vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector) { - struct vcpu_vmx *vmx = to_vmx(vcpu); + struct vcpu_vt *vt = to_vt(vcpu); int r; r = vmx_deliver_nested_posted_interrupt(vcpu, vector); @@ -4288,11 +4289,11 @@ static int vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector) if (!vcpu->arch.apic->apicv_active) return -1; - if (pi_test_and_set_pir(vector, &vmx->pi_desc)) + if (pi_test_and_set_pir(vector, &vt->pi_desc)) return 0; /* If a previous notification has sent the IPI, nothing to do. */ - if (pi_test_and_set_on(&vmx->pi_desc)) + if (pi_test_and_set_on(&vt->pi_desc)) return 0; /* @@ -4768,7 +4769,7 @@ static void init_vmcs(struct vcpu_vmx *vmx) vmcs_write16(GUEST_INTR_STATUS, 0); vmcs_write16(POSTED_INTR_NV, POSTED_INTR_VECTOR); - vmcs_write64(POSTED_INTR_DESC_ADDR, __pa((&vmx->pi_desc))); + vmcs_write64(POSTED_INTR_DESC_ADDR, __pa((&vmx->vt.pi_desc))); } if (vmx_can_use_ipiv(&vmx->vcpu)) { @@ -4881,8 +4882,8 @@ static void __vmx_vcpu_reset(struct kvm_vcpu *vcpu) * Enforce invariant: pi_desc.nv is always either POSTED_INTR_VECTOR * or POSTED_INTR_WAKEUP_VECTOR. */ - vmx->pi_desc.nv = POSTED_INTR_VECTOR; - __pi_set_sn(&vmx->pi_desc); + vmx->vt.pi_desc.nv = POSTED_INTR_VECTOR; + __pi_set_sn(&vmx->vt.pi_desc); } void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) @@ -6062,7 +6063,7 @@ static int handle_bus_lock_vmexit(struct kvm_vcpu *vcpu) * VM-Exits. Unconditionally set the flag here and leave the handling to * vmx_handle_exit(). */ - to_vmx(vcpu)->exit_reason.bus_lock_detected = true; + to_vt(vcpu)->exit_reason.bus_lock_detected = true; return 1; } @@ -6160,9 +6161,9 @@ void vmx_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason, { struct vcpu_vmx *vmx = to_vmx(vcpu); - *reason = vmx->exit_reason.full; + *reason = vmx->vt.exit_reason.full; *info1 = vmx_get_exit_qual(vcpu); - if (!(vmx->exit_reason.failed_vmentry)) { + if (!(vmx->vt.exit_reason.failed_vmentry)) { *info2 = vmx->idt_vectoring_info; *intr_info = vmx_get_intr_info(vcpu); if (is_exception_with_error_code(*intr_info)) @@ -6458,7 +6459,7 @@ void dump_vmcs(struct kvm_vcpu *vcpu) static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) { struct vcpu_vmx *vmx = to_vmx(vcpu); - union vmx_exit_reason exit_reason = vmx->exit_reason; + union vmx_exit_reason exit_reason = vmx_get_exit_reason(vcpu); u32 vectoring_info = vmx->idt_vectoring_info; u16 exit_handler_index; @@ -6624,7 +6625,7 @@ int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) * Exit to user space when bus lock detected to inform that there is * a bus lock in guest. */ - if (to_vmx(vcpu)->exit_reason.bus_lock_detected) { + if (vmx_get_exit_reason(vcpu).bus_lock_detected) { if (ret > 0) vcpu->run->exit_reason = KVM_EXIT_X86_BUS_LOCK; @@ -6903,22 +6904,22 @@ static void vmx_set_rvi(int vector) int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu) { - struct vcpu_vmx *vmx = to_vmx(vcpu); + struct vcpu_vt *vt = to_vt(vcpu); int max_irr; bool got_posted_interrupt; if (KVM_BUG_ON(!enable_apicv, vcpu->kvm)) return -EIO; - if (pi_test_on(&vmx->pi_desc)) { - pi_clear_on(&vmx->pi_desc); + if (pi_test_on(&vt->pi_desc)) { + pi_clear_on(&vt->pi_desc); /* * IOMMU can write to PID.ON, so the barrier matters even on UP. * But on x86 this is just a compiler barrier anyway. */ smp_mb__after_atomic(); got_posted_interrupt = - kvm_apic_update_irr(vcpu, vmx->pi_desc.pir, &max_irr); + kvm_apic_update_irr(vcpu, vt->pi_desc.pir, &max_irr); } else { max_irr = kvm_lapic_find_highest_irr(vcpu); got_posted_interrupt = false; @@ -6960,10 +6961,10 @@ void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap) void vmx_apicv_pre_state_restore(struct kvm_vcpu *vcpu) { - struct vcpu_vmx *vmx = to_vmx(vcpu); + struct vcpu_vt *vt = to_vt(vcpu); - pi_clear_on(&vmx->pi_desc); - memset(vmx->pi_desc.pir, 0, sizeof(vmx->pi_desc.pir)); + pi_clear_on(&vt->pi_desc); + memset(vt->pi_desc.pir, 0, sizeof(vt->pi_desc.pir)); } void vmx_do_interrupt_irqoff(unsigned long entry); @@ -7028,9 +7029,9 @@ void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu) if (vmx->emulation_required) return; - if (vmx->exit_reason.basic == EXIT_REASON_EXTERNAL_INTERRUPT) + if (vmx_get_exit_reason(vcpu).basic == EXIT_REASON_EXTERNAL_INTERRUPT) handle_external_interrupt_irqoff(vcpu, vmx_get_intr_info(vcpu)); - else if (vmx->exit_reason.basic == EXIT_REASON_EXCEPTION_NMI) + else if (vmx_get_exit_reason(vcpu).basic == EXIT_REASON_EXCEPTION_NMI) handle_exception_irqoff(vcpu, vmx_get_intr_info(vcpu)); } @@ -7261,10 +7262,10 @@ static fastpath_t vmx_exit_handlers_fastpath(struct kvm_vcpu *vcpu, * the fastpath even, all other exits must use the slow path. */ if (is_guest_mode(vcpu) && - to_vmx(vcpu)->exit_reason.basic != EXIT_REASON_PREEMPTION_TIMER) + vmx_get_exit_reason(vcpu).basic != EXIT_REASON_PREEMPTION_TIMER) return EXIT_FASTPATH_NONE; - switch (to_vmx(vcpu)->exit_reason.basic) { + switch (vmx_get_exit_reason(vcpu).basic) { case EXIT_REASON_MSR_WRITE: return handle_fastpath_set_msr_irqoff(vcpu); case EXIT_REASON_PREEMPTION_TIMER: @@ -7311,15 +7312,15 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu, vmx_enable_fb_clear(vmx); if (unlikely(vmx->fail)) { - vmx->exit_reason.full = 0xdead; + vmx->vt.exit_reason.full = 0xdead; goto out; } - vmx->exit_reason.full = vmcs_read32(VM_EXIT_REASON); - if (likely(!vmx->exit_reason.failed_vmentry)) + vmx->vt.exit_reason.full = vmcs_read32(VM_EXIT_REASON); + if (likely(!vmx_get_exit_reason(vcpu).failed_vmentry)) vmx->idt_vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD); - if ((u16)vmx->exit_reason.basic == EXIT_REASON_EXCEPTION_NMI && + if ((u16)vmx_get_exit_reason(vcpu).basic == EXIT_REASON_EXCEPTION_NMI && is_nmi(vmx_get_intr_info(vcpu))) { kvm_before_interrupt(vcpu, KVM_HANDLING_NMI); if (cpu_feature_enabled(X86_FEATURE_FRED)) @@ -7351,12 +7352,12 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) if (unlikely(vmx->emulation_required)) { vmx->fail = 0; - vmx->exit_reason.full = EXIT_REASON_INVALID_STATE; - vmx->exit_reason.failed_vmentry = 1; + vmx->vt.exit_reason.full = EXIT_REASON_INVALID_STATE; + vmx->vt.exit_reason.failed_vmentry = 1; kvm_register_mark_available(vcpu, VCPU_EXREG_EXIT_INFO_1); - vmx->exit_qualification = ENTRY_FAIL_DEFAULT; + vmx->vt.exit_qualification = ENTRY_FAIL_DEFAULT; kvm_register_mark_available(vcpu, VCPU_EXREG_EXIT_INFO_2); - vmx->exit_intr_info = 0; + vmx->vt.exit_intr_info = 0; return EXIT_FASTPATH_NONE; } @@ -7437,8 +7438,8 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) } /* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed */ - if (vmx->host_debugctlmsr) - update_debugctlmsr(vmx->host_debugctlmsr); + if (vmx->vt.host_debugctlmsr) + update_debugctlmsr(vmx->vt.host_debugctlmsr); #ifndef CONFIG_X86_64 /* @@ -7463,7 +7464,7 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) * checking. */ if (vmx->nested.nested_run_pending && - !vmx->exit_reason.failed_vmentry) + !vmx_get_exit_reason(vcpu).failed_vmentry) ++vcpu->stat.nested_run; vmx->nested.nested_run_pending = 0; @@ -7472,12 +7473,12 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) if (unlikely(vmx->fail)) return EXIT_FASTPATH_NONE; - if (unlikely((u16)vmx->exit_reason.basic == EXIT_REASON_MCE_DURING_VMENTRY)) + if (unlikely((u16)vmx_get_exit_reason(vcpu).basic == EXIT_REASON_MCE_DURING_VMENTRY)) kvm_machine_check(); trace_kvm_exit(vcpu, KVM_ISA_VMX); - if (unlikely(vmx->exit_reason.failed_vmentry)) + if (unlikely(vmx_get_exit_reason(vcpu).failed_vmentry)) return EXIT_FASTPATH_NONE; vmx->loaded_vmcs->launched = 1; @@ -7509,7 +7510,7 @@ int vmx_vcpu_create(struct kvm_vcpu *vcpu) BUILD_BUG_ON(offsetof(struct vcpu_vmx, vcpu) != 0); vmx = to_vmx(vcpu); - INIT_LIST_HEAD(&vmx->pi_wakeup_list); + INIT_LIST_HEAD(&vmx->vt.pi_wakeup_list); err = -ENOMEM; @@ -7607,7 +7608,7 @@ int vmx_vcpu_create(struct kvm_vcpu *vcpu) if (vmx_can_use_ipiv(vcpu)) WRITE_ONCE(to_kvm_vmx(vcpu->kvm)->pid_table[vcpu->vcpu_id], - __pa(&vmx->pi_desc) | PID_TABLE_ENTRY_VALID); + __pa(&vmx->vt.pi_desc) | PID_TABLE_ENTRY_VALID); return 0; diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index a58b940f0634..e635199901e2 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -17,6 +17,7 @@ #include "../cpuid.h" #include "run_flags.h" #include "../mmu.h" +#include "common.h" #define X2APIC_MSR(r) (APIC_BASE_MSR + ((r) >> 4)) @@ -68,29 +69,6 @@ struct pt_desc { struct pt_ctx guest; }; -union vmx_exit_reason { - struct { - u32 basic : 16; - u32 reserved16 : 1; - u32 reserved17 : 1; - u32 reserved18 : 1; - u32 reserved19 : 1; - u32 reserved20 : 1; - u32 reserved21 : 1; - u32 reserved22 : 1; - u32 reserved23 : 1; - u32 reserved24 : 1; - u32 reserved25 : 1; - u32 bus_lock_detected : 1; - u32 enclave_mode : 1; - u32 smi_pending_mtf : 1; - u32 smi_from_vmx_root : 1; - u32 reserved30 : 1; - u32 failed_vmentry : 1; - }; - u32 full; -}; - /* * The nested_vmx structure is part of vcpu_vmx, and holds information we need * for correct emulation of VMX (i.e., nested VMX) on this vcpu. @@ -231,20 +209,10 @@ struct nested_vmx { struct vcpu_vmx { struct kvm_vcpu vcpu; + struct vcpu_vt vt; u8 fail; u8 x2apic_msr_bitmap_mode; - /* - * If true, host state has been stored in vmx->loaded_vmcs for - * the CPU registers that only need to be switched when transitioning - * to/from the kernel, and the registers have been loaded with guest - * values. If false, host state is loaded in the CPU registers - * and vmx->loaded_vmcs->host_state is invalid. - */ - bool guest_state_loaded; - - unsigned long exit_qualification; - u32 exit_intr_info; u32 idt_vectoring_info; ulong rflags; @@ -257,7 +225,6 @@ struct vcpu_vmx { struct vmx_uret_msr guest_uret_msrs[MAX_NR_USER_RETURN_MSRS]; bool guest_uret_msrs_loaded; #ifdef CONFIG_X86_64 - u64 msr_host_kernel_gs_base; u64 msr_guest_kernel_gs_base; #endif @@ -298,14 +265,6 @@ struct vcpu_vmx { int vpid; bool emulation_required; - union vmx_exit_reason exit_reason; - - /* Posted interrupt descriptor */ - struct pi_desc pi_desc; - - /* Used if this vCPU is waiting for PI notification wakeup. */ - struct list_head pi_wakeup_list; - /* Support for a guest hypervisor (nested VMX) */ struct nested_vmx nested; @@ -323,8 +282,6 @@ struct vcpu_vmx { /* apic deadline value in host tsc */ u64 hv_deadline_tsc; - unsigned long host_debugctlmsr; - /* * Only bits masked by msr_ia32_feature_control_valid_bits can be set in * msr_ia32_feature_control. FEAT_CTL_LOCKED is always included @@ -361,6 +318,43 @@ struct kvm_vmx { u64 *pid_table; }; +static __always_inline struct vcpu_vt *to_vt(struct kvm_vcpu *vcpu) +{ + return &(container_of(vcpu, struct vcpu_vmx, vcpu)->vt); +} + +static __always_inline struct kvm_vcpu *vt_to_vcpu(struct vcpu_vt *vt) +{ + return &(container_of(vt, struct vcpu_vmx, vt)->vcpu); +} + +static __always_inline union vmx_exit_reason vmx_get_exit_reason(struct kvm_vcpu *vcpu) +{ + return to_vt(vcpu)->exit_reason; +} + +static __always_inline unsigned long vmx_get_exit_qual(struct kvm_vcpu *vcpu) +{ + struct vcpu_vt *vt = to_vt(vcpu); + + if (!kvm_register_test_and_mark_available(vcpu, VCPU_EXREG_EXIT_INFO_1) && + !WARN_ON_ONCE(is_td_vcpu(vcpu))) + vt->exit_qualification = vmcs_readl(EXIT_QUALIFICATION); + + return vt->exit_qualification; +} + +static __always_inline u32 vmx_get_intr_info(struct kvm_vcpu *vcpu) +{ + struct vcpu_vt *vt = to_vt(vcpu); + + if (!kvm_register_test_and_mark_available(vcpu, VCPU_EXREG_EXIT_INFO_2) && + !WARN_ON_ONCE(is_td_vcpu(vcpu))) + vt->exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO); + + return vt->exit_intr_info; +} + void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, int cpu, struct loaded_vmcs *buddy); int allocate_vpid(void); @@ -651,26 +645,6 @@ void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu); int intel_pmu_create_guest_lbr_event(struct kvm_vcpu *vcpu); void vmx_passthrough_lbr_msrs(struct kvm_vcpu *vcpu); -static __always_inline unsigned long vmx_get_exit_qual(struct kvm_vcpu *vcpu) -{ - struct vcpu_vmx *vmx = to_vmx(vcpu); - - if (!kvm_register_test_and_mark_available(vcpu, VCPU_EXREG_EXIT_INFO_1)) - vmx->exit_qualification = vmcs_readl(EXIT_QUALIFICATION); - - return vmx->exit_qualification; -} - -static __always_inline u32 vmx_get_intr_info(struct kvm_vcpu *vcpu) -{ - struct vcpu_vmx *vmx = to_vmx(vcpu); - - if (!kvm_register_test_and_mark_available(vcpu, VCPU_EXREG_EXIT_INFO_2)) - vmx->exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO); - - return vmx->exit_intr_info; -} - struct vmcs *alloc_vmcs_cpu(bool shadow, int cpu, gfp_t flags); void free_vmcs(struct vmcs *vmcs); int alloc_loaded_vmcs(struct loaded_vmcs *loaded_vmcs); From patchwork Wed Jan 29 09:58:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adrian Hunter X-Patchwork-Id: 13953584 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 881BA1BD03F; Wed, 29 Jan 2025 10:00:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.12 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144806; cv=none; b=tTAjwfabDivRsX05AWrgs4GRPU2Bs9Ze60JOZvWFOrAQHSs0YcoyHi2xNYZcAFv6SlCwlr5aEow2c2B0BOZNrtHk5bNPhTpM8NeFUecqntEdS4nzc7B7sfM8ZyNeH05SjuhY0WnWWWMDMf4JWBEXu29V7ymosEK+iK7nc4YqFT4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144806; c=relaxed/simple; bh=qTeY++NE1GntEjApCuhtFHvsTJ/xj4PR1DRLDQ6kphM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=L0aI7OxYVISrEqgVIsG55JBNdKTFXVRiA3jXyC991mh9nAZw05+tMRLCMn0+pueThxSCUEuxVOqmmxnRmw4x7cSPb8weFM30cUaHD1pVUgScbYCVBzvl9ZxFkwZHsvHubAViOkLEghhZLzmX2iUUortry4U7M/pfrQWSYibpPcA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=bIr8tV4j; arc=none smtp.client-ip=198.175.65.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="bIr8tV4j" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738144805; x=1769680805; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qTeY++NE1GntEjApCuhtFHvsTJ/xj4PR1DRLDQ6kphM=; b=bIr8tV4jzqeciKwKQvt3/iyOG3VJZFJV9hcnYpqLyn8mdLcYQ9okF8lI F0036huFe/eWX38HEBTBO69YMevzhMYenZwVDE8Z7F2ZxafG7SMX6SsSp JvNW6NoXkfQP3Yq50Ij0p8TIv0WTkhdgbf4FvSipHB7c1mRBAz7DPO3Wx YmjiXHt3R28sdXTOYYm1gUjrn7DzOqH98u+n+Jo8jbGJrKuCaF3X7cT6/ vRETmuJZyuNFGbOt1Ies8e/T1bdot0iJCUM+PsnUNFzx9vukDS7CEPB+/ xwJj2eOQh9HKc/ydxRFX9gRidjp+yTq19LsrZeK1ETNdkVTFB34xQi7Vf A==; X-CSE-ConnectionGUID: P6nHZIVFRlqzmrzYKhlj/g== X-CSE-MsgGUID: UrmCUfc2QDqjqUCzmEM9aA== X-IronPort-AV: E=McAfee;i="6700,10204,11329"; a="50036019" X-IronPort-AV: E=Sophos;i="6.13,243,1732608000"; d="scan'208";a="50036019" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 01:59:44 -0800 X-CSE-ConnectionGUID: GB/2mLpTSrCJQMBuP2LN+Q== X-CSE-MsgGUID: pNKRLv3ZTleGWuRBaEvUcQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="132262676" Received: from ahunter6-mobl1.ger.corp.intel.com (HELO ahunter-VirtualBox.ger.corp.intel.com) ([10.246.0.178]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 01:59:39 -0800 From: Adrian Hunter To: pbonzini@redhat.com, seanjc@google.com Cc: kvm@vger.kernel.org, rick.p.edgecombe@intel.com, kai.huang@intel.com, adrian.hunter@intel.com, reinette.chatre@intel.com, xiaoyao.li@intel.com, tony.lindgren@linux.intel.com, binbin.wu@linux.intel.com, dmatlack@google.com, isaku.yamahata@intel.com, nik.borisov@suse.com, linux-kernel@vger.kernel.org, yan.y.zhao@intel.com, chao.gao@intel.com, weijiang.yang@intel.com Subject: [PATCH V2 05/12] KVM: TDX: Implement TDX vcpu enter/exit path Date: Wed, 29 Jan 2025 11:58:54 +0200 Message-ID: <20250129095902.16391-6-adrian.hunter@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250129095902.16391-1-adrian.hunter@intel.com> References: <20250129095902.16391-1-adrian.hunter@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki From: Isaku Yamahata Implement callbacks to enter/exit a TDX VCPU by calling tdh_vp_enter(). Ensure the TDX VCPU is in a correct state to run. Do not pass arguments from/to vcpu->arch.regs[] unconditionally. Instead, marshall state to/from the appropriate x86 registers only when needed, i.e., to handle some TDVMCALL sub-leaves following KVM's ABI to leverage the existing code. Signed-off-by: Isaku Yamahata Signed-off-by: Adrian Hunter --- TD vcpu enter/exit v2: - Move VCPU_TD_STATE_INITIALIZED check to tdx_vcpu_pre_run() (Xiaoyao) - Check TD_STATE_RUNNABLE also in tdx_vcpu_pre_run() (Yan) - Add back 'noinstr' for tdx_vcpu_enter_exit() (Sean) - Add WARN_ON_ONCE if force_immediate_exit (Sean) - Add vp_enter_args to vcpu_tdx to store the input/output arguments for tdh_vp_enter(). - Don't copy arguments to/from vcpu->arch.regs[] unconditionally. (Sean) TD vcpu enter/exit v1: - Make argument of tdx_vcpu_enter_exit() struct kvm_vcpu. - Update for the wrapper functions for SEAMCALLs. (Sean) - Remove noinstr (Sean) - Add a missing comma, clarify sched_in part, and update changelog to match code by dropping the PMU related paragraph (Binbin) https://lore.kernel.org/lkml/c0029d4d-3dee-4f11-a929-d64d2651bfb3@linux.intel.com/ - Remove the union tdx_exit_reason. (Sean) https://lore.kernel.org/kvm/ZfSExlemFMKjBtZb@google.com/ - Remove the code of special handling of vcpu->kvm->vm_bugged (Rick) https://lore.kernel.org/kvm/20240318234010.GD1645738@ls.amr.corp.intel.com/ - For !tdx->initialized case, set tdx->vp_enter_ret to TDX_SW_ERROR to avoid collision with EXIT_REASON_EXCEPTION_NMI. v19: - Removed export_symbol_gpl(host_xcr0) to the patch that uses it Changes v15 -> v16: - use __seamcall_saved_ret() - As struct tdx_module_args doesn't match with vcpu.arch.regs, copy regs before/after calling __seamcall_saved_ret(). --- arch/x86/kvm/vmx/main.c | 20 ++++++++++++++-- arch/x86/kvm/vmx/tdx.c | 47 ++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/tdx.h | 3 +++ arch/x86/kvm/vmx/x86_ops.h | 7 ++++++ 4 files changed, 75 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 1cc1c06461f2..301c1a26606f 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -133,6 +133,22 @@ static void vt_vcpu_load(struct kvm_vcpu *vcpu, int cpu) vmx_vcpu_load(vcpu, cpu); } +static int vt_vcpu_pre_run(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return tdx_vcpu_pre_run(vcpu); + + return vmx_vcpu_pre_run(vcpu); +} + +static fastpath_t vt_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) +{ + if (is_td_vcpu(vcpu)) + return tdx_vcpu_run(vcpu, force_immediate_exit); + + return vmx_vcpu_run(vcpu, force_immediate_exit); +} + static void vt_flush_tlb_all(struct kvm_vcpu *vcpu) { if (is_td_vcpu(vcpu)) { @@ -272,8 +288,8 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .flush_tlb_gva = vt_flush_tlb_gva, .flush_tlb_guest = vt_flush_tlb_guest, - .vcpu_pre_run = vmx_vcpu_pre_run, - .vcpu_run = vmx_vcpu_run, + .vcpu_pre_run = vt_vcpu_pre_run, + .vcpu_run = vt_vcpu_run, .handle_exit = vmx_handle_exit, .skip_emulated_instruction = vmx_skip_emulated_instruction, .update_emulated_instruction = vmx_update_emulated_instruction, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index a7ebdafdfd82..95420ffd0022 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -11,6 +11,8 @@ #include "vmx.h" #include "mmu/spte.h" #include "common.h" +#include +#include "trace.h" #pragma GCC poison to_vmx @@ -673,6 +675,51 @@ void tdx_vcpu_free(struct kvm_vcpu *vcpu) tdx->state = VCPU_TD_STATE_UNINITIALIZED; } +int tdx_vcpu_pre_run(struct kvm_vcpu *vcpu) +{ + if (unlikely(to_tdx(vcpu)->state != VCPU_TD_STATE_INITIALIZED || + to_kvm_tdx(vcpu->kvm)->state != TD_STATE_RUNNABLE)) + return -EINVAL; + + return 1; +} + +static noinstr void tdx_vcpu_enter_exit(struct kvm_vcpu *vcpu) +{ + struct vcpu_tdx *tdx = to_tdx(vcpu); + + guest_state_enter_irqoff(); + + tdx->vp_enter_ret = tdh_vp_enter(&tdx->vp, &tdx->vp_enter_args); + + guest_state_exit_irqoff(); +} + +#define TDX_REGS_UNSUPPORTED_SET (BIT(VCPU_EXREG_RFLAGS) | \ + BIT(VCPU_EXREG_SEGMENTS)) + +fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) +{ + /* + * force_immediate_exit requires vCPU entering for events injection with + * an immediately exit followed. But The TDX module doesn't guarantee + * entry, it's already possible for KVM to _think_ it completely entry + * to the guest without actually having done so. + * Since KVM never needs to force an immediate exit for TDX, and can't + * do direct injection, just warn on force_immediate_exit. + */ + WARN_ON_ONCE(force_immediate_exit); + + trace_kvm_entry(vcpu, force_immediate_exit); + + tdx_vcpu_enter_exit(vcpu); + + vcpu->arch.regs_avail &= ~TDX_REGS_UNSUPPORTED_SET; + + trace_kvm_exit(vcpu, KVM_ISA_VMX); + + return EXIT_FASTPATH_NONE; +} void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int pgd_level) { diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index ba880dae547f..8339bbf0fdd4 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -46,11 +46,14 @@ enum vcpu_tdx_state { struct vcpu_tdx { struct kvm_vcpu vcpu; struct vcpu_vt vt; + struct tdx_module_args vp_enter_args; struct tdx_vp vp; struct list_head cpu_list; + u64 vp_enter_ret; + enum vcpu_tdx_state state; }; diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index ff6370787926..83aac44b779b 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -131,6 +131,8 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); int tdx_vcpu_create(struct kvm_vcpu *vcpu); void tdx_vcpu_free(struct kvm_vcpu *vcpu); void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu); +int tdx_vcpu_pre_run(struct kvm_vcpu *vcpu); +fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit); int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp); @@ -158,6 +160,11 @@ static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOP static inline int tdx_vcpu_create(struct kvm_vcpu *vcpu) { return -EOPNOTSUPP; } static inline void tdx_vcpu_free(struct kvm_vcpu *vcpu) {} static inline void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) {} +static inline int tdx_vcpu_pre_run(struct kvm_vcpu *vcpu) { return -EOPNOTSUPP; } +static inline fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) +{ + return EXIT_FASTPATH_NONE; +} static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { return -EOPNOTSUPP; } From patchwork Wed Jan 29 09:58:55 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adrian Hunter X-Patchwork-Id: 13953587 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A98971D8DE1; Wed, 29 Jan 2025 10:00:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.12 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144818; cv=none; b=FFfQEg36QwHtbgOn5NqIKfbeNlcI2nvLvMi4BnFvsQtlmjBc88heM7xI8Yce9pFxu3aB4ItIewikXD5ClquwdfOv/MSYVZNrsWagQJXt+VfpEY/3gS/BYtEa3KrgyeosCY9hFoob+EyUD7VAlnk26T0cS/vjKWdycjC7rFog5KU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144818; c=relaxed/simple; bh=hdpIBn+tieTlQleqs77ck9auMv6P1U0+XxdcR00jtGc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=en3EyeBoHWWeLCEV1LZ1Aa3ZC6gn3k6nWugMhjUNzILoUr+aiBGoIlRJmDGW9+rFPFkIyxU4kWgj7R87nuxpcN8qMhvE+WO0nUFw9mPaxcGaiYvx6Tj5VDsND+TTg+gsJeubThWXvQINzy2PF7I/BQBJ4WP/JoGjmBJAP2Ptl44= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=FZvHnlrt; arc=none smtp.client-ip=198.175.65.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="FZvHnlrt" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738144817; x=1769680817; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hdpIBn+tieTlQleqs77ck9auMv6P1U0+XxdcR00jtGc=; b=FZvHnlrtCegSgFKWMB7CUNJDzbBNBabSSz1O1vSXnhX1R/X8zamSKbdO OR0o4Q4TmK+3bN8IEkBmB1xx2y84AE3Nwnfa2hhvd1ItHN9mLjGT8lpQE ljKRBmvsnHjw23JKhyVBFo37wNmuwxvTgz1xfHfoYzDm8cXq1rVRZQ0TO E29cNT30tMQ/QoOhPF5yRAEfjWN0WnfnbWGmq9vYIB/dan5E832jbErLN 6vcky/QJdnt+0NBD/3erFyif+B/qKz7THzguBuU+/7lvD44yz6qtaEDJF ZE6J5hNB0tEWmVzpPul7Ros4iVgoK6bXvCqN0t1jm3RVTsN5Leb1pswQe A==; X-CSE-ConnectionGUID: kgI+RWeQQWqR6LYs9pBGZA== X-CSE-MsgGUID: OhsimM8tQBWO9MF0IiYeiQ== X-IronPort-AV: E=McAfee;i="6700,10204,11329"; a="50036029" X-IronPort-AV: E=Sophos;i="6.13,243,1732608000"; d="scan'208";a="50036029" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 01:59:48 -0800 X-CSE-ConnectionGUID: CuwF7LN0R9WFQe0xlP0/aQ== X-CSE-MsgGUID: 3mc4e47wTaul4J3Sl8PlIg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="132262689" Received: from ahunter6-mobl1.ger.corp.intel.com (HELO ahunter-VirtualBox.ger.corp.intel.com) ([10.246.0.178]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 01:59:44 -0800 From: Adrian Hunter To: pbonzini@redhat.com, seanjc@google.com Cc: kvm@vger.kernel.org, rick.p.edgecombe@intel.com, kai.huang@intel.com, adrian.hunter@intel.com, reinette.chatre@intel.com, xiaoyao.li@intel.com, tony.lindgren@linux.intel.com, binbin.wu@linux.intel.com, dmatlack@google.com, isaku.yamahata@intel.com, nik.borisov@suse.com, linux-kernel@vger.kernel.org, yan.y.zhao@intel.com, chao.gao@intel.com, weijiang.yang@intel.com Subject: [PATCH V2 06/12] KVM: TDX: vcpu_run: save/restore host state(host kernel gs) Date: Wed, 29 Jan 2025 11:58:55 +0200 Message-ID: <20250129095902.16391-7-adrian.hunter@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250129095902.16391-1-adrian.hunter@intel.com> References: <20250129095902.16391-1-adrian.hunter@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki From: Isaku Yamahata On entering/exiting TDX vcpu, preserved or clobbered CPU state is different from the VMX case. Add TDX hooks to save/restore host/guest CPU state. Save/restore kernel GS base MSR. Signed-off-by: Isaku Yamahata Signed-off-by: Adrian Hunter Reviewed-by: Paolo Bonzini --- TD vcpu enter/exit v2: - Use 1 variable named 'guest_state_loaded' to track host state save/restore (Sean) - Rebased due to moving guest_state_loaded/msr_host_kernel_gs_base to struct vcpu_vt. TD vcpu enter/exit v1: - Clarify comment (Binbin) - Use lower case preserved and add the for VMX in log (Tony) - Fix bisectability issue with includes (Kai) --- arch/x86/kvm/vmx/main.c | 24 +++++++++++++++++++++-- arch/x86/kvm/vmx/tdx.c | 40 ++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/x86_ops.h | 4 ++++ 3 files changed, 66 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 301c1a26606f..341aa537ca72 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -133,6 +133,26 @@ static void vt_vcpu_load(struct kvm_vcpu *vcpu, int cpu) vmx_vcpu_load(vcpu, cpu); } +static void vt_prepare_switch_to_guest(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) { + tdx_prepare_switch_to_guest(vcpu); + return; + } + + vmx_prepare_switch_to_guest(vcpu); +} + +static void vt_vcpu_put(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) { + tdx_vcpu_put(vcpu); + return; + } + + vmx_vcpu_put(vcpu); +} + static int vt_vcpu_pre_run(struct kvm_vcpu *vcpu) { if (is_td_vcpu(vcpu)) @@ -253,9 +273,9 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .vcpu_free = vt_vcpu_free, .vcpu_reset = vt_vcpu_reset, - .prepare_switch_to_guest = vmx_prepare_switch_to_guest, + .prepare_switch_to_guest = vt_prepare_switch_to_guest, .vcpu_load = vt_vcpu_load, - .vcpu_put = vmx_vcpu_put, + .vcpu_put = vt_vcpu_put, .update_exception_bitmap = vmx_update_exception_bitmap, .get_feature_msr = vmx_get_feature_msr, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 95420ffd0022..3f3d61935a58 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -2,6 +2,7 @@ #include #include #include +#include #include #include "capabilities.h" #include "mmu.h" @@ -11,6 +12,7 @@ #include "vmx.h" #include "mmu/spte.h" #include "common.h" +#include "posted_intr.h" #include #include "trace.h" @@ -642,6 +644,44 @@ void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) local_irq_enable(); } +/* + * Compared to vmx_prepare_switch_to_guest(), there is not much to do + * as SEAMCALL/SEAMRET calls take care of most of save and restore. + */ +void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) +{ + struct vcpu_vt *vt = to_vt(vcpu); + + if (vt->guest_state_loaded) + return; + + if (likely(is_64bit_mm(current->mm))) + vt->msr_host_kernel_gs_base = current->thread.gsbase; + else + vt->msr_host_kernel_gs_base = read_msr(MSR_KERNEL_GS_BASE); + + vt->guest_state_loaded = true; +} + +static void tdx_prepare_switch_to_host(struct kvm_vcpu *vcpu) +{ + struct vcpu_vt *vt = to_vt(vcpu); + + if (!vt->guest_state_loaded) + return; + + ++vcpu->stat.host_state_reload; + wrmsrl(MSR_KERNEL_GS_BASE, vt->msr_host_kernel_gs_base); + + vt->guest_state_loaded = false; +} + +void tdx_vcpu_put(struct kvm_vcpu *vcpu) +{ + vmx_vcpu_pi_put(vcpu); + tdx_prepare_switch_to_host(vcpu); +} + void tdx_vcpu_free(struct kvm_vcpu *vcpu) { struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm); diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 83aac44b779b..f856eac8f1e8 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -133,6 +133,8 @@ void tdx_vcpu_free(struct kvm_vcpu *vcpu); void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu); int tdx_vcpu_pre_run(struct kvm_vcpu *vcpu); fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit); +void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu); +void tdx_vcpu_put(struct kvm_vcpu *vcpu); int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp); @@ -165,6 +167,8 @@ static inline fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediat { return EXIT_FASTPATH_NONE; } +static inline void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) {} +static inline void tdx_vcpu_put(struct kvm_vcpu *vcpu) {} static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { return -EOPNOTSUPP; } From patchwork Wed Jan 29 09:58:56 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Adrian Hunter X-Patchwork-Id: 13953585 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 619391B4230; Wed, 29 Jan 2025 10:00:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.12 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144816; cv=none; b=K9mpiki/4jUayP6QFvsWztMvlAwjEOtjvPDu5Br5tVQgRbbESemw1BSy/fkPSVxR7aMiVOFylbTUaJQ0UjRPFdYpX9NMZmiEUEHk52tYTTi9rh1Ug37IXlHtVBuYIF6g9e9wpmRNcSwrSUAwElZQQRwTmeohLC8LTbOO+xcw3l8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144816; c=relaxed/simple; bh=N5c0i7v7FOq2ySIZVHmG6YwndHPT4au+Y9Zs3T3d7Qk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=E5O0lnerOhH8gPk0Cs6Ht4OdAjclFtX5BNFOJdfWDrE/NqEE7fQI/AmgVuZGRZiuG2YYzAqwKRMawQt2OoNpBUra7dRVGK0fZsqvPy6aE16MkVzstVTqikXFk6VnKUrHli/4lpV+efCacW8fF9didVnEKs8GE9LHRVt+HdPEylE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=M+T4zBy9; arc=none smtp.client-ip=198.175.65.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="M+T4zBy9" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738144814; x=1769680814; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=N5c0i7v7FOq2ySIZVHmG6YwndHPT4au+Y9Zs3T3d7Qk=; b=M+T4zBy9EELQpRmfFbLXzMPPf8R8HXkQ7PI3JGg+AGV0FAlW1SF1BrMW cjOwWNg/3DSoCzyLT750q+Wy1k9VRJeQ6VmwhVPkid7lFHlAl7ibL3tg0 al3n4j+BZsCxlTPQt8JSKVTRDF8txP+hE2PcTPFqbYOVqQbeo4gBBjoDJ lzQW3skBzjJn9+uvth5nAl65RwlwCI/BF0asxQQIhMukyYl5V1hmYZHjd xJ7uXco7XrHUBa0iGXEPvQet0JDfAxiJyrMeGv+1vJBCpLF0KGdttFuLW fEoDDrvL71hA0+mJdpgXH5XAjVleyTQNH8dPTOAUPHQ3Bbg7e9kA35W8G Q==; X-CSE-ConnectionGUID: JGdG5H+nSCO7NR5Eaj1uuQ== X-CSE-MsgGUID: HP/95svBRhes3SHkQ71UtA== X-IronPort-AV: E=McAfee;i="6700,10204,11329"; a="50036040" X-IronPort-AV: E=Sophos;i="6.13,243,1732608000"; d="scan'208";a="50036040" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 01:59:53 -0800 X-CSE-ConnectionGUID: bYHOkFg5T2Kj/wdL4Dykfg== X-CSE-MsgGUID: vnw7UWpgQr+H37bFV2depg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="132262699" Received: from ahunter6-mobl1.ger.corp.intel.com (HELO ahunter-VirtualBox.ger.corp.intel.com) ([10.246.0.178]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 01:59:49 -0800 From: Adrian Hunter To: pbonzini@redhat.com, seanjc@google.com Cc: kvm@vger.kernel.org, rick.p.edgecombe@intel.com, kai.huang@intel.com, adrian.hunter@intel.com, reinette.chatre@intel.com, xiaoyao.li@intel.com, tony.lindgren@linux.intel.com, binbin.wu@linux.intel.com, dmatlack@google.com, isaku.yamahata@intel.com, nik.borisov@suse.com, linux-kernel@vger.kernel.org, yan.y.zhao@intel.com, chao.gao@intel.com, weijiang.yang@intel.com Subject: [PATCH V2 07/12] KVM: TDX: restore host xsave state when exit from the guest TD Date: Wed, 29 Jan 2025 11:58:56 +0200 Message-ID: <20250129095902.16391-8-adrian.hunter@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250129095902.16391-1-adrian.hunter@intel.com> References: <20250129095902.16391-1-adrian.hunter@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki From: Isaku Yamahata On exiting from the guest TD, xsave state is clobbered. Restore xsave state on TD exit. Set up guest state so that existing kvm_load_host_xsave_state() can be used. Do not allow VCPU entry if guest state conflicts with the TD's configuration. Signed-off-by: Isaku Yamahata Signed-off-by: Adrian Hunter --- TD vcpu enter/exit v2: - Drop PT and CET feature flags (Chao) - Use cpu_feature_enabled() instead of static_cpu_has() (Chao) - Restore PKRU only if the host value differs from defined exit value (Chao) - Use defined masks to separate XFAM bits into XCR0/XSS (Adrian) - Use existing kvm_load_host_xsave_state() in place of tdx_restore_host_xsave_state() by defining guest CR4, XCR0, XSS and PKRU (Sean) - Do not enter if vital guest state is invalid (Adrian) TD vcpu enter/exit v1: - Remove noinstr on tdx_vcpu_enter_exit() (Sean) - Switch to kvm_host struct for xcr0 and xss v19: - Add EXPORT_SYMBOL_GPL(host_xcr0) v15 -> v16: - Added CET flag mask --- arch/x86/kvm/vmx/tdx.c | 72 ++++++++++++++++++++++++++++++++++++++---- 1 file changed, 66 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 3f3d61935a58..e4355553569a 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -83,16 +83,21 @@ static u64 tdx_get_supported_attrs(const struct tdx_sys_info_td_conf *td_conf) return val; } +/* + * Before returning from TDH.VP.ENTER, the TDX Module assigns: + * XCR0 to the TD’s user-mode feature bits of XFAM (bits 7:0, 9, 18:17) + * IA32_XSS to the TD's supervisor-mode feature bits of XFAM (bits 8, 16:10) + */ +#define TDX_XFAM_XCR0_MASK (GENMASK(7, 0) | BIT(9) | GENMASK(18, 17)) +#define TDX_XFAM_XSS_MASK (BIT(8) | GENMASK(16, 10)) +#define TDX_XFAM_MASK (TDX_XFAM_XCR0_MASK | TDX_XFAM_XSS_MASK) + static u64 tdx_get_supported_xfam(const struct tdx_sys_info_td_conf *td_conf) { u64 val = kvm_caps.supported_xcr0 | kvm_caps.supported_xss; - /* - * PT and CET can be exposed to TD guest regardless of KVM's XSS, PT - * and, CET support. - */ - val |= XFEATURE_MASK_PT | XFEATURE_MASK_CET_USER | - XFEATURE_MASK_CET_KERNEL; + /* Ensure features are in the masks */ + val &= TDX_XFAM_MASK; if ((val & td_conf->xfam_fixed1) != td_conf->xfam_fixed1) return 0; @@ -724,6 +729,19 @@ int tdx_vcpu_pre_run(struct kvm_vcpu *vcpu) return 1; } +static bool tdx_guest_state_is_invalid(struct kvm_vcpu *vcpu) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm); + + return vcpu->arch.xcr0 != (kvm_tdx->xfam & TDX_XFAM_XCR0_MASK) || + vcpu->arch.ia32_xss != (kvm_tdx->xfam & TDX_XFAM_XSS_MASK) || + vcpu->arch.pkru || + (cpu_feature_enabled(X86_FEATURE_XSAVE) && + !kvm_is_cr4_bit_set(vcpu, X86_CR4_OSXSAVE)) || + (cpu_feature_enabled(X86_FEATURE_XSAVES) && + !guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVES)); +} + static noinstr void tdx_vcpu_enter_exit(struct kvm_vcpu *vcpu) { struct vcpu_tdx *tdx = to_tdx(vcpu); @@ -740,6 +758,8 @@ static noinstr void tdx_vcpu_enter_exit(struct kvm_vcpu *vcpu) fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) { + struct vcpu_tdx *tdx = to_tdx(vcpu); + /* * force_immediate_exit requires vCPU entering for events injection with * an immediately exit followed. But The TDX module doesn't guarantee @@ -750,10 +770,22 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) */ WARN_ON_ONCE(force_immediate_exit); + if (WARN_ON_ONCE(tdx_guest_state_is_invalid(vcpu))) { + /* + * Invalid exit_reason becomes KVM_EXIT_INTERNAL_ERROR, refer + * tdx_handle_exit(). + */ + tdx->vt.exit_reason.full = -1u; + tdx->vp_enter_ret = -1u; + return EXIT_FASTPATH_NONE; + } + trace_kvm_entry(vcpu, force_immediate_exit); tdx_vcpu_enter_exit(vcpu); + kvm_load_host_xsave_state(vcpu); + vcpu->arch.regs_avail &= ~TDX_REGS_UNSUPPORTED_SET; trace_kvm_exit(vcpu, KVM_ISA_VMX); @@ -1878,9 +1910,23 @@ static int tdx_vcpu_get_cpuid(struct kvm_vcpu *vcpu, struct kvm_tdx_cmd *cmd) return r; } +static u64 tdx_guest_cr0(struct kvm_vcpu *vcpu, u64 cr4) +{ + u64 cr0 = ~CR0_RESERVED_BITS; + + if (cr4 & X86_CR4_CET) + cr0 |= X86_CR0_WP; + + cr0 |= X86_CR0_PE | X86_CR0_NE; + cr0 &= ~(X86_CR0_NW | X86_CR0_CD); + + return cr0; +} + static int tdx_vcpu_init(struct kvm_vcpu *vcpu, struct kvm_tdx_cmd *cmd) { u64 apic_base; + struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm); struct vcpu_tdx *tdx = to_tdx(vcpu); int ret; @@ -1903,6 +1949,20 @@ static int tdx_vcpu_init(struct kvm_vcpu *vcpu, struct kvm_tdx_cmd *cmd) if (ret) return ret; + vcpu->arch.cr4 = ~vcpu->arch.cr4_guest_rsvd_bits; + vcpu->arch.cr0 = tdx_guest_cr0(vcpu, vcpu->arch.cr4); + /* + * On return from VP.ENTER, the TDX Module sets XCR0 and XSS to the + * maximal values supported by the guest, and zeroes PKRU, so from + * KVM's perspective, those are the guest's values at all times. + */ + vcpu->arch.ia32_xss = kvm_tdx->xfam & TDX_XFAM_XSS_MASK; + vcpu->arch.xcr0 = kvm_tdx->xfam & TDX_XFAM_XCR0_MASK; + vcpu->arch.pkru = 0; + + /* TODO: freeze vCPU model before kvm_update_cpuid_runtime() */ + kvm_update_cpuid_runtime(vcpu); + tdx->state = VCPU_TD_STATE_INITIALIZED; return 0; From patchwork Wed Jan 29 09:58:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adrian Hunter X-Patchwork-Id: 13953586 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A98ED1D8E10; Wed, 29 Jan 2025 10:00:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.12 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144818; cv=none; b=rccGp37px4MzUw+m4yxfTpTRrnR3HQl7cuEbCGxfHv5J7WEXp6YRPVW+C8L9OO4d1fXsAZ5gYkQPdrcyP7t77IojzZPMVsgKbUcNlOQLOjjSP42p6In+glI+CTiklAkOVlfNvnLQ243dAemah+SJL74stTwjRXoVkKbG8IH9UfQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144818; c=relaxed/simple; bh=g2KLE8NL4H2nXqxB/6OS8EevoKIgyR/mQRghF7+qh/c=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=hdDO8/y9qOK/aWjP0qmIo/k6CZTN6FoOUexNkm+Er99mOfpsYhgjnh3D6SJS84j/6jzwyEsg07ghi5MNwuXL5/WFQHtAa8IIBx8oUo1Gt5WlvS0+idw2GtUoT+yIGpXF9g2OkTT2w152kgfrd0ygic6/hd2VBV+fS0xrrLJM8dY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=DDd3xArD; arc=none smtp.client-ip=198.175.65.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="DDd3xArD" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738144817; x=1769680817; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=g2KLE8NL4H2nXqxB/6OS8EevoKIgyR/mQRghF7+qh/c=; b=DDd3xArDF6LarxP8O9+77FODwvkSbrT0qvYSa5zDeteFfuOczZlgVIvl K3qFK61dRz6CSxRP33tFpn/TbHrrwleQSNpzhm6yglLQmF1D2V9bWUg4W xqnRZyq2+WseSf6eaaLc3s5SDPbxi//JfdNRXyIGD8uyro13+7+dyRGxN x2tDlLQ8XvgTFafiagME2/csvNw6L4KWiMyo7QZwOkxz82Kz63rNaNbR/ eUVwVlbdnBeQdhdeHInlqQnQ7CxZFdycI/VNZUcDbBg3yHGCH7ZRe3mc9 L0EeD4eI7QUpcqZEk+E2omN1SRt4vOf/gQY8/YlGcqTagb0Ht8JYwTQMc A==; X-CSE-ConnectionGUID: z8cpku4cQR6kaItT7qLbkw== X-CSE-MsgGUID: Vxtw0s44Qs+cFlJvQZCTrg== X-IronPort-AV: E=McAfee;i="6700,10204,11329"; a="50036049" X-IronPort-AV: E=Sophos;i="6.13,243,1732608000"; d="scan'208";a="50036049" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 01:59:58 -0800 X-CSE-ConnectionGUID: EBUctnx8TpehHWWb8YtJug== X-CSE-MsgGUID: xPO305oyQjag+yo4xyM9sA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="132262704" Received: from ahunter6-mobl1.ger.corp.intel.com (HELO ahunter-VirtualBox.ger.corp.intel.com) ([10.246.0.178]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 01:59:54 -0800 From: Adrian Hunter To: pbonzini@redhat.com, seanjc@google.com Cc: kvm@vger.kernel.org, rick.p.edgecombe@intel.com, kai.huang@intel.com, adrian.hunter@intel.com, reinette.chatre@intel.com, xiaoyao.li@intel.com, tony.lindgren@linux.intel.com, binbin.wu@linux.intel.com, dmatlack@google.com, isaku.yamahata@intel.com, nik.borisov@suse.com, linux-kernel@vger.kernel.org, yan.y.zhao@intel.com, chao.gao@intel.com, weijiang.yang@intel.com Subject: [PATCH V2 08/12] KVM: x86: Allow to update cached values in kvm_user_return_msrs w/o wrmsr Date: Wed, 29 Jan 2025 11:58:57 +0200 Message-ID: <20250129095902.16391-9-adrian.hunter@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250129095902.16391-1-adrian.hunter@intel.com> References: <20250129095902.16391-1-adrian.hunter@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki From: Chao Gao Several MSRs are constant and only used in userspace(ring 3). But VMs may have different values. KVM uses kvm_set_user_return_msr() to switch to guest's values and leverages user return notifier to restore them when the kernel is to return to userspace. To eliminate unnecessary wrmsr, KVM also caches the value it wrote to an MSR last time. TDX module unconditionally resets some of these MSRs to architectural INIT state on TD exit. It makes the cached values in kvm_user_return_msrs are inconsistent with values in hardware. This inconsistency needs to be fixed. Otherwise, it may mislead kvm_on_user_return() to skip restoring some MSRs to the host's values. kvm_set_user_return_msr() can help correct this case, but it is not optimal as it always does a wrmsr. So, introduce a variation of kvm_set_user_return_msr() to update cached values and skip that wrmsr. Signed-off-by: Chao Gao Signed-off-by: Isaku Yamahata Signed-off-by: Adrian Hunter Reviewed-by: Paolo Bonzini --- TD vcpu enter/exit v2: - No changes TD vcpu enter/exit v1: - Rename functions and remove useless comment (Binbin) --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/x86.c | 24 +++++++++++++++++++----- 2 files changed, 20 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 6b686d62c735..e557a441fade 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -2322,6 +2322,7 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low, int kvm_add_user_return_msr(u32 msr); int kvm_find_user_return_msr(u32 msr); int kvm_set_user_return_msr(unsigned index, u64 val, u64 mask); +void kvm_user_return_msr_update_cache(unsigned int index, u64 val); static inline bool kvm_is_supported_user_return_msr(u32 msr) { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5cf9f023fd4b..15447fe7687c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -636,6 +636,15 @@ static void kvm_user_return_msr_cpu_online(void) } } +static void kvm_user_return_register_notifier(struct kvm_user_return_msrs *msrs) +{ + if (!msrs->registered) { + msrs->urn.on_user_return = kvm_on_user_return; + user_return_notifier_register(&msrs->urn); + msrs->registered = true; + } +} + int kvm_set_user_return_msr(unsigned slot, u64 value, u64 mask) { struct kvm_user_return_msrs *msrs = this_cpu_ptr(user_return_msrs); @@ -649,15 +658,20 @@ int kvm_set_user_return_msr(unsigned slot, u64 value, u64 mask) return 1; msrs->values[slot].curr = value; - if (!msrs->registered) { - msrs->urn.on_user_return = kvm_on_user_return; - user_return_notifier_register(&msrs->urn); - msrs->registered = true; - } + kvm_user_return_register_notifier(msrs); return 0; } EXPORT_SYMBOL_GPL(kvm_set_user_return_msr); +void kvm_user_return_msr_update_cache(unsigned int slot, u64 value) +{ + struct kvm_user_return_msrs *msrs = this_cpu_ptr(user_return_msrs); + + msrs->values[slot].curr = value; + kvm_user_return_register_notifier(msrs); +} +EXPORT_SYMBOL_GPL(kvm_user_return_msr_update_cache); + static void drop_user_return_notifiers(void) { struct kvm_user_return_msrs *msrs = this_cpu_ptr(user_return_msrs); From patchwork Wed Jan 29 09:58:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adrian Hunter X-Patchwork-Id: 13953588 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E082E1D89F1; Wed, 29 Jan 2025 10:00:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.12 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144820; cv=none; b=lvWEjHOgWiaAjHSo2EWFgDQtaXzA7652jjdWKNvjyaZii3/l+sqW9JEdrNpr51o3mlDJmIF4rLRbzH+kZk1AV4LEQJqWf3gpgFrltBmwRXpXR0kgY8TBRCXiMF7rAceGoGXKhxfq9q0kO/DW1Mo+fYjYPPz38Qs1qMMTEDDPR0o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144820; c=relaxed/simple; bh=FcF5hew6e4R7LhHMHDbHCpjzgUfNYWKUEr0/w5BRJm8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Qk8Rqk1s6RvtYpJKInkhR6kLE12cNLxgv2XHnwLCy5k70C/PfQMF4MkTFb8s9bDq+rURsIinWzjrkLAjnOJn6xIOrEkoM2sUo4g6cSI+7Vyc6kNqQodJTPn6BgCCqH1y8wwjISWkR8pFZKPHjIY1YgcS7HKl2Y+IfDhE9Hpg13Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=D5gm67J1; arc=none smtp.client-ip=198.175.65.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="D5gm67J1" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738144818; x=1769680818; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FcF5hew6e4R7LhHMHDbHCpjzgUfNYWKUEr0/w5BRJm8=; b=D5gm67J1I2KnxpdeySugSJOLWq50JE8VnnbCbMgNFtzPc4mL1IcTeb2L Aa1lECEf7QjvwoTOpGNi11VpbvZZW/xV/DIaof3IqwRDmaeOKcQFuoOf/ Ni0bnSLsowyI4m4lmQrORkDK2hxj9ko7rrVIx5BAUTD2fRjOFczGj+tIO G/+zMZnsVlCqfsRBTiMsxh/m7MDNBYs6oSq5w1nM6+YmxXd8mpoAWc/jU pt+p50l7BKatiF5ugD3hmpoqLz8hMnb6hKTPzoXA8HY6gWaFDPaBm2kFG 4f9Szipc807RMAusU617pWFPsGWrGRRIFEBuV3Hvepwgp7LZFfS8Z9xBc A==; X-CSE-ConnectionGUID: NiYEgQLBT4ysxBwTxZZ+sw== X-CSE-MsgGUID: SjPT4eHGQWyQKE1qGGv+ZQ== X-IronPort-AV: E=McAfee;i="6700,10204,11329"; a="50036056" X-IronPort-AV: E=Sophos;i="6.13,243,1732608000"; d="scan'208";a="50036056" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 02:00:04 -0800 X-CSE-ConnectionGUID: /Vzw2NafRpakkDKRczyY2A== X-CSE-MsgGUID: Je1P8rllQdKvUQCqlV4Eiw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="132262752" Received: from ahunter6-mobl1.ger.corp.intel.com (HELO ahunter-VirtualBox.ger.corp.intel.com) ([10.246.0.178]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 01:59:58 -0800 From: Adrian Hunter To: pbonzini@redhat.com, seanjc@google.com Cc: kvm@vger.kernel.org, rick.p.edgecombe@intel.com, kai.huang@intel.com, adrian.hunter@intel.com, reinette.chatre@intel.com, xiaoyao.li@intel.com, tony.lindgren@linux.intel.com, binbin.wu@linux.intel.com, dmatlack@google.com, isaku.yamahata@intel.com, nik.borisov@suse.com, linux-kernel@vger.kernel.org, yan.y.zhao@intel.com, chao.gao@intel.com, weijiang.yang@intel.com Subject: [PATCH V2 09/12] KVM: TDX: restore user ret MSRs Date: Wed, 29 Jan 2025 11:58:58 +0200 Message-ID: <20250129095902.16391-10-adrian.hunter@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250129095902.16391-1-adrian.hunter@intel.com> References: <20250129095902.16391-1-adrian.hunter@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki From: Isaku Yamahata Several user ret MSRs are clobbered on TD exit. Restore those values on TD exit and before returning to ring 3. Co-developed-by: Tony Lindgren Signed-off-by: Tony Lindgren Signed-off-by: Isaku Yamahata Signed-off-by: Adrian Hunter Reviewed-by: Paolo Bonzini --- TD vcpu enter/exit v2: - No changes TD vcpu enter/exit v1: - Rename tdx_user_return_update_cache() -> tdx_user_return_msr_update_cache() (extrapolated from Binbin) - Adjust to rename in previous patches (Binbin) - Simplify comment (Tony) - Move code change in tdx_hardware_setup() to __tdx_bringup(). --- arch/x86/kvm/vmx/tdx.c | 44 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 43 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index e4355553569a..a0f5cdfd290b 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -729,6 +729,28 @@ int tdx_vcpu_pre_run(struct kvm_vcpu *vcpu) return 1; } +struct tdx_uret_msr { + u32 msr; + unsigned int slot; + u64 defval; +}; + +static struct tdx_uret_msr tdx_uret_msrs[] = { + {.msr = MSR_SYSCALL_MASK, .defval = 0x20200 }, + {.msr = MSR_STAR,}, + {.msr = MSR_LSTAR,}, + {.msr = MSR_TSC_AUX,}, +}; + +static void tdx_user_return_msr_update_cache(void) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(tdx_uret_msrs); i++) + kvm_user_return_msr_update_cache(tdx_uret_msrs[i].slot, + tdx_uret_msrs[i].defval); +} + static bool tdx_guest_state_is_invalid(struct kvm_vcpu *vcpu) { struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm); @@ -784,6 +806,8 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) tdx_vcpu_enter_exit(vcpu); + tdx_user_return_msr_update_cache(); + kvm_load_host_xsave_state(vcpu); vcpu->arch.regs_avail &= ~TDX_REGS_UNSUPPORTED_SET; @@ -2245,7 +2269,25 @@ static bool __init kvm_can_support_tdx(void) static int __init __tdx_bringup(void) { const struct tdx_sys_info_td_conf *td_conf; - int r; + int r, i; + + for (i = 0; i < ARRAY_SIZE(tdx_uret_msrs); i++) { + /* + * Check if MSRs (tdx_uret_msrs) can be saved/restored + * before returning to user space. + * + * this_cpu_ptr(user_return_msrs)->registered isn't checked + * because the registration is done at vcpu runtime by + * tdx_user_return_msr_update_cache(). + */ + tdx_uret_msrs[i].slot = kvm_find_user_return_msr(tdx_uret_msrs[i].msr); + if (tdx_uret_msrs[i].slot == -1) { + /* If any MSR isn't supported, it is a KVM bug */ + pr_err("MSR %x isn't included by kvm_find_user_return_msr\n", + tdx_uret_msrs[i].msr); + return -EIO; + } + } /* * Enabling TDX requires enabling hardware virtualization first, From patchwork Wed Jan 29 09:58:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adrian Hunter X-Patchwork-Id: 13953589 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8956D1B425C; Wed, 29 Jan 2025 10:00:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.12 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144829; cv=none; b=oh/Zs56KzJpKdZBxBPwH+ApHwmJ7BbRQzQUYwhi2vC/1ciB1uKoAbcNvBsf1G7glj0VjyVnOSdR8Zzl5AKHQlrum2lJG8Wk9dJTo00yj0GeWgcfrUHsxz8Bu9iNzp4TK3eaEZMwJzApSGth/zm2u3UuZBoyrYf3rdM4PWJxOHfM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144829; c=relaxed/simple; bh=dO6/lJT7ZqiH9aGBeVqPFqVQ1A/1FHOiK7GExhREHxQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eQsqMpQFpBNS6Bq4vek+sU+bbMsF6N+x1lPzFjX0YWmzDd/6loI5iOZG7SoIon1mYgvMcHCSPzNDy4zQqB6GdW8HgZRVBj5J11SGnMGj8/HbZO1UpBhbImKF0P2c1/xoxvSlRmNIze0niJMfBd3tVqHrSyzVOn78eBVpAB7NZXo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ZvEvJOwa; arc=none smtp.client-ip=198.175.65.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ZvEvJOwa" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738144828; x=1769680828; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dO6/lJT7ZqiH9aGBeVqPFqVQ1A/1FHOiK7GExhREHxQ=; b=ZvEvJOwaPe24JACyCOmnN1hvAMaEM78llKToGsqFituW1uJN/jj65x/A Uye0Bm7WJa+w1LviwZZE3ZHwrX/WyxPKKuGvTaNYapFcSnzPWFFWf4sUg KU/0J2JZJJQz8JV1fDWYh0u1ruSUchWU4njYI/Z2rhvXeM9Bmeapf8uwm S/fRog/XFeZH4UJ2xtsMja09wY79HRnpcRU3Vl3hDEnHSON6I0524VleC +zGrqcFXpI9SMamMBTsLb8sHHDB+L0EFsnjS+WD6HETbwT12ONhqRy/ZB IqHbYIZZuBx/MSFf1FGlG/046TOiZb/w4Sw3a0s/7TAsM2lvbAsqYJrpN w==; X-CSE-ConnectionGUID: A+z889jzS6GZ6NIb9azlsQ== X-CSE-MsgGUID: IKUgD4m1R0yO+z3aiM/Axw== X-IronPort-AV: E=McAfee;i="6700,10204,11329"; a="50036069" X-IronPort-AV: E=Sophos;i="6.13,243,1732608000"; d="scan'208";a="50036069" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 02:00:08 -0800 X-CSE-ConnectionGUID: dUZpHX8RTIOwHjHaMrEfGw== X-CSE-MsgGUID: ytyZ5RqBQAqYXPbDYFpVWA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="132262794" Received: from ahunter6-mobl1.ger.corp.intel.com (HELO ahunter-VirtualBox.ger.corp.intel.com) ([10.246.0.178]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 02:00:03 -0800 From: Adrian Hunter To: pbonzini@redhat.com, seanjc@google.com Cc: kvm@vger.kernel.org, rick.p.edgecombe@intel.com, kai.huang@intel.com, adrian.hunter@intel.com, reinette.chatre@intel.com, xiaoyao.li@intel.com, tony.lindgren@linux.intel.com, binbin.wu@linux.intel.com, dmatlack@google.com, isaku.yamahata@intel.com, nik.borisov@suse.com, linux-kernel@vger.kernel.org, yan.y.zhao@intel.com, chao.gao@intel.com, weijiang.yang@intel.com Subject: [PATCH V2 10/12] KVM: TDX: Disable support for TSX and WAITPKG Date: Wed, 29 Jan 2025 11:58:59 +0200 Message-ID: <20250129095902.16391-11-adrian.hunter@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250129095902.16391-1-adrian.hunter@intel.com> References: <20250129095902.16391-1-adrian.hunter@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki Support for restoring IA32_TSX_CTRL MSR and IA32_UMWAIT_CONTROL MSR is not yet implemented, so disable support for TSX and WAITPKG for now. Clear the associated CPUID bits returned by KVM_TDX_CAPABILITIES, and return an error if those bits are set in KVM_TDX_INIT_VM. Signed-off-by: Adrian Hunter --- TD vcpu enter/exit v2: - New patch --- arch/x86/kvm/vmx/tdx.c | 43 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index a0f5cdfd290b..70996af4be64 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -117,6 +117,44 @@ static u32 tdx_set_guest_phys_addr_bits(const u32 eax, int addr_bits) return (eax & ~GENMASK(23, 16)) | (addr_bits & 0xff) << 16; } +#define TDX_FEATURE_TSX (__feature_bit(X86_FEATURE_HLE) | __feature_bit(X86_FEATURE_RTM)) + +static bool has_tsx(const struct kvm_cpuid_entry2 *entry) +{ + return entry->function == 7 && entry->index == 0 && + (entry->ebx & TDX_FEATURE_TSX); +} + +static void clear_tsx(struct kvm_cpuid_entry2 *entry) +{ + entry->ebx &= ~TDX_FEATURE_TSX; +} + +static bool has_waitpkg(const struct kvm_cpuid_entry2 *entry) +{ + return entry->function == 7 && entry->index == 0 && + (entry->ecx & __feature_bit(X86_FEATURE_WAITPKG)); +} + +static void clear_waitpkg(struct kvm_cpuid_entry2 *entry) +{ + entry->ecx &= ~__feature_bit(X86_FEATURE_WAITPKG); +} + +static void tdx_clear_unsupported_cpuid(struct kvm_cpuid_entry2 *entry) +{ + if (has_tsx(entry)) + clear_tsx(entry); + + if (has_waitpkg(entry)) + clear_waitpkg(entry); +} + +static bool tdx_unsupported_cpuid(const struct kvm_cpuid_entry2 *entry) +{ + return has_tsx(entry) || has_waitpkg(entry); +} + #define KVM_TDX_CPUID_NO_SUBLEAF ((__u32)-1) static void td_init_cpuid_entry2(struct kvm_cpuid_entry2 *entry, unsigned char idx) @@ -140,6 +178,8 @@ static void td_init_cpuid_entry2(struct kvm_cpuid_entry2 *entry, unsigned char i */ if (entry->function == 0x80000008) entry->eax = tdx_set_guest_phys_addr_bits(entry->eax, 0xff); + + tdx_clear_unsupported_cpuid(entry); } static int init_kvm_tdx_caps(const struct tdx_sys_info_td_conf *td_conf, @@ -1214,6 +1254,9 @@ static int setup_tdparams_cpuids(struct kvm_cpuid2 *cpuid, if (!entry) continue; + if (tdx_unsupported_cpuid(entry)) + return -EINVAL; + copy_cnt++; value = &td_params->cpuid_values[i]; From patchwork Wed Jan 29 09:59:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adrian Hunter X-Patchwork-Id: 13953590 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C4D2B1B425D; Wed, 29 Jan 2025 10:00:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.12 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144830; cv=none; b=BQ9n6OKTQx0JFM5ez+5VIvwcdOq7GcF+Id/o2Is1HiDyTclJ31duLlKhfZXSx4wxfTs27rnMx4wE7umxXB4s9BcDbtaqwvGcz1ZbquqLs3PzgK6ZTegXJ/jwbiOxUo7Xhd/jIHS2BAX/GlM1lnFmkLTXCYyk12k/aKIZYDtGpGY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144830; c=relaxed/simple; bh=O5+GagowUx55L99n/4Spax+D/MajyZfzqN8aWUlzMGk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Qhexn3B8WD2IZCjkS5waOTKAnXByuL2LrS8s0C7q3nirxqOV5SFPxEUtJqT2gnbJ4sVBkH1+WonKUqhk8DzJRHayP03FkGS4UXXKB48ynzLCnqoSIKk9TyUj3ZWIwZdf3aX/VwztjTnvSIZeRo/sj0KHYClp1pJQYRiUZLQr0Sk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=APIEcomw; arc=none smtp.client-ip=198.175.65.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="APIEcomw" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738144829; x=1769680829; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=O5+GagowUx55L99n/4Spax+D/MajyZfzqN8aWUlzMGk=; b=APIEcomwLEm1XD76GrS30ep5x7SWT0Rnp06fVq9rqTdNvDxNe6QVR2fl IS5D65F0lJ6t5zgwXFR/lyVfcIPnInPHJk/g7Z1GBOkaTfWXpwJn0xp8z iGFK/CtAWzHmT6oRldvr4btMXx3JMl3/LAC1OxCStf3yD6IeCWGnvV3h4 S+vv/FVQxLUtGpJTQPAjIp02hJCKLdQ08Zbvef06/aVVolBByb+frJqJW 5emalOypa9gTb0F20Rx6vt8COsv47fJ1Bm1B36qUfL2Pc361keOoSVyBk aXyGVY3SOk9iJy/TwNbs8zihAUjFWJGm9MW9Asm0iRAfJ6+HOSRV6pYuU w==; X-CSE-ConnectionGUID: GZu+ynKbSHWf6FmhmiojlQ== X-CSE-MsgGUID: V72bPBGCTyiP9nTU1G4/Ag== X-IronPort-AV: E=McAfee;i="6700,10204,11329"; a="50036081" X-IronPort-AV: E=Sophos;i="6.13,243,1732608000"; d="scan'208";a="50036081" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 02:00:13 -0800 X-CSE-ConnectionGUID: yeNmvhxYR3WFa/TbfQoPLA== X-CSE-MsgGUID: xpVOc81CT1iA2H0czh5mzw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="132262810" Received: from ahunter6-mobl1.ger.corp.intel.com (HELO ahunter-VirtualBox.ger.corp.intel.com) ([10.246.0.178]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 02:00:08 -0800 From: Adrian Hunter To: pbonzini@redhat.com, seanjc@google.com Cc: kvm@vger.kernel.org, rick.p.edgecombe@intel.com, kai.huang@intel.com, adrian.hunter@intel.com, reinette.chatre@intel.com, xiaoyao.li@intel.com, tony.lindgren@linux.intel.com, binbin.wu@linux.intel.com, dmatlack@google.com, isaku.yamahata@intel.com, nik.borisov@suse.com, linux-kernel@vger.kernel.org, yan.y.zhao@intel.com, chao.gao@intel.com, weijiang.yang@intel.com Subject: [PATCH V2 11/12] KVM: TDX: Save and restore IA32_DEBUGCTL Date: Wed, 29 Jan 2025 11:59:00 +0200 Message-ID: <20250129095902.16391-12-adrian.hunter@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250129095902.16391-1-adrian.hunter@intel.com> References: <20250129095902.16391-1-adrian.hunter@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki Save the IA32_DEBUGCTL MSR before entering a TDX VCPU and restore it afterwards. The TDX Module preserves bits 1, 12, and 14, so if no other bits are set, no restore is done. Signed-off-by: Adrian Hunter --- TD vcpu enter/exit v2: - New patch - Rebased due to moving host_debugctlmsr to struct vcpu_vt. --- arch/x86/kvm/vmx/tdx.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 70996af4be64..0bce00415f42 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -705,6 +705,8 @@ void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) else vt->msr_host_kernel_gs_base = read_msr(MSR_KERNEL_GS_BASE); + vt->host_debugctlmsr = get_debugctlmsr(); + vt->guest_state_loaded = true; } @@ -818,9 +820,14 @@ static noinstr void tdx_vcpu_enter_exit(struct kvm_vcpu *vcpu) #define TDX_REGS_UNSUPPORTED_SET (BIT(VCPU_EXREG_RFLAGS) | \ BIT(VCPU_EXREG_SEGMENTS)) +#define TDX_DEBUGCTL_PRESERVED (DEBUGCTLMSR_BTF | \ + DEBUGCTLMSR_FREEZE_PERFMON_ON_PMI | \ + DEBUGCTLMSR_FREEZE_IN_SMM) + fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) { struct vcpu_tdx *tdx = to_tdx(vcpu); + struct vcpu_vt *vt = to_vt(vcpu); /* * force_immediate_exit requires vCPU entering for events injection with @@ -846,6 +853,9 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) tdx_vcpu_enter_exit(vcpu); + if (vt->host_debugctlmsr & ~TDX_DEBUGCTL_PRESERVED) + update_debugctlmsr(vt->host_debugctlmsr); + tdx_user_return_msr_update_cache(); kvm_load_host_xsave_state(vcpu); From patchwork Wed Jan 29 09:59:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adrian Hunter X-Patchwork-Id: 13953591 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 89C2D1B6D17; Wed, 29 Jan 2025 10:00:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.12 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144837; cv=none; b=Ha3hDsmy8KiMRzFJJuAaA0LqiU3lZ0oRse6Aa8F2oTjKg0kCsSkt34vv8oG7atGHFyX3nO2TQcrAWEFKw4YFLPXi4lsWD/k5LXhKvaXOYV09gMAUoC6g1c1Wc0bcSbn1Mi40X/7PFTcC8WEAV8/q6dVKPgN02EYUwmiRqybLoGk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738144837; c=relaxed/simple; bh=+ixnKposbO4ed0uhOHW1IAo8kancv+oQmaa2H5HjQGA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DGDd7MxWqvzLKkii1Z6y3FyI2hmRQxWbSooWK3Ggetz5VMWN/dno+XM4sOQLbTA1cTTMQOk+qzlmqVHZ/hfrjP8L3NxM0H6ouGDBXBTysn/F62s8cWPb3tUNbwNcxZ2S54NDU0+oCaTFLpqCrssqn99VhaAlIX2x61CeGQnKTNo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Y/Au/CFd; arc=none smtp.client-ip=198.175.65.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Y/Au/CFd" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738144835; x=1769680835; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+ixnKposbO4ed0uhOHW1IAo8kancv+oQmaa2H5HjQGA=; b=Y/Au/CFdME8YRNSoP5V6m+YtHP7H8FuWjN0hXbT/b21u7z9kFi8Ucq1C uFBGf8anLzk3xHaayFQQSmg8l2UHM7LHOEMpIs0l6bvONeWVUP0wpt3eo gjd98V/ByPTDVMInZicI8Y0Gb/8hicpgAXJxc1uJT+E29hLdkrAKcQwU/ aCvLfnd/OD+Lk++G7gZWp+UL3/bQGK/CMogQcid84rGlZv5iMpGKEhRzn X38zAMy8RSVbA3LjOEF5j7lT29yonDDQ9PCdC8iMSsHomHlB5Yg/VGiVa soYU1nEwSiTLPlCq8oRC56uvhYOfJyZS3ZVwAgpjhNADu5RasHVsuzjr7 Q==; X-CSE-ConnectionGUID: wy2I2Mi7QSKK2Xpsn1mUlA== X-CSE-MsgGUID: 6T6AnkivQZ2a+2pkDICHEQ== X-IronPort-AV: E=McAfee;i="6700,10204,11329"; a="50036092" X-IronPort-AV: E=Sophos;i="6.13,243,1732608000"; d="scan'208";a="50036092" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 02:00:17 -0800 X-CSE-ConnectionGUID: yYp5K29FQX6iXmhz1ubfrg== X-CSE-MsgGUID: zl9NGE3ZRuGCW9VRVWe/Ow== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="132262821" Received: from ahunter6-mobl1.ger.corp.intel.com (HELO ahunter-VirtualBox.ger.corp.intel.com) ([10.246.0.178]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2025 02:00:13 -0800 From: Adrian Hunter To: pbonzini@redhat.com, seanjc@google.com Cc: kvm@vger.kernel.org, rick.p.edgecombe@intel.com, kai.huang@intel.com, adrian.hunter@intel.com, reinette.chatre@intel.com, xiaoyao.li@intel.com, tony.lindgren@linux.intel.com, binbin.wu@linux.intel.com, dmatlack@google.com, isaku.yamahata@intel.com, nik.borisov@suse.com, linux-kernel@vger.kernel.org, yan.y.zhao@intel.com, chao.gao@intel.com, weijiang.yang@intel.com Subject: [PATCH V2 12/12] KVM: x86: Add a switch_db_regs flag to handle TDX's auto-switched behavior Date: Wed, 29 Jan 2025 11:59:01 +0200 Message-ID: <20250129095902.16391-13-adrian.hunter@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250129095902.16391-1-adrian.hunter@intel.com> References: <20250129095902.16391-1-adrian.hunter@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki From: Isaku Yamahata Add a flag KVM_DEBUGREG_AUTO_SWITCH to skip saving/restoring guest DRs. TDX-SEAM unconditionally saves/restores guest DRs on TD exit/enter, and resets DRs to architectural INIT state on TD exit. Use the new flag KVM_DEBUGREG_AUTO_SWITCH to indicate that KVM doesn't need to save/restore guest DRs. KVM still needs to restore host DRs after TD exit if there are active breakpoints in the host, which is covered by the existing code. MOV-DR exiting is always cleared for TDX guests, so the handler for DR access is never called, and KVM_DEBUGREG_WONT_EXIT is never set. Add a warning if both KVM_DEBUGREG_WONT_EXIT and KVM_DEBUGREG_AUTO_SWITCH are set. Opportunistically convert the KVM_DEBUGREG_* definitions to use BIT(). Reported-by: Xiaoyao Li Signed-off-by: Sean Christopherson Co-developed-by: Chao Gao Signed-off-by: Chao Gao Signed-off-by: Isaku Yamahata [binbin: rework changelog] Signed-off-by: Binbin Wu Message-ID: <20241210004946.3718496-2-binbin.wu@linux.intel.com> Signed-off-by: Paolo Bonzini --- TD vcpu enter/exit v2: - Moved from TDX "the rest" to "TD vcpu enter/exit" TDX "the rest" v1: - Update the comment about KVM_DEBUGREG_AUTO_SWITCH. - Check explicitly KVM_DEBUGREG_AUTO_SWITCH is not set in switch_db_regs before restoring guest DRs, because KVM_DEBUGREG_BP_ENABLED could be set by userspace. (Paolo) https://lore.kernel.org/lkml/ea136ac6-53cf-cdc5-a741-acfb437819b1@redhat.com/ - Fix the issue that host DRs are not restored in v19 (Binbin) https://lore.kernel.org/kvm/20240413002026.GP3039520@ls.amr.corp.intel.com/ - Update the changelog a bit. --- arch/x86/include/asm/kvm_host.h | 11 +++++++++-- arch/x86/kvm/vmx/tdx.c | 1 + arch/x86/kvm/x86.c | 4 +++- 3 files changed, 13 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index e557a441fade..bcfd89c28308 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -606,8 +606,15 @@ struct kvm_pmu { struct kvm_pmu_ops; enum { - KVM_DEBUGREG_BP_ENABLED = 1, - KVM_DEBUGREG_WONT_EXIT = 2, + KVM_DEBUGREG_BP_ENABLED = BIT(0), + KVM_DEBUGREG_WONT_EXIT = BIT(1), + /* + * Guest debug registers (DR0-3, DR6 and DR7) are saved/restored by + * hardware on exit from or enter to guest. KVM needn't switch them. + * DR0-3, DR6 and DR7 are set to their architectural INIT value on VM + * exit, host values need to be restored. + */ + KVM_DEBUGREG_AUTO_SWITCH = BIT(2), }; struct kvm_mtrr { diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 0bce00415f42..0863bdaf761a 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -652,6 +652,7 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu) vcpu->arch.efer = EFER_SCE | EFER_LME | EFER_LMA | EFER_NX; + vcpu->arch.switch_db_regs = KVM_DEBUGREG_AUTO_SWITCH; vcpu->arch.cr0_guest_owned_bits = -1ul; vcpu->arch.cr4_guest_owned_bits = -1ul; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 15447fe7687c..b023283e7ed4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10977,7 +10977,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) if (vcpu->arch.guest_fpu.xfd_err) wrmsrl(MSR_IA32_XFD_ERR, vcpu->arch.guest_fpu.xfd_err); - if (unlikely(vcpu->arch.switch_db_regs)) { + if (unlikely(vcpu->arch.switch_db_regs && + !(vcpu->arch.switch_db_regs & KVM_DEBUGREG_AUTO_SWITCH))) { set_debugreg(0, 7); set_debugreg(vcpu->arch.eff_db[0], 0); set_debugreg(vcpu->arch.eff_db[1], 1); @@ -11024,6 +11025,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) */ if (unlikely(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)) { WARN_ON(vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP); + WARN_ON(vcpu->arch.switch_db_regs & KVM_DEBUGREG_AUTO_SWITCH); kvm_x86_call(sync_dirty_debug_regs)(vcpu); kvm_update_dr0123(vcpu); kvm_update_dr7(vcpu);