[RFC,v5,042/104] KVM: x86/mmu: Track shadow MMIO value/mask on a per-VM basis

From: Sean Christopherson <sean.j.christopherson@intel.com>

From: Sean Christopherson <sean.j.christopherson@intel.com>

Define the EPT Violation #VE control bit, #VE info VMCS fields, and the
suppress #VE bit for EPT entries.

TDX will use a different shadow PTE entry value for MMIO from VMX.  Add
members to kvm_arch and track value for MMIO per-VM.  By using per-VM EPT
entry value for MMIO, the existing VMX logic is kept working.

In the case of VMX VM case, the EPT entry for MMIO is non-present PTE
(present bit cleared) without backing guest physical address (on EPT
violation, KVM seaching backing guest memory and it finds there is no
backing guest page.) or the value to trigger EPT misconfiguration.  For
fast path. Once MMIO is triggered on the EPT entry, the EPT entry is
updated for the future MMIO.  It allows KVM to understand the memory access
is for MMIO without searching backing guest pages.). And then KVM parses
guest instruction to figure out address/value/width for MMIO.

In the case of the guest TD, the guest memory is protected so that VMM
can't parse guest instruction to trigger EPT violation.  Instead VMM sets
up (Shared) EPT to trigger #VE.  When the guest TD issues MMIO, #VE is
injected.  guest VE handler converts MMIO access into MMIO hypercall to
pass address/value/width for MMIO to VMM. (or directly paravirtualize MMIO
into hypercall.)  Then VMM can handle the MMIO hypercall without parsing
guest instruction.

When the guest accesses GPA if "the EPT Violation #VE" control bit is set
and EPT SUPPRESS VE bit in EPT entry is cleared, #VE, virtualization
exception, is injected into the guest.  Because the TDX guest vCPU state
and memory are protected, a VMM can't emulate MMIO by the TDX guest on EPT
violation by snooping vCPU state and parsing instruction to figure out MMIO
address and value.  Instead, PV MMIO (MMIO hypercall) is adapted.  On EPT
violation, CPU injects #VE to guest and the guest converts MMIO instruction
into PV MMIO.  Or guest directly issues MMIO hypercall.

The existing VMX code uses zero as an initial value for EPT entry.  TDX
will enable EPT-violation #VE VM-execution control and requires suppress VE
bit cleared in shared EPT entry to inject #VE into the TDX guest.  To keep
the same behavior for VMX, suppress VE bit needs to be set.  Allow to
specify an initial value for EPT entry and if TDX is enabled, set initial
EPT entry value to suppress VE bit set.  EPT-violation #VE VM-execution
control will be enabled, and For TDX shared EPT suppress VE bit will be
cleared for TDX shared EPT entry.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  3 +++
 arch/x86/include/asm/vmx.h      |  1 +
 arch/x86/kvm/mmu.h              |  6 ++++--
 arch/x86/kvm/mmu/mmu.c          | 19 +++++++++++------
 arch/x86/kvm/mmu/spte.c         | 38 ++++++++++++++++-----------------
 arch/x86/kvm/mmu/spte.h         |  9 ++++----
 arch/x86/kvm/mmu/tdp_mmu.c      |  6 +++---
 arch/x86/kvm/svm/svm.c          |  2 +-
 arch/x86/kvm/vmx/main.c         |  7 ++++--
 arch/x86/kvm/vmx/tdx.c          |  2 +-
 arch/x86/kvm/vmx/tdx.h          |  2 ++
 arch/x86/kvm/vmx/vmx.c          |  8 +++++++
 12 files changed, 63 insertions(+), 40 deletions(-)

Message ID	b494b94bf2d6a5d841cb76e63e255d4cff906d83.1646422845.git.isaku.yamahata@intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9766C433F5 for <kvm@archiver.kernel.org>; Fri, 4 Mar 2022 20:09:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231175AbiCDUJt (ORCPT <rfc822;kvm@archiver.kernel.org>); Fri, 4 Mar 2022 15:09:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54006 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231146AbiCDUHw (ORCPT <rfc822;kvm@vger.kernel.org>); Fri, 4 Mar 2022 15:07:52 -0500 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 078A52036F0; Fri, 4 Mar 2022 12:02:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646424138; x=1677960138; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1RVb2YnUQFKFQsqCClaVBJvP7/e0tFbqIYsyc09sDEQ=; b=JYj+ls92w9TrIEeFjGd7EPnpQ/vjDUdVedyFaxWGzT+4GIY7Nj6dS9L5 AS8wJNrdce2men97YpKSzzIsyK+hXQDqy3VHZH9ePE5Da2mmJWWgfNMmn rJ+eqZtqKSXntsO2zTOz8tNHhNpysC6O7Y0Ff2DnpfqK80kxzT2r64Q15 lULNbUiwywwgk/AWr38z+1WgIDqKDzycN0rBb3o96KdaDnbuh6w4Mh05v iJ8FUH+WnK/yVkkZNI3ngjrrY2f6ewz6iYokRtussvZA85dttSG8XPmgZ NkrcFnsmc8CsVVuxqOv5q4teR+v8XA9NIvI9p7vPlS66sIO9ZSV0YHnUt A==; X-IronPort-AV: E=McAfee;i="6200,9189,10276"; a="253983484" X-IronPort-AV: E=Sophos;i="5.90,156,1643702400"; d="scan'208";a="253983484" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Mar 2022 11:50:22 -0800 X-IronPort-AV: E=Sophos;i="5.90,156,1643702400"; d="scan'208";a="552344336" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Mar 2022 11:50:21 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini <pbonzini@redhat.com>, Jim Mattson <jmattson@google.com>, erdemaktas@google.com, Connor Kuehl <ckuehl@redhat.com>, Sean Christopherson <seanjc@google.com> Subject: [RFC PATCH v5 042/104] KVM: x86/mmu: Track shadow MMIO value/mask on a per-VM basis Date: Fri, 4 Mar 2022 11:48:58 -0800 Message-Id: <b494b94bf2d6a5d841cb76e63e255d4cff906d83.1646422845.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <cover.1646422845.git.isaku.yamahata@intel.com> References: <cover.1646422845.git.isaku.yamahata@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: <kvm.vger.kernel.org> X-Mailing-List: kvm@vger.kernel.org
Series	KVM TDX basic feature support \| expand [RFC,v5,000/104] KVM TDX basic feature support [RFC,v5,001/104] KVM: VMX: Move out vmx_x86_ops to 'main.c' to wrap VMX and TDX [RFC,v5,002/104] x86/virt/tdx: export platform_has_tdx [RFC,v5,003/104] KVM: TDX: Detect CPU feature on kernel module initialization [RFC,v5,004/104] KVM: Enable hardware before doing arch VM initialization [RFC,v5,005/104] KVM: x86: Refactor KVM VMX module init/exit functions [RFC,v5,006/104] KVM: TDX: Add placeholders for TDX VM/vcpu structure [RFC,v5,007/104] x86/virt/tdx: Add a helper function to return system wide info about TDX module [RFC,v5,008/104] KVM: TDX: Add a function to initialize TDX module [RFC,v5,009/104] KVM: x86: Introduce vm_type to differentiate default VMs from confidential VMs [RFC,v5,010/104] KVM: TDX: Make TDX VM type supported [RFC,v5,011/104,MARKER] The start of TDX KVM patch series: TDX architectural definitions [RFC,v5,012/104] KVM: TDX: Define TDX architectural definitions [RFC,v5,013/104] KVM: TDX: Add TDX "architectural" error codes [RFC,v5,014/104] KVM: TDX: Add a function for KVM to invoke SEAMCALL [RFC,v5,015/104] KVM: TDX: add a helper function for KVM to issue SEAMCALL [RFC,v5,016/104] KVM: TDX: Add C wrapper functions for SEAMCALLs to the TDX module [RFC,v5,017/104] KVM: TDX: Add helper functions to print TDX SEAMCALL error [RFC,v5,018/104,MARKER] The start of TDX KVM patch series: TD VM creation/destruction [RFC,v5,019/104] KVM: TDX: Stub in tdx.h with structs, accessors, and VMCS helpers [RFC,v5,020/104] KVM: TDX: allocate per-package mutex [RFC,v5,021/104] KVM: x86: Introduce hooks to free VM callback prezap and vm_free [RFC,v5,022/104] KVM: Add max_vcpus field in common 'struct kvm' [RFC,v5,023/104] x86/cpu: Add helper functions to allocate/free MKTME keyid [RFC,v5,024/104] KVM: TDX: create/destroy VM structure [RFC,v5,025/104] KVM: TDX: Add place holder for TDX VM specific mem_enc_op ioctl [RFC,v5,026/104] KVM: TDX: x86: Add vm ioctl to get TDX systemwide parameters [RFC,v5,027/104] KVM: TDX: initialize VM with TDX specific parameters [RFC,v5,028/104,MARKER] The start of TDX KVM patch series: TD vcpu creation/destruction [RFC,v5,029/104] KVM: TDX: allocate/free TDX vcpu structure [RFC,v5,030/104] KVM: TDX: Do TDX specific vcpu initialization [RFC,v5,031/104,MARKER] The start of TDX KVM patch series: KVM MMU GPA stolen bits [RFC,v5,032/104] KVM: x86/mmu: introduce config for PRIVATE KVM MMU [RFC,v5,033/104] KVM: x86: Add infrastructure for stolen GPA bits [RFC,v5,034/104,MARKER] The start of TDX KVM patch series: KVM TDP refactoring for TDX [RFC,v5,035/104] KVM: x86/mmu: Disallow dirty logging for x86 TDX [RFC,v5,036/104] KVM: x86/mmu: Explicitly check for MMIO spte in fast page fault [RFC,v5,037/104] KVM: x86/mmu: Allow non-zero init value for shadow PTE [RFC,v5,038/104] KVM: x86/mmu: Allow per-VM override of the TDP max page level [RFC,v5,039/104] KVM: x86/mmu: Disallow fast page fault on private GPA [RFC,v5,040/104] KVM: VMX: Split out guts of EPT violation to common/exposed function [RFC,v5,041/104] KVM: VMX: Move setting of EPT MMU masks to common VT-x code [RFC,v5,042/104] KVM: x86/mmu: Track shadow MMIO value/mask on a per-VM basis [RFC,v5,043/104] KVM: TDX: Add load_mmu_pgd method for TDX [RFC,v5,044/104,MARKER] The start of TDX KVM patch series: KVM TDP MMU hooks [RFC,v5,045/104] KVM: x86/tdp_mmu: make REMOVED_SPTE include shadow_initial value [RFC,v5,046/104] KVM: x86/tdp_mmu: refactor kvm_tdp_mmu_map() [RFC,v5,047/104] KVM: x86/mmu: add a private pointer to struct kvm_mmu_page [RFC,v5,048/104] KVM: x86/tdp_mmu: Support TDX private mapping for TDP MMU [RFC,v5,049/104] KVM: x86/tdp_mmu: Ignore unsupported mmu operation on private GFNs [RFC,v5,050/104,MARKER] The start of TDX KVM patch series: TDX EPT violation [RFC,v5,051/104] KVM: TDX: TDP MMU TDX support [RFC,v5,052/104,MARKER] The start of TDX KVM patch series: KVM TDP MMU MapGPA [RFC,v5,053/104] KVM: x86/mmu: steal software usable bit for EPT to represent shared page [RFC,v5,054/104] KVM: x86/tdp_mmu: Keep PRIVATE_PROHIBIT bit when zapping [RFC,v5,055/104] KVM: x86/tdp_mmu: prevent private/shared map based on PRIVATE_PROHIBIT [RFC,v5,056/104] KVM: x86/tdp_mmu: implement MapGPA hypercall for TDX [RFC,v5,057/104] KVM: x86/mmu: Introduce kvm_mmu_map_tdp_page() for use by TDX [RFC,v5,058/104] KVM: x86/mmu: Focibly use TDP MMU for TDX [RFC,v5,059/104,MARKER] The start of TDX KVM patch series: TD finalization [RFC,v5,060/104] KVM: TDX: Create initial guest memory [RFC,v5,061/104] KVM: TDX: Finalize VM initialization [RFC,v5,062/104,MARKER] The start of TDX KVM patch series: TD vcpu enter/exit [RFC,v5,063/104] KVM: TDX: Add helper assembly function to TDX vcpu [RFC,v5,064/104] KVM: TDX: Implement TDX vcpu enter/exit path [RFC,v5,065/104] KVM: TDX: vcpu_run: save/restore host state(host kernel gs) [RFC,v5,066/104] KVM: TDX: restore host xsave state when exit from the guest TD [RFC,v5,067/104] KVM: x86: Allow to update cached values in kvm_user_return_msrs w/o wrmsr [RFC,v5,068/104] KVM: TDX: restore user ret MSRs [RFC,v5,069/104,MARKER] The start of TDX KVM patch series: TD vcpu exits/interrupts/hypercalls [RFC,v5,070/104] KVM: TDX: complete interrupts after tdexit [RFC,v5,071/104] KVM: TDX: restore debug store when TD exit [RFC,v5,072/104] KVM: TDX: handle vcpu migration over logical processor [RFC,v5,073/104] KVM: TDX: track LP tdx vcpu run and teardown vcpus on descroing the guest TD [RFC,v5,074/104] KVM: x86: Add a switch_db_regs flag to handle TDX's auto-switched behavior [RFC,v5,075/104] KVM: x86: Check for pending APICv interrupt in kvm_vcpu_has_events() [RFC,v5,076/104] KVM: x86: Add option to force LAPIC expiration wait [RFC,v5,077/104] KVM: TDX: Use vcpu_to_pi_desc() uniformly in posted_intr.c [RFC,v5,078/104] KVM: TDX: Implement interrupt injection [RFC,v5,079/104] KVM: TDX: Implements vcpu request_immediate_exit [RFC,v5,080/104] KVM: TDX: Implement methods to inject NMI [RFC,v5,081/104] KVM: VMX: Modify NMI and INTR handlers to take intr_info as function argument [RFC,v5,082/104] KVM: VMX: Move NMI/exception handler to common helper [RFC,v5,083/104] KVM: x86: Split core of hypercall emulation to helper function [RFC,v5,084/104] KVM: TDX: Add a place holder to handle TDX VM exit [RFC,v5,085/104] KVM: TDX: handle EXIT_REASON_OTHER_SMI [RFC,v5,086/104] KVM: TDX: handle ept violation/misconfig exit [RFC,v5,087/104] KVM: TDX: handle EXCEPTION_NMI and EXTERNAL_INTERRUPT [RFC,v5,088/104] KVM: TDX: Add TDG.VP.VMCALL accessors to access guest vcpu registers [RFC,v5,089/104] KVM: TDX: Add a placeholder for handler of TDX hypercalls (TDG.VP.VMCALL) [RFC,v5,090/104] KVM: TDX: handle KVM hypercall with TDG.VP.VMCALL [RFC,v5,091/104] KVM: TDX: Handle TDX PV CPUID hypercall [RFC,v5,092/104] KVM: TDX: Handle TDX PV HLT hypercall [RFC,v5,093/104] KVM: TDX: Handle TDX PV port io hypercall [RFC,v5,094/104] KVM: TDX: Handle TDX PV MMIO hypercall [RFC,v5,095/104] KVM: TDX: Implement callbacks for MSR operations for TDX [RFC,v5,096/104] KVM: TDX: Handle TDX PV rdmsr hypercall [RFC,v5,097/104] KVM: TDX: Handle TDX PV wrmsr hypercall [RFC,v5,098/104] KVM: TDX: Handle TDX PV report fatal error hypercall [RFC,v5,099/104] KVM: TDX: Handle TDX PV map_gpa hypercall [RFC,v5,100/104] KVM: TDX: Silently discard SMI request [RFC,v5,101/104] KVM: TDX: Silently ignore INIT/SIPI [RFC,v5,102/104] KVM: TDX: Add methods to ignore accesses to CPU state [RFC,v5,103/104] Documentation/virtual/kvm: Document on Trust Domain Extensions(TDX) [RFC,v5,104/104] KVM: x86: design documentation on TDX support of x86 KVM TDP MMU

[RFC,v5,042/104] KVM: x86/mmu: Track shadow MMIO value/mask on a per-VM basis

Commit Message

Comments

Patch