[RFC,v3,31/59] KVM: x86: Add infrastructure for stolen GPA bits

From: Rick Edgecombe <rick.p.edgecombe@intel.com>

From: Rick Edgecombe <rick.p.edgecombe@intel.com>

Add support in KVM's MMU for aliasing multiple GPAs (from a hardware
perspective) to a single GPA (from a memslot perspective). GPA alising
will be used to repurpose GPA bits as attribute bits, e.g. to expose an
execute-only permission bit to the guest. To keep the implementation
simple (relatively speaking), GPA aliasing is only supported via TDP.

Today KVM assumes two things that are broken by GPA aliasing.
  1. GPAs coming from hardware can be simply shifted to get the GFNs.
  2. GPA bits 51:MAXPHYADDR are reserved to zero.

With GPA aliasing, translating a GPA to GFN requires masking off the
repurposed bit, and a repurposed bit may reside in 51:MAXPHYADDR.

To support GPA aliasing, introduce the concept of per-VM GPA stolen bits,
that is, bits stolen from the GPA to act as new virtualized attribute
bits. A bit in the mask will cause the MMU code to create aliases of the
GPA. It can also be used to find the GFN out of a GPA coming from a tdp
fault.

To handle case (1) from above, retain any stolen bits when passing a GPA
in KVM's MMU code, but strip them when converting to a GFN so that the
GFN contains only the "real" GFN, i.e. never has repurposed bits set.

GFNs (without stolen bits) continue to be used to:
	-Specify physical memory by userspace via memslots
	-Map GPAs to TDP PTEs via RMAP
	-Specify dirty tracking and write protection
	-Look up MTRR types
	-Inject async page faults

Since there are now multiple aliases for the same aliased GPA, when
userspace memory backing the memslots is paged out, both aliases need to be
modified. Fortunately this happens automatically. Since rmap supports
multiple mappings for the same GFN for PTE shadowing based paging, by
adding/removing each alias PTE with its GFN, kvm_handle_hva() based
operations will be applied to both aliases.

In the case of the rmap being removed in the future, the needed
information could be recovered by iterating over the stolen bits and
walking the TDP page tables.

For TLB flushes that are address based, make sure to flush both aliases
in the stolen bits case.

Only support stolen bits in 64 bit guest paging modes (long, PAE).
Features that use this infrastructure should restrict the stolen bits to
exclude the other paging modes. Don't support stolen bits for shadow EPT.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 arch/x86/kvm/mmu.h              | 27 +++++++++++
 arch/x86/kvm/mmu/mmu.c          | 82 +++++++++++++++++++++++----------
 arch/x86/kvm/mmu/mmu_internal.h |  1 +
 arch/x86/kvm/mmu/paging_tmpl.h  | 25 ++++++----
 4 files changed, 100 insertions(+), 35 deletions(-)

Message ID	89046548aa74778658c6e66d219e157e71e439ab.1637799475.git.isaku.yamahata@intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18B10C4332F for <kvm@archiver.kernel.org>; Thu, 25 Nov 2021 00:21:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1353524AbhKYAZG (ORCPT <rfc822;kvm@archiver.kernel.org>); Wed, 24 Nov 2021 19:25:06 -0500 Received: from mga14.intel.com ([192.55.52.115]:6415 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1352926AbhKYAYZ (ORCPT <rfc822;kvm@vger.kernel.org>); Wed, 24 Nov 2021 19:24:25 -0500 X-IronPort-AV: E=McAfee;i="6200,9189,10178"; a="235649720" X-IronPort-AV: E=Sophos;i="5.87,261,1631602800"; d="scan'208";a="235649720" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Nov 2021 16:21:14 -0800 X-IronPort-AV: E=Sophos;i="5.87,261,1631602800"; d="scan'208";a="675042218" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Nov 2021 16:21:14 -0800 From: isaku.yamahata@intel.com To: Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, "H . Peter Anvin" <hpa@zytor.com>, Paolo Bonzini <pbonzini@redhat.com>, Vitaly Kuznetsov <vkuznets@redhat.com>, Wanpeng Li <wanpengli@tencent.com>, Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>, erdemaktas@google.com, Connor Kuehl <ckuehl@redhat.com>, Sean Christopherson <seanjc@google.com>, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Rick Edgecombe <rick.p.edgecombe@intel.com> Subject: [RFC PATCH v3 31/59] KVM: x86: Add infrastructure for stolen GPA bits Date: Wed, 24 Nov 2021 16:20:14 -0800 Message-Id: <89046548aa74778658c6e66d219e157e71e439ab.1637799475.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <cover.1637799475.git.isaku.yamahata@intel.com> References: <cover.1637799475.git.isaku.yamahata@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: <kvm.vger.kernel.org> X-Mailing-List: kvm@vger.kernel.org
Series	KVM: X86: TDX support \| expand [RFC,v3,00/59] KVM: X86: TDX support [RFC,v3,01/59] x86/mktme: move out MKTME related constatnts/macro to msr-index.h [RFC,v3,02/59] x86/mtrr: mask out keyid bits from variable mtrr mask register [RFC,v3,03/59] KVM: TDX: Define TDX architectural definitions [RFC,v3,04/59] KVM: TDX: Add TDX "architectural" error codes [RFC,v3,05/59] KVM: TDX: add a helper function for kvm to call seamcall [RFC,v3,06/59] KVM: TDX: Add C wrapper functions for TDX SEAMCALLs [RFC,v3,07/59] KVM: TDX: Add helper functions to print TDX SEAMCALL error [RFC,v3,08/59] KVM: Export kvm_io_bus_read for use by TDX for PV MMIO [RFC,v3,09/59] KVM: Enable hardware before doing arch VM initialization [RFC,v3,10/59] KVM: x86: Split core of hypercall emulation to helper function [RFC,v3,11/59] KVM: x86: Export kvm_mmio tracepoint for use by TDX for PV MMIO [RFC,v3,12/59] KVM: x86/mmu: Zap only leaf SPTEs for deleted/moved memslot by default [RFC,v3,13/59] KVM: Add max_vcpus field in common 'struct kvm' [RFC,v3,14/59] KVM: x86: Add vm_type to differentiate legacy VMs from protected VMs [RFC,v3,15/59] KVM: x86: Introduce "protected guest" concept and block disallowed ioctls [RFC,v3,16/59] KVM: x86: Add per-VM flag to disable direct IRQ injection [RFC,v3,17/59] KVM: x86: Add flag to disallow #MC injection / KVM_X86_SETUP_MCE [RFC,v3,18/59] KVM: x86: Add flag to mark TSC as immutable (for TDX) [RFC,v3,19/59] KVM: Add per-VM flag to mark read-only memory as unsupported [RFC,v3,20/59] KVM: Add per-VM flag to disable dirty logging of memslots for TDs [RFC,v3,21/59] KVM: x86: Add per-VM flag to disable in-kernel I/O APIC and level routes [RFC,v3,22/59] KVM: x86: add per-VM flags to disable SMI/INIT/SIPI [RFC,v3,23/59] KVM: x86: Allow host-initiated WRMSR to set X2APIC regardless of CPUID [RFC,v3,24/59] KVM: x86: Add kvm_x86_ops .cache_gprs() and .flush_gprs() [RFC,v3,25/59] KVM: x86: Add support for vCPU and device-scoped KVM_MEMORY_ENCRYPT_OP [RFC,v3,26/59] KVM: x86: Introduce vm_teardown() hook in kvm_arch_vm_destroy() [RFC,v3,27/59] KVM: x86: Add a switch_db_regs flag to handle TDX's auto-switched behavior [RFC,v3,28/59] KVM: x86: Check for pending APICv interrupt in kvm_vcpu_has_events() [RFC,v3,29/59] KVM: x86: Add option to force LAPIC expiration wait [RFC,v3,30/59] KVM: x86: Add guest_supported_xss placholder [RFC,v3,31/59] KVM: x86: Add infrastructure for stolen GPA bits [RFC,v3,32/59] KVM: x86/mmu: Explicitly check for MMIO spte in fast page fault [RFC,v3,33/59] KVM: x86/mmu: Ignore bits 63 and 62 when checking for "present" SPTEs [RFC,v3,34/59] KVM: x86/mmu: Allow non-zero init value for shadow PTE [RFC,v3,35/59] KVM: x86/mmu: Return old SPTE from mmu_spte_clear_track_bits() [RFC,v3,36/59] KVM: x86/mmu: Frame in support for private/inaccessible shadow pages [RFC,v3,37/59] KVM: x86/mmu: Introduce kvm_mmu_map_tdp_page() for use by TDX [RFC,v3,38/59] KVM: x86/mmu: Allow per-VM override of the TDP max page level [RFC,v3,39/59] KVM: VMX: Modify NMI and INTR handlers to take intr_info as param [RFC,v3,40/59] KVM: VMX: Move NMI/exception handler to common helper [RFC,v3,41/59] KVM: VMX: Split out guts of EPT violation to common/exposed function [RFC,v3,42/59] KVM: VMX: Define EPT Violation architectural bits [RFC,v3,43/59] KVM: VMX: Define VMCS encodings for shared EPT pointer [RFC,v3,44/59] KVM: VMX: Add 'main.c' to wrap VMX and TDX [RFC,v3,45/59] KVM: VMX: Move setting of EPT MMU masks to common VT-x code [RFC,v3,46/59] KVM: VMX: Move register caching logic to common code [RFC,v3,47/59] KVM: TDX: Define TDCALL exit reason [RFC,v3,48/59] KVM: TDX: Stub in tdx.h with structs, accessors, and VMCS helpers [RFC,v3,49/59] KVM: VMX: Add macro framework to read/write VMCS for VMs and TDs [RFC,v3,50/59] KVM: VMX: Move AR_BYTES encoder/decoder helpers to common.h [RFC,v3,51/59] KVM: VMX: MOVE GDT and IDT accessors to common code [RFC,v3,52/59] KVM: VMX: Move .get_interrupt_shadow() implementation to common VMX code [RFC,v3,53/59] KVM: x86: Add a helper function to restore 4 host MSRs on exit to user space [RFC,v3,54/59] KVM: X86: Introduce initial_tsc_khz in struct kvm_arch [RFC,v3,55/59] KVM: TDX: Add "basic" support for building and running Trust Domains [RFC,v3,56/59] KVM: TDX: Protect private mapping related SEAMCALLs with spinlock [RFC,v3,57/59] KVM, x86/mmu: Support TDX private mapping for TDP MMU [RFC,v3,58/59] KVM: TDX: exit to user space on GET_QUOTE, SETUP_EVENT_NOTIFY_INTERRUPT [RFC,v3,59/59] Documentation/virtual/kvm: Add Trust Domain Extensions(TDX)

[RFC,v3,31/59] KVM: x86: Add infrastructure for stolen GPA bits

Commit Message

Comments

Patch