From patchwork Sat Feb 13 13:28:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Kai" X-Patchwork-Id: 12086745 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B48A8C433E0 for ; Sat, 13 Feb 2021 13:29:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 694F464E44 for ; Sat, 13 Feb 2021 13:29:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229691AbhBMN3X (ORCPT ); Sat, 13 Feb 2021 08:29:23 -0500 Received: from mga01.intel.com ([192.55.52.88]:30281 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229672AbhBMN3V (ORCPT ); Sat, 13 Feb 2021 08:29:21 -0500 IronPort-SDR: R9VKivXYBsKYl173QNdGo2L7EHQskaHmgNoCdqPeh+8JnFyb9kA1iWxml5oFWGffNt2yC68BJR lk8vTG9TlrUA== X-IronPort-AV: E=McAfee;i="6000,8403,9893"; a="201682344" X-IronPort-AV: E=Sophos;i="5.81,176,1610438400"; d="scan'208";a="201682344" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Feb 2021 05:28:39 -0800 IronPort-SDR: ZlGz64bjU5OxLr0TP9p+5e5jQch2dzT2u21y9q0ZsYEawi+DES8Sq+lposPyWMOlc/GaPA07/w FNA1eR6Xm9Ww== X-IronPort-AV: E=Sophos;i="5.81,176,1610438400"; d="scan'208";a="398365933" Received: from kshah-mobl1.amr.corp.intel.com (HELO khuang2-desk.gar.corp.intel.com) ([10.255.230.239]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Feb 2021 05:28:35 -0800 From: Kai Huang To: linux-sgx@vger.kernel.org, kvm@vger.kernel.org, x86@kernel.org Cc: seanjc@google.com, jarkko@kernel.org, luto@kernel.org, dave.hansen@intel.com, rick.p.edgecombe@intel.com, haitao.huang@intel.com, pbonzini@redhat.com, bp@alien8.de, tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, jethro@fortanix.com, b.thiel@posteo.de, jmattson@google.com, joro@8bytes.org, vkuznets@redhat.com, wanpengli@tencent.com, corbet@lwn.net Subject: [RFC PATCH v5 00/26] KVM SGX virtualization support Date: Sun, 14 Feb 2021 02:28:28 +1300 Message-Id: X-Mailer: git-send-email 2.29.2 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org --- Disclaimer --- These patches were originally written by Sean Christopherson while at Intel. Now that Sean has left Intel, I (Kai) have taken over getting them upstream. This series needs more review before it can be merged. It is being posted publicly and under RFC so Sean and others can review it. X86 Maintainers are safe ignoring it for now. ------------------ Hi all, This series adds KVM SGX virtualization support. The first 14 patches starting with x86/sgx or x86/cpu.. are necessary changes to x86 and SGX core/driver to support KVM SGX virtualization, while the rest are patches to KVM subsystem. Please help to review this series. Any feedback is highly appreciated. Please let me know if I forgot to CC anyone, or anyone wants to be removed from CC. Thanks in advance! This series is based against tip/x86/sgx. You can also get the code from upstream branch of kvm-sgx repo on github: https://github.com/intel/kvm-sgx.git upstream It also requires Qemu changes to create VM with SGX support. You can find Qemu repo here: https://github.com/intel/qemu-sgx.git upstream Please refer to README.md of above qemu-sgx repo for detail on how to create guest with SGX support. At meantime, for your quick reference you can use below command to create SGX guest: #qemu-system-x86_64 -smp 4 -m 2G -drive file=,if=virtio \ -cpu host,+sgx_provisionkey \ -sgx-epc id=epc1,memdev=mem1 \ -object memory-backend-epc,id=mem1,size=64M,prealloc Please note that the SGX relevant part is: -cpu host,+sgx_provisionkey \ -sgx-epc id=epc1,memdev=mem1 \ -object memory-backend-epc,id=mem1,size=64M,prealloc And you can change other parameters of your qemu command based on your needs. ========= Changelog: (Changelog here is for global changes. Please see each patch's changelog for changes made to specific patch.) v4->v5 (No big update from v4): - Updated commit message of patch 4, per Jarkko. - Updated comments and commit msg for patch 3, per Dave and Sean. v3->v4: - Removed Jarkko's first patch, which removes WARN() against EPC page's SGX_EPC_PAGE_RECLAIMER_TRACKED flag. Dave and Sean were not convinced, and Sean "tripped more than once in the past during one of the many rebases of the virtual EPC and EPC cgroup branches". - Switched the patch to introduce SGX1/SGX2 feature bits, and the patch to move SGX_LC to cpuid_deps[] table, per Jarkko. Now SGX_LC patch can be applied to existing upstream code directly, if needed. - Fixed two problems in one KVM patch (trap ECREATE) after review from Rick and Sean. - Some other minor changes, i.e. update new file's copyright to 2021. Hi Dave, Sean, Paolo, Rick, and others, Btw, in last review, we identified that malicious guest can cause KVM to access EPC, which will trigger "page abort", which drops on write, and returns all ones on read. If I understand correctly, we agreed that a better option is to make memremap() deny EPC. However so far this series doesn't contain this work, since looks more discussion is required, and this work can be done independently from KVM SGX series. v2->v3: - Split original "x86/cpufeatures: Add SGX1 and SGX2 sub-features" patch into two patches, by splitting moving SGX_LC bit also into cpuid-deps table logic into a separate patch 2: [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features [RFC PATCH v3 02/27] x86/cpufeatures: Make SGX_LC feature bit depend on SGX bit - Changed from /dev/sgx_virt_epc to /dev/sgx_vepc, per Jarkko. And accordingly, changed prefix 'sgx_virt_epc_xx' to 'sgx_vepc_xx' in various functions and structures. - Changed CONFIG_X86_SGX_VIRTUALIZATION to CONFIG_X86_SGX_KVM, per Dave. Couple of x86 patches and KVM patches are changed too due to the renaming. v1->v2: - Refined this cover letter by addressing comments from Dave and Jarkko. - The original patch which introduced new X86_FEATURE_SGX1/SGX2 were replaced by 3 new patches from Sean, following Boris and Sean's discussion. [RFC PATCH v2 01/26] x86/cpufeatures: Add SGX1 and SGX2 sub-features [RFC PATCH v2 18/26] KVM: x86: Add support for reverse CPUID lookup of scattered features [RFC PATCH v2 19/26] KVM: x86: Add reverse-CPUID lookup support for scattered SGX features - The original patch 1 x86/sgx: Split out adding EPC page to free list to separate helper was replaced with 2 new patches from Jarkko [RFC PATCH v2 02/26] x86/sgx: Remove a warn from sgx_free_epc_page() [RFC PATCH v2 03/26] x86/sgx: Wipe out EREMOVE from sgx_free_epc_page() addressing Jarkko's comments. - Moved modifying sgx_init() to always initialize sgx_virt_epc_init() out of patch x86/sgx: Introduce virtual EPC for use by KVM guests to a separate patch: [RFC PATCH v2 07/26] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled to address Dave's comment that patch ordering can be improved due to before patch "Allow SGX virtualization without Launch Control support", all SGX, including SGX virtualization, is actually disabled when SGX LC is not present. ========= KVM SGX virtualization Overview - Virtual EPC SGX enclave memory is special and is reserved specifically for enclave use. In bare-metal SGX enclaves, the kernel allocates enclave pages, copies data into the pages with privileged instructions, then allows the enclave to start. In this scenario, only initialized pages already assigned to an enclave are mapped to userspace. In virtualized environments, the hypervisor still needs to do the physical enclave page allocation. The guest kernel is responsible for the data copying (among other things). This means that the job of starting an enclave is now split between hypervisor and guest. This series introduces a new misc device: /dev/sgx_vepc. This device allows the host to map *uninitialized* enclave memory into userspace, which can then be passed into a guest. While it might be *possible* to start a host-side enclave with /dev/sgx_enclave and pass its memory into a guest, it would be wasteful and convoluted. Implement the *raw* EPC allocation in the x86 core-SGX subsystem via /dev/sgx_vepc rather than in KVM. Doing so has two major advantages: - Does not require changes to KVM's uAPI, e.g. EPC gets handled as just another memory backend for guests. - EPC management is wholly contained in the SGX subsystem, e.g. SGX does not have to export any symbols, changes to reclaim flows don't need to be routed through KVM, SGX's dirty laundry doesn't have to get aired out for the world to see, and so on and so forth. The virtual EPC pages allocated to guests are currently not reclaimable. Reclaiming EPC page used by enclave requires a special reclaim mechanism separate from normal page reclaim, and that mechanism is not supported for virutal EPC pages. Due to the complications of handling reclaim conflicts between guest and host, reclaiming virtual EPC pages is significantly more complex than basic support for SGX virtualization. - Support SGX virtualization without SGX Flexible Launch Control SGX hardware supports two "launch control" modes to limit which enclaves can run. In the "locked" mode, the hardware prevents enclaves from running unless they are blessed by a third party. In the unlocked mode, the kernel is in full control of which enclaves can run. The bare-metal SGX code refuses to launch enclaves unless it is in the unlocked mode. This sgx_virt_epc driver does not have such a restriction. This allows guests which are OK with the locked mode to use SGX, even if the host kernel refuses to. - Support exposing SGX2 Due to the same reason above, SGX2 feature detection is added to core SGX code to allow KVM to expose SGX2 to guest, even currently SGX driver doesn't support SGX2, because SGX2 can work just fine in guest w/o any interaction to host SGX driver. - Restricit SGX guest access to provisioning key To grant guest being able to fully use SGX, guest needs to be able to access provisioning key. The provisioning key is sensitive, and accessing to it should be restricted. In bare-metal driver, allowing enclave to access provisioning key is restricted by being able to open /dev/sgx_provision. Add a new KVM_CAP_SGX_ATTRIBUTE to KVM uAPI to extend above mechanism to KVM guests as well. When userspace hypervisor creates a new VM, the new cap is only added to VM when userspace hypervisior is able to open /dev/sgx_provision, following the same role as in bare-metal driver. KVM then traps ECREATE from guest, and only allows ECREATE with provisioning key bit to run when guest supports KVM_CAP_SGX_ATTRIBUTE. Jarkko Sakkinen (1): x86/sgx: Wipe out EREMOVE from sgx_free_epc_page() Kai Huang (3): x86/cpufeatures: Make SGX_LC feature bit depend on SGX bit x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled x86/sgx: Add helper to update SGX_LEPUBKEYHASHn MSRs Sean Christopherson (22): x86/cpufeatures: Add SGX1 and SGX2 sub-features x86/sgx: Add SGX_CHILD_PRESENT hardware error code x86/sgx: Introduce virtual EPC for use by KVM guests x86/cpu/intel: Allow SGX virtualization without Launch Control support x86/sgx: Expose SGX architectural definitions to the kernel x86/sgx: Move ENCLS leaf definitions to sgx_arch.h x86/sgx: Add SGX2 ENCLS leaf definitions (EAUG, EMODPR and EMODT) x86/sgx: Add encls_faulted() helper x86/sgx: Add helpers to expose ECREATE and EINIT to KVM x86/sgx: Move provisioning device creation out of SGX driver KVM: VMX: Convert vcpu_vmx.exit_reason to a union KVM: x86: Export kvm_mmu_gva_to_gpa_{read,write}() for SGX (VMX) KVM: x86: Define new #PF SGX error code bit KVM: x86: Add support for reverse CPUID lookup of scattered features KVM: x86: Add reverse-CPUID lookup support for scattered SGX features KVM: VMX: Add basic handling of VM-Exit from SGX enclave KVM: VMX: Frame in ENCLS handler for SGX virtualization KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions KVM: VMX: Add emulation of SGX Launch Control LE hash MSRs KVM: VMX: Add ENCLS[EINIT] handler to support SGX Launch Control (LC) KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC KVM: x86: Add capability to grant VM access to privileged SGX attribute Documentation/virt/kvm/api.rst | 23 + arch/x86/Kconfig | 12 + arch/x86/include/asm/cpufeatures.h | 2 + arch/x86/include/asm/kvm_host.h | 5 + arch/x86/include/asm/sgx.h | 19 + .../cpu/sgx/arch.h => include/asm/sgx_arch.h} | 20 + arch/x86/include/asm/vmx.h | 1 + arch/x86/include/uapi/asm/vmx.h | 1 + arch/x86/kernel/cpu/cpuid-deps.c | 3 + arch/x86/kernel/cpu/feat_ctl.c | 69 ++- arch/x86/kernel/cpu/scattered.c | 2 + arch/x86/kernel/cpu/sgx/Makefile | 1 + arch/x86/kernel/cpu/sgx/driver.c | 17 - arch/x86/kernel/cpu/sgx/encl.c | 21 +- arch/x86/kernel/cpu/sgx/encls.h | 30 +- arch/x86/kernel/cpu/sgx/ioctl.c | 23 +- arch/x86/kernel/cpu/sgx/main.c | 93 +++- arch/x86/kernel/cpu/sgx/sgx.h | 13 +- arch/x86/kernel/cpu/sgx/virt.c | 353 +++++++++++++ arch/x86/kvm/Makefile | 2 + arch/x86/kvm/cpuid.c | 89 +++- arch/x86/kvm/cpuid.h | 50 +- arch/x86/kvm/vmx/nested.c | 70 ++- arch/x86/kvm/vmx/nested.h | 5 + arch/x86/kvm/vmx/sgx.c | 466 ++++++++++++++++++ arch/x86/kvm/vmx/sgx.h | 34 ++ arch/x86/kvm/vmx/vmcs12.c | 1 + arch/x86/kvm/vmx/vmcs12.h | 4 +- arch/x86/kvm/vmx/vmx.c | 171 +++++-- arch/x86/kvm/vmx/vmx.h | 27 +- arch/x86/kvm/x86.c | 24 + include/uapi/linux/kvm.h | 1 + tools/testing/selftests/sgx/defines.h | 2 +- 33 files changed, 1498 insertions(+), 156 deletions(-) create mode 100644 arch/x86/include/asm/sgx.h rename arch/x86/{kernel/cpu/sgx/arch.h => include/asm/sgx_arch.h} (96%) create mode 100644 arch/x86/kernel/cpu/sgx/virt.c create mode 100644 arch/x86/kvm/vmx/sgx.c create mode 100644 arch/x86/kvm/vmx/sgx.h