[v21,27/28] docs: x86/sgx: Document kernel internals
diff mbox series

Message ID 20190713170804.2340-28-jarkko.sakkinen@linux.intel.com
State New
Headers show
  • Intel SGX foundations
Related show

Commit Message

Jarkko Sakkinen July 13, 2019, 5:08 p.m. UTC
From: Sean Christopherson <sean.j.christopherson@intel.com>

Document some of the more tricky parts of the kernel implementation

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Co-developed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
 Documentation/x86/sgx/2.Kernel-internals.rst | 76 ++++++++++++++++++++
 Documentation/x86/sgx/index.rst              |  1 +
 2 files changed, 77 insertions(+)
 create mode 100644 Documentation/x86/sgx/2.Kernel-internals.rst

diff mbox series

diff --git a/Documentation/x86/sgx/2.Kernel-internals.rst b/Documentation/x86/sgx/2.Kernel-internals.rst
new file mode 100644
index 000000000000..5c90a65936f2
--- /dev/null
+++ b/Documentation/x86/sgx/2.Kernel-internals.rst
@@ -0,0 +1,76 @@ 
+.. SPDX-License-Identifier: GPL-2.0
+Kernel Internals
+CPU configuration
+Because SGX has an ever evolving and expanding feature set, it's possible for
+a BIOS or VMM to configure a system in such a way that not all CPUs are equal,
+e.g. where Launch Control is only enabled on a subset of CPUs.  Linux does
+*not* support such a heterogeneous system configuration, nor does it even
+attempt to play nice in the face of a misconfigured system.  With the exception
+of Launch Control's hash MSRs, which can vary per CPU, Linux assumes that all
+CPUs have a configuration that is identical to the boot CPU.
+EPC management
+Because the kernel can't arbitrarily read EPC memory or share RO backing pages
+between enclaves, traditional memory models such as CoW and fork() do not work
+with enclaves.  In other words, the architectural rules of EPC forces it to be
+treated as MAP_SHARED at all times.
+The inability to employ traditional memory models also means that EPC memory
+must be isolated from normal memory pools, e.g. attempting to use EPC memory
+for normal mappings would result in faults and/or perceived data corruption.
+Furthermore, EPC is not enumerated by as normal memory, e.g. BIOS enumerates
+EPC as reserved memory in the e820 tables, or not at all.  As a result, EPC
+memory is directly managed by the SGX subsystem, e.g. SGX employs VM_PFNMAP to
+manually insert/zap/swap page table entries, and exposes EPC to userspace via
+a well known device, /dev/sgx/enclave.
+The net effect is that all enclave VMAs must be MAP_SHARED and are backed by
+a single file, /dev/sgx/enclave.
+EPC oversubscription
+SGX allows to have larger enclaves than amount of available EPC by providing a
+subset of leaf instruction for swapping EPC pages to the system memory.  The
+details of these instructions are discussed in the architecture document. Due
+to the unique requirements for swapping EPC pages, and because EPC pages do not
+have associated page structures, management of the EPC is not handled by the
+standard memory subsystem.
+SGX directly handles swapping of EPC pages, including a thread to initiate the
+reclaiming process and a rudimentary LRU mechanism. When the amount of free EPC
+pages goes below a low watermark the swapping thread starts reclaiming pages.
+The pages that have not been recently accessed (i.e. do not have the A bit set)
+are selected as victim pages. Each enclave holds an shmem file as a backing
+storage for reclaimed pages.
+Launch Control
+The current kernel implementation supports only writable MSRs. The launch is
+performed by setting the MSRs to the hash of the public key modulus of the
+enclave signer and a token with the valid bit set to zero. Because kernel makes
+ultimately all the launch decisions token are not needed for anything.  We
+don't need or have a launch enclave for generating them as the MSRs must always
+be writable.
+The use of provisioning must be controlled because it allows to get access to
+the provisioning keys to attest to a remote party that the software is running
+inside a legit enclave. This could be used by a malware network to ensure that
+its nodes are running inside legit enclaves.
+The driver introduces a special device file /dev/sgx/provision and a special
+ioctl SGX_IOC_ENCLAVE_SET_ATTRIBUTE to accomplish this. A file descriptor
+pointing to /dev/sgx/provision is passed to ioctl from which kernel authorizes
+the PROVISION_KEY attribute to the enclave.
diff --git a/Documentation/x86/sgx/index.rst b/Documentation/x86/sgx/index.rst
index c5dfef62e612..5d660e83d984 100644
--- a/Documentation/x86/sgx/index.rst
+++ b/Documentation/x86/sgx/index.rst
@@ -14,3 +14,4 @@  potentially malicious.
    :maxdepth: 1
+   2.Kernel-internals