Message ID | 20200817131003.56650-1-andraprs@amazon.com (mailing list archive) |
---|---|
Headers | show |
Series | Add support for Nitro Enclaves | expand |
On 17.08.20 15:09, Andra Paraschiv wrote: > Nitro Enclaves (NE) is a new Amazon Elastic Compute Cloud (EC2) capability > that allows customers to carve out isolated compute environments within EC2 > instances [1]. > > For example, an application that processes sensitive data and runs in a VM, > can be separated from other applications running in the same VM. This > application then runs in a separate VM than the primary VM, namely an enclave. > > An enclave runs alongside the VM that spawned it. This setup matches low latency > applications needs. The resources that are allocated for the enclave, such as > memory and CPUs, are carved out of the primary VM. Each enclave is mapped to a > process running in the primary VM, that communicates with the NE driver via an > ioctl interface. > > In this sense, there are two components: > > 1. An enclave abstraction process - a user space process running in the primary > VM guest that uses the provided ioctl interface of the NE driver to spawn an > enclave VM (that's 2 below). > > There is a NE emulated PCI device exposed to the primary VM. The driver for this > new PCI device is included in the NE driver. > > The ioctl logic is mapped to PCI device commands e.g. the NE_START_ENCLAVE ioctl > maps to an enclave start PCI command. The PCI device commands are then > translated into actions taken on the hypervisor side; that's the Nitro > hypervisor running on the host where the primary VM is running. The Nitro > hypervisor is based on core KVM technology. > > 2. The enclave itself - a VM running on the same host as the primary VM that > spawned it. Memory and CPUs are carved out of the primary VM and are dedicated > for the enclave VM. An enclave does not have persistent storage attached. > > The memory regions carved out of the primary VM and given to an enclave need to > be aligned 2 MiB / 1 GiB physically contiguous memory regions (or multiple of > this size e.g. 8 MiB). The memory can be allocated e.g. by using hugetlbfs from > user space [2][3]. The memory size for an enclave needs to be at least 64 MiB. > The enclave memory and CPUs need to be from the same NUMA node. > > An enclave runs on dedicated cores. CPU 0 and its CPU siblings need to remain > available for the primary VM. A CPU pool has to be set for NE purposes by an > user with admin capability. See the cpu list section from the kernel > documentation [4] for how a CPU pool format looks. > > An enclave communicates with the primary VM via a local communication channel, > using virtio-vsock [5]. The primary VM has virtio-pci vsock emulated device, > while the enclave VM has a virtio-mmio vsock emulated device. The vsock device > uses eventfd for signaling. The enclave VM sees the usual interfaces - local > APIC and IOAPIC - to get interrupts from virtio-vsock device. The virtio-mmio > device is placed in memory below the typical 4 GiB. > > The application that runs in the enclave needs to be packaged in an enclave > image together with the OS ( e.g. kernel, ramdisk, init ) that will run in the > enclave VM. The enclave VM has its own kernel and follows the standard Linux > boot protocol. > > The kernel bzImage, the kernel command line, the ramdisk(s) are part of the > Enclave Image Format (EIF); plus an EIF header including metadata such as magic > number, eif version, image size and CRC. > > Hash values are computed for the entire enclave image (EIF), the kernel and > ramdisk(s). That's used, for example, to check that the enclave image that is > loaded in the enclave VM is the one that was intended to be run. > > These crypto measurements are included in a signed attestation document > generated by the Nitro Hypervisor and further used to prove the identity of the > enclave; KMS is an example of service that NE is integrated with and that checks > the attestation doc. > > The enclave image (EIF) is loaded in the enclave memory at offset 8 MiB. The > init process in the enclave connects to the vsock CID of the primary VM and a > predefined port - 9000 - to send a heartbeat value - 0xb7. This mechanism is > used to check in the primary VM that the enclave has booted. > > If the enclave VM crashes or gracefully exits, an interrupt event is received by > the NE driver. This event is sent further to the user space enclave process > running in the primary VM via a poll notification mechanism. Then the user space > enclave process can exit. > > Thank you. > This version reads very well, thanks a lot Andra! Greg, would you mind to have another look over it? Reviewed-by: Alexander Graf <graf@amazon.com> Alex Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879
On Wed, Aug 19, 2020 at 01:15:59PM +0200, Alexander Graf wrote: > > > On 17.08.20 15:09, Andra Paraschiv wrote: > > Nitro Enclaves (NE) is a new Amazon Elastic Compute Cloud (EC2) capability > > that allows customers to carve out isolated compute environments within EC2 > > instances [1]. > > > > For example, an application that processes sensitive data and runs in a VM, > > can be separated from other applications running in the same VM. This > > application then runs in a separate VM than the primary VM, namely an enclave. > > > > An enclave runs alongside the VM that spawned it. This setup matches low latency > > applications needs. The resources that are allocated for the enclave, such as > > memory and CPUs, are carved out of the primary VM. Each enclave is mapped to a > > process running in the primary VM, that communicates with the NE driver via an > > ioctl interface. > > > > In this sense, there are two components: > > > > 1. An enclave abstraction process - a user space process running in the primary > > VM guest that uses the provided ioctl interface of the NE driver to spawn an > > enclave VM (that's 2 below). > > > > There is a NE emulated PCI device exposed to the primary VM. The driver for this > > new PCI device is included in the NE driver. > > > > The ioctl logic is mapped to PCI device commands e.g. the NE_START_ENCLAVE ioctl > > maps to an enclave start PCI command. The PCI device commands are then > > translated into actions taken on the hypervisor side; that's the Nitro > > hypervisor running on the host where the primary VM is running. The Nitro > > hypervisor is based on core KVM technology. > > > > 2. The enclave itself - a VM running on the same host as the primary VM that > > spawned it. Memory and CPUs are carved out of the primary VM and are dedicated > > for the enclave VM. An enclave does not have persistent storage attached. > > > > The memory regions carved out of the primary VM and given to an enclave need to > > be aligned 2 MiB / 1 GiB physically contiguous memory regions (or multiple of > > this size e.g. 8 MiB). The memory can be allocated e.g. by using hugetlbfs from > > user space [2][3]. The memory size for an enclave needs to be at least 64 MiB. > > The enclave memory and CPUs need to be from the same NUMA node. > > > > An enclave runs on dedicated cores. CPU 0 and its CPU siblings need to remain > > available for the primary VM. A CPU pool has to be set for NE purposes by an > > user with admin capability. See the cpu list section from the kernel > > documentation [4] for how a CPU pool format looks. > > > > An enclave communicates with the primary VM via a local communication channel, > > using virtio-vsock [5]. The primary VM has virtio-pci vsock emulated device, > > while the enclave VM has a virtio-mmio vsock emulated device. The vsock device > > uses eventfd for signaling. The enclave VM sees the usual interfaces - local > > APIC and IOAPIC - to get interrupts from virtio-vsock device. The virtio-mmio > > device is placed in memory below the typical 4 GiB. > > > > The application that runs in the enclave needs to be packaged in an enclave > > image together with the OS ( e.g. kernel, ramdisk, init ) that will run in the > > enclave VM. The enclave VM has its own kernel and follows the standard Linux > > boot protocol. > > > > The kernel bzImage, the kernel command line, the ramdisk(s) are part of the > > Enclave Image Format (EIF); plus an EIF header including metadata such as magic > > number, eif version, image size and CRC. > > > > Hash values are computed for the entire enclave image (EIF), the kernel and > > ramdisk(s). That's used, for example, to check that the enclave image that is > > loaded in the enclave VM is the one that was intended to be run. > > > > These crypto measurements are included in a signed attestation document > > generated by the Nitro Hypervisor and further used to prove the identity of the > > enclave; KMS is an example of service that NE is integrated with and that checks > > the attestation doc. > > > > The enclave image (EIF) is loaded in the enclave memory at offset 8 MiB. The > > init process in the enclave connects to the vsock CID of the primary VM and a > > predefined port - 9000 - to send a heartbeat value - 0xb7. This mechanism is > > used to check in the primary VM that the enclave has booted. > > > > If the enclave VM crashes or gracefully exits, an interrupt event is received by > > the NE driver. This event is sent further to the user space enclave process > > running in the primary VM via a poll notification mechanism. Then the user space > > enclave process can exit. > > > > Thank you. > > > > This version reads very well, thanks a lot Andra! > > Greg, would you mind to have another look over it? Will do, it's in my to-review queue, behind lots of other patches...
On 19/08/2020 14:26, Greg KH wrote: > > On Wed, Aug 19, 2020 at 01:15:59PM +0200, Alexander Graf wrote: >> >> On 17.08.20 15:09, Andra Paraschiv wrote: >>> Nitro Enclaves (NE) is a new Amazon Elastic Compute Cloud (EC2) capability >>> that allows customers to carve out isolated compute environments within EC2 >>> instances [1]. >>> >>> For example, an application that processes sensitive data and runs in a VM, >>> can be separated from other applications running in the same VM. This >>> application then runs in a separate VM than the primary VM, namely an enclave. >>> >>> An enclave runs alongside the VM that spawned it. This setup matches low latency >>> applications needs. The resources that are allocated for the enclave, such as >>> memory and CPUs, are carved out of the primary VM. Each enclave is mapped to a >>> process running in the primary VM, that communicates with the NE driver via an >>> ioctl interface. >>> >>> In this sense, there are two components: >>> >>> 1. An enclave abstraction process - a user space process running in the primary >>> VM guest that uses the provided ioctl interface of the NE driver to spawn an >>> enclave VM (that's 2 below). >>> >>> There is a NE emulated PCI device exposed to the primary VM. The driver for this >>> new PCI device is included in the NE driver. >>> >>> The ioctl logic is mapped to PCI device commands e.g. the NE_START_ENCLAVE ioctl >>> maps to an enclave start PCI command. The PCI device commands are then >>> translated into actions taken on the hypervisor side; that's the Nitro >>> hypervisor running on the host where the primary VM is running. The Nitro >>> hypervisor is based on core KVM technology. >>> >>> 2. The enclave itself - a VM running on the same host as the primary VM that >>> spawned it. Memory and CPUs are carved out of the primary VM and are dedicated >>> for the enclave VM. An enclave does not have persistent storage attached. >>> >>> The memory regions carved out of the primary VM and given to an enclave need to >>> be aligned 2 MiB / 1 GiB physically contiguous memory regions (or multiple of >>> this size e.g. 8 MiB). The memory can be allocated e.g. by using hugetlbfs from >>> user space [2][3]. The memory size for an enclave needs to be at least 64 MiB. >>> The enclave memory and CPUs need to be from the same NUMA node. >>> >>> An enclave runs on dedicated cores. CPU 0 and its CPU siblings need to remain >>> available for the primary VM. A CPU pool has to be set for NE purposes by an >>> user with admin capability. See the cpu list section from the kernel >>> documentation [4] for how a CPU pool format looks. >>> >>> An enclave communicates with the primary VM via a local communication channel, >>> using virtio-vsock [5]. The primary VM has virtio-pci vsock emulated device, >>> while the enclave VM has a virtio-mmio vsock emulated device. The vsock device >>> uses eventfd for signaling. The enclave VM sees the usual interfaces - local >>> APIC and IOAPIC - to get interrupts from virtio-vsock device. The virtio-mmio >>> device is placed in memory below the typical 4 GiB. >>> >>> The application that runs in the enclave needs to be packaged in an enclave >>> image together with the OS ( e.g. kernel, ramdisk, init ) that will run in the >>> enclave VM. The enclave VM has its own kernel and follows the standard Linux >>> boot protocol. >>> >>> The kernel bzImage, the kernel command line, the ramdisk(s) are part of the >>> Enclave Image Format (EIF); plus an EIF header including metadata such as magic >>> number, eif version, image size and CRC. >>> >>> Hash values are computed for the entire enclave image (EIF), the kernel and >>> ramdisk(s). That's used, for example, to check that the enclave image that is >>> loaded in the enclave VM is the one that was intended to be run. >>> >>> These crypto measurements are included in a signed attestation document >>> generated by the Nitro Hypervisor and further used to prove the identity of the >>> enclave; KMS is an example of service that NE is integrated with and that checks >>> the attestation doc. >>> >>> The enclave image (EIF) is loaded in the enclave memory at offset 8 MiB. The >>> init process in the enclave connects to the vsock CID of the primary VM and a >>> predefined port - 9000 - to send a heartbeat value - 0xb7. This mechanism is >>> used to check in the primary VM that the enclave has booted. >>> >>> If the enclave VM crashes or gracefully exits, an interrupt event is received by >>> the NE driver. This event is sent further to the user space enclave process >>> running in the primary VM via a poll notification mechanism. Then the user space >>> enclave process can exit. >>> >>> Thank you. >>> >> This version reads very well, thanks a lot Andra! Glad that the review experience has been improved and the patch series is in a better shape. >> >> Greg, would you mind to have another look over it? > Will do, it's in my to-review queue, behind lots of other patches... > Thanks both for taking time to go through the patch series. Andra Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.
On 19/08/2020 14:26, Greg KH wrote: > > On Wed, Aug 19, 2020 at 01:15:59PM +0200, Alexander Graf wrote: >> >> On 17.08.20 15:09, Andra Paraschiv wrote: >>> Nitro Enclaves (NE) is a new Amazon Elastic Compute Cloud (EC2) capability >>> that allows customers to carve out isolated compute environments within EC2 >>> instances [1]. >>> >>> For example, an application that processes sensitive data and runs in a VM, >>> can be separated from other applications running in the same VM. This >>> application then runs in a separate VM than the primary VM, namely an enclave. >>> >>> An enclave runs alongside the VM that spawned it. This setup matches low latency >>> applications needs. The resources that are allocated for the enclave, such as >>> memory and CPUs, are carved out of the primary VM. Each enclave is mapped to a >>> process running in the primary VM, that communicates with the NE driver via an >>> ioctl interface. >>> >>> In this sense, there are two components: >>> >>> 1. An enclave abstraction process - a user space process running in the primary >>> VM guest that uses the provided ioctl interface of the NE driver to spawn an >>> enclave VM (that's 2 below). >>> >>> There is a NE emulated PCI device exposed to the primary VM. The driver for this >>> new PCI device is included in the NE driver. >>> >>> The ioctl logic is mapped to PCI device commands e.g. the NE_START_ENCLAVE ioctl >>> maps to an enclave start PCI command. The PCI device commands are then >>> translated into actions taken on the hypervisor side; that's the Nitro >>> hypervisor running on the host where the primary VM is running. The Nitro >>> hypervisor is based on core KVM technology. >>> >>> 2. The enclave itself - a VM running on the same host as the primary VM that >>> spawned it. Memory and CPUs are carved out of the primary VM and are dedicated >>> for the enclave VM. An enclave does not have persistent storage attached. >>> >>> The memory regions carved out of the primary VM and given to an enclave need to >>> be aligned 2 MiB / 1 GiB physically contiguous memory regions (or multiple of >>> this size e.g. 8 MiB). The memory can be allocated e.g. by using hugetlbfs from >>> user space [2][3]. The memory size for an enclave needs to be at least 64 MiB. >>> The enclave memory and CPUs need to be from the same NUMA node. >>> >>> An enclave runs on dedicated cores. CPU 0 and its CPU siblings need to remain >>> available for the primary VM. A CPU pool has to be set for NE purposes by an >>> user with admin capability. See the cpu list section from the kernel >>> documentation [4] for how a CPU pool format looks. >>> >>> An enclave communicates with the primary VM via a local communication channel, >>> using virtio-vsock [5]. The primary VM has virtio-pci vsock emulated device, >>> while the enclave VM has a virtio-mmio vsock emulated device. The vsock device >>> uses eventfd for signaling. The enclave VM sees the usual interfaces - local >>> APIC and IOAPIC - to get interrupts from virtio-vsock device. The virtio-mmio >>> device is placed in memory below the typical 4 GiB. >>> >>> The application that runs in the enclave needs to be packaged in an enclave >>> image together with the OS ( e.g. kernel, ramdisk, init ) that will run in the >>> enclave VM. The enclave VM has its own kernel and follows the standard Linux >>> boot protocol. >>> >>> The kernel bzImage, the kernel command line, the ramdisk(s) are part of the >>> Enclave Image Format (EIF); plus an EIF header including metadata such as magic >>> number, eif version, image size and CRC. >>> >>> Hash values are computed for the entire enclave image (EIF), the kernel and >>> ramdisk(s). That's used, for example, to check that the enclave image that is >>> loaded in the enclave VM is the one that was intended to be run. >>> >>> These crypto measurements are included in a signed attestation document >>> generated by the Nitro Hypervisor and further used to prove the identity of the >>> enclave; KMS is an example of service that NE is integrated with and that checks >>> the attestation doc. >>> >>> The enclave image (EIF) is loaded in the enclave memory at offset 8 MiB. The >>> init process in the enclave connects to the vsock CID of the primary VM and a >>> predefined port - 9000 - to send a heartbeat value - 0xb7. This mechanism is >>> used to check in the primary VM that the enclave has booted. >>> >>> If the enclave VM crashes or gracefully exits, an interrupt event is received by >>> the NE driver. This event is sent further to the user space enclave process >>> running in the primary VM via a poll notification mechanism. Then the user space >>> enclave process can exit. >>> >>> Thank you. >>> >> This version reads very well, thanks a lot Andra! >> >> Greg, would you mind to have another look over it? > Will do, it's in my to-review queue, behind lots of other patches... > I have a set of updates that can be included in a new revision, v8 e.g. new NE custom error codes for invalid flags / enclave CID, "shutdown" function for the NE PCI device driver, a couple more checks wrt invalid flags and enclave vsock CID, documentation and sample updates. There is also the option to have these updates as follow-up patches. Greg, let me know what would work fine for you with regard to the review of the patch series. Thanks, Andra Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.
On Mon, Aug 31, 2020 at 11:19:19AM +0300, Paraschiv, Andra-Irina wrote: > > > On 19/08/2020 14:26, Greg KH wrote: > > > > On Wed, Aug 19, 2020 at 01:15:59PM +0200, Alexander Graf wrote: > > > > > > On 17.08.20 15:09, Andra Paraschiv wrote: > > > > Nitro Enclaves (NE) is a new Amazon Elastic Compute Cloud (EC2) capability > > > > that allows customers to carve out isolated compute environments within EC2 > > > > instances [1]. > > > > > > > > For example, an application that processes sensitive data and runs in a VM, > > > > can be separated from other applications running in the same VM. This > > > > application then runs in a separate VM than the primary VM, namely an enclave. > > > > > > > > An enclave runs alongside the VM that spawned it. This setup matches low latency > > > > applications needs. The resources that are allocated for the enclave, such as > > > > memory and CPUs, are carved out of the primary VM. Each enclave is mapped to a > > > > process running in the primary VM, that communicates with the NE driver via an > > > > ioctl interface. > > > > > > > > In this sense, there are two components: > > > > > > > > 1. An enclave abstraction process - a user space process running in the primary > > > > VM guest that uses the provided ioctl interface of the NE driver to spawn an > > > > enclave VM (that's 2 below). > > > > > > > > There is a NE emulated PCI device exposed to the primary VM. The driver for this > > > > new PCI device is included in the NE driver. > > > > > > > > The ioctl logic is mapped to PCI device commands e.g. the NE_START_ENCLAVE ioctl > > > > maps to an enclave start PCI command. The PCI device commands are then > > > > translated into actions taken on the hypervisor side; that's the Nitro > > > > hypervisor running on the host where the primary VM is running. The Nitro > > > > hypervisor is based on core KVM technology. > > > > > > > > 2. The enclave itself - a VM running on the same host as the primary VM that > > > > spawned it. Memory and CPUs are carved out of the primary VM and are dedicated > > > > for the enclave VM. An enclave does not have persistent storage attached. > > > > > > > > The memory regions carved out of the primary VM and given to an enclave need to > > > > be aligned 2 MiB / 1 GiB physically contiguous memory regions (or multiple of > > > > this size e.g. 8 MiB). The memory can be allocated e.g. by using hugetlbfs from > > > > user space [2][3]. The memory size for an enclave needs to be at least 64 MiB. > > > > The enclave memory and CPUs need to be from the same NUMA node. > > > > > > > > An enclave runs on dedicated cores. CPU 0 and its CPU siblings need to remain > > > > available for the primary VM. A CPU pool has to be set for NE purposes by an > > > > user with admin capability. See the cpu list section from the kernel > > > > documentation [4] for how a CPU pool format looks. > > > > > > > > An enclave communicates with the primary VM via a local communication channel, > > > > using virtio-vsock [5]. The primary VM has virtio-pci vsock emulated device, > > > > while the enclave VM has a virtio-mmio vsock emulated device. The vsock device > > > > uses eventfd for signaling. The enclave VM sees the usual interfaces - local > > > > APIC and IOAPIC - to get interrupts from virtio-vsock device. The virtio-mmio > > > > device is placed in memory below the typical 4 GiB. > > > > > > > > The application that runs in the enclave needs to be packaged in an enclave > > > > image together with the OS ( e.g. kernel, ramdisk, init ) that will run in the > > > > enclave VM. The enclave VM has its own kernel and follows the standard Linux > > > > boot protocol. > > > > > > > > The kernel bzImage, the kernel command line, the ramdisk(s) are part of the > > > > Enclave Image Format (EIF); plus an EIF header including metadata such as magic > > > > number, eif version, image size and CRC. > > > > > > > > Hash values are computed for the entire enclave image (EIF), the kernel and > > > > ramdisk(s). That's used, for example, to check that the enclave image that is > > > > loaded in the enclave VM is the one that was intended to be run. > > > > > > > > These crypto measurements are included in a signed attestation document > > > > generated by the Nitro Hypervisor and further used to prove the identity of the > > > > enclave; KMS is an example of service that NE is integrated with and that checks > > > > the attestation doc. > > > > > > > > The enclave image (EIF) is loaded in the enclave memory at offset 8 MiB. The > > > > init process in the enclave connects to the vsock CID of the primary VM and a > > > > predefined port - 9000 - to send a heartbeat value - 0xb7. This mechanism is > > > > used to check in the primary VM that the enclave has booted. > > > > > > > > If the enclave VM crashes or gracefully exits, an interrupt event is received by > > > > the NE driver. This event is sent further to the user space enclave process > > > > running in the primary VM via a poll notification mechanism. Then the user space > > > > enclave process can exit. > > > > > > > > Thank you. > > > > > > > This version reads very well, thanks a lot Andra! > > > > > > Greg, would you mind to have another look over it? > > Will do, it's in my to-review queue, behind lots of other patches... > > > > I have a set of updates that can be included in a new revision, v8 e.g. new > NE custom error codes for invalid flags / enclave CID, "shutdown" function > for the NE PCI device driver, a couple more checks wrt invalid flags and > enclave vsock CID, documentation and sample updates. There is also the > option to have these updates as follow-up patches. > > Greg, let me know what would work fine for you with regard to the review of > the patch series. A new series is always fine with me... thanks, greg k-h
On 04/09/2020 19:13, Greg KH wrote: > On Mon, Aug 31, 2020 at 11:19:19AM +0300, Paraschiv, Andra-Irina wrote: >> >> On 19/08/2020 14:26, Greg KH wrote: >>> On Wed, Aug 19, 2020 at 01:15:59PM +0200, Alexander Graf wrote: >>>> On 17.08.20 15:09, Andra Paraschiv wrote: >>>>> Nitro Enclaves (NE) is a new Amazon Elastic Compute Cloud (EC2) capability >>>>> that allows customers to carve out isolated compute environments within EC2 >>>>> instances [1]. >>>>> >>>>> For example, an application that processes sensitive data and runs in a VM, >>>>> can be separated from other applications running in the same VM. This >>>>> application then runs in a separate VM than the primary VM, namely an enclave. >>>>> >>>>> An enclave runs alongside the VM that spawned it. This setup matches low latency >>>>> applications needs. The resources that are allocated for the enclave, such as >>>>> memory and CPUs, are carved out of the primary VM. Each enclave is mapped to a >>>>> process running in the primary VM, that communicates with the NE driver via an >>>>> ioctl interface. >>>>> >>>>> In this sense, there are two components: >>>>> >>>>> 1. An enclave abstraction process - a user space process running in the primary >>>>> VM guest that uses the provided ioctl interface of the NE driver to spawn an >>>>> enclave VM (that's 2 below). >>>>> >>>>> There is a NE emulated PCI device exposed to the primary VM. The driver for this >>>>> new PCI device is included in the NE driver. >>>>> >>>>> The ioctl logic is mapped to PCI device commands e.g. the NE_START_ENCLAVE ioctl >>>>> maps to an enclave start PCI command. The PCI device commands are then >>>>> translated into actions taken on the hypervisor side; that's the Nitro >>>>> hypervisor running on the host where the primary VM is running. The Nitro >>>>> hypervisor is based on core KVM technology. >>>>> >>>>> 2. The enclave itself - a VM running on the same host as the primary VM that >>>>> spawned it. Memory and CPUs are carved out of the primary VM and are dedicated >>>>> for the enclave VM. An enclave does not have persistent storage attached. >>>>> >>>>> The memory regions carved out of the primary VM and given to an enclave need to >>>>> be aligned 2 MiB / 1 GiB physically contiguous memory regions (or multiple of >>>>> this size e.g. 8 MiB). The memory can be allocated e.g. by using hugetlbfs from >>>>> user space [2][3]. The memory size for an enclave needs to be at least 64 MiB. >>>>> The enclave memory and CPUs need to be from the same NUMA node. >>>>> >>>>> An enclave runs on dedicated cores. CPU 0 and its CPU siblings need to remain >>>>> available for the primary VM. A CPU pool has to be set for NE purposes by an >>>>> user with admin capability. See the cpu list section from the kernel >>>>> documentation [4] for how a CPU pool format looks. >>>>> >>>>> An enclave communicates with the primary VM via a local communication channel, >>>>> using virtio-vsock [5]. The primary VM has virtio-pci vsock emulated device, >>>>> while the enclave VM has a virtio-mmio vsock emulated device. The vsock device >>>>> uses eventfd for signaling. The enclave VM sees the usual interfaces - local >>>>> APIC and IOAPIC - to get interrupts from virtio-vsock device. The virtio-mmio >>>>> device is placed in memory below the typical 4 GiB. >>>>> >>>>> The application that runs in the enclave needs to be packaged in an enclave >>>>> image together with the OS ( e.g. kernel, ramdisk, init ) that will run in the >>>>> enclave VM. The enclave VM has its own kernel and follows the standard Linux >>>>> boot protocol. >>>>> >>>>> The kernel bzImage, the kernel command line, the ramdisk(s) are part of the >>>>> Enclave Image Format (EIF); plus an EIF header including metadata such as magic >>>>> number, eif version, image size and CRC. >>>>> >>>>> Hash values are computed for the entire enclave image (EIF), the kernel and >>>>> ramdisk(s). That's used, for example, to check that the enclave image that is >>>>> loaded in the enclave VM is the one that was intended to be run. >>>>> >>>>> These crypto measurements are included in a signed attestation document >>>>> generated by the Nitro Hypervisor and further used to prove the identity of the >>>>> enclave; KMS is an example of service that NE is integrated with and that checks >>>>> the attestation doc. >>>>> >>>>> The enclave image (EIF) is loaded in the enclave memory at offset 8 MiB. The >>>>> init process in the enclave connects to the vsock CID of the primary VM and a >>>>> predefined port - 9000 - to send a heartbeat value - 0xb7. This mechanism is >>>>> used to check in the primary VM that the enclave has booted. >>>>> >>>>> If the enclave VM crashes or gracefully exits, an interrupt event is received by >>>>> the NE driver. This event is sent further to the user space enclave process >>>>> running in the primary VM via a poll notification mechanism. Then the user space >>>>> enclave process can exit. >>>>> >>>>> Thank you. >>>>> >>>> This version reads very well, thanks a lot Andra! >>>> >>>> Greg, would you mind to have another look over it? >>> Will do, it's in my to-review queue, behind lots of other patches... >>> >> I have a set of updates that can be included in a new revision, v8 e.g. new >> NE custom error codes for invalid flags / enclave CID, "shutdown" function >> for the NE PCI device driver, a couple more checks wrt invalid flags and >> enclave vsock CID, documentation and sample updates. There is also the >> option to have these updates as follow-up patches. >> >> Greg, let me know what would work fine for you with regard to the review of >> the patch series. > A new series is always fine with me... > Alright, thank you. I sent out the new revision. Andra Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.