diff mbox series

[v5,01/18] nitro_enclaves: Add ioctl interface definition

Message ID 20200715194540.45532-2-andraprs@amazon.com (mailing list archive)
State New, archived
Headers show
Series Add support for Nitro Enclaves | expand

Commit Message

Paraschiv, Andra-Irina July 15, 2020, 7:45 p.m. UTC
The Nitro Enclaves driver handles the enclave lifetime management. This
includes enclave creation, termination and setting up its resources such
as memory and CPU.

An enclave runs alongside the VM that spawned it. It is abstracted as a
process running in the VM that launched it. The process interacts with
the NE driver, that exposes an ioctl interface for creating an enclave
and setting up its resources.

Signed-off-by: Alexandru Vasile <lexnv@amazon.com>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
---
Changelog

v4 -> v5

* Add more details about the ioctl calls usage e.g. error codes, file
  descriptors used.
* Update the ioctl to set an enclave vCPU to not return a file
  descriptor.
* Add specific NE error codes.

v3 -> v4

* Decouple NE ioctl interface from KVM API.
* Add NE API version and the corresponding ioctl call.
* Add enclave / image load flags options.

v2 -> v3

* Remove the GPL additional wording as SPDX-License-Identifier is
  already in place.

v1 -> v2

* Add ioctl for getting enclave image load metadata.
* Update NE_ENCLAVE_START ioctl name to NE_START_ENCLAVE.
* Add entry in Documentation/userspace-api/ioctl/ioctl-number.rst for NE
  ioctls.
* Update NE ioctls definition based on the updated ioctl range for major
  and minor.
---
 .../userspace-api/ioctl/ioctl-number.rst      |   5 +-
 include/linux/nitro_enclaves.h                |  11 +
 include/uapi/linux/nitro_enclaves.h           | 244 ++++++++++++++++++
 3 files changed, 259 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/nitro_enclaves.h
 create mode 100644 include/uapi/linux/nitro_enclaves.h

Comments

Stefan Hajnoczi July 16, 2020, 8:30 a.m. UTC | #1
On Wed, Jul 15, 2020 at 10:45:23PM +0300, Andra Paraschiv wrote:
> + * A NE CPU pool has be set before calling this function. The pool can be set

s/has be/has to be/

Thanks, this looks good!

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Paraschiv, Andra-Irina July 16, 2020, 9:30 a.m. UTC | #2
On 16/07/2020 11:30, Stefan Hajnoczi wrote:
> On Wed, Jul 15, 2020 at 10:45:23PM +0300, Andra Paraschiv wrote:
>> + * A NE CPU pool has be set before calling this function. The pool can be set
> s/has be/has to be/

Fixed.

>
> Thanks, this looks good!
>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

Thanks for review, glad to hear it's in a better shape.

Andra



Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.
Greg KH July 21, 2020, 12:12 p.m. UTC | #3
On Wed, Jul 15, 2020 at 10:45:23PM +0300, Andra Paraschiv wrote:
> The Nitro Enclaves driver handles the enclave lifetime management. This
> includes enclave creation, termination and setting up its resources such
> as memory and CPU.
> 
> An enclave runs alongside the VM that spawned it. It is abstracted as a
> process running in the VM that launched it. The process interacts with
> the NE driver, that exposes an ioctl interface for creating an enclave
> and setting up its resources.
> 
> Signed-off-by: Alexandru Vasile <lexnv@amazon.com>
> Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
> Reviewed-by: Alexander Graf <graf@amazon.com>
> ---
> Changelog
> 
> v4 -> v5
> 
> * Add more details about the ioctl calls usage e.g. error codes, file
>   descriptors used.
> * Update the ioctl to set an enclave vCPU to not return a file
>   descriptor.
> * Add specific NE error codes.
> 
> v3 -> v4
> 
> * Decouple NE ioctl interface from KVM API.
> * Add NE API version and the corresponding ioctl call.
> * Add enclave / image load flags options.
> 
> v2 -> v3
> 
> * Remove the GPL additional wording as SPDX-License-Identifier is
>   already in place.
> 
> v1 -> v2
> 
> * Add ioctl for getting enclave image load metadata.
> * Update NE_ENCLAVE_START ioctl name to NE_START_ENCLAVE.
> * Add entry in Documentation/userspace-api/ioctl/ioctl-number.rst for NE
>   ioctls.
> * Update NE ioctls definition based on the updated ioctl range for major
>   and minor.
> ---
>  .../userspace-api/ioctl/ioctl-number.rst      |   5 +-
>  include/linux/nitro_enclaves.h                |  11 +
>  include/uapi/linux/nitro_enclaves.h           | 244 ++++++++++++++++++
>  3 files changed, 259 insertions(+), 1 deletion(-)
>  create mode 100644 include/linux/nitro_enclaves.h
>  create mode 100644 include/uapi/linux/nitro_enclaves.h
> 
> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
> index 59472cd6a11d..783440c6719b 100644
> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
> @@ -328,8 +328,11 @@ Code  Seq#    Include File                                           Comments
>  0xAC  00-1F  linux/raw.h
>  0xAD  00                                                             Netfilter device in development:
>                                                                       <mailto:rusty@rustcorp.com.au>
> -0xAE  all    linux/kvm.h                                             Kernel-based Virtual Machine
> +0xAE  00-1F  linux/kvm.h                                             Kernel-based Virtual Machine
>                                                                       <mailto:kvm@vger.kernel.org>
> +0xAE  40-FF  linux/kvm.h                                             Kernel-based Virtual Machine
> +                                                                     <mailto:kvm@vger.kernel.org>
> +0xAE  20-3F  linux/nitro_enclaves.h                                  Nitro Enclaves
>  0xAF  00-1F  linux/fsl_hypervisor.h                                  Freescale hypervisor
>  0xB0  all                                                            RATIO devices in development:
>                                                                       <mailto:vgo@ratio.de>
> diff --git a/include/linux/nitro_enclaves.h b/include/linux/nitro_enclaves.h
> new file mode 100644
> index 000000000000..d91ef2bfdf47
> --- /dev/null
> +++ b/include/linux/nitro_enclaves.h
> @@ -0,0 +1,11 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
> + */
> +
> +#ifndef _LINUX_NITRO_ENCLAVES_H_
> +#define _LINUX_NITRO_ENCLAVES_H_
> +
> +#include <uapi/linux/nitro_enclaves.h>
> +
> +#endif /* _LINUX_NITRO_ENCLAVES_H_ */
> diff --git a/include/uapi/linux/nitro_enclaves.h b/include/uapi/linux/nitro_enclaves.h
> new file mode 100644
> index 000000000000..cf1192f6e923
> --- /dev/null
> +++ b/include/uapi/linux/nitro_enclaves.h
> @@ -0,0 +1,244 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +/*
> + * Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
> + */
> +
> +#ifndef _UAPI_LINUX_NITRO_ENCLAVES_H_
> +#define _UAPI_LINUX_NITRO_ENCLAVES_H_
> +
> +#include <linux/types.h>
> +
> +/* Nitro Enclaves (NE) Kernel Driver Interface */
> +
> +#define NE_API_VERSION (1)

Why do you need this version?  It shouldn't be needed, right?



> +
> +/**
> + * The command is used to get the version of the NE API. This way the user space
> + * processes can be aware of the feature sets provided by the NE kernel driver.
> + *
> + * The NE API version is returned as result of this ioctl call.
> + *
> + * The ioctl can be invoked on the /dev/nitro_enclaves fd, independent of
> + * enclaves already created / started or not.
> + *
> + * No errors are returned.
> + */
> +#define NE_GET_API_VERSION _IO(0xAE, 0x20)
> +
> +/**
> + * The command is used to create a slot that is associated with an enclave VM.
> + * Memory and vCPUs are then set for the slot mapped to an enclave.
> + *
> + * The generated unique slot id is an output parameter. An enclave file
> + * descriptor is returned as result of this ioctl call. The enclave fd can be
> + * further used with ioctl calls to set vCPUs and memory regions, then start
> + * the enclave.
> + *
> + * The ioctl can be invoked on the /dev/nitro_enclaves fd, before setting any
> + * resources, such as memory and vCPUs, for an enclave.
> + *
> + * A NE CPU pool has be set before calling this function. The pool can be set
> + * after the NE driver load, using /sys/module/nitro_enclaves/parameters/ne_cpus.
> + * Its format is the following:
> + * https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html#cpu-lists
> + *
> + * CPU 0 and its siblings have to remain available for the primary / parent VM,
> + * so they cannot be set for enclaves. Full CPU core(s), from the same NUMA
> + * node, need(s) be included in the CPU pool.
> + *
> + * The returned errors are:
> + * * -EFAULT - copy_to_user() failure.
> + * * -ENOMEM - memory allocation failure for internal bookkeeping variables.
> + * * -NE_ERR_NO_CPUS_AVAIL_IN_POOL - no NE CPU pool set / no CPUs available in the pool.
> + * * Error codes from get_unused_fd_flags() and anon_inode_getfile().
> + * * Error codes from the NE PCI device request.
> + */
> +#define NE_CREATE_VM _IOR(0xAE, 0x21, __u64)
> +
> +/**
> + * The command is used to set a vCPU for an enclave. The vCPU can be auto-chosen
> + * from the NE CPU pool or it can be set by the caller, with the note that it
> + * needs to be available in the NE CPU pool. Full CPU core(s), from the same
> + * NUMA node, need(s) to be associated with an enclave.
> + *
> + * The vCPU id is an input / output parameter. If its value is 0, then a CPU is
> + * chosen from the enclave CPU pool and returned via this parameter.
> + *
> + * The ioctl can be invoked on the enclave fd, before an enclave is started.
> + *
> + * The returned errors are:
> + * * -EFAULT - copy_from_user() / copy_to_user() failure.
> + * * -ENOMEM - memory allocation failure for internal bookkeeping variables.
> + * * -EIO - current task mm is not the same as the one that created the enclave.
> + * * -NE_ERR_NO_CPUS_AVAIL_IN_POOL - no CPUs available in the NE CPU pool.
> + * * -NE_ERR_VCPU_ALREADY_USED - the provided vCPU is already used.
> + * * -NE_ERR_VCPU_NOT_IN_POOL - the provided vCPU is not available in the NE CPU pool.
> + * * -NE_ERR_INVALID_CPU_CORE - the core id of the provided vCPU is invalid / out of range.
> + * * -NE_ERR_NOT_IN_INIT_STATE - the enclave is not in init state (init = before being started).
> + * * -NE_ERR_INVALID_VCPU - the provided vCPU is out the available CPUs range.
> + * * Error codes from the NE PCI device request.
> + */
> +#define NE_ADD_VCPU _IOWR(0xAE, 0x22, __u32)
> +
> +/**
> + * The command is used to get information needed for in-memory enclave image
> + * loading e.g. offset in enclave memory to start placing the enclave image.
> + *
> + * The image load info is an input / output parameter. It includes info provided
> + * by the caller - flags - and returns the offset in enclave memory where to
> + * start placing the enclave image.
> + *
> + * The ioctl can be invoked on the enclave fd, before an enclave is started.
> + *
> + * The returned errors are:
> + * * -EFAULT - copy_from_user() / copy_to_user() failure.
> + * * -NE_ERR_NOT_IN_INIT_STATE - the enclave is not in init state (init = before being started).
> + */
> +#define NE_GET_IMAGE_LOAD_INFO _IOWR(0xAE, 0x23, struct ne_image_load_info)
> +
> +/**
> + * The command is used to set a memory region for an enclave, given the
> + * allocated memory from the userspace. Enclave memory needs to be from the
> + * same NUMA node as the enclave CPUs.
> + *
> + * The user memory region is an input parameter. It includes info provided
> + * by the caller - flags, memory size and userspace address.
> + *
> + * The ioctl can be invoked on the enclave fd, before an enclave is started.
> + *
> + * The returned errors are:
> + * * -EFAULT - copy_from_user() failure.
> + * * -ENOMEM - memory allocation failure for internal bookkeeping variables.
> + * * -EIO - current task mm is not the same as the one that created the enclave.
> + * * -NE_ERR_NOT_IN_INIT_STATE - the enclave is not in init state (init = before being started).
> + * * -NE_ERR_INVALID_MEM_REGION_SIZE - the memory size of the region is not multiple of 2 MiB.
> + * * -NE_ERR_INVALID_MEM_REGION_ADDR - invalid user space address given.
> + * * -NE_ERR_UNALIGNED_MEM_REGION_ADDR - unaligned user space address given.
> + * * -NE_ERR_MEM_NOT_HUGE_PAGE - the memory regions is not backed by huge pages.
> + * * -NE_ERR_MEM_DIFF_NUMA_NODE - the memory region is not from the same NUMA node as the CPUs.
> + * * -NE_ERR_MEM_MAX_REGIONS - the number of memory regions set for the enclave reached maximum.
> + * * Error codes from get_user_pages().
> + * * Error codes from the NE PCI device request.
> + */
> +#define NE_SET_USER_MEMORY_REGION _IOW(0xAE, 0x24, struct ne_user_memory_region)
> +
> +/**
> + * The command is used to trigger enclave start after the enclave resources,
> + * such as memory and CPU, have been set.
> + *
> + * The enclave start info is an input / output parameter. It includes info
> + * provided by the caller - enclave cid and flags - and returns the cid (if
> + * input cid is 0).
> + *
> + * The ioctl can be invoked on the enclave fd, after an enclave slot is created
> + * and resources, such as memory and vCPUs are set for an enclave.
> + *
> + * The returned errors are:
> + * * -EFAULT - copy_from_user() / copy_to_user() failure.
> + * * -NE_ERR_NOT_IN_INIT_STATE - the enclave is not in init state (init = before being started).
> + * * -NE_ERR_NO_MEM_REGIONS_ADDED / -NE_ERR_NO_VCPUS_ADDED - no CPUs / memory regions are set.
> + * * -NE_ERR_FULL_CORES_NOT_USED - full core(s) not set for the enclave.
> + * * -NE_ERR_ENCLAVE_MEM_MIN_SIZE - enclave memory is less than minimum memory size (64 MiB).
> + * * Error codes from the NE PCI device request.
> + */
> +#define NE_START_ENCLAVE _IOWR(0xAE, 0x25, struct ne_enclave_start_info)
> +
> +/* NE specific error codes */
> +
> +/* The provided vCPU is already used. */
> +#define NE_ERR_VCPU_ALREADY_USED (512)
> +/* The provided vCPU is not available in the NE CPU pool. */
> +#define NE_ERR_VCPU_NOT_IN_POOL (513)
> +/* The core id of the provided vCPU is invalid / out of range of the NE CPU pool. */
> +#define NE_ERR_INVALID_CPU_CORE (514)
> +/* The user space memory region size is not multiple of 2 MiB. */
> +#define NE_ERR_INVALID_MEM_REGION_SIZE (515)
> +/* The user space memory region address range is invalid. */
> +#define NE_ERR_INVALID_MEM_REGION_ADDR (516)
> +/* The user space memory region address is not aligned. */
> +#define NE_ERR_UNALIGNED_MEM_REGION_ADDR (517)
> +/* The user space memory region is not backed by contiguous physical huge page(s). */
> +#define NE_ERR_MEM_NOT_HUGE_PAGE (518)
> +/* The user space memory region is backed by pages from different NUMA nodes than the CPUs. */
> +#define NE_ERR_MEM_DIFF_NUMA_NODE (519)
> +/* The supported max memory regions per enclaves has been reached. */
> +#define NE_ERR_MEM_MAX_REGIONS (520)
> +/* The command to start an enclave is triggered and no memory regions are added. */
> +#define NE_ERR_NO_MEM_REGIONS_ADDED (521)
> +/* The command to start an enclave is triggered and no vcpus are added. */
> +#define NE_ERR_NO_VCPUS_ADDED (522)
> +/* The enclave memory size is lower than the minimum supported. */
> +#define NE_ERR_ENCLAVE_MEM_MIN_SIZE (523)
> +/* The command to start an enclave is triggered and full CPU cores are not set. */
> +#define NE_ERR_FULL_CORES_NOT_USED (524)
> +/* The enclave is not in init state when setting resources or triggering start. */
> +#define NE_ERR_NOT_IN_INIT_STATE (525)
> +/* The provided vCPU is out of range of the available CPUs. */
> +#define NE_ERR_INVALID_VCPU (526)
> +/* The command to create an enclave is triggered and no CPUs are available in the pool. */
> +#define NE_ERR_NO_CPUS_AVAIL_IN_POOL (527)

Tabs are nice to use to align things :)

> +
> +/* Image load info flags */
> +
> +/* Enclave Image Format (EIF) */
> +#define NE_EIF_IMAGE (0x01)
> +
> +/* Info necessary for in-memory enclave image loading (in / out). */
> +struct ne_image_load_info {
> +	/**
> +	 * Flags to determine the enclave image type (e.g. Enclave Image Format
> +	 * - EIF) (in).
> +	 */
> +	__u64 flags;
> +
> +	/**
> +	 * Offset in enclave memory where to start placing the enclave image
> +	 * (out).
> +	 */


Why not use kerneldoc format to describe this structure instead of this
"custom" one?  Same for other structures in this file.

thanks,

greg k-h
Paraschiv, Andra-Irina July 22, 2020, 8:27 a.m. UTC | #4
On 21/07/2020 15:12, Greg KH wrote:
> On Wed, Jul 15, 2020 at 10:45:23PM +0300, Andra Paraschiv wrote:
>> The Nitro Enclaves driver handles the enclave lifetime management. This
>> includes enclave creation, termination and setting up its resources such
>> as memory and CPU.
>>
>> An enclave runs alongside the VM that spawned it. It is abstracted as a
>> process running in the VM that launched it. The process interacts with
>> the NE driver, that exposes an ioctl interface for creating an enclave
>> and setting up its resources.
>>
>> Signed-off-by: Alexandru Vasile <lexnv@amazon.com>
>> Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
>> Reviewed-by: Alexander Graf <graf@amazon.com>
>> ---
>> Changelog
>>
>> v4 -> v5
>>
>> * Add more details about the ioctl calls usage e.g. error codes, file
>>    descriptors used.
>> * Update the ioctl to set an enclave vCPU to not return a file
>>    descriptor.
>> * Add specific NE error codes.
>>
>> v3 -> v4
>>
>> * Decouple NE ioctl interface from KVM API.
>> * Add NE API version and the corresponding ioctl call.
>> * Add enclave / image load flags options.
>>
>> v2 -> v3
>>
>> * Remove the GPL additional wording as SPDX-License-Identifier is
>>    already in place.
>>
>> v1 -> v2
>>
>> * Add ioctl for getting enclave image load metadata.
>> * Update NE_ENCLAVE_START ioctl name to NE_START_ENCLAVE.
>> * Add entry in Documentation/userspace-api/ioctl/ioctl-number.rst for NE
>>    ioctls.
>> * Update NE ioctls definition based on the updated ioctl range for major
>>    and minor.
>> ---
>>   .../userspace-api/ioctl/ioctl-number.rst      |   5 +-
>>   include/linux/nitro_enclaves.h                |  11 +
>>   include/uapi/linux/nitro_enclaves.h           | 244 ++++++++++++++++++
>>   3 files changed, 259 insertions(+), 1 deletion(-)
>>   create mode 100644 include/linux/nitro_enclaves.h
>>   create mode 100644 include/uapi/linux/nitro_enclaves.h
>>
>> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
>> index 59472cd6a11d..783440c6719b 100644
>> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
>> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
>> @@ -328,8 +328,11 @@ Code  Seq#    Include File                                           Comments
>>   0xAC  00-1F  linux/raw.h
>>   0xAD  00                                                             Netfilter device in development:
>>                                                                        <mailto:rusty@rustcorp.com.au>
>> -0xAE  all    linux/kvm.h                                             Kernel-based Virtual Machine
>> +0xAE  00-1F  linux/kvm.h                                             Kernel-based Virtual Machine
>>                                                                        <mailto:kvm@vger.kernel.org>
>> +0xAE  40-FF  linux/kvm.h                                             Kernel-based Virtual Machine
>> +                                                                     <mailto:kvm@vger.kernel.org>
>> +0xAE  20-3F  linux/nitro_enclaves.h                                  Nitro Enclaves
>>   0xAF  00-1F  linux/fsl_hypervisor.h                                  Freescale hypervisor
>>   0xB0  all                                                            RATIO devices in development:
>>                                                                        <mailto:vgo@ratio.de>
>> diff --git a/include/linux/nitro_enclaves.h b/include/linux/nitro_enclaves.h
>> new file mode 100644
>> index 000000000000..d91ef2bfdf47
>> --- /dev/null
>> +++ b/include/linux/nitro_enclaves.h
>> @@ -0,0 +1,11 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
>> + */
>> +
>> +#ifndef _LINUX_NITRO_ENCLAVES_H_
>> +#define _LINUX_NITRO_ENCLAVES_H_
>> +
>> +#include <uapi/linux/nitro_enclaves.h>
>> +
>> +#endif /* _LINUX_NITRO_ENCLAVES_H_ */
>> diff --git a/include/uapi/linux/nitro_enclaves.h b/include/uapi/linux/nitro_enclaves.h
>> new file mode 100644
>> index 000000000000..cf1192f6e923
>> --- /dev/null
>> +++ b/include/uapi/linux/nitro_enclaves.h
>> @@ -0,0 +1,244 @@
>> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
>> +/*
>> + * Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
>> + */
>> +
>> +#ifndef _UAPI_LINUX_NITRO_ENCLAVES_H_
>> +#define _UAPI_LINUX_NITRO_ENCLAVES_H_
>> +
>> +#include <linux/types.h>
>> +
>> +/* Nitro Enclaves (NE) Kernel Driver Interface */
>> +
>> +#define NE_API_VERSION (1)
> Why do you need this version?  It shouldn't be needed, right?

The version is used as a way for the user space tooling to sync on the 
features set provided by the driver e.g. in case an older version of the 
driver is available on the system and the user space tooling expects a 
set of features that is not included in that driver version.

>
>
>
>> +
>> +/**
>> + * The command is used to get the version of the NE API. This way the user space
>> + * processes can be aware of the feature sets provided by the NE kernel driver.
>> + *
>> + * The NE API version is returned as result of this ioctl call.
>> + *
>> + * The ioctl can be invoked on the /dev/nitro_enclaves fd, independent of
>> + * enclaves already created / started or not.
>> + *
>> + * No errors are returned.
>> + */
>> +#define NE_GET_API_VERSION _IO(0xAE, 0x20)
>> +
>> +/**
>> + * The command is used to create a slot that is associated with an enclave VM.
>> + * Memory and vCPUs are then set for the slot mapped to an enclave.
>> + *
>> + * The generated unique slot id is an output parameter. An enclave file
>> + * descriptor is returned as result of this ioctl call. The enclave fd can be
>> + * further used with ioctl calls to set vCPUs and memory regions, then start
>> + * the enclave.
>> + *
>> + * The ioctl can be invoked on the /dev/nitro_enclaves fd, before setting any
>> + * resources, such as memory and vCPUs, for an enclave.
>> + *
>> + * A NE CPU pool has be set before calling this function. The pool can be set
>> + * after the NE driver load, using /sys/module/nitro_enclaves/parameters/ne_cpus.
>> + * Its format is the following:
>> + * https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html#cpu-lists
>> + *
>> + * CPU 0 and its siblings have to remain available for the primary / parent VM,
>> + * so they cannot be set for enclaves. Full CPU core(s), from the same NUMA
>> + * node, need(s) be included in the CPU pool.
>> + *
>> + * The returned errors are:
>> + * * -EFAULT - copy_to_user() failure.
>> + * * -ENOMEM - memory allocation failure for internal bookkeeping variables.
>> + * * -NE_ERR_NO_CPUS_AVAIL_IN_POOL - no NE CPU pool set / no CPUs available in the pool.
>> + * * Error codes from get_unused_fd_flags() and anon_inode_getfile().
>> + * * Error codes from the NE PCI device request.
>> + */
>> +#define NE_CREATE_VM _IOR(0xAE, 0x21, __u64)
>> +
>> +/**
>> + * The command is used to set a vCPU for an enclave. The vCPU can be auto-chosen
>> + * from the NE CPU pool or it can be set by the caller, with the note that it
>> + * needs to be available in the NE CPU pool. Full CPU core(s), from the same
>> + * NUMA node, need(s) to be associated with an enclave.
>> + *
>> + * The vCPU id is an input / output parameter. If its value is 0, then a CPU is
>> + * chosen from the enclave CPU pool and returned via this parameter.
>> + *
>> + * The ioctl can be invoked on the enclave fd, before an enclave is started.
>> + *
>> + * The returned errors are:
>> + * * -EFAULT - copy_from_user() / copy_to_user() failure.
>> + * * -ENOMEM - memory allocation failure for internal bookkeeping variables.
>> + * * -EIO - current task mm is not the same as the one that created the enclave.
>> + * * -NE_ERR_NO_CPUS_AVAIL_IN_POOL - no CPUs available in the NE CPU pool.
>> + * * -NE_ERR_VCPU_ALREADY_USED - the provided vCPU is already used.
>> + * * -NE_ERR_VCPU_NOT_IN_POOL - the provided vCPU is not available in the NE CPU pool.
>> + * * -NE_ERR_INVALID_CPU_CORE - the core id of the provided vCPU is invalid / out of range.
>> + * * -NE_ERR_NOT_IN_INIT_STATE - the enclave is not in init state (init = before being started).
>> + * * -NE_ERR_INVALID_VCPU - the provided vCPU is out the available CPUs range.
>> + * * Error codes from the NE PCI device request.
>> + */
>> +#define NE_ADD_VCPU _IOWR(0xAE, 0x22, __u32)
>> +
>> +/**
>> + * The command is used to get information needed for in-memory enclave image
>> + * loading e.g. offset in enclave memory to start placing the enclave image.
>> + *
>> + * The image load info is an input / output parameter. It includes info provided
>> + * by the caller - flags - and returns the offset in enclave memory where to
>> + * start placing the enclave image.
>> + *
>> + * The ioctl can be invoked on the enclave fd, before an enclave is started.
>> + *
>> + * The returned errors are:
>> + * * -EFAULT - copy_from_user() / copy_to_user() failure.
>> + * * -NE_ERR_NOT_IN_INIT_STATE - the enclave is not in init state (init = before being started).
>> + */
>> +#define NE_GET_IMAGE_LOAD_INFO _IOWR(0xAE, 0x23, struct ne_image_load_info)
>> +
>> +/**
>> + * The command is used to set a memory region for an enclave, given the
>> + * allocated memory from the userspace. Enclave memory needs to be from the
>> + * same NUMA node as the enclave CPUs.
>> + *
>> + * The user memory region is an input parameter. It includes info provided
>> + * by the caller - flags, memory size and userspace address.
>> + *
>> + * The ioctl can be invoked on the enclave fd, before an enclave is started.
>> + *
>> + * The returned errors are:
>> + * * -EFAULT - copy_from_user() failure.
>> + * * -ENOMEM - memory allocation failure for internal bookkeeping variables.
>> + * * -EIO - current task mm is not the same as the one that created the enclave.
>> + * * -NE_ERR_NOT_IN_INIT_STATE - the enclave is not in init state (init = before being started).
>> + * * -NE_ERR_INVALID_MEM_REGION_SIZE - the memory size of the region is not multiple of 2 MiB.
>> + * * -NE_ERR_INVALID_MEM_REGION_ADDR - invalid user space address given.
>> + * * -NE_ERR_UNALIGNED_MEM_REGION_ADDR - unaligned user space address given.
>> + * * -NE_ERR_MEM_NOT_HUGE_PAGE - the memory regions is not backed by huge pages.
>> + * * -NE_ERR_MEM_DIFF_NUMA_NODE - the memory region is not from the same NUMA node as the CPUs.
>> + * * -NE_ERR_MEM_MAX_REGIONS - the number of memory regions set for the enclave reached maximum.
>> + * * Error codes from get_user_pages().
>> + * * Error codes from the NE PCI device request.
>> + */
>> +#define NE_SET_USER_MEMORY_REGION _IOW(0xAE, 0x24, struct ne_user_memory_region)
>> +
>> +/**
>> + * The command is used to trigger enclave start after the enclave resources,
>> + * such as memory and CPU, have been set.
>> + *
>> + * The enclave start info is an input / output parameter. It includes info
>> + * provided by the caller - enclave cid and flags - and returns the cid (if
>> + * input cid is 0).
>> + *
>> + * The ioctl can be invoked on the enclave fd, after an enclave slot is created
>> + * and resources, such as memory and vCPUs are set for an enclave.
>> + *
>> + * The returned errors are:
>> + * * -EFAULT - copy_from_user() / copy_to_user() failure.
>> + * * -NE_ERR_NOT_IN_INIT_STATE - the enclave is not in init state (init = before being started).
>> + * * -NE_ERR_NO_MEM_REGIONS_ADDED / -NE_ERR_NO_VCPUS_ADDED - no CPUs / memory regions are set.
>> + * * -NE_ERR_FULL_CORES_NOT_USED - full core(s) not set for the enclave.
>> + * * -NE_ERR_ENCLAVE_MEM_MIN_SIZE - enclave memory is less than minimum memory size (64 MiB).
>> + * * Error codes from the NE PCI device request.
>> + */
>> +#define NE_START_ENCLAVE _IOWR(0xAE, 0x25, struct ne_enclave_start_info)
>> +
>> +/* NE specific error codes */
>> +
>> +/* The provided vCPU is already used. */
>> +#define NE_ERR_VCPU_ALREADY_USED (512)
>> +/* The provided vCPU is not available in the NE CPU pool. */
>> +#define NE_ERR_VCPU_NOT_IN_POOL (513)
>> +/* The core id of the provided vCPU is invalid / out of range of the NE CPU pool. */
>> +#define NE_ERR_INVALID_CPU_CORE (514)
>> +/* The user space memory region size is not multiple of 2 MiB. */
>> +#define NE_ERR_INVALID_MEM_REGION_SIZE (515)
>> +/* The user space memory region address range is invalid. */
>> +#define NE_ERR_INVALID_MEM_REGION_ADDR (516)
>> +/* The user space memory region address is not aligned. */
>> +#define NE_ERR_UNALIGNED_MEM_REGION_ADDR (517)
>> +/* The user space memory region is not backed by contiguous physical huge page(s). */
>> +#define NE_ERR_MEM_NOT_HUGE_PAGE (518)
>> +/* The user space memory region is backed by pages from different NUMA nodes than the CPUs. */
>> +#define NE_ERR_MEM_DIFF_NUMA_NODE (519)
>> +/* The supported max memory regions per enclaves has been reached. */
>> +#define NE_ERR_MEM_MAX_REGIONS (520)
>> +/* The command to start an enclave is triggered and no memory regions are added. */
>> +#define NE_ERR_NO_MEM_REGIONS_ADDED (521)
>> +/* The command to start an enclave is triggered and no vcpus are added. */
>> +#define NE_ERR_NO_VCPUS_ADDED (522)
>> +/* The enclave memory size is lower than the minimum supported. */
>> +#define NE_ERR_ENCLAVE_MEM_MIN_SIZE (523)
>> +/* The command to start an enclave is triggered and full CPU cores are not set. */
>> +#define NE_ERR_FULL_CORES_NOT_USED (524)
>> +/* The enclave is not in init state when setting resources or triggering start. */
>> +#define NE_ERR_NOT_IN_INIT_STATE (525)
>> +/* The provided vCPU is out of range of the available CPUs. */
>> +#define NE_ERR_INVALID_VCPU (526)
>> +/* The command to create an enclave is triggered and no CPUs are available in the pool. */
>> +#define NE_ERR_NO_CPUS_AVAIL_IN_POOL (527)
> Tabs are nice to use to align things :)

Aham, I can update these defines and other places where necessary to 
have a better view of the values.

>
>> +
>> +/* Image load info flags */
>> +
>> +/* Enclave Image Format (EIF) */
>> +#define NE_EIF_IMAGE (0x01)
>> +
>> +/* Info necessary for in-memory enclave image loading (in / out). */
>> +struct ne_image_load_info {
>> +	/**
>> +	 * Flags to determine the enclave image type (e.g. Enclave Image Format
>> +	 * - EIF) (in).
>> +	 */
>> +	__u64 flags;
>> +
>> +	/**
>> +	 * Offset in enclave memory where to start placing the enclave image
>> +	 * (out).
>> +	 */
>
> Why not use kerneldoc format to describe this structure instead of this
> "custom" one?  Same for other structures in this file.

I'll update all the documentation to follow the kernel-doc format (part 
of it is already) and have the kernel-doc script used to automatically 
check all is fine.

Thank you.

Andra



Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.
Greg KH July 22, 2020, 9:57 a.m. UTC | #5
On Wed, Jul 22, 2020 at 11:27:29AM +0300, Paraschiv, Andra-Irina wrote:
> > > +#ifndef _UAPI_LINUX_NITRO_ENCLAVES_H_
> > > +#define _UAPI_LINUX_NITRO_ENCLAVES_H_
> > > +
> > > +#include <linux/types.h>
> > > +
> > > +/* Nitro Enclaves (NE) Kernel Driver Interface */
> > > +
> > > +#define NE_API_VERSION (1)
> > Why do you need this version?  It shouldn't be needed, right?
> 
> The version is used as a way for the user space tooling to sync on the
> features set provided by the driver e.g. in case an older version of the
> driver is available on the system and the user space tooling expects a set
> of features that is not included in that driver version.

That is guaranteed to get out of sync instantly with different distro
kernels backporting random things, combined with stable kernel patch
updates and the like.

Just use the normal api interfaces instead, don't try to "version"
anything, it will not work, trust us :)

If an ioctl returns -ENOTTY then hey, it's not present and your
userspace code can handle it that way.

thanks,

greg k-h
Paraschiv, Andra-Irina July 23, 2020, 9:23 a.m. UTC | #6
On 22/07/2020 12:57, Greg KH wrote:
> On Wed, Jul 22, 2020 at 11:27:29AM +0300, Paraschiv, Andra-Irina wrote:
>>>> +#ifndef _UAPI_LINUX_NITRO_ENCLAVES_H_
>>>> +#define _UAPI_LINUX_NITRO_ENCLAVES_H_
>>>> +
>>>> +#include <linux/types.h>
>>>> +
>>>> +/* Nitro Enclaves (NE) Kernel Driver Interface */
>>>> +
>>>> +#define NE_API_VERSION (1)
>>> Why do you need this version?  It shouldn't be needed, right?
>> The version is used as a way for the user space tooling to sync on the
>> features set provided by the driver e.g. in case an older version of the
>> driver is available on the system and the user space tooling expects a set
>> of features that is not included in that driver version.
> That is guaranteed to get out of sync instantly with different distro
> kernels backporting random things, combined with stable kernel patch
> updates and the like.
>
> Just use the normal api interfaces instead, don't try to "version"
> anything, it will not work, trust us :)
>
> If an ioctl returns -ENOTTY then hey, it's not present and your
> userspace code can handle it that way.

Correct, there could be a variety of kernel versions and user space 
tooling either in the original form, customized or written from scratch. 
And ENOTTY signals an ioctl not available or e.g. EINVAL (or custom 
error) if the parameter field value is not valid within a certain 
version. We have these in place, that's good. :)

However, I was thinking, for example, of an ioctl flow usage where a 
certain order needs to be followed e.g. create a VM, add resources to a 
VM, start a VM.

Let's say, for an use case wrt new features, ioctl A (create a VM) 
succeeds, ioctl B (add memory to the VM) succeeds, ioctl C (add CPU to 
the VM) succeeds and ioctl D (add any other type of resource before 
starting the VM) fails because it is not supported.

Would not need to call ioctl A to C and go through their underneath 
logic to realize ioctl D support is not there and rollback all the 
changes done till then within ioctl A to C logic. Of course, there could 
be ioctl A followed by ioctl D, and would need to rollback ioctl A 
changes, but I shared a more lengthy call chain that can be an option as 
well.

Thanks,
Andra



Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.
Greg KH July 23, 2020, 10:54 a.m. UTC | #7
On Thu, Jul 23, 2020 at 12:23:56PM +0300, Paraschiv, Andra-Irina wrote:
> 
> 
> On 22/07/2020 12:57, Greg KH wrote:
> > On Wed, Jul 22, 2020 at 11:27:29AM +0300, Paraschiv, Andra-Irina wrote:
> > > > > +#ifndef _UAPI_LINUX_NITRO_ENCLAVES_H_
> > > > > +#define _UAPI_LINUX_NITRO_ENCLAVES_H_
> > > > > +
> > > > > +#include <linux/types.h>
> > > > > +
> > > > > +/* Nitro Enclaves (NE) Kernel Driver Interface */
> > > > > +
> > > > > +#define NE_API_VERSION (1)
> > > > Why do you need this version?  It shouldn't be needed, right?
> > > The version is used as a way for the user space tooling to sync on the
> > > features set provided by the driver e.g. in case an older version of the
> > > driver is available on the system and the user space tooling expects a set
> > > of features that is not included in that driver version.
> > That is guaranteed to get out of sync instantly with different distro
> > kernels backporting random things, combined with stable kernel patch
> > updates and the like.
> > 
> > Just use the normal api interfaces instead, don't try to "version"
> > anything, it will not work, trust us :)
> > 
> > If an ioctl returns -ENOTTY then hey, it's not present and your
> > userspace code can handle it that way.
> 
> Correct, there could be a variety of kernel versions and user space tooling
> either in the original form, customized or written from scratch. And ENOTTY
> signals an ioctl not available or e.g. EINVAL (or custom error) if the
> parameter field value is not valid within a certain version. We have these
> in place, that's good. :)
> 
> However, I was thinking, for example, of an ioctl flow usage where a certain
> order needs to be followed e.g. create a VM, add resources to a VM, start a
> VM.
> 
> Let's say, for an use case wrt new features, ioctl A (create a VM) succeeds,
> ioctl B (add memory to the VM) succeeds, ioctl C (add CPU to the VM)
> succeeds and ioctl D (add any other type of resource before starting the VM)
> fails because it is not supported.
> 
> Would not need to call ioctl A to C and go through their underneath logic to
> realize ioctl D support is not there and rollback all the changes done till
> then within ioctl A to C logic. Of course, there could be ioctl A followed
> by ioctl D, and would need to rollback ioctl A changes, but I shared a more
> lengthy call chain that can be an option as well.

I think you are overthinking this.

If your interface is this complex, you have much larger issues as you
ALWAYS have to be able to handle error conditions properly, even if the
API is "supported".

Perhaps your API is showing to be too complex?

Also, where is the userspace code for all of this?  Did I miss a link to
it in the patches somewhere?

good luck!

greg k-h
Paraschiv, Andra-Irina July 23, 2020, 6:21 p.m. UTC | #8
On 23/07/2020 13:54, Greg KH wrote:
> On Thu, Jul 23, 2020 at 12:23:56PM +0300, Paraschiv, Andra-Irina wrote:
>>
>> On 22/07/2020 12:57, Greg KH wrote:
>>> On Wed, Jul 22, 2020 at 11:27:29AM +0300, Paraschiv, Andra-Irina wrote:
>>>>>> +#ifndef _UAPI_LINUX_NITRO_ENCLAVES_H_
>>>>>> +#define _UAPI_LINUX_NITRO_ENCLAVES_H_
>>>>>> +
>>>>>> +#include <linux/types.h>
>>>>>> +
>>>>>> +/* Nitro Enclaves (NE) Kernel Driver Interface */
>>>>>> +
>>>>>> +#define NE_API_VERSION (1)
>>>>> Why do you need this version?  It shouldn't be needed, right?
>>>> The version is used as a way for the user space tooling to sync on the
>>>> features set provided by the driver e.g. in case an older version of the
>>>> driver is available on the system and the user space tooling expects a set
>>>> of features that is not included in that driver version.
>>> That is guaranteed to get out of sync instantly with different distro
>>> kernels backporting random things, combined with stable kernel patch
>>> updates and the like.
>>>
>>> Just use the normal api interfaces instead, don't try to "version"
>>> anything, it will not work, trust us :)
>>>
>>> If an ioctl returns -ENOTTY then hey, it's not present and your
>>> userspace code can handle it that way.
>> Correct, there could be a variety of kernel versions and user space tooling
>> either in the original form, customized or written from scratch. And ENOTTY
>> signals an ioctl not available or e.g. EINVAL (or custom error) if the
>> parameter field value is not valid within a certain version. We have these
>> in place, that's good. :)
>>
>> However, I was thinking, for example, of an ioctl flow usage where a certain
>> order needs to be followed e.g. create a VM, add resources to a VM, start a
>> VM.
>>
>> Let's say, for an use case wrt new features, ioctl A (create a VM) succeeds,
>> ioctl B (add memory to the VM) succeeds, ioctl C (add CPU to the VM)
>> succeeds and ioctl D (add any other type of resource before starting the VM)
>> fails because it is not supported.
>>
>> Would not need to call ioctl A to C and go through their underneath logic to
>> realize ioctl D support is not there and rollback all the changes done till
>> then within ioctl A to C logic. Of course, there could be ioctl A followed
>> by ioctl D, and would need to rollback ioctl A changes, but I shared a more
>> lengthy call chain that can be an option as well.
> I think you are overthinking this.
>
> If your interface is this complex, you have much larger issues as you
> ALWAYS have to be able to handle error conditions properly, even if the
> API is "supported".

True, the error paths need to handled correctly on the kernel driver and 
on the user space logic side, independent of supported features or not. 
Cannot assume that all ioctl callers are behaving correctly or there are 
no errors in the system.

What I wanted to cover with that example is more towards the user space 
logic using new features, either early exiting before even trying the 
ioctl call flow path or going through part of the flow till getting the 
error e.g. ENOTTY for one of the ioctl calls.

>
> Perhaps your API is showing to be too complex?
>
> Also, where is the userspace code for all of this?  Did I miss a link to
> it in the patches somewhere?

Nope, you didn't miss any references to it. The codebase for the user 
space code is not publicly available for now, but it will be available 
on GitHub once the whole project is GA. And I'll include the refs, once 
available, in the NE kernel driver documentation.

I can summarize here the ioctl interface usage flow, let me know if I 
can help with more clarifications:

Enclave creation

* Open the misc device (/dev/nitro_enclaves) and get the device fd.
* Using the device fd, call NE_GET_API_VERSION to check the API version.
* Using the device fd, call NE_CREATE_VM and get an enclave fd.
* Using the enclave fd, call NE_GET_IMAGE_LOAD_INFO to get the offset in 
the enclave memory where to place the enclave image. Enclave memory 
regions consist of hugetlbfs huge pages. Place the enclave image in 
enclave memory.
* Using the enclave fd, call NE_SET_USER_MEMORY_REGION to set a memory 
region for an enclave. Repeat this step for all enclave memory regions.
* Using the enclave fd, call NE_ADD_VCPU to add a vCPU for an enclave. 
Repeat this step for all enclave vCPUs. The CPUs are part of the NE CPU 
pool, set using a sysfs file before starting to create an enclave.
* Using the enclave fd, call NE_START_ENCLAVE to start an enclave.


Enclave termination

* Close the enclave fd.


Thanks,
Andra

>
> good luck!
>
> greg k-h




Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.
Alexander Graf July 23, 2020, 11:04 p.m. UTC | #9
On 23.07.20 20:21, Paraschiv, Andra-Irina wrote:
> 
> 
> On 23/07/2020 13:54, Greg KH wrote:
>> On Thu, Jul 23, 2020 at 12:23:56PM +0300, Paraschiv, Andra-Irina wrote:
>>>
>>> On 22/07/2020 12:57, Greg KH wrote:
>>>> On Wed, Jul 22, 2020 at 11:27:29AM +0300, Paraschiv, Andra-Irina wrote:
>>>>>>> +#ifndef _UAPI_LINUX_NITRO_ENCLAVES_H_
>>>>>>> +#define _UAPI_LINUX_NITRO_ENCLAVES_H_
>>>>>>> +
>>>>>>> +#include <linux/types.h>
>>>>>>> +
>>>>>>> +/* Nitro Enclaves (NE) Kernel Driver Interface */
>>>>>>> +
>>>>>>> +#define NE_API_VERSION (1)
>>>>>> Why do you need this version?  It shouldn't be needed, right?
>>>>> The version is used as a way for the user space tooling to sync on the
>>>>> features set provided by the driver e.g. in case an older version 
>>>>> of the
>>>>> driver is available on the system and the user space tooling 
>>>>> expects a set
>>>>> of features that is not included in that driver version.
>>>> That is guaranteed to get out of sync instantly with different distro
>>>> kernels backporting random things, combined with stable kernel patch
>>>> updates and the like.
>>>>
>>>> Just use the normal api interfaces instead, don't try to "version"
>>>> anything, it will not work, trust us :)
>>>>
>>>> If an ioctl returns -ENOTTY then hey, it's not present and your
>>>> userspace code can handle it that way.
>>> Correct, there could be a variety of kernel versions and user space 
>>> tooling
>>> either in the original form, customized or written from scratch. And 
>>> ENOTTY
>>> signals an ioctl not available or e.g. EINVAL (or custom error) if the
>>> parameter field value is not valid within a certain version. We have 
>>> these
>>> in place, that's good. :)
>>>
>>> However, I was thinking, for example, of an ioctl flow usage where a 
>>> certain
>>> order needs to be followed e.g. create a VM, add resources to a VM, 
>>> start a
>>> VM.
>>>
>>> Let's say, for an use case wrt new features, ioctl A (create a VM) 
>>> succeeds,
>>> ioctl B (add memory to the VM) succeeds, ioctl C (add CPU to the VM)
>>> succeeds and ioctl D (add any other type of resource before starting 
>>> the VM)
>>> fails because it is not supported.
>>>
>>> Would not need to call ioctl A to C and go through their underneath 
>>> logic to
>>> realize ioctl D support is not there and rollback all the changes 
>>> done till
>>> then within ioctl A to C logic. Of course, there could be ioctl A 
>>> followed
>>> by ioctl D, and would need to rollback ioctl A changes, but I shared 
>>> a more
>>> lengthy call chain that can be an option as well.
>> I think you are overthinking this.
>>
>> If your interface is this complex, you have much larger issues as you
>> ALWAYS have to be able to handle error conditions properly, even if the
>> API is "supported".
> 
> True, the error paths need to handled correctly on the kernel driver and 
> on the user space logic side, independent of supported features or not. 
> Cannot assume that all ioctl callers are behaving correctly or there are 
> no errors in the system.
> 
> What I wanted to cover with that example is more towards the user space 
> logic using new features, either early exiting before even trying the 
> ioctl call flow path or going through part of the flow till getting the 
> error e.g. ENOTTY for one of the ioctl calls.

If we need an API to query for new features, we can add it once we add 
new features, no? The absence of the query API will indicate that no 
additional features are available.

So yes, sorry, oversight on my side :(. I agree with Greg: there really 
is no need for a version query API as of today.

> 
>>
>> Perhaps your API is showing to be too complex?
>>
>> Also, where is the userspace code for all of this?  Did I miss a link to
>> it in the patches somewhere?
> 
> Nope, you didn't miss any references to it. The codebase for the user 
> space code is not publicly available for now, but it will be available 
> on GitHub once the whole project is GA. And I'll include the refs, once 
> available, in the NE kernel driver documentation.

Patch 16/18 contains an example user space to drive the ioctl interface.

The code base Andra is referring to above is going to be a more complete 
framework to drive Nitro Enclaves that also consumes this kernel API.


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879
Paraschiv, Andra-Irina July 24, 2020, 9:06 a.m. UTC | #10
On 24/07/2020 02:04, Alexander Graf wrote:
>
>
> On 23.07.20 20:21, Paraschiv, Andra-Irina wrote:
>>
>>
>> On 23/07/2020 13:54, Greg KH wrote:
>>> On Thu, Jul 23, 2020 at 12:23:56PM +0300, Paraschiv, Andra-Irina wrote:
>>>>
>>>> On 22/07/2020 12:57, Greg KH wrote:
>>>>> On Wed, Jul 22, 2020 at 11:27:29AM +0300, Paraschiv, Andra-Irina 
>>>>> wrote:
>>>>>>>> +#ifndef _UAPI_LINUX_NITRO_ENCLAVES_H_
>>>>>>>> +#define _UAPI_LINUX_NITRO_ENCLAVES_H_
>>>>>>>> +
>>>>>>>> +#include <linux/types.h>
>>>>>>>> +
>>>>>>>> +/* Nitro Enclaves (NE) Kernel Driver Interface */
>>>>>>>> +
>>>>>>>> +#define NE_API_VERSION (1)
>>>>>>> Why do you need this version?  It shouldn't be needed, right?
>>>>>> The version is used as a way for the user space tooling to sync 
>>>>>> on the
>>>>>> features set provided by the driver e.g. in case an older version 
>>>>>> of the
>>>>>> driver is available on the system and the user space tooling 
>>>>>> expects a set
>>>>>> of features that is not included in that driver version.
>>>>> That is guaranteed to get out of sync instantly with different distro
>>>>> kernels backporting random things, combined with stable kernel patch
>>>>> updates and the like.
>>>>>
>>>>> Just use the normal api interfaces instead, don't try to "version"
>>>>> anything, it will not work, trust us :)
>>>>>
>>>>> If an ioctl returns -ENOTTY then hey, it's not present and your
>>>>> userspace code can handle it that way.
>>>> Correct, there could be a variety of kernel versions and user space 
>>>> tooling
>>>> either in the original form, customized or written from scratch. 
>>>> And ENOTTY
>>>> signals an ioctl not available or e.g. EINVAL (or custom error) if the
>>>> parameter field value is not valid within a certain version. We 
>>>> have these
>>>> in place, that's good. :)
>>>>
>>>> However, I was thinking, for example, of an ioctl flow usage where 
>>>> a certain
>>>> order needs to be followed e.g. create a VM, add resources to a VM, 
>>>> start a
>>>> VM.
>>>>
>>>> Let's say, for an use case wrt new features, ioctl A (create a VM) 
>>>> succeeds,
>>>> ioctl B (add memory to the VM) succeeds, ioctl C (add CPU to the VM)
>>>> succeeds and ioctl D (add any other type of resource before 
>>>> starting the VM)
>>>> fails because it is not supported.
>>>>
>>>> Would not need to call ioctl A to C and go through their underneath 
>>>> logic to
>>>> realize ioctl D support is not there and rollback all the changes 
>>>> done till
>>>> then within ioctl A to C logic. Of course, there could be ioctl A 
>>>> followed
>>>> by ioctl D, and would need to rollback ioctl A changes, but I 
>>>> shared a more
>>>> lengthy call chain that can be an option as well.
>>> I think you are overthinking this.
>>>
>>> If your interface is this complex, you have much larger issues as you
>>> ALWAYS have to be able to handle error conditions properly, even if the
>>> API is "supported".
>>
>> True, the error paths need to handled correctly on the kernel driver 
>> and on the user space logic side, independent of supported features 
>> or not. Cannot assume that all ioctl callers are behaving correctly 
>> or there are no errors in the system.
>>
>> What I wanted to cover with that example is more towards the user 
>> space logic using new features, either early exiting before even 
>> trying the ioctl call flow path or going through part of the flow 
>> till getting the error e.g. ENOTTY for one of the ioctl calls.
>
> If we need an API to query for new features, we can add it once we add 
> new features, no? The absence of the query API will indicate that no 
> additional features are available.
>
> So yes, sorry, oversight on my side :(. I agree with Greg: there 
> really is no need for a version query API as of today.

No problem. I can remove the versioning logic for now, although I think 
it would have been fine to have it from the beginning if we want to move 
further with a version query API in the end.

The overall discussion here was more towards having the versioning logic 
or not at all, either within the current code base or while getting to 
new features.

>
>>
>>>
>>> Perhaps your API is showing to be too complex?
>>>
>>> Also, where is the userspace code for all of this?  Did I miss a 
>>> link to
>>> it in the patches somewhere?
>>
>> Nope, you didn't miss any references to it. The codebase for the user 
>> space code is not publicly available for now, but it will be 
>> available on GitHub once the whole project is GA. And I'll include 
>> the refs, once available, in the NE kernel driver documentation.
>
> Patch 16/18 contains an example user space to drive the ioctl interface.

Yup, and the flow mentioned in the previous mail is included in the 
ioctl usage sample.

Thanks,
Andra

>
> The code base Andra is referring to above is going to be a more 
> complete framework to drive Nitro Enclaves that also consumes this 
> kernel API.
>
>
> Alex




Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.
diff mbox series

Patch

diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
index 59472cd6a11d..783440c6719b 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -328,8 +328,11 @@  Code  Seq#    Include File                                           Comments
 0xAC  00-1F  linux/raw.h
 0xAD  00                                                             Netfilter device in development:
                                                                      <mailto:rusty@rustcorp.com.au>
-0xAE  all    linux/kvm.h                                             Kernel-based Virtual Machine
+0xAE  00-1F  linux/kvm.h                                             Kernel-based Virtual Machine
                                                                      <mailto:kvm@vger.kernel.org>
+0xAE  40-FF  linux/kvm.h                                             Kernel-based Virtual Machine
+                                                                     <mailto:kvm@vger.kernel.org>
+0xAE  20-3F  linux/nitro_enclaves.h                                  Nitro Enclaves
 0xAF  00-1F  linux/fsl_hypervisor.h                                  Freescale hypervisor
 0xB0  all                                                            RATIO devices in development:
                                                                      <mailto:vgo@ratio.de>
diff --git a/include/linux/nitro_enclaves.h b/include/linux/nitro_enclaves.h
new file mode 100644
index 000000000000..d91ef2bfdf47
--- /dev/null
+++ b/include/linux/nitro_enclaves.h
@@ -0,0 +1,11 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
+ */
+
+#ifndef _LINUX_NITRO_ENCLAVES_H_
+#define _LINUX_NITRO_ENCLAVES_H_
+
+#include <uapi/linux/nitro_enclaves.h>
+
+#endif /* _LINUX_NITRO_ENCLAVES_H_ */
diff --git a/include/uapi/linux/nitro_enclaves.h b/include/uapi/linux/nitro_enclaves.h
new file mode 100644
index 000000000000..cf1192f6e923
--- /dev/null
+++ b/include/uapi/linux/nitro_enclaves.h
@@ -0,0 +1,244 @@ 
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
+ */
+
+#ifndef _UAPI_LINUX_NITRO_ENCLAVES_H_
+#define _UAPI_LINUX_NITRO_ENCLAVES_H_
+
+#include <linux/types.h>
+
+/* Nitro Enclaves (NE) Kernel Driver Interface */
+
+#define NE_API_VERSION (1)
+
+/**
+ * The command is used to get the version of the NE API. This way the user space
+ * processes can be aware of the feature sets provided by the NE kernel driver.
+ *
+ * The NE API version is returned as result of this ioctl call.
+ *
+ * The ioctl can be invoked on the /dev/nitro_enclaves fd, independent of
+ * enclaves already created / started or not.
+ *
+ * No errors are returned.
+ */
+#define NE_GET_API_VERSION _IO(0xAE, 0x20)
+
+/**
+ * The command is used to create a slot that is associated with an enclave VM.
+ * Memory and vCPUs are then set for the slot mapped to an enclave.
+ *
+ * The generated unique slot id is an output parameter. An enclave file
+ * descriptor is returned as result of this ioctl call. The enclave fd can be
+ * further used with ioctl calls to set vCPUs and memory regions, then start
+ * the enclave.
+ *
+ * The ioctl can be invoked on the /dev/nitro_enclaves fd, before setting any
+ * resources, such as memory and vCPUs, for an enclave.
+ *
+ * A NE CPU pool has be set before calling this function. The pool can be set
+ * after the NE driver load, using /sys/module/nitro_enclaves/parameters/ne_cpus.
+ * Its format is the following:
+ * https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html#cpu-lists
+ *
+ * CPU 0 and its siblings have to remain available for the primary / parent VM,
+ * so they cannot be set for enclaves. Full CPU core(s), from the same NUMA
+ * node, need(s) be included in the CPU pool.
+ *
+ * The returned errors are:
+ * * -EFAULT - copy_to_user() failure.
+ * * -ENOMEM - memory allocation failure for internal bookkeeping variables.
+ * * -NE_ERR_NO_CPUS_AVAIL_IN_POOL - no NE CPU pool set / no CPUs available in the pool.
+ * * Error codes from get_unused_fd_flags() and anon_inode_getfile().
+ * * Error codes from the NE PCI device request.
+ */
+#define NE_CREATE_VM _IOR(0xAE, 0x21, __u64)
+
+/**
+ * The command is used to set a vCPU for an enclave. The vCPU can be auto-chosen
+ * from the NE CPU pool or it can be set by the caller, with the note that it
+ * needs to be available in the NE CPU pool. Full CPU core(s), from the same
+ * NUMA node, need(s) to be associated with an enclave.
+ *
+ * The vCPU id is an input / output parameter. If its value is 0, then a CPU is
+ * chosen from the enclave CPU pool and returned via this parameter.
+ *
+ * The ioctl can be invoked on the enclave fd, before an enclave is started.
+ *
+ * The returned errors are:
+ * * -EFAULT - copy_from_user() / copy_to_user() failure.
+ * * -ENOMEM - memory allocation failure for internal bookkeeping variables.
+ * * -EIO - current task mm is not the same as the one that created the enclave.
+ * * -NE_ERR_NO_CPUS_AVAIL_IN_POOL - no CPUs available in the NE CPU pool.
+ * * -NE_ERR_VCPU_ALREADY_USED - the provided vCPU is already used.
+ * * -NE_ERR_VCPU_NOT_IN_POOL - the provided vCPU is not available in the NE CPU pool.
+ * * -NE_ERR_INVALID_CPU_CORE - the core id of the provided vCPU is invalid / out of range.
+ * * -NE_ERR_NOT_IN_INIT_STATE - the enclave is not in init state (init = before being started).
+ * * -NE_ERR_INVALID_VCPU - the provided vCPU is out the available CPUs range.
+ * * Error codes from the NE PCI device request.
+ */
+#define NE_ADD_VCPU _IOWR(0xAE, 0x22, __u32)
+
+/**
+ * The command is used to get information needed for in-memory enclave image
+ * loading e.g. offset in enclave memory to start placing the enclave image.
+ *
+ * The image load info is an input / output parameter. It includes info provided
+ * by the caller - flags - and returns the offset in enclave memory where to
+ * start placing the enclave image.
+ *
+ * The ioctl can be invoked on the enclave fd, before an enclave is started.
+ *
+ * The returned errors are:
+ * * -EFAULT - copy_from_user() / copy_to_user() failure.
+ * * -NE_ERR_NOT_IN_INIT_STATE - the enclave is not in init state (init = before being started).
+ */
+#define NE_GET_IMAGE_LOAD_INFO _IOWR(0xAE, 0x23, struct ne_image_load_info)
+
+/**
+ * The command is used to set a memory region for an enclave, given the
+ * allocated memory from the userspace. Enclave memory needs to be from the
+ * same NUMA node as the enclave CPUs.
+ *
+ * The user memory region is an input parameter. It includes info provided
+ * by the caller - flags, memory size and userspace address.
+ *
+ * The ioctl can be invoked on the enclave fd, before an enclave is started.
+ *
+ * The returned errors are:
+ * * -EFAULT - copy_from_user() failure.
+ * * -ENOMEM - memory allocation failure for internal bookkeeping variables.
+ * * -EIO - current task mm is not the same as the one that created the enclave.
+ * * -NE_ERR_NOT_IN_INIT_STATE - the enclave is not in init state (init = before being started).
+ * * -NE_ERR_INVALID_MEM_REGION_SIZE - the memory size of the region is not multiple of 2 MiB.
+ * * -NE_ERR_INVALID_MEM_REGION_ADDR - invalid user space address given.
+ * * -NE_ERR_UNALIGNED_MEM_REGION_ADDR - unaligned user space address given.
+ * * -NE_ERR_MEM_NOT_HUGE_PAGE - the memory regions is not backed by huge pages.
+ * * -NE_ERR_MEM_DIFF_NUMA_NODE - the memory region is not from the same NUMA node as the CPUs.
+ * * -NE_ERR_MEM_MAX_REGIONS - the number of memory regions set for the enclave reached maximum.
+ * * Error codes from get_user_pages().
+ * * Error codes from the NE PCI device request.
+ */
+#define NE_SET_USER_MEMORY_REGION _IOW(0xAE, 0x24, struct ne_user_memory_region)
+
+/**
+ * The command is used to trigger enclave start after the enclave resources,
+ * such as memory and CPU, have been set.
+ *
+ * The enclave start info is an input / output parameter. It includes info
+ * provided by the caller - enclave cid and flags - and returns the cid (if
+ * input cid is 0).
+ *
+ * The ioctl can be invoked on the enclave fd, after an enclave slot is created
+ * and resources, such as memory and vCPUs are set for an enclave.
+ *
+ * The returned errors are:
+ * * -EFAULT - copy_from_user() / copy_to_user() failure.
+ * * -NE_ERR_NOT_IN_INIT_STATE - the enclave is not in init state (init = before being started).
+ * * -NE_ERR_NO_MEM_REGIONS_ADDED / -NE_ERR_NO_VCPUS_ADDED - no CPUs / memory regions are set.
+ * * -NE_ERR_FULL_CORES_NOT_USED - full core(s) not set for the enclave.
+ * * -NE_ERR_ENCLAVE_MEM_MIN_SIZE - enclave memory is less than minimum memory size (64 MiB).
+ * * Error codes from the NE PCI device request.
+ */
+#define NE_START_ENCLAVE _IOWR(0xAE, 0x25, struct ne_enclave_start_info)
+
+/* NE specific error codes */
+
+/* The provided vCPU is already used. */
+#define NE_ERR_VCPU_ALREADY_USED (512)
+/* The provided vCPU is not available in the NE CPU pool. */
+#define NE_ERR_VCPU_NOT_IN_POOL (513)
+/* The core id of the provided vCPU is invalid / out of range of the NE CPU pool. */
+#define NE_ERR_INVALID_CPU_CORE (514)
+/* The user space memory region size is not multiple of 2 MiB. */
+#define NE_ERR_INVALID_MEM_REGION_SIZE (515)
+/* The user space memory region address range is invalid. */
+#define NE_ERR_INVALID_MEM_REGION_ADDR (516)
+/* The user space memory region address is not aligned. */
+#define NE_ERR_UNALIGNED_MEM_REGION_ADDR (517)
+/* The user space memory region is not backed by contiguous physical huge page(s). */
+#define NE_ERR_MEM_NOT_HUGE_PAGE (518)
+/* The user space memory region is backed by pages from different NUMA nodes than the CPUs. */
+#define NE_ERR_MEM_DIFF_NUMA_NODE (519)
+/* The supported max memory regions per enclaves has been reached. */
+#define NE_ERR_MEM_MAX_REGIONS (520)
+/* The command to start an enclave is triggered and no memory regions are added. */
+#define NE_ERR_NO_MEM_REGIONS_ADDED (521)
+/* The command to start an enclave is triggered and no vcpus are added. */
+#define NE_ERR_NO_VCPUS_ADDED (522)
+/* The enclave memory size is lower than the minimum supported. */
+#define NE_ERR_ENCLAVE_MEM_MIN_SIZE (523)
+/* The command to start an enclave is triggered and full CPU cores are not set. */
+#define NE_ERR_FULL_CORES_NOT_USED (524)
+/* The enclave is not in init state when setting resources or triggering start. */
+#define NE_ERR_NOT_IN_INIT_STATE (525)
+/* The provided vCPU is out of range of the available CPUs. */
+#define NE_ERR_INVALID_VCPU (526)
+/* The command to create an enclave is triggered and no CPUs are available in the pool. */
+#define NE_ERR_NO_CPUS_AVAIL_IN_POOL (527)
+
+/* Image load info flags */
+
+/* Enclave Image Format (EIF) */
+#define NE_EIF_IMAGE (0x01)
+
+/* Info necessary for in-memory enclave image loading (in / out). */
+struct ne_image_load_info {
+	/**
+	 * Flags to determine the enclave image type (e.g. Enclave Image Format
+	 * - EIF) (in).
+	 */
+	__u64 flags;
+
+	/**
+	 * Offset in enclave memory where to start placing the enclave image
+	 * (out).
+	 */
+	__u64 memory_offset;
+};
+
+/* User memory region flags */
+
+/* Memory region for enclave general usage. */
+#define NE_DEFAULT_MEMORY_REGION (0x00)
+
+/* Memory region to be set for an enclave (in). */
+struct ne_user_memory_region {
+	/**
+	 * Flags to determine the usage for the memory region (in).
+	 */
+	__u64 flags;
+
+	/**
+	 * The size, in bytes, of the memory region to be set for an enclave
+	 * (in).
+	 */
+	__u64 memory_size;
+
+	/**
+	 * The start of the userspace allocated memory of the memory region to
+	 * set for an enclave (in).
+	 */
+	__u64 userspace_addr;
+};
+
+/* Enclave start info flags */
+
+/* Start enclave in debug mode. */
+#define NE_ENCLAVE_DEBUG_MODE (0x01)
+
+/* Setup info necessary for enclave start (in / out). */
+struct ne_enclave_start_info {
+	/* Flags for the enclave to start with (e.g. debug mode) (in). */
+	__u64 flags;
+
+	/**
+	 * Context ID (CID) for the enclave vsock device. If 0 as input, the
+	 * CID is autogenerated by the hypervisor and returned back as output
+	 * by the driver (in / out).
+	 */
+	__u64 enclave_cid;
+};
+
+#endif /* _UAPI_LINUX_NITRO_ENCLAVES_H_ */