[RFC,v2,2/3] x86/vdso: Modify __vdso_sgx_enter_enclave() to allow parameter passing on untrusted stack
diff mbox series

Message ID 20190424062623.4345-3-cedric.xing@intel.com
State New
Headers show
Series
  • An alternative __vdso_sgx_enter_enclave() to allow enclave/host parameter passing using untrusted stack
Related show

Commit Message

Xing, Cedric April 24, 2019, 6:26 a.m. UTC
The previous __vdso_sgx_enter_enclave() requires enclaves to preserve %rsp,
which prohibits enclaves from allocating and passing parameters for
untrusted function calls (aka. o-calls).

This patch addresses the problem above by introducing a new ABI that preserves
%rbp instead of %rsp. Then __vdso_sgx_enter_enclave() can anchor its frame
using %rbp so that enclaves are allowed to allocate space on the untrusted
stack by decrementing %rsp. Please note that the stack space allocated in such
way will be part of __vdso_sgx_enter_enclave()'s frame so will be freed after
__vdso_sgx_enter_enclave() returns. Therefore, __vdso_sgx_enter_enclave() has
been revised to take a callback function as an optional parameter, which if
supplied, will be invoked upon enclave exits (both AEX (Asynchronous Enclave
eXit) and normal exits), with the value of %rsp left off by the enclave as a
parameter to the callback.

Here's the summary of API/ABI changes in this patch. More details could be
found in arch/x86/entry/vdso/vsgx_enter_enclave.S.
* 'struct sgx_enclave_exception' is renamed to 'struct sgx_enclave_exinfo'
  because it is filled upon both AEX (i.e. exceptions) and normal enclave
  exits.
* __vdso_sgx_enter_enclave() anchors its frame using %rbp (instead of %rsp in
  the previous implementation).
* __vdso_sgx_enter_enclave() takes one more parameter - a callback function to
  be invoked upon enclave exits. This callback is optional, and if not
  supplied, will cause __vdso_sgx_enter_enclave() to return upon enclave exits
  (same behavior as previous implementation).
* The callback function is given as a parameter the value of %rsp at enclave
  exit to address data "pushed" by the enclave. A positive value returned by
  the callback will be treated as an ENCLU leaf for re-entering the enclave,
  while a zero or negative value will be passed through as the return
  value of __vdso_sgx_enter_enclave() to its caller. It's also safe to
  leave callback by longjmp() or by throwing a C++ exception.

Signed-off-by: Cedric Xing <cedric.xing@intel.com>
---
 arch/x86/entry/vdso/vsgx_enter_enclave.S | 175 ++++++++++++++++-------
 arch/x86/include/uapi/asm/sgx.h          |  14 +-
 2 files changed, 128 insertions(+), 61 deletions(-)

Comments

Sean Christopherson April 24, 2019, 7:04 p.m. UTC | #1
On Tue, Apr 23, 2019 at 11:26:22PM -0700, Cedric Xing wrote:
> The previous __vdso_sgx_enter_enclave() requires enclaves to preserve %rsp,
> which prohibits enclaves from allocating and passing parameters for
> untrusted function calls (aka. o-calls).
> 
> This patch addresses the problem above by introducing a new ABI that preserves
> %rbp instead of %rsp. Then __vdso_sgx_enter_enclave() can anchor its frame
> using %rbp so that enclaves are allowed to allocate space on the untrusted
> stack by decrementing %rsp. Please note that the stack space allocated in such
> way will be part of __vdso_sgx_enter_enclave()'s frame so will be freed after
> __vdso_sgx_enter_enclave() returns. Therefore, __vdso_sgx_enter_enclave() has
> been revised to take a callback function as an optional parameter, which if
> supplied, will be invoked upon enclave exits (both AEX (Asynchronous Enclave
> eXit) and normal exits), with the value of %rsp left off by the enclave as a
> parameter to the callback.

AIUI, you and Andy had an off-list discussion regarding rewriting
__vdso_sgx_enter_enclave() vs. providing a second vDSO function.  Can you
please summarize the discussion so that it's clear why you're pursuing a
single function?

In the end I probably agree that's it's desirable to have a single ABI and
vDSO function since the cost is basically that the non-callback case needs
to pass params via the stack instead of registers, but I do miss the
simplicity of separate implementations.

> Here's the summary of API/ABI changes in this patch. More details could be
> found in arch/x86/entry/vdso/vsgx_enter_enclave.S.
> * 'struct sgx_enclave_exception' is renamed to 'struct sgx_enclave_exinfo'
>   because it is filled upon both AEX (i.e. exceptions) and normal enclave
>   exits.
> * __vdso_sgx_enter_enclave() anchors its frame using %rbp (instead of %rsp in
>   the previous implementation).
> * __vdso_sgx_enter_enclave() takes one more parameter - a callback function to
>   be invoked upon enclave exits. This callback is optional, and if not
>   supplied, will cause __vdso_sgx_enter_enclave() to return upon enclave exits
>   (same behavior as previous implementation).
> * The callback function is given as a parameter the value of %rsp at enclave
>   exit to address data "pushed" by the enclave. A positive value returned by
>   the callback will be treated as an ENCLU leaf for re-entering the enclave,
>   while a zero or negative value will be passed through as the return
>   value of __vdso_sgx_enter_enclave() to its caller. It's also safe to
>   leave callback by longjmp() or by throwing a C++ exception.
> 
> Signed-off-by: Cedric Xing <cedric.xing@intel.com>
> ---
>  arch/x86/entry/vdso/vsgx_enter_enclave.S | 175 ++++++++++++++++-------
>  arch/x86/include/uapi/asm/sgx.h          |  14 +-
>  2 files changed, 128 insertions(+), 61 deletions(-)
> 
> diff --git a/arch/x86/entry/vdso/vsgx_enter_enclave.S b/arch/x86/entry/vdso/vsgx_enter_enclave.S
> index fe0bf6671d6d..debecb0d785c 100644
> --- a/arch/x86/entry/vdso/vsgx_enter_enclave.S
> +++ b/arch/x86/entry/vdso/vsgx_enter_enclave.S
> @@ -19,83 +19,150 @@
>   * __vdso_sgx_enter_enclave() - Enter an SGX enclave
>   *
>   * @leaf:	**IN \%eax** - ENCLU leaf, must be EENTER or ERESUME
> - * @tcs:	**IN \%rbx** - TCS, must be non-NULL
> - * @ex_info:	**IN \%rcx** - Optional 'struct sgx_enclave_exception' pointer
> + * @tcs:	**IN 0x08(\%rsp)** - TCS, must be non-NULL
> + * @ex_info:	**IN 0x10(\%rsp)** - Optional 'struct sgx_enclave_exinfo'
> + *				     pointer
> + * @callback:	**IN 0x18(\%rsp)** - Optional callback function to be called on
> + *				     enclave exit or exception
>   *
>   * Return:
>   *  **OUT \%eax** -
> - *  %0 on a clean entry/exit to/from the enclave, %-EINVAL if ENCLU leaf is
> - *  not allowed or if TCS is NULL, %-EFAULT if ENCLU or the enclave faults
> + *  %0 on a clean entry/exit to/from the enclave, %-EINVAL if ENCLU leaf is not
> + *  allowed, %-EFAULT if ENCLU or the enclave faults, or a non-positive value
> + *  returned from ``callback`` (if one is supplied).
>   *
>   * **Important!**  __vdso_sgx_enter_enclave() is **NOT** compliant with the
> - * x86-64 ABI, i.e. cannot be called from standard C code.   As noted above,
> - * input parameters must be passed via ``%eax``, ``%rbx`` and ``%rcx``, with
> - * the return value passed via ``%eax``.  All registers except ``%rsp`` must
> - * be treated as volatile from the caller's perspective, including but not
> - * limited to GPRs, EFLAGS.DF, MXCSR, FCW, etc...  Conversely, the enclave
> - * being run **must** preserve the untrusted ``%rsp`` and stack.
> + * x86-64 ABI, i.e. cannot be called from standard C code. As noted above,
> + * input parameters must be passed via ``%eax``, ``8(%rsp)``, ``0x10(%rsp)`` and
> + * ``0x18(%rsp)``, with the return value passed via ``%eax``. All other
> + * registers will be passed through to the enclave as is. All registers except
> + * ``%rbp`` must be treated as volatile from the caller's perspective, including

Hmm, this is my fault since I wrote the original comment, but stating
"All registers except %rbp must be treated as volatile" is confusing, e.g.
at first I thought it meant that the assembly blob was mucking with the
registers and that they couldn't be used to pass information out of the
enclave.

Maybe something like:

 * Register ABI:
 *  - As noted above, input parameters are passed via %eax and the stack.
 *  - The return value is passed via %eax.
 *  - %rbx and %rcx must be treated as volatile as they are modified as part of
 *    enclaves transitions and are used as scratch regs.
 *  - %rbp **must** be preserved by the enclave in all cases, as it is used to
 *    reference paramaters when handling enclave exits.
 *  - %rdx, %rdi, %rsi and %r8-%r15 may be freely modified by the enclave
 *    and will not be passed back to the caller untouched.
 *  - %rsp is saved/restored across __vdso_sgx_enter_enclave(), but is passed
 *    as-is to the callback if one is provided, i.e. the enclave may use u_rsp
 *    to pass information to its associated callback.
 *
 * Callback ABI:
 *  <more information about callback ABI>
 * 

> + * but not limited to GPRs, EFLAGS.DF, MXCSR, FCW, etc... Conversely, the
> + * enclave being run **must** preserve the untrusted ``%rbp``.
> + *
> + * ``callback`` has the following signature:
> + * int callback(long rdi, long rsi, long rdx,
> + *		struct sgx_enclave_exinfo *exinfo, long r8, long r9,
> + *		void *tcs, long ursp);
> + * ``callback`` **shall** follow x86_64 ABI. All GPRs **except** ``%rax``,
> + * ``%rbx`` and ``rcx`` are passed through to ``callback``. ``%rdi``, ``%rsi``,
> + * ``%rdx``, ``%r8``, ``%r9``, along with the value of ``%rsp`` when the enclave
> + * exited/excepted, can be accessed directly as input parameters, while other
> + * GPRs can be accessed in assembly if needed.  A positive value returned from
> + * ``callback`` will be treated as an ENCLU leaf (e.g. EENTER/ERESUME) to
> + * reenter the enclave (without popping the extra data pushed by the enclave off
> + * the stack), while 0 (zero) or a negative return value will be passed back to
> + * the caller of __vdso_sgx_enter_enclave(). It is also safe to leave
> + * ``callback`` via ``longjmp()`` or by throwing a C++ exception.
>   */

While this may format nicely in .rst (I haven't checked), it's damn near
unreadable in its raw form.  Escaping '%' in kernel-doc is unresolved at
this point, but this definitely is an argument for allowing the %CONST
interpretation to be disabled entirely.

longjmp() is probably the only thing that needs to be explicitly marked,
I think it would be better to simply say "the callback" instead of using
``callback``, i.e. use regular English instead of referencing the param.

> -__vdso_sgx_enter_enclave(u32 leaf, void *tcs,
> -			 struct sgx_enclave_exception *ex_info)
> +typedef int (*sgx_callback)(long rdi, long rsi, long rdx,
> +			    struct sgx_enclave_exinfo *exinfo, long r8,
> +			    long r9, void *tcs, long ursp);
> +int __vdso_sgx_enter_enclave(int leaf, void *tcs,
> +			     struct sgx_enclave_exinfo *exinfo,
> +			     sgx_callback callback)
>  {
> -	if (leaf != SGX_EENTER && leaf != SGX_ERESUME)
> -		return -EINVAL;
> -
> -	if (!tcs)
> -		return -EINVAL;
> -
> -	try {
> -		ENCLU[leaf];
> -	} catch (exception) {
> -		if (e)
> -			*e = exception;
> -		return -EFAULT;
> +	while (leaf == EENTER || leaf == ERESUME) {

This gives the impression that looping is the common case.  I'd prefer
using 'goto' to show that the common case is a simple fall through whereas
the callback case can effectively loop on ENCLU.

> +		int rc;
> +		try {
> +			ENCLU[leaf];
> +			rc = 0;
> +			if (exinfo)
> +				exinfo->leaf = EEXIT;
> +		} catch (exception) {
> +			rc = -EFAULT;
> +			if (exinfo)
> +				*exinfo = exception;
> +		}
> +
> +		leaf = callback ? (*callback)(
> +			rdi, rsi, rdx, exinfo, r8, r9, tcs, ursp) : rc;
>  	}
>  
> -	return 0;
> +	return leaf > 0 ? -EINVAL : leaf;
>  }
>  #endif
>  ENTRY(__vdso_sgx_enter_enclave)
> -	/* EENTER <= leaf <= ERESUME */
> +	/* Prolog */
> +	.cfi_startproc
> +	push	%rbp
> +	.cfi_adjust_cfa_offset	8
> +	.cfi_rel_offset		%rbp, 0
> +	mov	%rsp, %rbp
> +	.cfi_def_cfa_register	%rbp
> +
> +1:	/* EENTER <= leaf <= ERESUME */
>  	cmp	$0x2, %eax
> -	jb	bad_input
> -
> +	jb	6f
>  	cmp	$0x3, %eax
> -	ja	bad_input
> +	ja	6f
>  
> -	/* TCS must be non-NULL */
> -	test	%rbx, %rbx
> -	je	bad_input
> +	/* Load TCS and AEP */
> +	mov	0x10(%rbp), %rbx
> +	lea	2f(%rip), %rcx
>  
> -	/* Save @exception_info */
> -	push	%rcx
> +	/* Single ENCLU serving as both EENTER and AEP (ERESUME) */
> +2:	enclu
>  
> -	/* Load AEP for ENCLU */
> -	lea	1f(%rip),  %rcx
> -1:	enclu
> +	/* EEXIT path */
> +	xor	%ebx, %ebx
> +3:	mov	0x18(%rbp), %rcx

@exinfo is optional, i.e. %ecx needs to be tested before its used.

> +	jrcxz	4f

I dislike the number of flags and values that are stashed away only to
be consumed later, it makes the code hard to follow.  AFAICT, a lot of
the shenanigans are done to reuse code because exinfo was overloaded,
but that's actually pointless, i.e. it's unnecessarily complex.

Overlading the struct is pointless becase if you make it mandatory when
using a callback then it can be nullified (when passed to the callback)
to indicate EEXIT.  If it's still optional, then the callback needs an
extra param to explicitly indicate EEXIT vs. -EFAULT, otherwise the
callback has no indication whatsover of why it was invoked.

My preference is for the latter since it's more explicit and obvious.

Tangetially related, I'm not opposed to renaming it
'struct sgx_enclave_exception_info' for clarity.

> +	mov	%eax, EX_LEAF(%rcx)
> +	jnc	4f
> +	mov	%di, EX_TRAPNR(%rcx)
> +	mov	%si, EX_ERROR_CODE(%rcx)
> +	mov	%rdx, EX_ADDRESS(%rcx)

My strong preference would be to organize the code to separate the various
flows, i.e. happy common case, invalid input, exception handler and
callback invocation.  And make the common case a straight shot so that
it's more obvious what happens in the happy non-callback case.

>  
> -	add	$0x8, %rsp
> -	xor	%eax, %eax
> +4:	/* Call *callback if supplied */
> +	mov	0x20(%rbp), %rax
> +	test	%rax, %rax
> +	/* At this point, %ebx holds the effective return value, which shall be
> +	 * returned if no callback is specified */

Use kernel-doc format for multi-line comments.  Missing punctionation at
the end.  Blank lines between logical blocks would also help readability.

E.g.:

4:      /* Call *callback if supplied */
        mov     0x20(%rbp), %rax
        test    %rax, %rax

        /*
         * At this point, %ebx holds the effective return value, which shall be
         * returned if no callback is specified.
         */
        cmovz   %rbx, %rax
        jz      7f

> +	cmovz	%rbx, %rax
> +	jz	7f
> +	/* Align stack per x86_64 ABI. The original %rsp is saved in %rbx to be
> +	 * restored after *callback returns. */
> +	mov	%rsp, %rbx
> +	and	$-0x10, %rsp
> +	/* Clear RFLAGS.DF per x86_64 ABI */
> +	cld
> +	/* Parameters for *callback */
> +	push	%rbx
> +	push	0x10(%rbp)
> +	/* Call *%rax via retpoline */
> +	call	40f

Another case of stashing a value it consuming it later.  This one is
especially brutal since %rax was loaded with CMOVcc, which means the
reader needs to backtrack even further to understand what %rax contains
at this point.

> +	/* Restore %rsp to its original value left off by the enclave from last
> +	 * exit */
> +	mov	%rbx, %rsp
> +	/* Positive return value from *callback will be interpreted as an ENCLU
> +	 * leaf, while a non-positive value will be interpreted as the return
> +	 * value to be passed back to the caller. */

Please use imperative mood in the comments, i.e. simply state what the
code is doing.  E.g. "will be interpreted" makes it sound like something
else is processing the callback's return value.

> +	jmp	1b
> +40:	/* retpoline */
> +	call	42f
> +41:	pause
> +	lfence
> +	jmp	41b
> +42:	mov	%rax, (%rsp)
>  	ret
>  
> -bad_input:
> -	mov     $(-EINVAL), %rax
> -	ret
> +5:	/* Exception path */
> +	mov	$-EFAULT, %ebx
> +	stc
> +	jmp	3b
>  
> -.pushsection .fixup, "ax"
> -	/* Re-load @exception_info and fill it (if it's non-NULL) */
> -2:	pop	%rcx
> -	test    %rcx, %rcx
> -	je      3f
> +6:	/* Unsupported ENCLU leaf */
> +	cmp	$0, %eax
> +	jle	7f
> +	mov	$-EINVAL, %eax
>  
> -	mov	%eax, EX_LEAF(%rcx)
> -	mov	%di,  EX_TRAPNR(%rcx)
> -	mov	%si,  EX_ERROR_CODE(%rcx)
> -	mov	%rdx, EX_ADDRESS(%rcx)
> -3:	mov	$(-EFAULT), %rax
> +7:	/* Epilog */
> +	leave
> +	.cfi_def_cfa		%rsp, 8
>  	ret
> -.popsection
> +	.cfi_endproc
>  
> -_ASM_VDSO_EXTABLE_HANDLE(1b, 2b)
> +_ASM_VDSO_EXTABLE_HANDLE(2b, 5b)
>  
>  ENDPROC(__vdso_sgx_enter_enclave)

Putting everything together, sans comments, I'd prefer something like the
following.  Pasted in raw code as that'll hopefully be easier to review
and discuss than a diff.

Note, swapping 'long rc' and '... *e' would allow the callback flow
to save one memory access, but IMO the exception info belongs at the end
since it's optional.

#ifdef SGX_KERNEL_DOC
typedef int (*sgx_callback)(long rdi, long rsi, long rdx, long ret, long r8,
			    long r9, void *tcs, long ursp,
			    struct sgx_enclave_exception_info *e);
int __vdso_sgx_enter_enclave(int leaf, void *tcs,
				struct sgx_enclave_exception_info *e,
				sgx_callback callback)
{
	int ret;

enter_enclave:
	if (leaf != EENTER && leaf != ERESUME) {
		ret = -EINVAL;
		goto out;
	}

	try {
		ENCLU[leaf];
		ret = 0;
	} catch (exception) {
		ret = -EFAULT;
		if (e)
			*e = exception;
	}
	if (callback)
		goto do_callback;
out:
	return ret;

do_callback:
	leaf = (*callback)(rdi, rsi, rdx, ret, r8, r9, e, tcs, ursp);
	if (leaf <= 0)
		goto out;
	goto enter_enclave;
}
#endif
ENTRY(__vdso_sgx_enter_enclave)
	/* Prolog */
	.cfi_startproc
	push	%rbp
	.cfi_adjust_cfa_offset	8
	.cfi_rel_offset		%rbp, 0
	mov	%rsp, %rbp
	.cfi_def_cfa_register	%rbp

1:	/* EENTER <= leaf <= ERESUME */
	cmp	$0x2, %eax
	jb	5f
	cmp	$0x3, %eax
	ja	5f

	/* Load TCS and AEP */
	mov	0x10(%rbp), %rbx
	lea	2f(%rip), %rcx

	/* Single ENCLU serving as both EENTER and AEP (ERESUME) */
2:	enclu

	/* EEXIT branches here unless the enclave is doing something fancy. */
	xor	%eax, %eax

3:	cmp	$0, 0x20(%rbp)
	jne	8f

4:	leave
	.cfi_def_cfa		%rsp, 8
	ret

5:	mov	$(-EINVAL), %rax
	jmp	4b

6:	mov	0x18(%rbp), %rcx
	test    %rcx, %rcx
	je	7f

	/* Fill optional exception info. */
	mov	%eax, EX_LEAF(%rcx)
	mov	%di,  EX_TRAPNR(%rcx)
	mov	%si,  EX_ERROR_CODE(%rcx)
	mov	%rdx, EX_ADDRESS(%rcx)
7:	mov	$(-EFAULT), %rax
	jmp	3b

8:	/*
	 * Align stack per x86_64 ABI.  Save the original %rsp in %rbx to be
	 * restored after the callback returns.
	 */
	mov	%rsp, %rbx
	and	$-0x10, %rsp

	/* Push @e, u_rsp and @tcs as parameters to the callback. */
	push	0x18(%rbp)
	push	%rbx
	push	0x10(%rbp)

	/* Pass the "return value" to the callback via %rcx. */
	mov	%rax, %rcx

	/* Clear RFLAGS.DF per x86_64 ABI */
	cld

	/* Load the callback pointer to %rax and invoke it via retpoline. */
	mov	0x20(%rbp), %rax
	call	40f

	/* Restore %rsp to its post-exit value. */
	mov	%rbx, %rsp

	/*
	 * If the return from callback is zero or negative, return immediately,
	 * otherwise attempt to re-execute ENCLU with the return interpreted as
	 * the requested ENCLU leaf.
	 */
	cmp	$0, %eax
	jle	4b
	jmp	1b

40:	/* retpoline */
	call	42f
41:	pause
	lfence
	jmp	41b
42:	mov	%rax, (%rsp)
	ret
	.cfi_endproc

_ASM_VDSO_EXTABLE_HANDLE(2b, 6b)

ENDPROC(__vdso_sgx_enter_enclave)
Xing, Cedric April 25, 2019, 11:31 p.m. UTC | #2
Hi Sean,

> AIUI, you and Andy had an off-list discussion regarding rewriting
> __vdso_sgx_enter_enclave() vs. providing a second vDSO function.  Can
> you
> please summarize the discussion so that it's clear why you're pursuing a
> single function?
> 
> In the end I probably agree that's it's desirable to have a single ABI
> and
> vDSO function since the cost is basically that the non-callback case
> needs
> to pass params via the stack instead of registers, but I do miss the
> simplicity of separate implementations.
> 

The major reason is we don't want to maintain 2 different ABIs (preserving %rsp vs. %rbp). Given the new one covers all known use cases, we prefer not to keep the other one.

> > Here's the summary of API/ABI changes in this patch. More details
> could be
> > found in arch/x86/entry/vdso/vsgx_enter_enclave.S.
> > * 'struct sgx_enclave_exception' is renamed to 'struct
> sgx_enclave_exinfo'
> >   because it is filled upon both AEX (i.e. exceptions) and normal
> enclave
> >   exits.
> > * __vdso_sgx_enter_enclave() anchors its frame using %rbp (instead
> of %rsp in
> >   the previous implementation).
> > * __vdso_sgx_enter_enclave() takes one more parameter - a callback
> function to
> >   be invoked upon enclave exits. This callback is optional, and if not
> >   supplied, will cause __vdso_sgx_enter_enclave() to return upon
> enclave exits
> >   (same behavior as previous implementation).
> > * The callback function is given as a parameter the value of %rsp at
> enclave
> >   exit to address data "pushed" by the enclave. A positive value
> returned by
> >   the callback will be treated as an ENCLU leaf for re-entering the
> enclave,
> >   while a zero or negative value will be passed through as the return
> >   value of __vdso_sgx_enter_enclave() to its caller. It's also safe to
> >   leave callback by longjmp() or by throwing a C++ exception.
> >
> > Signed-off-by: Cedric Xing <cedric.xing@intel.com>
> > ---
> >  arch/x86/entry/vdso/vsgx_enter_enclave.S | 175 ++++++++++++++++------
> -
> >  arch/x86/include/uapi/asm/sgx.h          |  14 +-
> >  2 files changed, 128 insertions(+), 61 deletions(-)
> >
> > diff --git a/arch/x86/entry/vdso/vsgx_enter_enclave.S
> b/arch/x86/entry/vdso/vsgx_enter_enclave.S
> > index fe0bf6671d6d..debecb0d785c 100644
> > --- a/arch/x86/entry/vdso/vsgx_enter_enclave.S
> > +++ b/arch/x86/entry/vdso/vsgx_enter_enclave.S
> > @@ -19,83 +19,150 @@
> >   * __vdso_sgx_enter_enclave() - Enter an SGX enclave
> >   *
> >   * @leaf:	**IN \%eax** - ENCLU leaf, must be EENTER or ERESUME
> > - * @tcs:	**IN \%rbx** - TCS, must be non-NULL
> > - * @ex_info:	**IN \%rcx** - Optional 'struct sgx_enclave_exception'
> pointer
> > + * @tcs:	**IN 0x08(\%rsp)** - TCS, must be non-NULL
> > + * @ex_info:	**IN 0x10(\%rsp)** - Optional 'struct
> sgx_enclave_exinfo'
> > + *				     pointer
> > + * @callback:	**IN 0x18(\%rsp)** - Optional callback function to
> be called on
> > + *				     enclave exit or exception
> >   *
> >   * Return:
> >   *  **OUT \%eax** -
> > - *  %0 on a clean entry/exit to/from the enclave, %-EINVAL if ENCLU
> leaf is
> > - *  not allowed or if TCS is NULL, %-EFAULT if ENCLU or the enclave
> faults
> > + *  %0 on a clean entry/exit to/from the enclave, %-EINVAL if ENCLU
> leaf is not
> > + *  allowed, %-EFAULT if ENCLU or the enclave faults, or a non-
> positive value
> > + *  returned from ``callback`` (if one is supplied).
> >   *
> >   * **Important!**  __vdso_sgx_enter_enclave() is **NOT** compliant
> with the
> > - * x86-64 ABI, i.e. cannot be called from standard C code.   As noted
> above,
> > - * input parameters must be passed via ``%eax``, ``%rbx`` and
> ``%rcx``, with
> > - * the return value passed via ``%eax``.  All registers except
> ``%rsp`` must
> > - * be treated as volatile from the caller's perspective, including
> but not
> > - * limited to GPRs, EFLAGS.DF, MXCSR, FCW, etc...  Conversely, the
> enclave
> > - * being run **must** preserve the untrusted ``%rsp`` and stack.
> > + * x86-64 ABI, i.e. cannot be called from standard C code. As noted
> above,
> > + * input parameters must be passed via ``%eax``, ``8(%rsp)``,
> ``0x10(%rsp)`` and
> > + * ``0x18(%rsp)``, with the return value passed via ``%eax``. All
> other
> > + * registers will be passed through to the enclave as is. All
> registers except
> > + * ``%rbp`` must be treated as volatile from the caller's perspective,
> including
> 
> Hmm, this is my fault since I wrote the original comment, but stating
> "All registers except %rbp must be treated as volatile" is confusing,
> e.g.
> at first I thought it meant that the assembly blob was mucking with the
> registers and that they couldn't be used to pass information out of the
> enclave.
> 
> Maybe something like:
> 
>  * Register ABI:
>  *  - As noted above, input parameters are passed via %eax and the stack.
>  *  - The return value is passed via %eax.
>  *  - %rbx and %rcx must be treated as volatile as they are modified as
> part of
>  *    enclaves transitions and are used as scratch regs.
>  *  - %rbp **must** be preserved by the enclave in all cases, as it is
> used to
>  *    reference paramaters when handling enclave exits.
>  *  - %rdx, %rdi, %rsi and %r8-%r15 may be freely modified by the
> enclave
>  *    and will not be passed back to the caller untouched.
>  *  - %rsp is saved/restored across __vdso_sgx_enter_enclave(), but is
> passed
>  *    as-is to the callback if one is provided, i.e. the enclave may use
> u_rsp
>  *    to pass information to its associated callback.
>  *
>  * Callback ABI:
>  *  <more information about callback ABI>
>  *

Really appreciated! I'll update the comments in the next revision.

> 
> > + * but not limited to GPRs, EFLAGS.DF, MXCSR, FCW, etc... Conversely,
> the
> > + * enclave being run **must** preserve the untrusted ``%rbp``.
> > + *
> > + * ``callback`` has the following signature:
> > + * int callback(long rdi, long rsi, long rdx,
> > + *		struct sgx_enclave_exinfo *exinfo, long r8, long r9,
> > + *		void *tcs, long ursp);
> > + * ``callback`` **shall** follow x86_64 ABI. All GPRs **except**
> ``%rax``,
> > + * ``%rbx`` and ``rcx`` are passed through to ``callback``. ``%rdi``,
> ``%rsi``,
> > + * ``%rdx``, ``%r8``, ``%r9``, along with the value of ``%rsp`` when
> the enclave
> > + * exited/excepted, can be accessed directly as input parameters,
> while other
> > + * GPRs can be accessed in assembly if needed.  A positive value
> returned from
> > + * ``callback`` will be treated as an ENCLU leaf (e.g. EENTER/ERESUME)
> to
> > + * reenter the enclave (without popping the extra data pushed by the
> enclave off
> > + * the stack), while 0 (zero) or a negative return value will be
> passed back to
> > + * the caller of __vdso_sgx_enter_enclave(). It is also safe to leave
> > + * ``callback`` via ``longjmp()`` or by throwing a C++ exception.
> >   */
> 
> While this may format nicely in .rst (I haven't checked), it's damn near
> unreadable in its raw form.  Escaping '%' in kernel-doc is unresolved at
> this point, but this definitely is an argument for allowing the %CONST
> interpretation to be disabled entirely.
> 

I know very little about kernel-doc. Not quite sure which markup means what. Or if you don't mind, could you just write up the whole thing (you have written half of it in your email already) so I can include it into my next revision?
 
> longjmp() is probably the only thing that needs to be explicitly marked,
> I think it would be better to simply say "the callback" instead of using
> ``callback``, i.e. use regular English instead of referencing the param.
> 

Are you suggesting not to mention C++ exception here?
 
> > -__vdso_sgx_enter_enclave(u32 leaf, void *tcs,
> > -			 struct sgx_enclave_exception *ex_info)
> > +typedef int (*sgx_callback)(long rdi, long rsi, long rdx,
> > +			    struct sgx_enclave_exinfo *exinfo, long r8,
> > +			    long r9, void *tcs, long ursp);
> > +int __vdso_sgx_enter_enclave(int leaf, void *tcs,
> > +			     struct sgx_enclave_exinfo *exinfo,
> > +			     sgx_callback callback)
> >  {
> > -	if (leaf != SGX_EENTER && leaf != SGX_ERESUME)
> > -		return -EINVAL;
> > -
> > -	if (!tcs)
> > -		return -EINVAL;
> > -
> > -	try {
> > -		ENCLU[leaf];
> > -	} catch (exception) {
> > -		if (e)
> > -			*e = exception;
> > -		return -EFAULT;
> > +	while (leaf == EENTER || leaf == ERESUME) {
> 
> This gives the impression that looping is the common case.  I'd prefer
> using 'goto' to show that the common case is a simple fall through
> whereas
> the callback case can effectively loop on ENCLU.

Looping IMO is indeed the common case. Just think of the case of an enclave making a sequence of o-calls. 
 
> 
> > +		int rc;
> > +		try {
> > +			ENCLU[leaf];
> > +			rc = 0;
> > +			if (exinfo)
> > +				exinfo->leaf = EEXIT;
> > +		} catch (exception) {
> > +			rc = -EFAULT;
> > +			if (exinfo)
> > +				*exinfo = exception;
> > +		}
> > +
> > +		leaf = callback ? (*callback)(
> > +			rdi, rsi, rdx, exinfo, r8, r9, tcs, ursp) : rc;
> >  	}
> >
> > -	return 0;
> > +	return leaf > 0 ? -EINVAL : leaf;
> >  }
> >  #endif
> >  ENTRY(__vdso_sgx_enter_enclave)
> > -	/* EENTER <= leaf <= ERESUME */
> > +	/* Prolog */
> > +	.cfi_startproc
> > +	push	%rbp
> > +	.cfi_adjust_cfa_offset	8
> > +	.cfi_rel_offset		%rbp, 0
> > +	mov	%rsp, %rbp
> > +	.cfi_def_cfa_register	%rbp
> > +
> > +1:	/* EENTER <= leaf <= ERESUME */
> >  	cmp	$0x2, %eax
> > -	jb	bad_input
> > -
> > +	jb	6f
> >  	cmp	$0x3, %eax
> > -	ja	bad_input
> > +	ja	6f
> >
> > -	/* TCS must be non-NULL */
> > -	test	%rbx, %rbx
> > -	je	bad_input
> > +	/* Load TCS and AEP */
> > +	mov	0x10(%rbp), %rbx
> > +	lea	2f(%rip), %rcx
> >
> > -	/* Save @exception_info */
> > -	push	%rcx
> > +	/* Single ENCLU serving as both EENTER and AEP (ERESUME) */
> > +2:	enclu
> >
> > -	/* Load AEP for ENCLU */
> > -	lea	1f(%rip),  %rcx
> > -1:	enclu
> > +	/* EEXIT path */
> > +	xor	%ebx, %ebx
> > +3:	mov	0x18(%rbp), %rcx
> 
> @exinfo is optional, i.e. %ecx needs to be tested before its used.
> 
> > +	jrcxz	4f

The above instruction (i.e. jrcxz) does test %rcx.

> 
> I dislike the number of flags and values that are stashed away only to
> be consumed later, it makes the code hard to follow.  AFAICT, a lot of
> the shenanigans are done to reuse code because exinfo was overloaded,
> but that's actually pointless, i.e. it's unnecessarily complex.
> 
> Overlading the struct is pointless becase if you make it mandatory when
> using a callback then it can be nullified (when passed to the callback)
> to indicate EEXIT.  If it's still optional, then the callback needs an
> extra param to explicitly indicate EEXIT vs. -EFAULT, otherwise the
> callback has no indication whatsover of why it was invoked.
> 
> My preference is for the latter since it's more explicit and obvious.

I wouldn't nullify @exinfo at EEXIT because that could be used as a "context" pointer to the callback.

Making it optionally because the callback can still use other registers (e.g. %rdi) determine the reason without it. For example, an enclave may set %rdi to non-zero to signify o-call/e-ret while zero (as set by AEX) would indicate an exception. My intention is not to impose unnecessary restrictions on applications.

> 
> Tangetially related, I'm not opposed to renaming it
> 'struct sgx_enclave_exception_info' for clarity.

I'd still prefer sgx_enclave_exinfo to imply its use in both _ex_it and _ex_ception cases, for the reason stated above.

> 
> > +	mov	%eax, EX_LEAF(%rcx)
> > +	jnc	4f
> > +	mov	%di, EX_TRAPNR(%rcx)
> > +	mov	%si, EX_ERROR_CODE(%rcx)
> > +	mov	%rdx, EX_ADDRESS(%rcx)
> 
> My strong preference would be to organize the code to separate the
> various
> flows, i.e. happy common case, invalid input, exception handler and
> callback invocation.  And make the common case a straight shot so that
> it's more obvious what happens in the happy non-callback case.

I'd prefer more concise code. After all, this is assembly without optimization done by the compiler. If it were for C, I'd agree with you because either case would end up in roughly the same code (in terms of code size or efficiency) due to the optimizer.

> >
> > -	add	$0x8, %rsp
> > -	xor	%eax, %eax
> > +4:	/* Call *callback if supplied */
> > +	mov	0x20(%rbp), %rax
> > +	test	%rax, %rax
> > +	/* At this point, %ebx holds the effective return value, which
> shall be
> > +	 * returned if no callback is specified */
> 
> Use kernel-doc format for multi-line comments.  Missing punctionation at
> the end.  Blank lines between logical blocks would also help readability.
> 
> E.g.:
> 
> 4:      /* Call *callback if supplied */
>         mov     0x20(%rbp), %rax
>         test    %rax, %rax
> 
>         /*
>          * At this point, %ebx holds the effective return value, which
> shall be
>          * returned if no callback is specified.
>          */
>         cmovz   %rbx, %rax
>         jz      7f

I know very little about kernel-doc. Would you please point me to documents describing it? What you suggested here will of course be incorporated into my next revision. 

> > +	cmovz	%rbx, %rax
> > +	jz	7f
> > +	/* Align stack per x86_64 ABI. The original %rsp is saved in %rbx
> to be
> > +	 * restored after *callback returns. */
> > +	mov	%rsp, %rbx
> > +	and	$-0x10, %rsp
> > +	/* Clear RFLAGS.DF per x86_64 ABI */
> > +	cld
> > +	/* Parameters for *callback */
> > +	push	%rbx
> > +	push	0x10(%rbp)
> > +	/* Call *%rax via retpoline */
> > +	call	40f
> 
> Another case of stashing a value it consuming it later.  This one is
> especially brutal since %rax was loaded with CMOVcc, which means the
> reader needs to backtrack even further to understand what %rax contains
> at this point.

%rax was loaded only 2 instructions ago, while what %rbx is containing is described in the comment immediately above it. I think I have done my best to make it readable. After all, this is assembly so can't expect something like giving proper names to variables in C because that just can't be done in assembly.

> 
> > +	/* Restore %rsp to its original value left off by the enclave from
> last
> > +	 * exit */
> > +	mov	%rbx, %rsp
> > +	/* Positive return value from *callback will be interpreted as an
> ENCLU
> > +	 * leaf, while a non-positive value will be interpreted as the
> return
> > +	 * value to be passed back to the caller. */
> 
> Please use imperative mood in the comments, i.e. simply state what the
> code is doing.  E.g. "will be interpreted" makes it sound like something
> else is processing the callback's return value.

Will do in the next revision.

> > +	jmp	1b
> > +40:	/* retpoline */
> > +	call	42f
> > +41:	pause
> > +	lfence
> > +	jmp	41b
> > +42:	mov	%rax, (%rsp)
> >  	ret
> >
> > -bad_input:
> > -	mov     $(-EINVAL), %rax
> > -	ret
> > +5:	/* Exception path */
> > +	mov	$-EFAULT, %ebx
> > +	stc
> > +	jmp	3b
> >
> > -.pushsection .fixup, "ax"
> > -	/* Re-load @exception_info and fill it (if it's non-NULL) */
> > -2:	pop	%rcx
> > -	test    %rcx, %rcx
> > -	je      3f
> > +6:	/* Unsupported ENCLU leaf */
> > +	cmp	$0, %eax
> > +	jle	7f
> > +	mov	$-EINVAL, %eax
> >
> > -	mov	%eax, EX_LEAF(%rcx)
> > -	mov	%di,  EX_TRAPNR(%rcx)
> > -	mov	%si,  EX_ERROR_CODE(%rcx)
> > -	mov	%rdx, EX_ADDRESS(%rcx)
> > -3:	mov	$(-EFAULT), %rax
> > +7:	/* Epilog */
> > +	leave
> > +	.cfi_def_cfa		%rsp, 8
> >  	ret
> > -.popsection
> > +	.cfi_endproc
> >
> > -_ASM_VDSO_EXTABLE_HANDLE(1b, 2b)
> > +_ASM_VDSO_EXTABLE_HANDLE(2b, 5b)
> >
> >  ENDPROC(__vdso_sgx_enter_enclave)
> 
> Putting everything together, sans comments, I'd prefer something like
> the
> following.  Pasted in raw code as that'll hopefully be easier to review
> and discuss than a diff.
> 
> Note, swapping 'long rc' and '... *e' would allow the callback flow
> to save one memory access, but IMO the exception info belongs at the end
> since it's optional.

You have probably misunderstood my code. 'ret' is _not_ passed while @exinfo is passed in %rcx. exinfo->leaf implies the reason of the callback.

Or are you suggesting to add 'ret' as one more parameter to the callback? I don't think it necessary because exinfo contains everything the callback would ever need.

> #ifdef SGX_KERNEL_DOC
> typedef int (*sgx_callback)(long rdi, long rsi, long rdx, long ret, long
> r8,
> 			    long r9, void *tcs, long ursp,
> 			    struct sgx_enclave_exception_info *e);
> int __vdso_sgx_enter_enclave(int leaf, void *tcs,
> 				struct sgx_enclave_exception_info *e,
> 				sgx_callback callback)
> {
> 	int ret;
> 
> enter_enclave:
> 	if (leaf != EENTER && leaf != ERESUME) {
> 		ret = -EINVAL;
> 		goto out;
> 	}
> 
> 	try {
> 		ENCLU[leaf];
> 		ret = 0;
> 	} catch (exception) {
> 		ret = -EFAULT;
> 		if (e)
> 			*e = exception;
> 	}
> 	if (callback)
> 		goto do_callback;
> out:
> 	return ret;
> 
> do_callback:
> 	leaf = (*callback)(rdi, rsi, rdx, ret, r8, r9, e, tcs, ursp);
> 	if (leaf <= 0)
> 		goto out;
> 	goto enter_enclave;
> }
> #endif
> ENTRY(__vdso_sgx_enter_enclave)
> 	/* Prolog */
> 	.cfi_startproc
> 	push	%rbp
> 	.cfi_adjust_cfa_offset	8
> 	.cfi_rel_offset		%rbp, 0
> 	mov	%rsp, %rbp
> 	.cfi_def_cfa_register	%rbp
> 
> 1:	/* EENTER <= leaf <= ERESUME */
> 	cmp	$0x2, %eax
> 	jb	5f
> 	cmp	$0x3, %eax
> 	ja	5f
> 
> 	/* Load TCS and AEP */
> 	mov	0x10(%rbp), %rbx
> 	lea	2f(%rip), %rcx
> 
> 	/* Single ENCLU serving as both EENTER and AEP (ERESUME) */
> 2:	enclu
> 
> 	/* EEXIT branches here unless the enclave is doing something fancy.
> */
> 	xor	%eax, %eax
> 
> 3:	cmp	$0, 0x20(%rbp)
> 	jne	8f
> 
> 4:	leave
> 	.cfi_def_cfa		%rsp, 8
> 	ret
> 
> 5:	mov	$(-EINVAL), %rax
> 	jmp	4b
> 
> 6:	mov	0x18(%rbp), %rcx
> 	test    %rcx, %rcx
> 	je	7f
> 
> 	/* Fill optional exception info. */
> 	mov	%eax, EX_LEAF(%rcx)
> 	mov	%di,  EX_TRAPNR(%rcx)
> 	mov	%si,  EX_ERROR_CODE(%rcx)
> 	mov	%rdx, EX_ADDRESS(%rcx)
> 7:	mov	$(-EFAULT), %rax
> 	jmp	3b
> 
> 8:	/*
> 	 * Align stack per x86_64 ABI.  Save the original %rsp in %rbx to
> be
> 	 * restored after the callback returns.
> 	 */
> 	mov	%rsp, %rbx
> 	and	$-0x10, %rsp
> 
> 	/* Push @e, u_rsp and @tcs as parameters to the callback. */
> 	push	0x18(%rbp)
> 	push	%rbx
> 	push	0x10(%rbp)
> 
> 	/* Pass the "return value" to the callback via %rcx. */
> 	mov	%rax, %rcx
> 
> 	/* Clear RFLAGS.DF per x86_64 ABI */
> 	cld
> 
> 	/* Load the callback pointer to %rax and invoke it via retpoline.
> */
> 	mov	0x20(%rbp), %rax
> 	call	40f
> 
> 	/* Restore %rsp to its post-exit value. */
> 	mov	%rbx, %rsp
> 
> 	/*
> 	 * If the return from callback is zero or negative, return
> immediately,
> 	 * otherwise attempt to re-execute ENCLU with the return
> interpreted as
> 	 * the requested ENCLU leaf.
> 	 */
> 	cmp	$0, %eax
> 	jle	4b
> 	jmp	1b
> 
> 40:	/* retpoline */
> 	call	42f
> 41:	pause
> 	lfence
> 	jmp	41b
> 42:	mov	%rax, (%rsp)
> 	ret
> 	.cfi_endproc
> 
> _ASM_VDSO_EXTABLE_HANDLE(2b, 6b)
> 
> ENDPROC(__vdso_sgx_enter_enclave)

Your code will work. The only missing thing is a ".cfi_def_cfa %rbp, 16" after 'ret' instruction in block 4. I decided to keep "leave; ret" at the end to avoid confusion around the frame pointer. As you may also notice, GCC generated code usually does the same thing. 

-Cedric
Sean Christopherson April 26, 2019, 9 p.m. UTC | #3
On Thu, Apr 25, 2019 at 04:31:40PM -0700, Xing, Cedric wrote:
> > While this may format nicely in .rst (I haven't checked), it's damn near
> > unreadable in its raw form.  Escaping '%' in kernel-doc is unresolved at
> > this point, but this definitely is an argument for allowing the %CONST
> > interpretation to be disabled entirely.
> > 
> 
> I know very little about kernel-doc. Not quite sure which markup means what.
> Or if you don't mind, could you just write up the whole thing (you have
> written half of it in your email already) so I can include it into my next
> revision?

Hmm, at this point I'd say just ignore the comment entirely.  It's
something we need to sort out in the "official" series anyways.

>  
> > longjmp() is probably the only thing that needs to be explicitly marked,
> > I think it would be better to simply say "the callback" instead of using
> > ``callback``, i.e. use regular English instead of referencing the param.
> > 
> 
> Are you suggesting not to mention C++ exception here?

No, I was saying that rather than do the backtick markup on callback,
just say "the callback".  The copious amount of markup makes it difficult
to read the raw text.

>  
> > > -__vdso_sgx_enter_enclave(u32 leaf, void *tcs,
> > > -			 struct sgx_enclave_exception *ex_info)
> > > +typedef int (*sgx_callback)(long rdi, long rsi, long rdx,
> > > +			    struct sgx_enclave_exinfo *exinfo, long r8,
> > > +			    long r9, void *tcs, long ursp);
> > > +int __vdso_sgx_enter_enclave(int leaf, void *tcs,
> > > +			     struct sgx_enclave_exinfo *exinfo,
> > > +			     sgx_callback callback)
> > >  {
> > > -	if (leaf != SGX_EENTER && leaf != SGX_ERESUME)
> > > -		return -EINVAL;
> > > -
> > > -	if (!tcs)
> > > -		return -EINVAL;
> > > -
> > > -	try {
> > > -		ENCLU[leaf];
> > > -	} catch (exception) {
> > > -		if (e)
> > > -			*e = exception;
> > > -		return -EFAULT;
> > > +	while (leaf == EENTER || leaf == ERESUME) {
> > 
> > This gives the impression that looping is the common case.  I'd prefer
> > using 'goto' to show that the common case is a simple fall through
> > whereas
> > the callback case can effectively loop on ENCLU.
> 
> Looping IMO is indeed the common case. Just think of the case of an enclave
> making a sequence of o-calls. 

Except that looping is only done when using a callback, and that most
definitely is not the common case.

> > > +		int rc;
> > > +		try {
> > > +			ENCLU[leaf];
> > > +			rc = 0;
> > > +			if (exinfo)
> > > +				exinfo->leaf = EEXIT;
> > > +		} catch (exception) {
> > > +			rc = -EFAULT;
> > > +			if (exinfo)
> > > +				*exinfo = exception;
> > > +		}
> > > +
> > > +		leaf = callback ? (*callback)(
> > > +			rdi, rsi, rdx, exinfo, r8, r9, tcs, ursp) : rc;
> > >  	}
> > >
> > > -	return 0;
> > > +	return leaf > 0 ? -EINVAL : leaf;
> > >  }
> > >  #endif
> > >  ENTRY(__vdso_sgx_enter_enclave)
> > > -	/* EENTER <= leaf <= ERESUME */
> > > +	/* Prolog */
> > > +	.cfi_startproc
> > > +	push	%rbp
> > > +	.cfi_adjust_cfa_offset	8
> > > +	.cfi_rel_offset		%rbp, 0
> > > +	mov	%rsp, %rbp
> > > +	.cfi_def_cfa_register	%rbp
> > > +
> > > +1:	/* EENTER <= leaf <= ERESUME */
> > >  	cmp	$0x2, %eax
> > > -	jb	bad_input
> > > -
> > > +	jb	6f
> > >  	cmp	$0x3, %eax
> > > -	ja	bad_input
> > > +	ja	6f
> > >
> > > -	/* TCS must be non-NULL */
> > > -	test	%rbx, %rbx
> > > -	je	bad_input
> > > +	/* Load TCS and AEP */
> > > +	mov	0x10(%rbp), %rbx
> > > +	lea	2f(%rip), %rcx
> > >
> > > -	/* Save @exception_info */
> > > -	push	%rcx
> > > +	/* Single ENCLU serving as both EENTER and AEP (ERESUME) */
> > > +2:	enclu
> > >
> > > -	/* Load AEP for ENCLU */
> > > -	lea	1f(%rip),  %rcx
> > > -1:	enclu
> > > +	/* EEXIT path */
> > > +	xor	%ebx, %ebx
> > > +3:	mov	0x18(%rbp), %rcx
> > 
> > @exinfo is optional, i.e. %ecx needs to be tested before its used.
> > 
> > > +	jrcxz	4f
> 
> The above instruction (i.e. jrcxz) does test %rcx.

I love x86 ISA.

> > I dislike the number of flags and values that are stashed away only to
> > be consumed later, it makes the code hard to follow.  AFAICT, a lot of
> > the shenanigans are done to reuse code because exinfo was overloaded,
> > but that's actually pointless, i.e. it's unnecessarily complex.
> > 
> > Overlading the struct is pointless becase if you make it mandatory when
> > using a callback then it can be nullified (when passed to the callback)
> > to indicate EEXIT.  If it's still optional, then the callback needs an
> > extra param to explicitly indicate EEXIT vs. -EFAULT, otherwise the
> > callback has no indication whatsover of why it was invoked.
> > 
> > My preference is for the latter since it's more explicit and obvious.
> 
> I wouldn't nullify @exinfo at EEXIT because that could be used as a "context"
> pointer to the callback.
> 
> Making it optionally because the callback can still use other registers (e.g.
> %rdi) determine the reason without it. For example, an enclave may set %rdi
> to non-zero to signify o-call/e-ret while zero (as set by AEX) would indicate
> an exception. My intention is not to impose unnecessary restrictions on
> applications.
> > Tangetially related, I'm not opposed to renaming it
> > 'struct sgx_enclave_exception_info' for clarity.
> 
> I'd still prefer sgx_enclave_exinfo to imply its use in both _ex_it and
> _ex_ception cases, for the reason stated above.

Because an enclave *could* choose a different route?  The cost is
essentially one memory access, which is likely offset by the fact that
the callback can directly check %rcx for the exit reason.  You are not
the sole consumer of this code.

> > > +	mov	%eax, EX_LEAF(%rcx)
> > > +	jnc	4f
> > > +	mov	%di, EX_TRAPNR(%rcx)
> > > +	mov	%si, EX_ERROR_CODE(%rcx)
> > > +	mov	%rdx, EX_ADDRESS(%rcx)
> > 
> > My strong preference would be to organize the code to separate the
> > various
> > flows, i.e. happy common case, invalid input, exception handler and
> > callback invocation.  And make the common case a straight shot so that
> > it's more obvious what happens in the happy non-callback case.
> 
> I'd prefer more concise code. After all, this is assembly without
> optimization done by the compiler. If it were for C, I'd agree with you
> because either case would end up in roughly the same code (in terms of code
> size or efficiency) due to the optimizer.

a) You've argued multiple times this isn't performance critical code.

b) A straight shot is optimal for the common case where userspace is not
   using a callback.

c) The performance difference is something like one or two memory accesses.
   That's peanuts compared to how readable the code is by people without
   prior knowledge of exactly what this code is doing.

> > > -	add	$0x8, %rsp
> > > -	xor	%eax, %eax
> > > +4:	/* Call *callback if supplied */
> > > +	mov	0x20(%rbp), %rax
> > > +	test	%rax, %rax
> > > +	/* At this point, %ebx holds the effective return value, which
> > shall be
> > > +	 * returned if no callback is specified */
> > 
> > Use kernel-doc format for multi-line comments.  Missing punctionation at
> > the end.  Blank lines between logical blocks would also help readability.
> > 
> > E.g.:
> > 
> > 4:      /* Call *callback if supplied */
> >         mov     0x20(%rbp), %rax
> >         test    %rax, %rax
> > 
> >         /*
> >          * At this point, %ebx holds the effective return value, which
> > shall be
> >          * returned if no callback is specified.
> >          */
> >         cmovz   %rbx, %rax
> >         jz      7f
> 
> I know very little about kernel-doc. Would you please point me to documents
> describing it? What you suggested here will of course be incorporated into my
> next revision. 

Documentation/doc-guide/kernel-doc.rst

[...]

> > Note, swapping 'long rc' and '... *e' would allow the callback flow
> > to save one memory access, but IMO the exception info belongs at the end
> > since it's optional.
> 
> You have probably misunderstood my code. 'ret' is _not_ passed while @exinfo
> is passed in %rcx. exinfo->leaf implies the reason of the callback.
> 
> Or are you suggesting to add 'ret' as one more parameter to the callback? I
> don't think it necessary because exinfo contains everything the callback
> would ever need.

The latter, because exinfo is optional, as stated above.
Jarkko Sakkinen May 2, 2019, 8:28 a.m. UTC | #4
On Fri, Apr 26, 2019 at 02:00:18PM -0700, Sean Christopherson wrote:
> On Thu, Apr 25, 2019 at 04:31:40PM -0700, Xing, Cedric wrote:
> > > While this may format nicely in .rst (I haven't checked), it's damn near
> > > unreadable in its raw form.  Escaping '%' in kernel-doc is unresolved at
> > > this point, but this definitely is an argument for allowing the %CONST
> > > interpretation to be disabled entirely.
> > > 
> > 
> > I know very little about kernel-doc. Not quite sure which markup means what.
> > Or if you don't mind, could you just write up the whole thing (you have
> > written half of it in your email already) so I can include it into my next
> > revision?
> 
> Hmm, at this point I'd say just ignore the comment entirely.  It's
> something we need to sort out in the "official" series anyways.

To get started: https://www.kernel.org/doc/Documentation/kernel-doc-nano-HOWTO.txt

/Jarkko

Patch
diff mbox series

diff --git a/arch/x86/entry/vdso/vsgx_enter_enclave.S b/arch/x86/entry/vdso/vsgx_enter_enclave.S
index fe0bf6671d6d..debecb0d785c 100644
--- a/arch/x86/entry/vdso/vsgx_enter_enclave.S
+++ b/arch/x86/entry/vdso/vsgx_enter_enclave.S
@@ -19,83 +19,150 @@ 
  * __vdso_sgx_enter_enclave() - Enter an SGX enclave
  *
  * @leaf:	**IN \%eax** - ENCLU leaf, must be EENTER or ERESUME
- * @tcs:	**IN \%rbx** - TCS, must be non-NULL
- * @ex_info:	**IN \%rcx** - Optional 'struct sgx_enclave_exception' pointer
+ * @tcs:	**IN 0x08(\%rsp)** - TCS, must be non-NULL
+ * @ex_info:	**IN 0x10(\%rsp)** - Optional 'struct sgx_enclave_exinfo'
+ *				     pointer
+ * @callback:	**IN 0x18(\%rsp)** - Optional callback function to be called on
+ *				     enclave exit or exception
  *
  * Return:
  *  **OUT \%eax** -
- *  %0 on a clean entry/exit to/from the enclave, %-EINVAL if ENCLU leaf is
- *  not allowed or if TCS is NULL, %-EFAULT if ENCLU or the enclave faults
+ *  %0 on a clean entry/exit to/from the enclave, %-EINVAL if ENCLU leaf is not
+ *  allowed, %-EFAULT if ENCLU or the enclave faults, or a non-positive value
+ *  returned from ``callback`` (if one is supplied).
  *
  * **Important!**  __vdso_sgx_enter_enclave() is **NOT** compliant with the
- * x86-64 ABI, i.e. cannot be called from standard C code.   As noted above,
- * input parameters must be passed via ``%eax``, ``%rbx`` and ``%rcx``, with
- * the return value passed via ``%eax``.  All registers except ``%rsp`` must
- * be treated as volatile from the caller's perspective, including but not
- * limited to GPRs, EFLAGS.DF, MXCSR, FCW, etc...  Conversely, the enclave
- * being run **must** preserve the untrusted ``%rsp`` and stack.
+ * x86-64 ABI, i.e. cannot be called from standard C code. As noted above,
+ * input parameters must be passed via ``%eax``, ``8(%rsp)``, ``0x10(%rsp)`` and
+ * ``0x18(%rsp)``, with the return value passed via ``%eax``. All other
+ * registers will be passed through to the enclave as is. All registers except
+ * ``%rbp`` must be treated as volatile from the caller's perspective, including
+ * but not limited to GPRs, EFLAGS.DF, MXCSR, FCW, etc... Conversely, the
+ * enclave being run **must** preserve the untrusted ``%rbp``.
+ *
+ * ``callback`` has the following signature:
+ * int callback(long rdi, long rsi, long rdx,
+ *		struct sgx_enclave_exinfo *exinfo, long r8, long r9,
+ *		void *tcs, long ursp);
+ * ``callback`` **shall** follow x86_64 ABI. All GPRs **except** ``%rax``,
+ * ``%rbx`` and ``rcx`` are passed through to ``callback``. ``%rdi``, ``%rsi``,
+ * ``%rdx``, ``%r8``, ``%r9``, along with the value of ``%rsp`` when the enclave
+ * exited/excepted, can be accessed directly as input parameters, while other
+ * GPRs can be accessed in assembly if needed.  A positive value returned from
+ * ``callback`` will be treated as an ENCLU leaf (e.g. EENTER/ERESUME) to
+ * reenter the enclave (without popping the extra data pushed by the enclave off
+ * the stack), while 0 (zero) or a negative return value will be passed back to
+ * the caller of __vdso_sgx_enter_enclave(). It is also safe to leave
+ * ``callback`` via ``longjmp()`` or by throwing a C++ exception.
  */
-__vdso_sgx_enter_enclave(u32 leaf, void *tcs,
-			 struct sgx_enclave_exception *ex_info)
+typedef int (*sgx_callback)(long rdi, long rsi, long rdx,
+			    struct sgx_enclave_exinfo *exinfo, long r8,
+			    long r9, void *tcs, long ursp);
+int __vdso_sgx_enter_enclave(int leaf, void *tcs,
+			     struct sgx_enclave_exinfo *exinfo,
+			     sgx_callback callback)
 {
-	if (leaf != SGX_EENTER && leaf != SGX_ERESUME)
-		return -EINVAL;
-
-	if (!tcs)
-		return -EINVAL;
-
-	try {
-		ENCLU[leaf];
-	} catch (exception) {
-		if (e)
-			*e = exception;
-		return -EFAULT;
+	while (leaf == EENTER || leaf == ERESUME) {
+		int rc;
+		try {
+			ENCLU[leaf];
+			rc = 0;
+			if (exinfo)
+				exinfo->leaf = EEXIT;
+		} catch (exception) {
+			rc = -EFAULT;
+			if (exinfo)
+				*exinfo = exception;
+		}
+
+		leaf = callback ? (*callback)(
+			rdi, rsi, rdx, exinfo, r8, r9, tcs, ursp) : rc;
 	}
 
-	return 0;
+	return leaf > 0 ? -EINVAL : leaf;
 }
 #endif
 ENTRY(__vdso_sgx_enter_enclave)
-	/* EENTER <= leaf <= ERESUME */
+	/* Prolog */
+	.cfi_startproc
+	push	%rbp
+	.cfi_adjust_cfa_offset	8
+	.cfi_rel_offset		%rbp, 0
+	mov	%rsp, %rbp
+	.cfi_def_cfa_register	%rbp
+
+1:	/* EENTER <= leaf <= ERESUME */
 	cmp	$0x2, %eax
-	jb	bad_input
-
+	jb	6f
 	cmp	$0x3, %eax
-	ja	bad_input
+	ja	6f
 
-	/* TCS must be non-NULL */
-	test	%rbx, %rbx
-	je	bad_input
+	/* Load TCS and AEP */
+	mov	0x10(%rbp), %rbx
+	lea	2f(%rip), %rcx
 
-	/* Save @exception_info */
-	push	%rcx
+	/* Single ENCLU serving as both EENTER and AEP (ERESUME) */
+2:	enclu
 
-	/* Load AEP for ENCLU */
-	lea	1f(%rip),  %rcx
-1:	enclu
+	/* EEXIT path */
+	xor	%ebx, %ebx
+3:	mov	0x18(%rbp), %rcx
+	jrcxz	4f
+	mov	%eax, EX_LEAF(%rcx)
+	jnc	4f
+	mov	%di, EX_TRAPNR(%rcx)
+	mov	%si, EX_ERROR_CODE(%rcx)
+	mov	%rdx, EX_ADDRESS(%rcx)
 
-	add	$0x8, %rsp
-	xor	%eax, %eax
+4:	/* Call *callback if supplied */
+	mov	0x20(%rbp), %rax
+	test	%rax, %rax
+	/* At this point, %ebx holds the effective return value, which shall be
+	 * returned if no callback is specified */
+	cmovz	%rbx, %rax
+	jz	7f
+	/* Align stack per x86_64 ABI. The original %rsp is saved in %rbx to be
+	 * restored after *callback returns. */
+	mov	%rsp, %rbx
+	and	$-0x10, %rsp
+	/* Clear RFLAGS.DF per x86_64 ABI */
+	cld
+	/* Parameters for *callback */
+	push	%rbx
+	push	0x10(%rbp)
+	/* Call *%rax via retpoline */
+	call	40f
+	/* Restore %rsp to its original value left off by the enclave from last
+	 * exit */
+	mov	%rbx, %rsp
+	/* Positive return value from *callback will be interpreted as an ENCLU
+	 * leaf, while a non-positive value will be interpreted as the return
+	 * value to be passed back to the caller. */
+	jmp	1b
+40:	/* retpoline */
+	call	42f
+41:	pause
+	lfence
+	jmp	41b
+42:	mov	%rax, (%rsp)
 	ret
 
-bad_input:
-	mov     $(-EINVAL), %rax
-	ret
+5:	/* Exception path */
+	mov	$-EFAULT, %ebx
+	stc
+	jmp	3b
 
-.pushsection .fixup, "ax"
-	/* Re-load @exception_info and fill it (if it's non-NULL) */
-2:	pop	%rcx
-	test    %rcx, %rcx
-	je      3f
+6:	/* Unsupported ENCLU leaf */
+	cmp	$0, %eax
+	jle	7f
+	mov	$-EINVAL, %eax
 
-	mov	%eax, EX_LEAF(%rcx)
-	mov	%di,  EX_TRAPNR(%rcx)
-	mov	%si,  EX_ERROR_CODE(%rcx)
-	mov	%rdx, EX_ADDRESS(%rcx)
-3:	mov	$(-EFAULT), %rax
+7:	/* Epilog */
+	leave
+	.cfi_def_cfa		%rsp, 8
 	ret
-.popsection
+	.cfi_endproc
 
-_ASM_VDSO_EXTABLE_HANDLE(1b, 2b)
+_ASM_VDSO_EXTABLE_HANDLE(2b, 5b)
 
 ENDPROC(__vdso_sgx_enter_enclave)
diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
index 9ed690a38c70..50d2b5143e5e 100644
--- a/arch/x86/include/uapi/asm/sgx.h
+++ b/arch/x86/include/uapi/asm/sgx.h
@@ -24,7 +24,7 @@ 
 
 /**
  * struct sgx_enclave_create - parameter structure for the
- *                             %SGX_IOC_ENCLAVE_CREATE ioctl
+ *			       %SGX_IOC_ENCLAVE_CREATE ioctl
  * @src:	address for the SECS page data
  */
 struct sgx_enclave_create  {
@@ -33,7 +33,7 @@  struct sgx_enclave_create  {
 
 /**
  * struct sgx_enclave_add_page - parameter structure for the
- *                               %SGX_IOC_ENCLAVE_ADD_PAGE ioctl
+ *				 %SGX_IOC_ENCLAVE_ADD_PAGE ioctl
  * @addr:	address within the ELRANGE
  * @src:	address for the page data
  * @secinfo:	address for the SECINFO data
@@ -49,7 +49,7 @@  struct sgx_enclave_add_page {
 
 /**
  * struct sgx_enclave_init - parameter structure for the
- *                           %SGX_IOC_ENCLAVE_INIT ioctl
+ *			     %SGX_IOC_ENCLAVE_INIT ioctl
  * @sigstruct:	address for the SIGSTRUCT data
  */
 struct sgx_enclave_init {
@@ -66,16 +66,16 @@  struct sgx_enclave_set_attribute {
 };
 
 /**
- * struct sgx_enclave_exception - structure to report exceptions encountered in
- *				  __vdso_sgx_enter_enclave()
+ * struct sgx_enclave_exinfo - structure to report exceptions encountered in
+ *			       __vdso_sgx_enter_enclave()
  *
- * @leaf:	ENCLU leaf from \%eax at time of exception
+ * @leaf:	ENCLU leaf from \%eax at time of exception/exit
  * @trapnr:	exception trap number, a.k.a. fault vector
  * @error_code:	exception error code
  * @address:	exception address, e.g. CR2 on a #PF
  * @reserved:	reserved for future use
  */
-struct sgx_enclave_exception {
+struct sgx_enclave_exinfo {
 	__u32 leaf;
 	__u16 trapnr;
 	__u16 error_code;